Missing morph tagging in new Loeb volumes

Greg F
Greg F Member Posts: 278
edited November 20 in Resources Forum

I play a little game to see how quickly I can find errors in tagging, milestones, etc. on newly released Loeb editions. It took me about thirty seconds to notice problems in the recently released Medical Works of Antiquity series. (Which begs the question: if I can do it, why can't the quality control processes at Logos? ... anyway..)

The following volumes are missing the automatic morph tagging on the Greek side.

Galen's On the Natural Faculties: Greek Text

Heraclitus' On the Universe (the Greek and Latin quotes are sometimes tagged correctly later on, I think it's a language selection problem at the beginning)

Also, the Heraclitus book isn't marked as a separate book with its own milestones (?), meaning you can't link the English Heraclitus text with the Greek text. It would be much more helpful if each H fragment could receive its own milestone, so users could enter "Heraclitus 17" and be brought to the 17th fragment, with the matching English as well in the other panel.

I'll post more if I see problems in the Latin volumes.

Comments

  • Greg F
    Greg F Member Posts: 278

    Also, I just noticed that in the Index for the Hippocrates volume, the entries sometimes point to the Greek version, sometimes to the English version. See the first entry of the index on

    ABDERA, I. 266, 268, 270, 274, 278; II. 187

    The first five entries point to the Greek volume, the last points to the English volume. Same thing with "Amputation" etc.

    This should be harmonized.

  • Greg F
    Greg F Member Posts: 278

    The title of the Heracleitus volume is misspelled in Greek, the final sigma has been ocr'd incorrectly as a capital epsilon.

    ΠΕΡΙ ΤΟΥ ΠΑΝΤΟΕ

    should be:

    ΠΕΡΙ ΤΟΥ ΠΑΝΤΟΣ

  • MJ. Smith
    MJ. Smith MVP Posts: 53,397

    This particular error needs to be reported via typo to insure it is in the system when they update the book.

    Orthodox Bishop Alfeyev: "To be a theologian means to have experience of a personal encounter with God through prayer and worship."; Orthodox proverb: "We know where the Church is, we do not know where it is not."

  • Butters
    Butters Member Posts: 466

    The Loeb volumes are nearly useless; there are simply digital texts that one can get anywhere - but with jumbled/incorrect/missing morphological information, etc.    

    For example, St. Augustine's Confessions - there are dozens of places I can get the digital text; some of the texts have oodles of information built into them.  

    So what exactly is Logos selling with these volumes?  They have added very little value to them.  

    I gave up and walked away thinking that eventually Logos would fix these issues; a year or so later absolutely nothing has been done.  

    It honestly feels a bit fraudulent.

    Very, very poor showing.  

    “To love means loving the unlovable.  To forgive means pardoning the unpardonable.  Faith means believing the unbelievable.  Hope means hoping when everything seems hopeless.” ~Chesterton

  • Kyle G. Anderson
    Kyle G. Anderson Member, Logos Employee Posts: 2,218

    Thanks for letting us know. 

    Greg F said:

    The following volumes are missing the automatic morph tagging on the Greek side.

    Galen's On the Natural Faculties: Greek Text

    Heraclitus' On the Universe (the Greek and Latin quotes are sometimes tagged correctly later on, I think it's a language selection problem at the beginning)

    Morph tagging has been added and will be available next resource update.

    Greg F said:

    Also, the Heraclitus book isn't marked as a separate book with its own milestones (?), meaning you can't link the English Heraclitus text with the Greek text. It would be much more helpful if each H fragment could receive its own milestone, so users could enter "Heraclitus 17" and be brought to the 17th fragment, with the matching English as well in the other panel.

    I'll have to analyze this one further. It would require creating a new data type which is something that won't be able to happen immediately.

    Greg F said:

    Also, I just noticed that in the Index for the Hippocrates volume, the entries sometimes point to the Greek version, sometimes to the English version. See the first entry of the index on

    ABDERA, I. 266, 268, 270, 274, 278; II. 187

    The first five entries point to the Greek volume, the last points to the English volume. Same thing with "Amputation" etc.

    This should be harmonized.

    This reflects the print. In the print, Greek and English are on opposite facing pages (which is why we had to split it into two resources). Since the print links to page 266 (which happens to be Greek) that is where we'll link.

    Greg F said:

    The title of the Heracleitus volume is misspelled in Greek, the final sigma has been ocr'd incorrectly as a capital epsilon.

    ΠΕΡΙ ΤΟΥ ΠΑΝΤΟΕ

    should be:

    ΠΕΡΙ ΤΟΥ ΠΑΝΤΟΣ

    This has been updated and will be available next resource update.

  • Greg F
    Greg F Member Posts: 278

    Thanks for the update Kyle, I appreciate it.

    You can add the Celsus De Medicina volumes to the list of resources to be updated to add morphological tagging. The Latin missing tagging too.

    And please do consider adding a new data type for Heraclitus. He's a major Greek Pre-Socratic philosopher, quoted by Luther and particularly important for the early church fathers (my library shows him being quoted by Justin Martyr, Tatien, Clement of Alexandria, etc.) Then, of course, there are all the references in secular literature: Dante, Lucretius, Marx, Francis Bacon, Marcus Aurelius, Plutarch, Hegel, etc.

  • Kyle G. Anderson
    Kyle G. Anderson Member, Logos Employee Posts: 2,218

    Greg F said:

    You can add the Celsus De Medicina volumes to the list of resources to be updated to add morphological tagging. The Latin missing tagging too.

    I didn't mention it earlier but it was also updated. Thanks for pointing it out though.

    Greg F said:

    And please do consider adding a new data type for Heraclitus. He's a major Greek Pre-Socratic philosopher, quoted by Luther and particularly important for the early church fathers (my library shows him being quoted by Justin Martyr, Tatien, Clement of Alexandria, etc.) Then, of course, there are all the references in secular literature: Dante, Lucretius, Marx, Francis Bacon, Marcus Aurelius, Plutarch, Hegel, etc.

    I have a case in for Meraclitus for work on a data type. It'll probably be a couple of months before I'm able to get to it.

  • Greg F
    Greg F Member Posts: 278

    Butters said:

    The Loeb volumes are nearly useless; there are simply digital texts that one can get anywhere - but with jumbled/incorrect/missing morphological information, etc.    

    I understand your frustration, Butters, and I do get a little tired of doing quality control for Logos for free, but I think your characterization of "nearly useless" might be a bit of an overstatement.

    While the automatic morphological tagging is not perfect, I've found it to be good enough for 95% of the words I need to look up. (I even use the lack of tagging as a first indicator that the word might be OCR'd incorrectly: in which case I go to one of the original scans and check before reporting a typo.) I don't expect perfect tagging, because that would require a human being (or more likely, a team of people) to go through and check and tag the texts manually--an unrealistic and incredibly costly scenario, especially for the number of volumes being produced. Automatic tagging is about as good as one can reasonably expect for the price.

    Also, the Logos/Noet team has been pretty good about updating resources with problems. See this thread for instance. Others, however, are still waiting.

    In the end, while I think the Logos Loeb volumes are fabulously overpriced for what they are, I'm very happy to have purchased all of them progressively through Community Pricing. The note-taking abilities, the cross-device synching, the linking of English and Greek/Latin texts, as well instant access to dictionaries and encyclopedias are all rather wonderful features, especially compared to working with scanned PDF files, even if they are freely available. (Even Perseus is less user-friendly than Logos, in my opinion).

  • Butters
    Butters Member Posts: 466

    Hi Greg, 

    Very much appreciate your thoughtful response.  :)  

    Well, maybe there's something wrong with my software, but allow me to provide an example.  Choosing a word at random ("nostrum" in the context of "tu excitas, ut laudare te delectet, quia fecisti nos ad te et inquietum est cor nostrum, donec requiescat in te"), in the 1st paragraph of Liber Primus of the Confessions.  

    So, I click on the word and here's the morphological info I get:  

    I get similarly vague, confusing and even incorrect results with every single word in the first paragraph.

    “To love means loving the unlovable.  To forgive means pardoning the unpardonable.  Faith means believing the unbelievable.  Hope means hoping when everything seems hopeless.” ~Chesterton

  • Greg F
    Greg F Member Posts: 278

    Ah, I see! I'm afraid the problem you're having is to be blamed on Latin, not on Logos. :)

    As you probably already know, Latin morphology is rather polysemantic, meaning a given form can have many different meanings and interpretations. An automatic morphology program is going to give you ALL possible meanings, rather than just the right one.

    Take your nostrum example: from a purely morphological standpoint, it can either represent a form of noster, nostra, nostrum (the possessive pronoun our), OR it can be the plural form of ego (a personal pronoun). And then, within those two forms (either an adjective or a pronoun), it can either be singular, neuter, nominative OR singular, neuter, vocative OR singular, masculine, accusative, and so on and so forth.

    So what you're seeing in the Logos popup you posted is just a reflection of the nature of Latin: any automatic tagging will list all possible forms, not just the "right" one. Getting the exactly right form for a given sentence would require someone with a good knowledge of Latin go in and choose from among all those possibilities. And, as I said in my previous post, that would cost a fortune. So, instead, Logos lists all possible interpretations for a given morpheme. You happened to pick one with lots of possibilities, but that's the richness (and difficulty) of classical languages. :)

    I hope that makes it clearer!

  • Butters
    Butters Member Posts: 466

    Greg, thanks for your thoughtful response one again :)

    As a former Classics (Greek & Latin) major, I'm exceedingly aware of the polysemantic nature of Latin morphology; "nostrum" in the abstract can indeed be many of those things, but not in the context of that sentence.

    A few years back, I read the Iliad; I ALWAYS read hardcopy - AND, I put Logos to good use by using it as a reading aid. If I was unsure or clueless about a morpheme or a definition, with a few exceptions (which I brought to the attention of Logos) the information was precise, concise and extremely helpful.

    So, when I attempt to use Logos in the same way when reading the Confessions, I find the information associated with the text to be - in comparison - a jumbled mess. I don't think it's just the difference in the morphological natures of the languages; clearly, a great deal of work was put into the Iliad; and clearly, very little has been put into the Confessions. I'd be willing to bet if one compared the various biblical texts that Logos sells, they would be far closer to the Iliad than to the Confessions.

    My point is, there ought to be a distinction in marketing language because clearly there is a difference in the reality of how much work has been put into the respective texts I've mentioned.

    Anyhoo, thanks again. ~Butters :)

    “To love means loving the unlovable.  To forgive means pardoning the unpardonable.  Faith means believing the unbelievable.  Hope means hoping when everything seems hopeless.” ~Chesterton

  • Greg F
    Greg F Member Posts: 278

    Is this the Iliad text you used? That one happens to be the only non-Bible-related Greek text I'm aware of that has complete morphological tagging done by a human being, which explains why you were finding precise morphological information and not just a list of different possibilities as per Augustine's Confessions.

    I do agree with you that Logos could/should make it clearer that automatic tagging does not mean 100%-accurate morphological information for each and every word.

    One suggestion, if you're finding the popup to be unhelpful is to use the Information panel, it offers a somewhat "cleaner" view of all the morphological possibilities for a a given term.

  • Butters
    Butters Member Posts: 466

    ah, I see...lol.  So I just happened to use a text that would give me false expectations for others.  Thanks Greg - and thanks for the suggestion, I'll try that.  Cheers, :)  

    “To love means loving the unlovable.  To forgive means pardoning the unpardonable.  Faith means believing the unbelievable.  Hope means hoping when everything seems hopeless.” ~Chesterton

  • Butters
    Butters Member Posts: 466

    Greg F said:

    I do agree with you that Logos could/should make it clearer that automatic tagging does not mean 100%-accurate morphological information for each and every word.

    Indeed. 

    “To love means loving the unlovable.  To forgive means pardoning the unpardonable.  Faith means believing the unbelievable.  Hope means hoping when everything seems hopeless.” ~Chesterton

  • Butters
    Butters Member Posts: 466

    Greg F said:

    One suggestion, if you're finding the popup to be unhelpful is to use the Information panel, it offers a somewhat "cleaner" view of all the morphological possibilities for a a given term.

    Greg, just wanted to thank you for this suggestion; works MUCH better!  :)  

    “To love means loving the unlovable.  To forgive means pardoning the unpardonable.  Faith means believing the unbelievable.  Hope means hoping when everything seems hopeless.” ~Chesterton

  • Greg F
    Greg F Member Posts: 278

    I have a case in for Meraclitus for work on a data type. It'll probably be a couple of months before I'm able to get to it.

    I noticed that a data type has been added for Heraclitus' fragments. Now books can point to individual fragments (Fragmentum 15, for example), rather than just the book as a whole. A big improvement, thanks Kyle!

    However, I noticed that the data type is misspelled as "Heraclides", one of the legendary descendants of the Greek hero Heracles, and not "Heraclitus" as the philosopher's name is generally spelled (though it is "Heracleitus" in the Loeb edition).

    Sorry to nitpick, but it's pretty important... :)

  • Kyle G. Anderson
    Kyle G. Anderson Member, Logos Employee Posts: 2,218

    Greg F said:

    I have a case in for Meraclitus for work on a data type. It'll probably be a couple of months before I'm able to get to it.

    I noticed that a data type has been added for Heraclitus' fragments. Now books can point to individual fragments (Fragmentum 15, for example), rather than just the book as a whole. A big improvement, thanks Kyle!

    However, I noticed that the data type is misspelled as "Heraclides", one of the legendary descendants of the Greek hero Heracles, and not "Heraclitus" as the philosopher's name is generally spelled (though it is "Heracleitus" in the Loeb edition).

    Sorry to nitpick, but it's pretty important... :)

    Thanks Greg! I'm glad you noticed it. This is a weird case where we actually didn't create anything to solve the problem. We had a data type from Perseus data all along. I just happened to notice it.

    The good news is that data can be updated. Due to release cycles it might take a bit for the change to show but I can make it.

  • gregory barton
    gregory barton Member Posts: 4

    Well, it gets worse. The web site that sells this as logos has pictures of the volumes, publication dates, name of the publishers, qoutes from the publisher, the number of volumes you will recieve etc etc.  Only in small print, in light faded grey text, does one read "this is a download". Alas I ordered a "set" expecting to be shipped 166 volumes. I realized my mistake when I saw no shipping charge. That sounded suspicious, so I studied the page again--but it was clearly 166 volumes. I am a professor of history and have ordered for years from all over the world.  Never a scam.  Then at this same site I see to my horror, in small print, and in faded gray type (quite easy to miss because the rest of the page is in clear type) that "this is a download."  Jeez.  I have canceled my order ans we shall see if I get a refund.  But this is a site I will never, ever trust.  Worse that any publisher I have seen.  Why cant they be honest and upfront?  At least as honest and upfront as the secular sites?  If its an ebook then for heaven sakes, say it !! Loud and clear!  And cut the whole layout of information that looks like hard copy volumes. Be honest!

  • MJ. Smith
    MJ. Smith MVP Posts: 53,397

    Be honest!

    Given that it is a software firm that offers resources to work in that software, Faithlife is normally very clear on resources that are physical and require shipping. In my decade in the forums, I can assure you that Faithlife had no intent to fool you into thinking they provided a physical book. I understand that you would be frustrated at the miscommunication but I am confident that your refund will be processed promptly.

    But this is a site I will never, ever trust.  Worse that any publisher I have seen.

    Please don't be so severe - it is a software firm not a publisher as most of the routes into the site make clear. You obviously entered the site from a route that bypassed the software aspect.

    Orthodox Bishop Alfeyev: "To be a theologian means to have experience of a personal encounter with God through prayer and worship."; Orthodox proverb: "We know where the Church is, we do not know where it is not."

  • gregory barton
    gregory barton Member Posts: 4

    Dear Mj Smith,

    Thank you for your polite response. Yes perhaps I have been severe.  If I get a refund I will be happy. To explain, I did indeed enter the forum from a google search, looking for Loeb library books.  But to give me some credit, look at the website in question.

    https://www.logos.com/product/55968/loeb-classical-library-builder

    You can see, that it certainly looks like like physical volumes from top to bottom, except for the small print, in faded gray. The books are even listed as "owned" which would clearly indicate used books.  I am sure coming from the outside made the web page all the more convincing that these were....well...books. 

    Warm Regards. 

    PS so you know Im not a crank and that I order constantly, you can check my own books online under Gregory A. Barton and see that I would indeed order quite a bit for my own research.  I am not easily mistaken on book ordering. I just think the page could be clearer. Thanks again for your kind response.

  • Michel Pauw
    Michel Pauw Member Posts: 564 ✭✭✭

    Greg F said:

    Is this the Iliad text you used? That one happens to be the only non-Bible-related Greek text I'm aware of that has complete morphological tagging done by a human being, which explains why you were finding precise morphological information and not just a list of different possibilities as per Augustine's Confessions.

    How do you know that that one is done by a human? Do you infer that from the quality of the tagging?

    Also, how would one go about if there is somebody interested in doing such a manual morphological tagging?

    Dell XPS 17 9700, W11, 32GB, 1TB SSD, NVIDIA GeForce RTX 2060
    L5+L9+L10 Portfolio | Logos Max | Translator's Workplace

  • MJ. Smith
    MJ. Smith MVP Posts: 53,397

    Also, how would one go about if there is somebody interested in doing such a manual morphological tagging?

    Usually one starts with parsing software and makes corrections.

     https://wiki.digitalclassicist.org/Morphological_parsing_or_lemmatising_Greek_and_Latin is a good starting point for tracking down software, methodologies, et.

    Orthodox Bishop Alfeyev: "To be a theologian means to have experience of a personal encounter with God through prayer and worship."; Orthodox proverb: "We know where the Church is, we do not know where it is not."

  • Michel Pauw
    Michel Pauw Member Posts: 564 ✭✭✭

    Interesting!

    However, I would like to know how to go about if I have somebody / a team that would occasionally work on morphological tagging for Logos on Greek resources that we are actually using. Any thoughts?

    Dell XPS 17 9700, W11, 32GB, 1TB SSD, NVIDIA GeForce RTX 2060
    L5+L9+L10 Portfolio | Logos Max | Translator's Workplace

  • Greg F
    Greg F Member Posts: 278

    How do you know that that one is done by a human? Do you infer that from the quality of the tagging?

    I seem to remember the project description talking about how this text was made, but it's true that now it's not clear that this text was edited "by hand" by someone.

    Below is from the introduction by John J. Jackson:


    Preface

    My purpose has been to produce an interlinear that is both inductive and deductive in approach, allowing the student to quickly read large amounts of Greek or Latin, and provide a platform for comprehensive study. I find the Libronix Library System to be perfectly suited for this task and am proud to make an addition to the library.
    John J. Jackson
    [...]

    The text of Monro and Allen’s 1920 edition of the Iliad contains a total of 111,862 words, generated from 21,418 unique wordforms. These, when lemmatized, yield 18,430 unique dictionary headwords.  
    Upon further analysis we find 12766 unambiguous forms (those with only one morphological parse). Forms of this type were parsed by the use of an automatic parser thus reducing the corpus to 8,652 ambiguous wordforms. Additionally we find that wordform to lemma assignment exhibit fewer than 200 ambiguous relationships.
    In addition to an automatic parser, lexicon driven knowledge based systems were created to resolve areas not handled by the parser.

    So it would appear that automatic parsing was also used on this text, albeit with more human attention..