Bug: Multiple problems with the lemma חצב - very inconsistent results in BWS

Page 1 of 1 (14 items)
This post has 13 Replies | 1 Follower

Posts 13379
Forum MVP
Mark Barnes | Forum Activity | Posted: Fri, Sep 24 2010 3:44 AM | Locked

The lemma חצב seems to be infected with a bug. The screenshot below shows a BWS executed from different Hebrew texts (from Isaiah 5:2). Top right is from AFAT. Bottom left from WHM4.2, Bottom right from SESB 2.0.

There's six problems that I can see:

  1. AFAT (top right) has selected the wrong homograph for lexicon lookup, though the correct homograph for everything else in the report.
  2. WHM/SESB (bottom left/right) have zero results in the spark graph, and in example uses.
  3. WIVU (not shown) has 17 results in the spark graph, but 25 in the Hebrew Bible section.
  4. The SESB BWS tab can close unexpectedly. If you choose the 2nd homograph and regenerate the study, then go back to the first homograph, the tab closes. (This is because the second homograph isn't actually supported by SESB, so it switches to AFAT morphology. Then I switch back to the first homograph it sticks with AFAT, but I already have an AFAT tab open to that homograph, so it closes the current tab and switches to that one.)
  5. If I have an AFAT BWS, open, I can't run a WIVU BHS (and vice versa). But they're different morphologies, so I should be able to.
  6. Also dragging a BWS to a tab from the right-click menu gives me a blank report.

There's also the question of the widely variant results (16-25), though I assume this is because SESB doesn't have homographs here.

Posts 25264
Forum MVP
Dave Hooton | Forum Activity | Replied: Fri, Sep 24 2010 7:15 AM | Locked

I found this to be a minefield of contradictions when looking at the OP in http://community.logos.com/forums/t/23496.aspx, so i'm glad you took it up!

Mark Barnes:
AFAT (top right) has selected the wrong homograph for lexicon lookup, though the correct homograph for everything else in the report.

Same result if you execute BWS from Word-by-Word in Exegetical Guide on Is 5:2 using AFAT or ESV OT RI.

Dave
===

Windows 10 & Android 8

Posts 4077
Melissa Snyder | Forum Activity | Replied: Fri, Sep 24 2010 9:15 AM | Locked

I've submitted your report to development. Thanks.

Posts 912
David Knoll | Forum Activity | Replied: Fri, Sep 24 2010 10:05 AM | Locked

1)      Bible Word Study in WIVU and AFAT return the right results in the Hebrew Bible Section (i.e. according to tagging of the relevant databases). But it always uses morphology similar to WHM 4.2 for the spark graph.

2)      The results when you do a search with WIVU or AFAT (When clicking on the link in the Hebrew Bible section for instance) in analysis view are wrong: The verse that is displayed as 1Ki 5:15 is actually 1Ki 5:29 (also in other hits).

3)      AFAT tagging is wrong. They split the results between two homonyms for no apparent reason. I see no difference between 1Ki 5:29 and 2Ki 12:13. Neither do BDB or HALOT. This seems to be an error.

4)       The lemma chosen by AFAT BWS is correct: The occurrences where AFAT have for some reason tagged חצב 2 do not fit the homonym suggested by HALOT. But this is merely a coincidence the Lemma section uses WHM 4.2 no matter where the BWS is triggered from. This can be proven by opening a BWS in Ps 29:7 where  WHM follows HALOT in tagging חֹצֵב as חצב 2 and  other databases do not.

Posts 433
Vincent Setterholm | Forum Activity | Replied: Fri, Sep 24 2010 12:33 PM | Locked

Mark Barnes:
AFAT (top right) has selected the wrong homograph for lexicon lookup, though the correct homograph for everything else in the report.

I'm guessing you're getting that result by typing in the lemma, not by right clicking the word in context in Is 5:2 and launching directly from the text. Launching from the text will get what you'd expect for the KeyLink section. It sometimes happens that a single word has more than one destination in a given lexicon, and if you just type in the word, one of the KeyLink destinations is chosen at "random" (it isn't actually random, it is hard coded somewhere, but the best guess is produced by a script that Development wrote where the logic for which one is chosen isn't always obvious to the end user - maybe they could do a better job of always selecting the most common match for a given multi-target lemma, but if you launch the BWS from a right click menu, you'll get context sensitive KeyLink destinations that will be consistently more correct for the text you're studying).

Mark Barnes:
WHM/SESB (bottom left/right) have zero results in the spark graph, and in example uses.

Much of the data for BWS comes from the Andersen-Forbes database, regardless of where you start out - the exception is the Textual Searches section, which is specific to the version you start from. Since the lexical analysis of each database varies quite a bit, there is a bit of code written by one of the developers that makes the conversion between one database and another. The SESB is producing zero hits and a fairly empty report on many words, which means that it isn't properly hooked into this system. This feature is generally working with WHM, but might have problems on some words (like this one) where the relationship between the analysis of the two databases is complex (many lemmas mapping to many lemmas). We'll look into this.

Mark Barnes:
WIVU (not shown) has 17 results in the spark graph, but 25 in the Hebrew Bible section.

The differing hit count with WIVU is basically because WIVU analyzes  חֹצֵב as a participle instead of as a separate noun - a perfectly defensible position, even if it differs from the others. In Hebrew, many nouns are derived from verbs. But Hebrew verbs have participle forms which can function like nouns. So sometimes you can debate whether a participle should be grouped with as a verb or broken into a separate entry as a noun.

Posts 912
David Knoll | Forum Activity | Replied: Fri, Sep 24 2010 1:04 PM | Locked

Please note the fourth and fifth rows in this search. the reference does not match the verse.

Posts 433
Vincent Setterholm | Forum Activity | Replied: Fri, Sep 24 2010 1:08 PM | Locked

David Knoll:
Bible Word Study in WIVU and AFAT return the right results in the Hebrew Bible Section (i.e. according to tagging of the relevant databases). But it always uses morphology similar to WHM 4.2 for the spark graph.

I'm pretty sure the spark graph info comes from the reverse interlinear data, which is based on AFAT.

David Knoll:
The results when you do a search with WIVU or AFAT (When clicking on the link in the Hebrew Bible section for instance) in analysis view are wrong: The verse that is displayed as 1Ki 5:15 is actually 1Ki 5:29 (also in other hits).

Actually, this is a place where the Hebrew verse numbering is different from the English, so this is behaving correctly. (Edit: my bad, I hadn't read carefully that this was in analysis view, not BWS. Yes, the conversion to the default Bible data type when searching the Hebrew Bible looks like a bug to me.)

David Knoll:
But this is merely a coincidence the Lemma section uses WHM 4.2 no matter where the BWS is triggered from.

The lemma section (and KeyLinking in general) is not based on the WHM 4.2. For most of the lexicons (not counting Strong's-based resources) we have custom KeyLink tables designed to provide a good look-up no matter how a database spells a word or what homograph numbers (if any) are present. This means that KeyLinking isn't based on a database, it is based on the analysis of the lexicon itself. So in some cases, you'll navigate to the "right" entry in the lexicon, even if that clashes with the alternate analysis of the database you are using. This method has a number of significant advantages in terms of maintainability (we don't have to re-ship every lexicon every time a database changes its analysis on one word - databases change more often than lexicons), the time it takes to create KeyLink tables (we can create one table for the entire Hebrew Bible, not one table for every database), the ability to add new Hebrew databases without needing to create a new KeyLink table for every lexicon, the ability to compare different databases (we can index based on the surface form instead of the 'lemma' so that the same word in two databases can be lined up even if they disagree on the analysis) - besides the obvious benefit of respecting the analysis of the lexicon (which is a benefit not because the lexicon is always right when there is a conflict with the database, but rather because if you follow the database where it disagrees with the lexicon, you're likely to miss notes specific to the verse you are studying because you're not looking at the article that treats your verse).

Posts 13379
Forum MVP
Mark Barnes | Forum Activity | Replied: Fri, Sep 24 2010 1:53 PM | Locked

Vincent. Thanks for dropping by. I almost emailed you and asked that you would comment, so thanks for being alert to this post.

Vincent Setterholm:

Mark Barnes:
AFAT (top right) has selected the wrong homograph for lexicon lookup, though the correct homograph for everything else in the report.

I'm guessing you're getting that result by typing in the lemma, not by right clicking the word in context in Is 5:2 and launching directly from the text.

Actually, no. I launched it from AFAT 1 Kings 5:29. (It seems to work correctly from other locations, but not from there.) So I think it is a bug.

Posts 912
David Knoll | Forum Activity | Replied: Fri, Sep 24 2010 1:56 PM | Locked

Vincent Setterholm:
I'm pretty sure the spark graph info comes from the reverse interlinear data, which is based on AFAT.

Yep checked that. But if you hover on the number of hits you get the explanation:  "Number of hits in Book/ Number of words in book" this is actually wrong because AFAT for that lemma has  17 hits in 16 verses (and BWS graph reads 16) WHM has 16 results in 14 verses. So I think either the explanation which appears when you hover on the number needs to be changed or the way the number is generated needs to be changed. 

Why can't I get the relevant data in the graph from where the BWS was generated? (If I right click from WIVU I should get data from WIVU...)

Vincent Setterholm:
The lemma section (and KeyLinking in general) is not based on the WHM 4.2. For most of the lexicons (not counting Strong's-based resources) we have custom KeyLink tables designed to provide a good look-up no matter how a database spells a word or what homograph numbers (if any) are present. This means that KeyLinking isn't based on a database, it is based on the analysis of the lexicon itself. So in some cases, you'll navigate to the "right" entry in the lexicon, even if that clashes with the alternate analysis of the database you are using. This method has a number of significant advantages in terms of maintainability (we don't have to re-ship every lexicon every time a database changes its analysis on one word - databases change more often than lexicons), the time it takes to create KeyLink tables (we can create one table for the entire Hebrew Bible, not one table for every database), the ability to add new Hebrew databases without needing to create a new KeyLink table for every lexicon, the ability to compare different databases (we can index based on the surface form instead of the 'lemma' so that the same word in two databases can be lined up even if they disagree on the analysis) - besides the obvious benefit of respecting the analysis of the lexicon (which is a benefit not because the lexicon is always right when there is a conflict with the database, but rather because if you follow the database where it disagrees with the lexicon, you're likely to miss notes specific to the verse you are studying because you're not looking at the article that treats your verse).

I am impressed! How do you analyse HALOT for instance? Is it machine generated or was there a linguist who actually read every definition and matched it to the appropriate occurrences? I mean not every lemma in HALOT ends with a cross, so you actually have to decide for each word to what homonym it fits best. I am only curious as to how this is technically achieved .

Another thing: One of the advantages of Logos is that you get several morphologies (AFAT, WHM, WIVU, and hopefully the much anticipated Richter) but you actually have to compare them manually. That means search the lemma in each database and then compare them line by line.   Couldn't you offer us a tool that would make the comparison easier? Like highlighting where WHM and WIVU diffeṙ?  It would be very useful, not so much for the obvious questions like if a participle is a verb or a noun but for the analysis of the word מלתעות for instance.

Posts 13379
Forum MVP
Mark Barnes | Forum Activity | Replied: Fri, Sep 24 2010 1:56 PM | Locked

Vincent Setterholm:

David Knoll:
Bible Word Study in WIVU and AFAT return the right results in the Hebrew Bible Section (i.e. according to tagging of the relevant databases). But it always uses morphology similar to WHM 4.2 for the spark graph.

I'm pretty sure the spark graph info comes from the reverse interlinear data, which is based on AFAT.

That might be what's happening, but it shouldn't be, should it? Why a reverse interlinear in the lemma section? That's a bug too, isn't it?

Let me make sure I've understood everything and what's still outstanding):

  • Not a bug: AFAT (top right) has selected the wrong homograph for lexicon lookup, though the correct homograph for everything else in the report.
  • Confirmed problem with lemma mapping: WHM/SESB (bottom left/right) have zero results in the spark graph, and in example uses.
  • Debatable bug (spark graph uses RI): WIVU (not shown) has 17 results in the spark graph, but 25 in the Hebrew Bible section.
  • Not yet commented on: The SESB BWS tab can close unexpectedly. If you choose the 2nd homograph and regenerate the study, then go back to the first homograph, the tab closes. (This is because the second homograph isn't actually supported by SESB, so it switches to AFAT morphology. Then I switch back to the first homograph it sticks with AFAT, but I already have an AFAT tab open to that homograph, so it closes the current tab and switches to that one.)
  • Not yet commented on: If I have an AFAT BWS, open, I can't run a WIVU BHS (and vice versa). But they're different morphologies, so I should be able to.
  • Not yet commented on: Also dragging a BWS to a tab from the right-click menu gives me a blank report.
  •  

    Posts 912
    David Knoll | Forum Activity | Replied: Fri, Sep 24 2010 2:04 PM | Locked

    Mark Barnes:
    Actually, no. I launched it from AFAT 1 Kings 5:29. (It seems to work correctly from other locations, but not from there.) So I think it is a bug

    I think it is not a bug because as Mr Setterholm said they follow the lexicon and HALOT analyses this participle as a Nomen Agentis and references this verse.

    Posts 13379
    Forum MVP
    Mark Barnes | Forum Activity | Replied: Fri, Sep 24 2010 2:08 PM | Locked

    David Knoll:
    Couldn't you offer us a tool that would make the comparison easier? Like highlighting where WHM and WIVU diffeṙ?

    That would be a great use for the Bible comparison tool. The differences in morphology structure would limit the comparison, but it ought to be pretty easy to highlight situations where one morphology showed a verb, and the other a noun. It could be another option on the dropdown 'ignore morphology', turned on by default.

    Posts 13379
    Forum MVP
    Mark Barnes | Forum Activity | Replied: Fri, Sep 24 2010 2:12 PM | Locked

    David Knoll:

    Mark Barnes:
    Actually, no. I launched it from AFAT 1 Kings 5:29. (It seems to work correctly from other locations, but not from there.) So I think it is a bug

    I think it is not a bug because as Mr Setterholm said they follow the lexicon and HALOT analyses this participle as a Nomen Agentis and references this verse.

    Yes, you're right. Logos is being too clever for me here! That's pretty impressive to locate the correct place in the lexicon even when the morphology points somewhere else.

     

    Posts 433
    Vincent Setterholm | Forum Activity | Replied: Fri, Sep 24 2010 2:20 PM | Locked

    Mark Barnes:
    Unconfirmed possible bug: AFAT (top right) has selected the wrong homograph for lexicon lookup, though the correct homograph for everything else in the report

    I don't know if I'd call this a bug, since 'right' homograph is context dependent. AFAT has a one-to-many relationship on this particular word with regards to HALOT, so if you want the right instance for Isaiah 5:2, you need to launch the BWS from Is 5.2, not by just typing in the lemma. I think we could improve this a bit by making sure that the most common destination is the default for typed in lemmas, but even if/when we implement that, you'll still get better results launching the report from the right-click menu.

    Mark Barnes:
    Why a reverse interlinear in the lemma section?

    I think the reason the sparkline is coming from the RI is because Dev wanted to reuse the same code that generates the sparkline in the Translation section, but I agree that it would be more intuitive for the lemma section to reflect the database you are searching both in hit counts and in the order of books - I proposed this back when the feature was first designed. If this behavior is changed, there will then be a disconnect between the hit counts in the lemma section and the hit counts in the translation section if you're coming from anything not based on AF, since the translation data has to come from the RI data. So it won't make all confusion disappear - that's just a price we pay for letting you get at translation wheels and grammatical relationship data from any source (at least, theoretically), so that folk who prefer a different database can still get information from the AF.

    I didn't comment on the tab behavior because that's way outside my area of expertise.

    Page 1 of 1 (14 items) | RSS