Bug: Multiple problems with the lemma חצב - very inconsistent results in BWS

The lemma חצב seems to be infected with a bug. The screenshot below shows a BWS executed from different Hebrew texts (from Isaiah 5:2). Top right is from AFAT. Bottom left from WHM4.2, Bottom right from SESB 2.0.

There's six problems that I can see:

AFAT (top right) has selected the wrong homograph for lexicon lookup, though the correct homograph for everything else in the report.
WHM/SESB (bottom left/right) have zero results in the spark graph, and in example uses.
WIVU (not shown) has 17 results in the spark graph, but 25 in the Hebrew Bible section.
The SESB BWS tab can close unexpectedly. If you choose the 2nd homograph and regenerate the study, then go back to the first homograph, the tab closes. (This is because the second homograph isn't actually supported by SESB, so it switches to AFAT morphology. Then I switch back to the first homograph it sticks with AFAT, but I already have an AFAT tab open to that homograph, so it closes the current tab and switches to that one.)
If I have an AFAT BWS, open, I can't run a WIVU BHS (and vice versa). But they're different morphologies, so I should be able to.
Also dragging a BWS to a tab from the right-click menu gives me a blank report.

There's also the question of the widely variant results (16-25), though I assume this is because SESB doesn't have homographs here.

Find more posts tagged with

Comments

Dave Hooton

I found this to be a minefield of contradictions when looking at the OP in http://community.logos.com/forums/t/23496.aspx, so i'm glad you took it up!

AFAT (top right) has selected the wrong homograph for lexicon lookup, though the correct homograph for everything else in the report.

Same result if you execute BWS from Word-by-Word in Exegetical Guide on Is 5:2 using AFAT or ESV OT RI.

Melissa Snyder

I've submitted your report to development. Thanks.

David Knoll

1) Bible Word
Study in WIVU and AFAT return the right results in the Hebrew Bible Section
(i.e. according to tagging of the relevant databases). But it always uses morphology
similar to WHM 4.2 for the spark graph.

2) The results
when you do a search with WIVU or AFAT (When clicking on the link in the Hebrew
Bible section for instance) in analysis view are wrong: The verse that is
displayed as 1Ki 5:15 is actually 1Ki 5:29 (also in other hits).

3) AFAT tagging is
wrong. They split the results between two homonyms for no apparent reason. I
see no difference between 1Ki 5:29 and 2Ki 12:13. Neither do BDB or HALOT. This seems to be an
error.

4) The lemma chosen by AFAT BWS is correct: The
occurrences where AFAT have for some reason tagged חצב 2 do not fit the homonym suggested
by HALOT. But this is merely a coincidence the Lemma section uses WHM 4.2 no
matter where the BWS is triggered from. This can be proven by opening a BWS in Ps
29:7 where WHM follows HALOT in tagging חֹצֵב as חצב 2 and other
databases do not.

Vincent Setterholm

AFAT (top right) has selected the wrong homograph for lexicon lookup, though the correct homograph for everything else in the report.

I'm guessing you're getting that result by typing in the lemma, not by right clicking the word in context in Is 5:2 and launching directly from the text. Launching from the text will get what you'd expect for the KeyLink section. It sometimes happens that a single word has more than one destination in a given lexicon, and if you just type in the word, one of the KeyLink destinations is chosen at "random" (it isn't actually random, it is hard coded somewhere, but the best guess is produced by a script that Development wrote where the logic for which one is chosen isn't always obvious to the end user - maybe they could do a better job of always selecting the most common match for a given multi-target lemma, but if you launch the BWS from a right click menu, you'll get context sensitive KeyLink destinations that will be consistently more correct for the text you're studying).

WHM/SESB (bottom left/right) have zero results in the spark graph, and in example uses.

Much of the data for BWS comes from the Andersen-Forbes database, regardless of where you start out - the exception is the Textual Searches section, which is specific to the version you start from. Since the lexical analysis of each database varies quite a bit, there is a bit of code written by one of the developers that makes the conversion between one database and another. The SESB is producing zero hits and a fairly empty report on many words, which means that it isn't properly hooked into this system. This feature is generally working with WHM, but might have problems on some words (like this one) where the relationship between the analysis of the two databases is complex (many lemmas mapping to many lemmas). We'll look into this.

WIVU (not shown) has 17 results in the spark graph, but 25 in the Hebrew Bible section.

The differing hit count with WIVU is basically because WIVU analyzes חֹצֵב as a participle instead of as a separate noun - a perfectly defensible position, even if it differs from the others. In Hebrew, many nouns are derived from verbs. But Hebrew verbs have participle forms which can function like nouns. So sometimes you can debate whether a participle should be grouped with as a verb or broken into a separate entry as a noun.

David Knoll

https://community.logos.com/discussion/comment/175306#Comment_175306

Please note the fourth and fifth rows in this search. the reference does not match the verse.

Vincent Setterholm

https://community.logos.com/discussion/comment/175267#Comment_175267

Bible Word Study in WIVU and AFAT return the right results in the Hebrew Bible Section (i.e. according to tagging of the relevant databases). But it always uses morphology similar to WHM 4.2 for the spark graph.

I'm pretty sure the spark graph info comes from the reverse interlinear data, which is based on AFAT.

The results when you do a search with WIVU or AFAT (When clicking on the link in the Hebrew Bible section for instance) in analysis view are wrong: The verse that is displayed as 1Ki 5:15 is actually 1Ki 5:29 (also in other hits).

Actually, this is a place where the Hebrew verse numbering is different from the English, so this is behaving correctly. (Edit: my bad, I hadn't read carefully that this was in analysis view, not BWS. Yes, the conversion to the default Bible data type when searching the Hebrew Bible looks like a bug to me.)

But this is merely a coincidence the Lemma section uses WHM 4.2 no matter where the BWS is triggered from.

The lemma section (and KeyLinking in general) is not based on the WHM 4.2. For most of the lexicons (not counting Strong's-based resources) we have custom KeyLink tables designed to provide a good look-up no matter how a database spells a word or what homograph numbers (if any) are present. This means that KeyLinking isn't based on a database, it is based on the analysis of the lexicon itself. So in some cases, you'll navigate to the "right" entry in the lexicon, even if that clashes with the alternate analysis of the database you are using. This method has a number of significant advantages in terms of maintainability (we don't have to re-ship every lexicon every time a database changes its analysis on one word - databases change more often than lexicons), the time it takes to create KeyLink tables (we can create one table for the entire Hebrew Bible, not one table for every database), the ability to add new Hebrew databases without needing to create a new KeyLink table for every lexicon, the ability to compare different databases (we can index based on the surface form instead of the 'lemma' so that the same word in two databases can be lined up even if they disagree on the analysis) - besides the obvious benefit of respecting the analysis of the lexicon (which is a benefit not because the lexicon is always right when there is a conflict with the database, but rather because if you follow the database where it disagrees with the lexicon, you're likely to miss notes specific to the verse you are studying because you're not looking at the article that treats your verse).

Mark Barnes

https://community.logos.com/discussion/comment/175306#Comment_175306

Vincent. Thanks for dropping by. I almost emailed you and asked that
you would comment, so thanks for being alert to this post.

AFAT (top right) has selected the wrong homograph for lexicon lookup, though the correct homograph for everything else in the report.

I'm guessing you're getting that result by typing in the lemma, not by right clicking the word in context in Is 5:2 and launching directly from the text.

Actually, no. I launched it from AFAT 1 Kings 5:29. (It seems to work correctly from other locations, but not from there.) So I think it is a bug.

David Knoll

https://community.logos.com/discussion/comment/175313#Comment_175313

I'm pretty sure the spark graph info comes from the reverse interlinear data, which is based on AFAT.

Yep checked that. But if you hover on the number of hits you get the explanation: "Number of hits in Book/ Number of words in book" this is actually wrong because AFAT for that lemma has 17 hits in 16 verses (and BWS graph reads 16) WHM has 16 results in 14 verses. So I think either the explanation which appears when you hover on the number needs to be changed or the way the number is generated needs to be changed.

Why can't I get the relevant data in the graph from where the BWS was generated? (If I right click from WIVU I should get data from WIVU...)

The lemma section (and KeyLinking in general) is not based on the WHM 4.2. For most of the lexicons (not counting Strong's-based resources) we have custom KeyLink tables designed to provide a good look-up no matter how a database spells a word or what homograph numbers (if any) are present. This means that KeyLinking isn't based on a database, it is based on the analysis of the lexicon itself. So in some cases, you'll navigate to the "right" entry in the lexicon, even if that clashes with the alternate analysis of the database you are using. This method has a number of significant advantages in terms of maintainability (we don't have to re-ship every lexicon every time a database changes its analysis on one word - databases change more often than lexicons), the time it takes to create KeyLink tables (we can create one table for the entire Hebrew Bible, not one table for every database), the ability to add new Hebrew databases without needing to create a new KeyLink table for every lexicon, the ability to compare different databases (we can index based on the surface form instead of the 'lemma' so that the same word in two databases can be lined up even if they disagree on the analysis) - besides the obvious benefit of respecting the analysis of the lexicon (which is a benefit not because the lexicon is always right when there is a conflict with the database, but rather because if you follow the database where it disagrees with the lexicon, you're likely to miss notes specific to the verse you are studying because you're not looking at the article that treats your verse).

I am impressed! How do you analyse HALOT for instance? Is it machine generated or was there a linguist who actually read every definition and matched it to the appropriate occurrences? I mean not every lemma in HALOT ends with a cross, so you actually have to decide for each word to what homonym it fits best. I am only curious as to how this is technically achieved .

Another thing: One of the advantages of Logos is that you get several morphologies (AFAT, WHM, WIVU, and hopefully the much anticipated Richter) but you actually have to compare them manually. That means search the lemma in each database and then compare them line by line. Couldn't you offer us a tool that would make the comparison easier? Like highlighting where WHM and WIVU diffeṙ? It would be very useful, not so much for the obvious questions like if a participle is a verb or a noun but for the analysis of the word מלתעות for instance.

Mark Barnes

https://community.logos.com/discussion/comment/175313#Comment_175313

Bible Word Study in WIVU and AFAT return the right results in the Hebrew Bible Section (i.e. according to tagging of the relevant databases). But it always uses morphology similar to WHM 4.2 for the spark graph.

I'm pretty sure the spark graph info comes from the reverse interlinear data, which is based on AFAT.

That might be what's happening, but it shouldn't be, should it? Why a reverse interlinear in the lemma section? That's a bug too, isn't it?

Let me make sure I've understood everything and what's still outstanding):

Not a bug: AFAT (top right) has selected the
wrong homograph for lexicon lookup, though the correct homograph for
everything else in the report.
Confirmed problem with lemma mapping: WHM/SESB (bottom left/right) have zero results in the spark graph,
and in example uses.
Debatable bug (spark graph uses RI): WIVU (not shown) has 17 results in the spark graph, but 25 in the
Hebrew Bible section.
Not yet commented on: The SESB BWS tab can close unexpectedly. If you choose the 2nd
homograph and regenerate the study, then go back to the first homograph,
the tab closes. (This is because the second homograph isn't actually
supported by SESB, so it switches to AFAT morphology. Then I switch back
to the first homograph it sticks with AFAT, but I already have an AFAT
tab open to that homograph, so it closes the current tab and switches to
that one.)
Not yet commented on: If I have an AFAT BWS, open, I can't run a WIVU BHS (and vice
versa). But they're different morphologies, so I should be able to.
Not yet commented on: Also dragging a BWS to a tab from the right-click menu gives me a
blank report.

David Knoll

https://community.logos.com/discussion/comment/175325#Comment_175325

Actually, no. I launched it from AFAT 1 Kings 5:29. (It seems to work correctly from other locations, but not from there.) So I think it is a bug

I think it is not a bug because as Mr Setterholm said they follow the lexicon and HALOT analyses this participle as a Nomen Agentis and references this verse.

Mark Barnes

https://community.logos.com/discussion/comment/175327#Comment_175327

Couldn't you offer us a tool that would make the comparison easier? Like highlighting where WHM and WIVU diffeṙ?

That would be a great use for the Bible comparison tool. The differences in morphology structure would limit the comparison, but it ought to be pretty easy to highlight situations where one morphology showed a verb, and the other a noun. It could be another option on the dropdown 'ignore morphology', turned on by default.

Mark Barnes

https://community.logos.com/discussion/comment/175333#Comment_175333

Actually, no. I launched it from AFAT 1 Kings 5:29. (It seems to work correctly from other locations, but not from there.) So I think it is a bug

I think it is not a bug because as Mr Setterholm said they follow the lexicon and HALOT analyses this participle as a Nomen Agentis and references this verse.

Yes, you're right. Logos is being too clever for me here! That's pretty impressive to locate the correct place in the lexicon even when the morphology points somewhere else.

Vincent Setterholm

https://community.logos.com/discussion/comment/175328#Comment_175328

Unconfirmed possible bug: AFAT (top right) has selected the wrong homograph for lexicon lookup, though the correct homograph for everything else in the report

I don't know if I'd call this a bug, since 'right' homograph is context dependent. AFAT has a one-to-many relationship on this particular word with regards to HALOT, so if you want the right instance for Isaiah 5:2, you need to launch the BWS from Is 5.2, not by just typing in the lemma. I think we could improve this a bit by making sure that the most common destination is the default for typed in lemmas, but even if/when we implement that, you'll still get better results launching the report from the right-click menu.

Why a reverse interlinear in the lemma section?

I think the reason the sparkline is coming from the RI is because Dev wanted to reuse the same code that generates the sparkline in the Translation section, but I agree that it would be more intuitive for the lemma section to reflect the database you are searching both in hit counts and in the order of books - I proposed this back when the feature was first designed. If this behavior is changed, there will then be a disconnect between the hit counts in the lemma section and the hit counts in the translation section if you're coming from anything not based on AF, since the translation data has to come from the RI data. So it won't make all confusion disappear - that's just a price we pay for letting you get at translation wheels and grammatical relationship data from any source (at least, theoretically), so that folk who prefer a different database can still get information from the AF.

I didn't comment on the tab behavior because that's way outside my area of expertise.