Bug/Dataset Issue: Building a Word List from the Old Testament

Robert Kelbe
Robert Kelbe Member Posts: 631 ✭✭✭
edited November 2024 in English Forum

I created a word list from the entire Old Testament (based on the NKJV) in order to study vocabulary. I noticed a few peculiarities so far...

The gloss for מַלְאָךְ mispells "God" - a simple spelling mistake: "messenger; messengers of God (prophets, priests, angels); angel of Go, Yahweh"

More worryingly, נוּס seems to be completely missing ("to flee, escape") even though it occurs 158 times in the NKJV.

In addition, there was at least one time where the word count did not match the results when I searched by lemma in the NKJV, which was strange.

Mostly, however, I am wondering why נוּס was missing and what else could possibly be missing.

Thank you,



  • MJ. Smith
    MJ. Smith MVP Posts: 54,573

    I hope you have reported via the typo option - then it helps the entire community.

    When a lemma appears to be missing, go to an instance of it and use (a) interlinear, (b) information panel, or (c) context menu to see how Logos treats it. Assigning lemmas is not an exact science - rather there are multiple theories as to how to assign them. Read the Logos glossary to understand their approach.

    One should expect the word count in the Greek to differ from the English in that the Greek is not exactly the text that the translators used - where they made choices, that choice is not reflected back into the Greek.

    Orthodox Bishop Alfeyev: "To be a theologian means to have experience of a personal encounter with God through prayer and worship."; Orthodox proverb: "We know where the Church is, we do not know where it is not."

  • Dave Hooton
    Dave Hooton MVP Posts: 36,026

    More worryingly, נוּס seems to be completely missing ("to flee, escape") even though it occurs 158 times in the NKJV.

    נוס  flee; drive on**  does occur 158 times in NKJV. I did a Word List for Gen 14-39 and it was included with the correct count of 7.


    ** the gloss does vary e.g. BHW 4.18 has "to flee; to drive on"

    Mostly, however, I am wondering why נוּס was missing and what else could possibly be missing.

    Have you now found נוס  flee; drive on? Unless very familiar with Hebrew, the word can be missed in a large List. I would sort by Gloss (looking for 'flee') or Count (looking for those occurring 158x).

    I would doubt that anything is missing.

    How did you build the Word List e.g.  Gen-Mal?

    In addition, there was at least one time where the word count did not match the results when I searched by lemma in the NKJV, which was strange.

    Can you provide examples?


    Windows 11 & Android 13

  • Robert Kelbe
    Robert Kelbe Member Posts: 631 ✭✭✭

    Thank you, MJ and Dave!

    Ahhh... נוּס was there. Long story... I had exported it into an Excel spreadsheet and accidentally had it filtered off. I also looked in the word list but I searched with a Shureq instead of a Vav. Sorry about that!

    Regarding submitting a typo, I can't figure out what resource it is in, in order to submit a typo. I don't think it is LTW. 

  • Robert Kelbe
    Robert Kelbe Member Posts: 631 ✭✭✭

    Regarding the count, here is an example: the word list says 166 (based on the NKJV), but when I right-click, select the lemma, and Search "Bible" it shows as having 158 matches (in the NKJV). So I don't understand why that is, but that is OK; I don't need to waste anyone else's time!

  • Dave Hooton
    Dave Hooton MVP Posts: 36,026

    Regarding submitting a typo, I can't figure out what resource it is in, in order to submit a typo.

    Create a new thread with Title BUG:Typo in gloss for מַלְאָךְ

    The gloss is a composite or truncation, so the typo is the phrase "angel of Go, Yahweh"  as Yahweh is never a gloss for this word (I think it comes from Lexham Analytic Lexicon of the Hebrew Bible which states "the angel of God/Yahweh"). Just include the BWS screenshot above, though.


    Windows 11 & Android 13

  • Dave Hooton
    Dave Hooton MVP Posts: 36,026

    Regarding the count, here is an example: the word list says 166 (based on the NKJV), but when I right-click, select the lemma, and Search "Bible" it shows as having 158 matches (in the NKJV). So I don't understand why that is,

    It is connected with the Hebrew Reverse Interlinear (complex to explain), so if you want a 'true'/reliable count build your Word List from a Hebrew Bible e.g. LHB, SESB.

    EDIT: having said that, the word count in LHB was also wrong (139 instead of 160). I might create a bug report


    Windows 11 & Android 13

  • MJ. Smith
    MJ. Smith MVP Posts: 54,573

    I don't need to waste anyone else's time!

    Rarely a waste of time. We all learn tracking down the details.

    Orthodox Bishop Alfeyev: "To be a theologian means to have experience of a personal encounter with God through prayer and worship."; Orthodox proverb: "We know where the Church is, we do not know where it is not."

  • Robert Kelbe
    Robert Kelbe Member Posts: 631 ✭✭✭

    I would love to figure out why there is the inconsistency in "count" and if it truly is an error, for Logos to make a formal bug report for it. Many people use their Bible software program for statistics of word count, etc. Here two examples taken at random.

    (Comparing count based on exporting the entire Bible in the NKJV as a word list. I confirmed by exporting the Matt-Rev as a word list that the issue still exists)

    קֶ֫דֶם - word list: 74 actual: 61

    προσκαρτερέω - word list: 18 actual: 10

  • Dave Hooton
    Dave Hooton MVP Posts: 36,026

    I would love to figure out why there is the inconsistency in "count" and if it truly is an error, for Logos to make a formal bug report for it.

    At this time, I recommend that you create a separate BUG thread for this post to be noticed.


    Windows 11 & Android 13

  • Andrew Batishko
    Andrew Batishko Member, Community Manager, Logos Employee Posts: 5,451

    I would love to figure out why there is the inconsistency in "count" and if it truly is an error, for Logos to make a formal bug report for it. Many people use their Bible software program for statistics of word count, etc.

    I can't tell you at the moment exactly why the count is wrong, but I can tell you generally what is wrong along with how to work around the problem.

    You are creating a list of lemmas from a translation, where zero or more words in English can map to zero or more words in the original language. Whatever mechanism the Word List is using to determine these counts is not accounting for this mapping in the right way. If you want to fix it then, create your Word List from an original language resource, such as the SBLGNT, and then the counts will be correct.

    FYI, if you are attempting to get counts of words, the Word List may not be the right tool for you. The Concordance tool may be better fit for your needs.

    Andrew Batishko | Logos software developer

  • Robert Kelbe
    Robert Kelbe Member Posts: 631 ✭✭✭

    Thank you, Andrew, for responding. Actually, I didn't know about the concordance tool, so I appreciate you showing me that!

    I am using the word list for studying vocabulary. I exported it into an excel spreadsheet that I use for studying. I sort by "Count" to prioritize most frequent words. You could say that it doesn't really matter if the count is off, since it is mostly correct, enough to get a rough order (although the error is large enough that the order is affected). However, If I know I've learned all the words with a count greater than a certain number, I would like to be able to input that number into the "READER'S EDITION" view in an interlinear and have the words actually correspond to the words I've already studied. There is enough discrepancy that that isn't the case. Finally, I generate some statistics based on the "Count". For example, I've memorized 820 lemmas which is 15.1% of the total number of lemmas but 86.8% of the words in the New Testament based on "Count". The problem is, that this is not true if the "Count" is inaccurate.

    You said that the issue is likely caused by the fact that I am using the Greek text underlying an English translation (the NKJV). That may be true, but why is the concordance accurate for the NKJV? The concordance for the NKJV for the first page of results matches the concordance for the Scrivener 1881 Greek New Testament. It seems to me that the way it generates the count in the word list should match the count in the concordance. To me, this does seem like a bug - probably not your most important one, but one I would like to see fixed eventually. Thank you!