Is it possible to convert Hebrew text to just a string of root words?

Lachlan Davis
Lachlan Davis Member Posts: 6
edited November 2024 in English Forum

I think it is possible to do large scale text comparison if one could copy the Hebrew text as a string of root verbs and nouns with no pronomial suffixes. Is this possible?

For example: Genesis 49:3 quotes Deuteronomy 21:17. Butthe conjugations are different

Deut: רֵאשִׁ֣ית אֹנ֔וֹ (the first fruits of his vigor)

Gen: וְרֵאשִׁ֣ית אוֹנִ֑י (and the first fruits of my vigor)

Lets say I want to use the entirety of Genesis 37-50 as my base text to see if there are any other phrases borrowed from Deuteronomy. 

If I were to take the texts, without accents or vowels and do a text comparison between Gen 37-50 and Deuteronomy 1-34 the above result would not show up because the words are conjugated differently. 

So how can I do a neural network/fuzzy search between the two large sections of text that would successfully highlight Gen 49:3 and Deut 21:17 

Tagged:

Comments

  • Phil Gons (Logos)
    Phil Gons (Logos) Administrator, Logos Employee Posts: 3,799

    Can you provide a specific example of what you're trying to accomplish? Do you want roots or lemmas? How many?

  • DMB
    DMB Member Posts: 13,817 ✭✭✭

    I think it is possible to do large scale text comparison if one could copy the Hebrew text as a string of root verbs and nouns with no pronomial suffixes. Is this possible?

    Not sure your 'root', since Logos nomenclature for root is a level below what you seem interested in. Yours is a level between lemma and surface text?

    I don't think Logos has any facility or workaround to accomplish this or your other post for phrases (edit: if Logos roots, straight comparison as Phil may be leading to).

    I do both but outside, inside my own Bible software. Neural networks, and the extension fuzzy neurals are used. For maximum flexibility, you combine both the base hebrew, and morph tagging to do the matching.

    "If myth is ideology in narrative form, then scholarship is myth with footnotes." B. Lincolm 1999.

  • Lachlan Davis
    Lachlan Davis Member Posts: 6

    Hi Denise, that sounds like the sort of thing I'm trying to achieve. Would you be happy to give a little more detail as to how you go about it?

  • Lachlan Davis
    Lachlan Davis Member Posts: 6

    For example: Genesis 49:3 quotes Deuteronomy 21:17. Butthe conjugations are different

    Deut: רֵאשִׁ֣ית אֹנ֔וֹ (the first fruits of his vigor)

    Gen: וְרֵאשִׁ֣ית אוֹנִ֑י (and the first fruits of my vigor)

    Lets say I want to use the entirety of Genesis 37-50 as my base text to see if there are any other phrases borrowed from Deuteronomy. 

    If I were to take the texts, without accents or vowels and do a text comparison between Gen 37-50 and Deuteronomy 1-34 the above result would not show up because the words are conjugated differently. 

    So how can I do a neural network/fuzzy search between the two large sections of text that would successfully highlight Gen 49:3 and Deut 21:17 

  • DMB
    DMB Member Posts: 13,817 ✭✭✭

    If I were to take the texts, without accents or vowels and do a text comparison between Gen 37-50 and Deuteronomy 1-34 the above result would not show up because the words are conjugated differently. 

    Well, first, my comment was only to demonstrate that one solution (mine) is do-able but non-trivial (as defined technically) .... your idea for a bulk comparison. Phil, above, is  an executive at FL, and was likely seeing is they had a comparison level more practical for you.

    Without getting too 'deep', I also strip down to 'consonents' but layered relative to morphological use, with the goal to avoid potential contamination after the Persian period.

    The neurals are also good for word series that are 'almost'. The technique is to train an author to write his own 'book'. Then use that training to have the same author write the other books. You get matches where the neural author can almost duplicate portions of the other book. Since the software evaluates all the author/books, you can find matches in both directions (the authors are in different time periods, ergo potential back-referencing).

    In your example, the exact phrasing would be easy (for the neural). But in a batch mode, Genesis would be a messy match to Deuteronomy, the latter being tight (very predictable, also matching early Isaiah), and Genesis being messy (author signatures).

    "If myth is ideology in narrative form, then scholarship is myth with footnotes." B. Lincolm 1999.

  • Lachlan Davis
    Lachlan Davis Member Posts: 6

    I'd be interested in finding out more. If you could contact me at lachlanjdavis@gmail.com I'd love to chat.

  • For example: Genesis 49:3 quotes Deuteronomy 21:17. Butthe conjugations are different

    Deut: רֵאשִׁ֣ית אֹנ֔וֹ (the first fruits of his vigor)

    Gen: וְרֵאשִׁ֣ית אוֹנִ֑י (and the first fruits of my vigor)

    Welcome [:D]

    Right click on רֵאשִׁ֣ית shows <Sense = beginning part>, which is part of the Bible Sense Lexicon (BSL) hierarchy <Sense concept>

    Right click on אֹנ֔וֹ shows <Sense = strength> while Right click on אוֹנִ֑י shows <Sense = vigor> that is part of BSL hierarchy <Sense strength>

    Bible Search for <Sense concept> BEFORE 1 WORD <Sense strength> in Lexham Hebrew Bible finds Ge49.3, Dt21.17, plus more:

    BSL <Sense concept> hierarchy can be broadened: e.g. <Sense knowledge> OR refined: e.g. <Sense division (portion)>

    Bible Search <Sense division (portion)> BEFORE 1 WORD <Sense strength> finds two verses: Genesis 49:3 & Deuteronomy 21:17

    Keep Smiling [:)]

  • Dave Hooton
    Dave Hooton MVP Posts: 35,880

    Comparison is not possible with the Logos tool; which is based on surface text. You would have to construct your own bibles (as Personal Books) to facilitate that comparison.

    Dave
    ===

    Windows 11 & Android 13

  • Beloved Amodeo
    Beloved Amodeo Member Posts: 4,183 ✭✭✭

    For example: Genesis 49:3 quotes Deuteronomy 21:17. Butthe conjugations are different

    Deut: רֵאשִׁ֣ית אֹנ֔וֹ (the first fruits of his vigor)

    Gen: וְרֵאשִׁ֣ית אוֹנִ֑י (and the first fruits of my vigor)

    Lets say I want to use the entirety of Genesis 37-50 as my base text to see if there are any other phrases borrowed from Deuteronomy. 

    If I were to take the texts, without accents or vowels and do a text comparison between Gen 37-50 and Deuteronomy 1-34 the above result would not show up because the words are conjugated differently. 

    So how can I do a neural network/fuzzy search between the two large sections of text that would successfully highlight Gen 49:3 and Deut 21:17 

    Lachlan,

    This happens to be a keen interest of mine. And I and others have advocated for FL to develop OT use of OT intertext with no commitment from them. I hope your posts highlight the usefulness of this dataset. See these posts   https://community.logos.com/forums/p/188939/1091347.aspx#1091347 and https://community.logos.com/forums/t/173348.aspx In addition you may find this prepub useful https://www.logos.com/product/190419/old-testament-use-of-the-old-testament-a-book-by-book-guide Like Lonnie Spence I find the Concordance Tool useful for study. Note that it generated the target texts you were searching to compare. Note the details of its use. If you need help generating this report post back; I will be glad to help. See below

    Let's hope that FL sees the demand for this dataset and begins work to bring it to its hopeful customers. I would love to see an interactive on the level of a NTUOT developed for Logos.

    Meanwhile, Jesus kept on growing wiser and more mature, and in favor with God and his fellow man.

    International Standard Version. (2011). (Lk 2:52). Yorba Linda, CA: ISV Foundation.

    MacBook Pro MacOS Sequoia 15.2 1TB SSD

  • DMB
    DMB Member Posts: 13,817 ✭✭✭

    I'd be interested in finding out more. If you could contact me at ... I'd love to chat.

    Just for everyone's benefit, we try to keep conversations on the forum .... no offense. Questions are welcome. But neurals are probably at an unnecessary extreme, just guessing. MJ, in your other thread, mentions more practical software ... indeed, something like it would be useful embedded into Logos for academic value.

    The concept of shared text is heavily discussed, both from multi-culture (eg ugaritic > hebrew and especially assyrian/aramaic > hebrew), and as Beloved hopes for, intra-canon matches. I avoided the word 'quote', since a common phrasing can exist over a lengthy period of time, without direct contact.

    The neural approach is nastier (considerable data coding), but more valid, since any matches can be evaluated in the surrounding text of a match.  And the approach can identify matching styles across larger text blocks. For example, several of the Genesis female narritives match up against similar in Ruth and Samuel, even though the neurals have no technical access to gender (stripped).

    "If myth is ideology in narrative form, then scholarship is myth with footnotes." B. Lincolm 1999.