Suggestion for lemma searches

MJ. Smith
MJ. Smith MVP Posts: 55,108
edited December 2024 in English Forum

The question about LXX Englishman's search got me thinking about what I'd really like in a search - pie in the sky style and make mine blue huckleberries please.

 

image

What I would like to be able to do is

(1) Select English translation under consideration

(2) Select the languages used (Aramaic Targums, Syriac Peshitta, Amharic in addition to those shown above).

(3) Select manuscript(s) used for each language [not shown in picture]

(4) Select sequence of Biblical books

(5) Process would (a) find all lemmas translated into the English word for in this particular translations (b) find all occurences of those lemmas in the 'original' languages (c) sort the results into biblical book order within English translation lemma - putting those that match the originally selected word at the beginning

(6) Be able to open and close branches of the results to narrow down to what is of specific interest (basic Warnier-Orr or Brackets diagram functionality)

(7) Mouse over provides popup window with larger context

(8) Clicking on one of the English lemmas moves that lemma to the root and the diagram is rebuilt.

I recognize that this is not a "quick and dirty" add-in but it would be extremely powerful and done right could really make Logos standout from the competition.

Orthodox Bishop Alfeyev: "To be a theologian means to have experience of a personal encounter with God through prayer and worship."; Orthodox proverb: "We know where the Church is, we do not know where it is not."

Comments

  • Chris Ease
    Chris Ease Member Posts: 175 ✭✭

    Great suggestion!!!

  • George Somsel
    George Somsel Member Posts: 10,150 ✭✭✭

    MJ. Smith said:

    (5) Process would (a) find all lemmas translated into the English word for in this particular translations (b) find all occurences of those lemmas in the 'original' languages (c) sort the results into biblical book order within English translation lemma - putting those that match the originally selected word at the beginning

    Are you thinking of using one particular translation or do you want it to function for whichever translation you happen to choose?  The latter would require some processing beyond, say a comparison of the Greek and Hebrew in Tov's Parallel Aligned Hebrew-Aramaic and Greek Texts or one of the interlinears (Ptui!).  It would need to be done "on the fly."  It might therefore be somewhat slow in execution.

    george
    gfsomsel

    יְמֵי־שְׁנוֹתֵינוּ בָהֶם שִׁבְעִים שָׁנָה וְאִם בִּגְבוּרֹת שְׁמוֹנִים שָׁנָה וְרָהְבָּם עָמָל וָאָוֶן

  • MJ. Smith
    MJ. Smith MVP Posts: 55,108

    Are you thinking of using one particular translation or do you want it to function for whichever translation you happen to choose? 

    I assumed that each choice - translation or base manuscript - would have to be limited in order to take advantage of the tagging that is already available and make additional tagging financially feasible. The execution time would depend upon the indexing available, the format of documents requiring an actual search, the space on the clients' machines for sorting, and most importantly the ingenuity used by the programmer in creating the algorithm. Well designed, it is feasible technically - I can give examples of applications that provide each function albeit on different material.  However, I lack sufficient knowledge of how Logos handles certain features with regards to tagging and indesing to know if it is financially feasible.  But if it isn't financially feasible noew, I'd still like them to keep it in mind as a future feature.

    Orthodox Bishop Alfeyev: "To be a theologian means to have experience of a personal encounter with God through prayer and worship."; Orthodox proverb: "We know where the Church is, we do not know where it is not."

  • Dave Hooton
    Dave Hooton MVP Posts: 36,199

    MJ. Smith said:

    I assumed that each choice - translation or base manuscript - would have to be limited in order to take advantage of the tagging that is already available and make additional tagging financially feasible. The execution time would depend upon the indexing available, the format of documents requiring an actual search, the space on the clients' machines for sorting, and most importantly the ingenuity used by the programmer in creating the algorithm.

    The problem would be incorporating language lemmas into selected resources or building indexes separate to the resources. The task would be equivalent to building Strong's numbers for the KJV or NASB95. Libronix v4 could have the potential for separate indexes, but v3 uses algorithmic stemming for English searches which is not a substitute for proper root equivalents - it wouldn't know one well from another and

    It's going to overstem some words (e.g., both "one" and "on" to the
    same root) and understem others (e.g., "majestic" and "majesty"
    have different roots).

    The Porter Stemming Algorithm used can be found
    here http://www.tartarus.org/~martin/PorterStemmer/ - you might find that words ending "en" are not stemmed e.g. fall gets falling and falls, but not fallen.

    Dave
    ===

    Windows 11 & Android 13

  • MJ. Smith
    MJ. Smith MVP Posts: 55,108

    Thanks for the  Porter Stemmer reference - I agree that it is a primitive algorithm. Actual I was intending to take the Greek, Latin and English back to their Indo-European roots (just kidding). You are correct that I am assuming "manual" coding.  However, the vast majority can be handled programatically as replace literals in a file that applies to all texts in that particular language.  Basic process:

    1) Use a concordance building program to get a list of all the words appearing in the text. There are some excellent programs available for manuscript studies for less than $100.

    2) Run the list through a rudimentary stemmer program - or an etymological dictionary and custom "stemmer"

    3) Review and tweak output especially flagging homographs which will require manual coding.

    4) Run the list against the input file - using a replace statement to replace each word occurence with the word and its tag.

    5) Manually code the homographs

    6) Use a short program to build indices based on tags.

    I would suspect that Logos already uses a very similar process for some of its tagging. Review and tweaking are the most time consuming but the results are reusable. Each additional book in the same language only needs to look for new entries in the concordance.

    But then again, we've gone a bit off-topic.  My original post was a feature I would like to have available ... I really don't want to be a free analyst for Logos unless that is what is necessary to get my wishes [:D]                                                                                        

    Orthodox Bishop Alfeyev: "To be a theologian means to have experience of a personal encounter with God through prayer and worship."; Orthodox proverb: "We know where the Church is, we do not know where it is not."

  • Dave Hooton
    Dave Hooton MVP Posts: 36,199

    MJ. Smith said:

    But then again, we've gone a bit off-topic.  My original post was a feature I would like to have available ... I really don't want to be a free analyst for Logos unless that is what is necessary to get my wishes Big Smile

    That's OK. Based on some 2005 discussions (sample quoted above), Logos have already thought about 1) and 2), and the rest follows and I would like them to adopt the feature [:)]

    Dave
    ===

    Windows 11 & Android 13