Suggestion for lemma searches

Page 1 of 1 (7 items)
This post has 6 Replies | 0 Followers

Posts 25855
Forum MVP
MJ. Smith | Forum Activity | Posted: Sat, Aug 22 2009 1:02 PM

The question about LXX Englishman's search got me thinking about what I'd really like in a search - pie in the sky style and make mine blue huckleberries please.

 

What I would like to be able to do is

(1) Select English translation under consideration

(2) Select the languages used (Aramaic Targums, Syriac Peshitta, Amharic in addition to those shown above).

(3) Select manuscript(s) used for each language [not shown in picture]

(4) Select sequence of Biblical books

(5) Process would (a) find all lemmas translated into the English word for in this particular translations (b) find all occurences of those lemmas in the 'original' languages (c) sort the results into biblical book order within English translation lemma - putting those that match the originally selected word at the beginning

(6) Be able to open and close branches of the results to narrow down to what is of specific interest (basic Warnier-Orr or Brackets diagram functionality)

(7) Mouse over provides popup window with larger context

(8) Clicking on one of the English lemmas moves that lemma to the root and the diagram is rebuilt.

I recognize that this is not a "quick and dirty" add-in but it would be extremely powerful and done right could really make Logos standout from the competition.

Orthodox Bishop Hilarion Alfeyev: "To be a theologian means to have experience of a personal encounter with God through prayer and worship."

Posts 172
Chris Ease | Forum Activity | Replied: Sat, Aug 22 2009 1:45 PM

Great suggestion!!!

Posts 9945
George Somsel | Forum Activity | Replied: Sat, Aug 22 2009 1:55 PM

MJ. Smith:
(5) Process would (a) find all lemmas translated into the English word for in this particular translations (b) find all occurences of those lemmas in the 'original' languages (c) sort the results into biblical book order within English translation lemma - putting those that match the originally selected word at the beginning

Are you thinking of using one particular translation or do you want it to function for whichever translation you happen to choose?  The latter would require some processing beyond, say a comparison of the Greek and Hebrew in Tov's Parallel Aligned Hebrew-Aramaic and Greek Texts or one of the interlinears (Ptui!).  It would need to be done "on the fly."  It might therefore be somewhat slow in execution.

george
gfsomsel

יְמֵי־שְׁנוֹתֵינוּ בָהֶם שִׁבְעִים שָׁנָה וְאִם בִּגְבוּרֹת שְׁמוֹנִים שָׁנָה וְרָהְבָּם עָמָל וָאָוֶן

Posts 25855
Forum MVP
MJ. Smith | Forum Activity | Replied: Sat, Aug 22 2009 4:13 PM

George Somsel:
Are you thinking of using one particular translation or do you want it to function for whichever translation you happen to choose? 

I assumed that each choice - translation or base manuscript - would have to be limited in order to take advantage of the tagging that is already available and make additional tagging financially feasible. The execution time would depend upon the indexing available, the format of documents requiring an actual search, the space on the clients' machines for sorting, and most importantly the ingenuity used by the programmer in creating the algorithm. Well designed, it is feasible technically - I can give examples of applications that provide each function albeit on different material.  However, I lack sufficient knowledge of how Logos handles certain features with regards to tagging and indesing to know if it is financially feasible.  But if it isn't financially feasible noew, I'd still like them to keep it in mind as a future feature.

Orthodox Bishop Hilarion Alfeyev: "To be a theologian means to have experience of a personal encounter with God through prayer and worship."

Posts 24164
Forum MVP
Dave Hooton | Forum Activity | Replied: Sun, Aug 23 2009 1:14 AM

MJ. Smith:
I assumed that each choice - translation or base manuscript - would have to be limited in order to take advantage of the tagging that is already available and make additional tagging financially feasible. The execution time would depend upon the indexing available, the format of documents requiring an actual search, the space on the clients' machines for sorting, and most importantly the ingenuity used by the programmer in creating the algorithm.

The problem would be incorporating language lemmas into selected resources or building indexes separate to the resources. The task would be equivalent to building Strong's numbers for the KJV or NASB95. Libronix v4 could have the potential for separate indexes, but v3 uses algorithmic stemming for English searches which is not a substitute for proper root equivalents - it wouldn't know one well from another and

Bradley  (Sept 2005):
It's going to overstem some words (e.g., both "one" and "on" to the same root) and understem others (e.g., "majestic" and "majesty" have different roots).
The Porter Stemming Algorithm used can be found here http://www.tartarus.org/~martin/PorterStemmer/ - you might find that words ending "en" are not stemmed e.g. fall gets falling and falls, but not fallen.

Dave
===

Windows & Android

Posts 25855
Forum MVP
MJ. Smith | Forum Activity | Replied: Sun, Aug 23 2009 11:16 AM

Thanks for the  Porter Stemmer reference - I agree that it is a primitive algorithm. Actual I was intending to take the Greek, Latin and English back to their Indo-European roots (just kidding). You are correct that I am assuming "manual" coding.  However, the vast majority can be handled programatically as replace literals in a file that applies to all texts in that particular language.  Basic process:

1) Use a concordance building program to get a list of all the words appearing in the text. There are some excellent programs available for manuscript studies for less than $100.

2) Run the list through a rudimentary stemmer program - or an etymological dictionary and custom "stemmer"

3) Review and tweak output especially flagging homographs which will require manual coding.

4) Run the list against the input file - using a replace statement to replace each word occurence with the word and its tag.

5) Manually code the homographs

6) Use a short program to build indices based on tags.

I would suspect that Logos already uses a very similar process for some of its tagging. Review and tweaking are the most time consuming but the results are reusable. Each additional book in the same language only needs to look for new entries in the concordance.

But then again, we've gone a bit off-topic.  My original post was a feature I would like to have available ... I really don't want to be a free analyst for Logos unless that is what is necessary to get my wishes Big Smile                                                                                        

Orthodox Bishop Hilarion Alfeyev: "To be a theologian means to have experience of a personal encounter with God through prayer and worship."

Posts 24164
Forum MVP
Dave Hooton | Forum Activity | Replied: Sun, Aug 23 2009 5:17 PM

MJ. Smith:
But then again, we've gone a bit off-topic.  My original post was a feature I would like to have available ... I really don't want to be a free analyst for Logos unless that is what is necessary to get my wishes Big Smile

That's OK. Based on some 2005 discussions (sample quoted above), Logos have already thought about 1) and 2), and the rest follows and I would like them to adopt the feature Smile

Dave
===

Windows & Android

Page 1 of 1 (7 items) | RSS