Surprising results with Morph Query Editor

Page 1 of 1 (9 items)
This post has 8 Replies | 1 Follower

Posts 1042
Harry Hahne | Forum Activity | Posted: Sat, Oct 22 2016 2:34 PM

In experimenting with the Morph Query Editor, I have come up with some surprising results.

I created the following Morph Query Document, in an attempt to find participles without an article in agreement before them. This is the query:

At the top of the search results, was Matt 1:16 which has an article followed by a participle (the participle is highlighted). You can also see Matt 2:2, which is another false hit.

After some experimentation, I removed the extra search term, so there were only 2 words in the search, rather than 3. This search works better:

This removes Matt 1:16 and produces more reasonable search results.

This leads me to a suggestion and a bug report (unless I am not understanding something).

Suggestion: Start the Morph Query with 2 search terms. The user can always add another search term if they need it. But the extra word which will match any word by default can produce some unexpected results and can be confusing.

Bug: Even though I understand that the original search is highlighting "any" word that follows my desired word, it still should not have accepted Matt 1:16 or Matt 2:2. Both have an article before the participle. This is not producing correct search results.

Posts 25103
Forum MVP
Dave Hooton | Forum Activity | Replied: Sun, Oct 23 2016 1:31 AM

Harry Hahne:
Bug: Even though I understand that the original search is highlighting "any" word that follows my desired word, it still should not have accepted Matt 1:16 or Matt 2:2. Both have an article before the participle. This is not producing correct search results.

I agree, assuming your Agreement is on Lang, Case, Number & Gender because they are returned if the article is changed to "Exists" (verifying the Agreement). Also, if I untick only Agreement on Word 3 they are not returned (effectively the same as removing Word 3).

I'm trying another version with two Agreements ---> (Word 1+2 and Word 2+3),

And the results omit those verses! But there are results I don't understand like Mt 5:4 & 5:6; which have the article & words that aren't adjacent.

Note: that your proximity values should be at least 3 for 3 words, 4 for 4 words, etc. The value at the top of the query is the range of the search e.g. look for 3 words over a range of 10 words. Your Query likely compensated and used minimum values.

Dave
===

Windows 10 & Android 8

Posts 1511
Forum MVP
Fr Devin Roza | Forum Activity | Replied: Sun, Oct 23 2016 8:52 AM

Harry Hahne:

In experimenting with the Morph Query Editor, I have come up with some surprising results.

I created the following Morph Query Document, in an attempt to find participles without an article in agreement before them. This is the query:

At the top of the search results, was Matt 1:16 which has an article followed by a participle (the participle is highlighted). You can also see Matt 2:2, which is another false hit.

After some experimentation, I removed the extra search term, so there were only 2 words in the search, rather than 3. This search works better:

This removes Matt 1:16 and produces more reasonable search results.

As best I can tell your search in both cases is working correctly.

The problem you are having is that you have set your range up at the top of the Query to 2 words. Just leave it at the default 10 words, or set it to 1 verse or 2 verses, and you should get the results you are expecting.

The number of words or verses up at the top gives the Morph Query a range to search within. If you are limiting your range to only 2 words, you cannot possibly get precise results if you are actually asking it to examine 3 words. Instead, within the two word range it was actually looking at, the article was in fact NOT there in Mt 1:16 or Mt 2:2... because all it could see was two words! It needs at least a three word range to actually see whether the article is there or not.

For most searches, the default 10 words / segments should work perfectly.

Posts 1042
Harry Hahne | Forum Activity | Replied: Sun, Oct 23 2016 2:26 PM

Fr Devin Roza:
The problem you are having is that you have set your range up at the top of the Query to 2 words. Just leave it at the default 10 words, or set it to 1 verse or 2 verses, and you should get the results you are expecting.

If the search were treating the range as 2 words, it should not return any results at all, since by definition 3 words cannot occur within a range of 2 words. The search appears to look for some combination of 2 of the 3 search terms.

Suggestion:I think it would be best if the range setting and the "within" setting could not be set to less than the number of search terms, so it does not create an indeterminate search like this. So it a third search term were added, the "range" and "Within" parameters would be increased to 3 if they were set to 2.

Question: What is the difference between the "range" and "Within" parameters? 

Posts 25103
Forum MVP
Dave Hooton | Forum Activity | Replied: Sun, Oct 23 2016 4:09 PM

Fr Devin Roza:

As best I can tell your search in both cases is working correctly.

The problem you are having is that you have set your range up at the top of the Query to 2 words. Just leave it at the default 10 words, or set it to 1 verse or 2 verses, and you should get the results you are expecting.

If you read my comments above you would see that is not so as a Faithlife developer had told me that the Query engine assumes minimum values in order to provide a non-zero result, and I got the same results as Harry with default range 10 words and proximity 3 words.

The third word seems to pose problems as it is not defined beyond Agreement and Order e.g. what does Agreement mean wrt "Does not exist"? Does the query engine first find the other two words which must be in agreement before evaluating that term for the first word? I tried to answer that with my amended query and it gave results like Mt 5:4 & 5:6 which you could perhaps explain.

Dave
===

Windows 10 & Android 8

Posts 25103
Forum MVP
Dave Hooton | Forum Activity | Replied: Sun, Oct 23 2016 9:15 PM

Dave Hooton:
I got the same results as Harry with default range 10 words and proximity 3 words.

Well, I now get different results that omit the two verses Harry mentions (1460 in 1166). This has happened before and it now seems that I cannot trust results after altering parameters (the original issue that was acknowledged followed from altering the resource).

I also get different results with 1 Verse ---> 1391 in 1069.

That third word is causing difficulties, but it does not help to declare it as a Noun, or other Part of Speech. I think it is a bug where Words 2 & 3 are not adjacent.

Dave
===

Windows 10 & Android 8

Posts 1511
Forum MVP
Fr Devin Roza | Forum Activity | Replied: Mon, Oct 24 2016 12:43 AM

Harry Hahne:

If the search were treating the range as 2 words, it should not return any results at all, since by definition 3 words cannot occur within a range of 2 words. The search appears to look for some combination of 2 of the 3 search terms.

The problem is that when you set the range to 2 words, the word on the far left in fact does not exist, so it fulfills your criteria. Probably not the best design decision, but that is currently how the software is working when we set things to not exist.

The following two searches demonstrate that the software is currently working this way:

Search 1

This search returns Mt 1:16 because the range up top is set to two words. Within that two word range, the definite article does not exist, so Mt 1:16 is correctly included as a result.

Search 2

If I run the exact same search, and ONLY change the range to 3, it works properly:

This search does NOT return Mt 1:16. 

However, the search is clearly still not correctly designed. Notice the double hit in Mt 1:18, where the second word is two words away? 

Instead, when working with words that do NOT exist, we need to separate out that proximity constraint, to avoid allowing for hits like Mt 1:18. The search needs to be written like this, with an additional row of proximity constraint:

Now you are getting the type of results you were looking for:

Notice the rules I am following above:

1. The upper range has to be at least the number of words as in your query, including words that are set to "not exist". If you just leave it at the default 10, it should almost always work. 

2. In your Proximity condition on the bottom, you have to have a separate proximity range for words that don't exist from words that do exist. 

Finally, to search for Anarthrous Participles, I would suggest the following search. It's not perfect, but it does give an approximate answer. I don't think the Morph Builder engine has enough information yet to actually run searches for arthrous and anarthrous words, at least not with a foolproof method:

Harry Hahne:

Question: What is the difference between the "range" and "Within" parameters? 

The "range" parameter up top is for the entire search, while the Within constraint down below is for the words that you have selected.

For the "range" parameter up top, you could imagine Logos going through the SBLGNT with a piece of paper with a hole cutout in it, and Logos can only see and consider what is inside of that hole to answer your query... nothing else. 

So, if you have a hole that only shows you two words at a time, and you search for two words that do exist, plus one word that does not exist on the far right or far left, the search will always return that the third word does not exist, because only two words can be seen by Logos at a time, so as far as the search is concerned, the 3rd word does not exist!

With the Within proximity constraint down below, it sets the range within which the selected words need to return "true". This basically allows for a lot of flexibility as regards range - maybe two words in your query do need to be right next to each other, but the others you don't care. 

Posts 1511
Forum MVP
Fr Devin Roza | Forum Activity | Replied: Mon, Oct 24 2016 1:02 AM

Dave Hooton:

If you read my comments above you would see that is not so as a Faithlife developer had told me that the Query engine assumes minimum values in order to provide a non-zero result, and I got the same results as Harry with default range 10 words and proximity 3 words.

I had also read the response from Faithlife saying that it automatically upped the range if we put more words. However, that is clearly not happening when we set words to "not exist." Cf. my post immediately above this one for an example search showing that the Query Engine is currently not taking into account words that do not exist when it calculates the minimum value range. 

I would suggest Faithlife modify the design, so that words that do "not exist" are also included in the minimum range calculation.

Dave Hooton:

The third word seems to pose problems as it is not defined beyond Agreement and Order e.g. what does Agreement mean wrt "Does not exist"? Does the query engine first find the other two words which must be in agreement before evaluating that term for the first word? I tried to answer that with my amended query and it gave results like Mt 5:4 & 5:6 which you could perhaps explain.

If I understand correctly, you are referring to this search:

This search returns results like Mt 5:4:

Once again, I think this search is working properly. Here's how I understand how this works.

Down below we set a range of three within which we are working for all three words. Three word span, which must have a participle and an agreeing word within three words, and within that same three word range, it must not have an agreeing definite article on the left hand side.

All of those criteria are clearly met in Mt 5:4 and Mt 5:6. We have a three word range, in which there is not definite article, and in which we have a word agreeing with the participle.

The fact that there happens to be an agreeing definite article to the left of this three word range is simply irrelevant to the search, because we specified a three word range. If we up the proximity constraint down below to four:

Then Mt 5:4 and 5:6 are no longer returned:

Basically, for the search to work as your hoping, this is the the way to do it:

This requires both the first and second term to NOT be next to each other (as the first term can't exist), and the 2nd and 3rd term to be next to each other. Once these two are set, here are the results surrounding Mt 5:

Posts 25103
Forum MVP
Dave Hooton | Forum Activity | Replied: Mon, Oct 24 2016 2:46 AM

Thank you for taking the time to explain.

Fr Devin Roza:

I had also read the response from Faithlife saying that it automatically upped the range if we put more words. However, that is clearly not happening when we set words to "not exist." Cf. my post immediately above this one for an example search showing that the Query Engine is currently not taking into account words that do not exist when it calculates the minimum value range. 

I would suggest Faithlife modify the design, so that words that do "not exist" are also included in the minimum range calculation.

Agree.

Fr Devin Roza:

Once again, I think this search is working properly. Here's how I understand how this works.

Down below we set a range of three within which we are working for all three words. Three word span, which must have a participle and an agreeing word within three words, and within that same three word range, it must not have an agreeing definite article on the left hand side.

Fr Devin Roza:
Basically, for the search to work as your hoping, this is the the way to do it:

I came to those conclusions after working from a query in which the participle exists, but came across anomalies in the modified anarthrous query that I'll address in a new thread.

Dave
===

Windows 10 & Android 8

Page 1 of 1 (9 items) | RSS