[closed] BUG: "Forli" is not a form of the word "for"

Page 1 of 1 (11 items)
This post has 10 Replies | 0 Followers

Posts 5921
SineNomine | Forum Activity | Posted: Wed, Jun 3 2015 2:33 PM

I built a concordance for the Roman Martyrology and let it show me connecting words while grouping similar forms. It grouped "for" with "Forli", which is a place in Italy with wet winters, warm summers, and apparently at least one martyr or two. It's hard for me to verify that last part because even with Forli in quotation marks in Inline Search, Verbum [Logos] still won't let me distinguish between place and preposition and gives me 818 or less false positives.

Please fix this. Smile Thanks!

Please use descriptive thread titles to attract helpful posts & not waste others' time. Thanks!

Posts 19577
Rosie Perera | Forum Activity | Replied: Wed, Jun 3 2015 2:44 PM

SineNomine:
It's hard for me to verify that last part because even with Forli in quotation marks in Inline Search, Verbum [Logos] still won't let me distinguish between place and preposition and gives me 818 or less false positives.

Turn off "Match all word forms" in your Search tab. It affects inline search as well, even though it doesn't seem connected. I was able to find the one instance of Forli in that resource using inline search.

But you're right, when "Match all word forms" is on, searches for "for" and "Forli" should not be finding each other.

Posts 9181
LogosEmployee

This is a "known issue" with stemming (match all word forms / group similar forms); see https://community.logos.com/forums/p/87708/615401.aspx#615401 

Posts 19577
Rosie Perera | Forum Activity | Replied: Wed, Jun 3 2015 3:00 PM

I can see why there would be some errors in stemming due to using an algorithmic stemmer, but when is "li" an English ending that should be stripped off by a stemmer? Is it confusing "li" for "ly"?

Posts 3033
LogosEmployee
Andrew Batishko | Forum Activity | Replied: Wed, Jun 3 2015 3:15 PM

Rosie Perera:

I can see why there would be some errors in stemming due to using an algorithmic stemmer, but when is "li" an English ending that should be stripped off by a stemmer? Is it confusing "li" for "ly"?

My guess is that since stemmers have support for removing chains of inflection, "li" handles cases like: "uglier" -> "ugli" -> "ugly"

Here's some more info: http://snowball.tartarus.org/algorithms/english/stemmer.html

Andrew Batishko | Faithlife software developer

Posts 5921
SineNomine | Forum Activity | Replied: Wed, Jun 3 2015 3:22 PM

Is it possible for Faithlife to manually override the imperfect algorithm?

Please use descriptive thread titles to attract helpful posts & not waste others' time. Thanks!

Posts 9181
LogosEmployee

SineNomine:

Is it possible for Faithlife to manually override the imperfect algorithm?

Yes, but see caveats discussed here: https://community.logos.com/forums/p/61695/437957.aspx#437957 

Posts 5921
SineNomine | Forum Activity | Replied: Wed, Jun 3 2015 4:54 PM

Bradley Grainger (Faithlife):

Am I understanding correctly the the caveat is that it's a pain in the neck to do?

Please use descriptive thread titles to attract helpful posts & not waste others' time. Thanks!

Posts 19577
Rosie Perera | Forum Activity | Replied: Wed, Jun 3 2015 4:57 PM

SineNomine:

Bradley Grainger (Faithlife):

Am I understanding correctly the the caveat is that it's a pain in the neck to do?

I think the caveat is that it might break something, because there are complicated dependencies.

Posts 5921
SineNomine | Forum Activity | Replied: Wed, Jun 3 2015 5:24 PM

Rosie Perera:
I think the caveat is that it might break something, because there are complicated dependencies.

If a specific exception of the type "'Forli' is not a form of the word 'for'" breaks something (I guess one might also need to define what other forms of Forli there might be-I'm thinking "Forli's" and nothing else), then that would seem to indicate that there's something else more seriously wrong.

Incidentally, an end user tool for making precisely this kind of separation, though quite possibly impractical or nearly impossible to program, would make me quite satisfied.

Please use descriptive thread titles to attract helpful posts & not waste others' time. Thanks!

Posts 2068
Forum MVP
Reuben Helmuth | Forum Activity | Replied: Wed, Jun 3 2015 5:41 PM

Since this ability/inability seems like it would be directly connected to "fuzzy search," perhaps Faithlife should simply change the last phrase in this screenshot!

Page 1 of 1 (11 items) | RSS