Searching with British or US spelling - different results!

Page 1 of 2 (21 items) 1 2 Next >
This post has 20 Replies | 2 Followers

Posts 234
Colin | Forum Activity | Posted: Fri, Mar 17 2017 8:24 AM

I had naively assumed that I could search with British or US spelling and all the results from both queries would be returned. However, I was searching my library today for "empathise" and found just 26 results. When I search for empathize, it returns 718 (different) hits. (My search was more complex but when I got no meaningful results I tried just this word). 

Is this the intended behaviour of the software? If so, is it worth going to uservoice and making a suggestion to return all results for a word whether it is printed by a British publisher or an American one? 

Colin

Posts 27000
Forum MVP
JT (alabama24) | Forum Activity | Replied: Fri, Mar 17 2017 10:20 AM

Colin:
Is this the intended behaviour of the software?

Yes! 

Colin:
is it worth going to uservoice and making a suggestion to return all results for a word whether it is printed by a British publisher or an American one?

Probably not. I assume this would be very difficult to implement. How would you go about it? 

OSX & iOS | Logs |  Install

Posts 234
Colin | Forum Activity | Replied: Fri, Mar 17 2017 11:03 AM

Thanks Alabama for clearing that up!

alabama24:
I assume this would be very difficult to implement. How would you go about it? 

No idea but Google manages it somehow .... 

Colin

Posts 2681
DominicM | Forum Activity | Replied: Fri, Mar 17 2017 11:11 AM

Easiest way is using a wild card in these cases: emphasi*e would return both sets in same search 

Never Deprive Anyone of Hope.. It Might Be ALL They Have

Posts 2968
Forum MVP
PetahChristian | Forum Activity | Replied: Fri, Mar 17 2017 11:47 AM

alabama24:
I assume this would be very difficult to implement. How would you go about it?

Some degree of search normalization is already implemented in Logos. For example, naïve returns results that include naive (diacritic-insensitive), and shepherd returns results that include Shepherd (case-insensitive).

UK/US normalization involves the same manner of canonicalization, but the real dilemma is how far you take it:

  • Do you only canonicalize spellings (e.g., normalise -> normalize)?
  • What about words with different meanings, such as bonnet? Should it also match hood but not hat?
  • Do you include jelly in results for jam? Was that the right sense for the term?

The difficulty really lies not with what is technically possible, but what people expect. As soon as you start returning results that don't literally match what a user expected, are the results incorrect? Should the user have to then exclude any false positives? At that point, it's a no-win situation.

I think FL has chosen the most practical solution, which is to not surprise users and return exactly what the user searched for, while giving them the power to use grouping/lists/OR to specify alternate spellings

Posts 27000
Forum MVP
JT (alabama24) | Forum Activity | Replied: Fri, Mar 17 2017 2:08 PM

Colin:

No idea but Google manages it somehow .... 

  1. Google spends BILLIONS on thier search engine. Logos doesn't. 
  2. Google searches are done online, not locally. Would you be ok gaining this feature but losing offline (local) searches?
  3. You can accomplish what you want now, with the right syntax. 

OSX & iOS | Logs |  Install

Posts 873
Justin Gatlin | Forum Activity | Replied: Fri, Mar 17 2017 3:29 PM

PetahChristian:

alabama24:
I assume this would be very difficult to implement. How would you go about it?

Some degree of search normalization is already implemented in Logos. For example, naïve returns results that include naive (diacritic-insensitive), and shepherd returns results that include Shepherd (case-insensitive).

UK/US normalization involves the same manner of canonicalization, but the real dilemma is how far you take it:

  • Do you only canonicalize spellings (e.g., normalise -> normalize)?
  • What about words with different meanings, such as bonnet? Should it also match hood but not hat?
  • Do you include jelly in results for jam? Was that the right sense for the term?

The difficulty really lies not with what is technically possible, but what people expect. As soon as you start returning results that don't literally match what a user expected, are the results incorrect? Should the user have to then exclude any false positives? At that point, it's a no-win situation.

I think FL has chosen the most practical solution, which is to not surprise users and return exactly what the user searched for, while giving them the power to use grouping/lists/OR to specify alternate spellings

The logical thing would be to include alternate spellings in Include All Word Forms, and not when that is unselected. That gives both power and ease of use.

Posts 24347
Forum MVP
Dave Hooton | Forum Activity | Replied: Fri, Mar 17 2017 3:49 PM

PetahChristian:
The difficulty really lies not with what is technically possible, but what people expect. As soon as you start returning results that don't literally match what a user expected, are the results incorrect?

Match All Word Forms will return results that don't literally match what was entered (and it could be a user's default). But it doesn't normali[sz]e British/US spellings. So a solution would have to involve a new Search option Normali[sz]e British/US Spellings, and hope that FL can also change and use it to feed the algorithm used by Match All....

EDIT: If FL do the Brtish/US variants the algorithm could be given each variant when Match All... is selected.

Dave
===

Windows & Android

Posts 27000
Forum MVP
JT (alabama24) | Forum Activity | Replied: Fri, Mar 17 2017 3:50 PM

I wonder how that works. I assume that some basic rules are applied (prefix, suffix, etc.) which wouldn't apply here. If that is the case, each and every variant would have to be added manually. I could be wrong. I have been before. I agree, however, with your thinking!

OSX & iOS | Logs |  Install

Posts 24347
Forum MVP
Dave Hooton | Forum Activity | Replied: Fri, Mar 17 2017 4:00 PM

alabama24:
I wonder how that works. I assume that some basic rules are applied (prefix, suffix, etc.) which wouldn't apply here. If that is the case, each and every variant would have to be added manually.

The algorithm for Match All.. is complex, but wouldn't have to be modified if FL gave it each British/US variant triggered by a separate British/US Spelling option.

Dave
===

Windows & Android

Posts 27000
Forum MVP
JT (alabama24) | Forum Activity | Replied: Fri, Mar 17 2017 4:09 PM

Dave Hooton:

alabama24:
I wonder how that works. I assume that some basic rules are applied (prefix, suffix, etc.) which wouldn't apply here. If that is the case, each and every variant would have to be added manually.

The algorithm for Match All.. is complex, but wouldn't have to be modified if FL gave it each British/US variant triggered by a separate British/US Spelling option.

I agree... if this were possible, this is how it should be implemented. My question, however, is would there be a good way to create rules for this? Would there be many false positives?

OSX & iOS | Logs |  Install

Posts 24347
Forum MVP
Dave Hooton | Forum Activity | Replied: Fri, Mar 17 2017 4:21 PM

Justin Gatlin:
The logical thing would be to include alternate spellings in Include All Word Forms, and not when that is unselected.

The most flexible approach is to have a separate option for British/US Spellings (see my other posts).

Dave
===

Windows & Android

Posts 24347
Forum MVP
Dave Hooton | Forum Activity | Replied: Fri, Mar 17 2017 4:32 PM

alabama24:
My question, however, is would there be a good way to create rules for this? Would there be many false positives?

I'd say a lexical approach would work much better than an algorithm based on letter usage e.g. 's' versus 'z'.

Dave
===

Windows & Android

Posts 234
Colin | Forum Activity | Replied: Fri, Mar 17 2017 11:07 PM

Thanks for carrying on to discuss this guys and for giving me a little hope at least. I want to make the point which I left unstated in my first post which is that this 'problem' affects all of us. There were results found using the British spelling query which did not show when using the American one. 

It's in all our interest either to know how to construct a search which will discover both spellings or to see if the experts can figure out a way to modify the search engine so it will be done automatically.

Colin

Posts 948
Tom Reynolds | Forum Activity | Replied: Sat, Mar 18 2017 1:26 AM

While I appreciate that this might be difficult it seems clear that people are missing out on results because of spelling variations or errors. e.g. if you are not even aware that there are different UK, etc. spellings for a word you are unlikely to ever find them. When you add in spelling mistakes or old forms of spelling you have quite a mess. For example, to find all the instances of counselling you would have to do at least 5 searches (BTW I entered typo reports for the first two variations).

couseling (7x in my library)

conseling (1x)

counsselling (1x)

counseling (26503x)

counselling (1709x)

Posts 2296
David Ames | Forum Activity | Replied: Sat, Mar 18 2017 4:33 AM

RE: search with British or US spelling 

https://en.wikipedia.org/wiki/Wikipedia:List_of_spelling_variants 

Has a list of 158 words that have different spellings

Posts 27000
Forum MVP
JT (alabama24) | Forum Activity | Replied: Sat, Mar 18 2017 5:37 AM

Colin:
It's in all our interest either to know how to construct a search which will discover both spellings

That I can help you with. A search for behaviour OR behavior will find both spellings of the word. The operator "OR" means that only one of the variants has to be true.

OSX & iOS | Logs |  Install

Posts 24347
Forum MVP
Dave Hooton | Forum Activity | Replied: Sat, Mar 18 2017 2:28 PM

alabama24:
That I can help you with. A search for behaviour OR behavior will find both spellings of the word.

They will have different highlight colours in results. Use  behaviour, behavior for the same color.

Dave
===

Windows & Android

Posts 27000
Forum MVP
JT (alabama24) | Forum Activity | Replied: Sat, Mar 18 2017 3:04 PM

Yes

OSX & iOS | Logs |  Install

Posts 325
Sue McIntyre | Forum Activity | Replied: Sat, Mar 18 2017 4:13 PM

Dave Hooton:

alabama24:
That I can help you with. A search for behaviour OR behavior will find both spellings of the word.

They will have different highlight colours in results. Use  behaviour, behavior for the same color.

I wonder if it would be helpful if the predictive text in the search box could offer these options as a reminder? Possibly even in a  different colour? Or even color :-)

Page 1 of 2 (21 items) 1 2 Next > | RSS