Searching with British or US spelling - different results!

Colin
Colin Member Posts: 256 ✭✭
edited November 2024 in English Forum

I had naively assumed that I could search with British or US spelling and all the results from both queries would be returned. However, I was searching my library today for "empathise" and found just 26 results. When I search for empathize, it returns 718 (different) hits. (My search was more complex but when I got no meaningful results I tried just this word). 

Is this the intended behaviour of the software? If so, is it worth going to uservoice and making a suggestion to return all results for a word whether it is printed by a British publisher or an American one? 

Colin

Comments

  • JT (alabama24)
    JT (alabama24) MVP Posts: 36,523

    Colin said:

    Is this the intended behaviour of the software?

    Yes! 

    Colin said:

    is it worth going to uservoice and making a suggestion to return all results for a word whether it is printed by a British publisher or an American one?

    Probably not. I assume this would be very difficult to implement. How would you go about it? 

    macOS, iOS & iPadOS |Logs| Install
    Choose Truth Over Tribe | Become a Joyful Outsider!

  • Colin
    Colin Member Posts: 256 ✭✭

    Thanks Alabama for clearing that up!

    alabama24 said:

    I assume this would be very difficult to implement. How would you go about it? 

    No idea but Google manages it somehow .... 

    Colin

  • DominicM
    DominicM Member Posts: 2,995 ✭✭✭

    Easiest way is using a wild card in these cases: emphasi*e would return both sets in same search 

    Never Deprive Anyone of Hope.. It Might Be ALL They Have

  • PetahChristian
    PetahChristian Member Posts: 4,635 ✭✭✭

    alabama24 said:

    I assume this would be very difficult to implement. How would you go about it?

    Some degree of search normalization is already implemented in Logos. For example, naïve returns results that include naive (diacritic-insensitive), and shepherd returns results that include Shepherd (case-insensitive).

    UK/US normalization involves the same manner of canonicalization, but the real dilemma is how far you take it:

    • Do you only canonicalize spellings (e.g., normalise -> normalize)?
    • What about words with different meanings, such as bonnet? Should it also match hood but not hat?
    • Do you include jelly in results for jam? Was that the right sense for the term?

    The difficulty really lies not with what is technically possible, but what people expect. As soon as you start returning results that don't literally match what a user expected, are the results incorrect? Should the user have to then exclude any false positives? At that point, it's a no-win situation.

    I think FL has chosen the most practical solution, which is to not surprise users and return exactly what the user searched for, while giving them the power to use grouping/lists/OR to specify alternate spellings

    Thanks to FL for including Carta and a Hebrew audio bible in Logos 9!

  • JT (alabama24)
    JT (alabama24) MVP Posts: 36,523

    Colin said:

    No idea but Google manages it somehow .... 

    1. Google spends BILLIONS on thier search engine. Logos doesn't. 
    2. Google searches are done online, not locally. Would you be ok gaining this feature but losing offline (local) searches?
    3. You can accomplish what you want now, with the right syntax. 

    macOS, iOS & iPadOS |Logs| Install
    Choose Truth Over Tribe | Become a Joyful Outsider!

  • Justin Gatlin
    Justin Gatlin Member, MVP Posts: 2,336

    alabama24 said:

    I assume this would be very difficult to implement. How would you go about it?

    Some degree of search normalization is already implemented in Logos. For example, naïve returns results that include naive (diacritic-insensitive), and shepherd returns results that include Shepherd (case-insensitive).

    UK/US normalization involves the same manner of canonicalization, but the real dilemma is how far you take it:

    • Do you only canonicalize spellings (e.g., normalise -> normalize)?
    • What about words with different meanings, such as bonnet? Should it also match hood but not hat?
    • Do you include jelly in results for jam? Was that the right sense for the term?

    The difficulty really lies not with what is technically possible, but what people expect. As soon as you start returning results that don't literally match what a user expected, are the results incorrect? Should the user have to then exclude any false positives? At that point, it's a no-win situation.

    I think FL has chosen the most practical solution, which is to not surprise users and return exactly what the user searched for, while giving them the power to use grouping/lists/OR to specify alternate spellings

    The logical thing would be to include alternate spellings in Include All Word Forms, and not when that is unselected. That gives both power and ease of use.

  • Dave Hooton
    Dave Hooton MVP Posts: 36,357

    The difficulty really lies not with what is technically possible, but what people expect. As soon as you start returning results that don't literally match what a user expected, are the results incorrect?

    Match All Word Forms will return results that don't literally match what was entered (and it could be a user's default). But it doesn't normali[sz]e British/US spellings. So a solution would have to involve a new Search option Normali[sz]e British/US Spellings, and hope that FL can also change and use it to feed the algorithm used by Match All....

    EDIT: If FL do the Brtish/US variants the algorithm could be given each variant when Match All... is selected.

    Dave
    ===

    Windows 11 & Android 13

  • JT (alabama24)
    JT (alabama24) MVP Posts: 36,523

    I wonder how that works. I assume that some basic rules are applied (prefix, suffix, etc.) which wouldn't apply here. If that is the case, each and every variant would have to be added manually. I could be wrong. I have been before. I agree, however, with your thinking!

    macOS, iOS & iPadOS |Logs| Install
    Choose Truth Over Tribe | Become a Joyful Outsider!

  • Dave Hooton
    Dave Hooton MVP Posts: 36,357

    alabama24 said:

    I wonder how that works. I assume that some basic rules are applied (prefix, suffix, etc.) which wouldn't apply here. If that is the case, each and every variant would have to be added manually.

    The algorithm for Match All.. is complex, but wouldn't have to be modified if FL gave it each British/US variant triggered by a separate British/US Spelling option.

    Dave
    ===

    Windows 11 & Android 13

  • JT (alabama24)
    JT (alabama24) MVP Posts: 36,523

    alabama24 said:

    I wonder how that works. I assume that some basic rules are applied (prefix, suffix, etc.) which wouldn't apply here. If that is the case, each and every variant would have to be added manually.

    The algorithm for Match All.. is complex, but wouldn't have to be modified if FL gave it each British/US variant triggered by a separate British/US Spelling option.

    I agree... if this were possible, this is how it should be implemented. My question, however, is would there be a good way to create rules for this? Would there be many false positives?

    macOS, iOS & iPadOS |Logs| Install
    Choose Truth Over Tribe | Become a Joyful Outsider!

  • Dave Hooton
    Dave Hooton MVP Posts: 36,357

    The logical thing would be to include alternate spellings in Include All Word Forms, and not when that is unselected.

    The most flexible approach is to have a separate option for British/US Spellings (see my other posts).

    Dave
    ===

    Windows 11 & Android 13

  • Dave Hooton
    Dave Hooton MVP Posts: 36,357

    alabama24 said:

    My question, however, is would there be a good way to create rules for this? Would there be many false positives?

    I'd say a lexical approach would work much better than an algorithm based on letter usage e.g. 's' versus 'z'.

    Dave
    ===

    Windows 11 & Android 13

  • Colin
    Colin Member Posts: 256 ✭✭

    Thanks for carrying on to discuss this guys and for giving me a little hope at least. I want to make the point which I left unstated in my first post which is that this 'problem' affects all of us. There were results found using the British spelling query which did not show when using the American one. 

    It's in all our interest either to know how to construct a search which will discover both spellings or to see if the experts can figure out a way to modify the search engine so it will be done automatically.

    Colin

  • Tom Reynolds
    Tom Reynolds Member Posts: 1,460 ✭✭✭

    While I appreciate that this might be difficult it seems clear that people are missing out on results because of spelling variations or errors. e.g. if you are not even aware that there are different UK, etc. spellings for a word you are unlikely to ever find them. When you add in spelling mistakes or old forms of spelling you have quite a mess. For example, to find all the instances of counselling you would have to do at least 5 searches (BTW I entered typo reports for the first two variations).

    couseling (7x in my library)

    conseling (1x)

    counsselling (1x)

    counseling (26503x)

    counselling (1709x)

  • David Ames
    David Ames Member Posts: 2,971 ✭✭✭

    RE: search with British or US spelling 

    https://en.wikipedia.org/wiki/Wikipedia:List_of_spelling_variants 

    Has a list of 158 words that have different spellings

  • JT (alabama24)
    JT (alabama24) MVP Posts: 36,523

    Colin said:

    It's in all our interest either to know how to construct a search which will discover both spellings

    That I can help you with. A search for behaviour OR behavior will find both spellings of the word. The operator "OR" means that only one of the variants has to be true.

    macOS, iOS & iPadOS |Logs| Install
    Choose Truth Over Tribe | Become a Joyful Outsider!

  • Dave Hooton
    Dave Hooton MVP Posts: 36,357

    alabama24 said:

    That I can help you with. A search for behaviour OR behavior will find both spellings of the word.

    They will have different highlight colours in results. Use  behaviour, behavior for the same color.

    Dave
    ===

    Windows 11 & Android 13

  • Suzy
    Suzy Member Posts: 325 ✭✭

    alabama24 said:

    That I can help you with. A search for behaviour OR behavior will find both spellings of the word.

    They will have different highlight colours in results. Use  behaviour, behavior for the same color.

    I wonder if it would be helpful if the predictive text in the search box could offer these options as a reminder? Possibly even in a  different colour? Or even color :-)

  • Dave Hooton
    Dave Hooton MVP Posts: 36,357

    I wonder if it would be helpful if the predictive text in the search box could offer these options as a reminder? Possibly even in a  different colour? Or even color :-)

    It will suggest on a lexical basis if you type e.g. behavio, but you have to make the choice. What you suggest is possible but it implies the search term will be behaviour, behavior whereas a search option would enable British & US spellings or disable it for either word.

    Dave
    ===

    Windows 11 & Android 13