SEARCH Punctuation question for Bradley, Dave or other super search nerd.
I just discovered that my second rule of building a search is wrong. It was my understanding that one could not use punctuation in a search term e.g. here's would be treated as heres. It was also my understanding that FileFormat.Info was our "official" super-nerd site to identify Unicode characters that are considered "punctuation".
I was wrong. "aint" and "ain't" produce different results - the apostrophe being treated as a legit letter although defined as punctuation at https://www.fileformat.info/info/unicode/category/Po/list.htm. What is the actual treatment of punctuation?
Orthodox Bishop Alfeyev: "To be a theologian means to have experience of a personal encounter with God through prayer and worship."; Orthodox proverb: "We know where the Church is, we do not know where it is not."
Comments
-
MJ. Smith said:
I just discovered that my second rule of building a search is wrong. It was my understanding that one could not use punctuation in a search term e.g. here's would be treated as heres. It was also my understanding that FileFormat.Info was our "official" super-nerd site to identify Unicode characters that are considered "punctuation".
I was wrong. "aint" and "ain't" produce different results - the apostrophe being treated as a legit letter although defined as punctuation at https://www.fileformat.info/info/unicode/category/Po/list.htm. What is the actual treatment of punctuation?
Not a super search nerd (have never heard of your file reference), but I think it was Bradley who recently corrected my mis-assumption that punctuation simply is left out: I was informed This only is correct between words, but not within words. The apostrophe within "ain't" then should make a difference. I'll look up the thread later.
Have joy in the Lord!
0 -
Thanks - that matches what I'm seeing but doesn't match some earlier answers in the forums.
Orthodox Bishop Alfeyev: "To be a theologian means to have experience of a personal encounter with God through prayer and worship."; Orthodox proverb: "We know where the Church is, we do not know where it is not."
0 -
MJ. Smith said:
Thanks - that matches what I'm seeing but doesn't match some earlier answers in the forums.
The discussion was in a Faithlife Search group thread about a bug with multiple "OR"s (fixed with 8.13) about finding what in the end turned out to be "Beni Na'im", "Beni Naʿim", "Beni Na im", "Beni Naim" (you can have OR instead of the commas now that the bug is fixed, but different-style apostrophes are different from each other and different from space and different from simply leaving them off, and regardless off the match-all-forms setting) and Bradley commented on my cited side remark as follows:
"Intra-word punctuation has always been indexed."
Actually that wasn't what I had remembered, but maybe there just was no need to discuss intra-word punctuation before.
Have joy in the Lord!
0 -
NB.Mick said:
"Intra-word punctuation has always been indexed."
The classic case is to Search for God with Match all word Forms; which will provide "God's" as a result (as well as "gods"). To search for "God's" only, you have to enter God's, and turn off Match all word Forms.
Hyphenated words are also indexed e.g. intra-text, and you also find them with "intra text" (quotes included).
Dave
===Windows 11 & Android 13
0 -
Dave Hooton said:
To search for "God's" only, you have to enter God's, and turn off Match all word Forms.
Alternative (with Match all word Forms checked) using an advanced search directive is:
Keep Smiling [:)]
0