Bug: Very weird indexing/search bug

This is very weird. If I do a search for a Bible reference like <bible ~ 2 Kings 8>, I get search results as if I'd been searching for church! The problem only occurs between 2 Kings 8 and 2 Kings 19. Any other Bible range seems to work fine!!
This is my personal Faithlife account. On 1 March 2022, I started working for Faithlife, and have a new 'official' user account. Posts on this account shouldn't be taken as official Faithlife views!
Comments
-
-
I can't reproduce it but I don't have those resources. If you search for "church" do you get hits on those same results in NNCD?
Dave
===Windows 11 & Android 13
0 -
Dave Hooton said:
If you search for "church" do you get hits on those same results in NNCD?
Yes, searching for church returns exactly the same results as <bible ~ 2 Kings 8> (so long as "search all word forms" is off).
In my main index, I get zero results for church NOTEQUALS <bible ~ 2 Kings 8>, but main results in my supplementary index.
If I run a search for <bible ~ 2 Kings 8> NOTEQUALS church, I get a few results in both indexes (688 resources).
If I run a search for <bible ~ 2 Kings 8> ANDEQUALS church, I get no results in supplementary, and many results in the main index (3,712 resources).
Given that Rosie can duplicate this, and it doesn't affect new resources, I guess it's likely to point to an old indexing bug that's subsequently been fixed.
This is my personal Faithlife account. On 1 March 2022, I started working for Faithlife, and have a new 'official' user account. Posts on this account shouldn't be taken as official Faithlife views!
0 -
Are they note markers (footnotes) ie. do you get a popup when you hover over the word "church"?
EDIT: that is unlikely as you get zero results for church NOTEQUALS <bible ~ 2 Kings 8> - looks like a library rebuild will be needed.
Dave
===Windows 11 & Android 13
0 -
Dave Hooton said:
looks like a library rebuild will be needed
I came to the same conclusion, but I'm holding off for now, in case Bradley wants me to check something.
This is my personal Faithlife account. On 1 March 2022, I started working for Faithlife, and have a new 'official' user account. Posts on this account shouldn't be taken as official Faithlife views!
0 -
I'm not getting results for "Church" or for the NNCD, or other unexpected results, so my guess is that you do need to rebuild your Library index.
0 -
Except Dave and Dominick have very similar problems with 'prophet' rather than 'church'. It does seem that there's an indexing bug somewhere with 2 Kings 8-19. If I fix it on my computer, that's fine for me, but it presumably means that hundreds of thousands of Logos users have a buggy index. I think we need to be confident that the bug has gone before we just fix it on individual machines.
This is my personal Faithlife account. On 1 March 2022, I started working for Faithlife, and have a new 'official' user account. Posts on this account shouldn't be taken as official Faithlife views!
0 -
Mark Barnes said:
Except Dave and Dominick have very similar problems with 'prophet' rather than 'church'
At first, I thought yours might be similar but the "prophets" is an issue of hover content that was supposed to be fixed in B2.
I get significant (non-zero) results for prophets NOTEQUALS <bible ~ 2 Kings 8> and church NOTEQUALS <bible ~ 2 Kings 8>
Dave
===Windows 11 & Android 13
0 -
Mark Barnes said:
This is very weird. If I do a search for a Bible reference like <bible ~ 2 Kings 8>, I get search results as if I'd been searching for church! The problem only occurs between 2 Kings 8 and 2 Kings 19. Any other Bible range seems to work fine!!
It appears that what is happening is that one of the individual references to which <bible ~ 2 Kings 8> expands (e.g., 2 Ki 8:1, 2 Ki 8:2, 2 Ki 8:1-2, 2 Ki 8:3, 2 Ki 8:1-3, 2 Ki 8:2-3, ...) is incorrectly mapped to the actual hits for the word "church", causing those results to appear in a search for Bible references. Unfortunately, there are potentially hundreds of possible references in that range, and even if we found the right one (which would be very time-consuming), there's not much it would actually tell us.
It's hard to say exactly how this could have happened, and transferring your entire 4+ GB index here for analysis is probably impractical. This and the double-counted "the" results in the HCSB OT Quote * search are almost certainly caused by the same underlying problem, though, and since you rebuilt your library index recently, it appears to be a bug that can be reproduced with some frequency. We'll examine the code for potential flaws, but I can't give a good workaround apart from rebuilding the entire index.
Is it possible that your index has been merged since you last rebuilt it, or do you know for sure that you rebuilt it from scratch and it has never been merged since then? (I'd like to know if we can rule out the index merging code, or if we need to examine that for problems, too.)
0 -
Bradley Grainger said:
It appears that what is happening is that one of the individual references to which <bible ~ 2 Kings 8> expands (e.g., 2 Ki 8:1, 2 Ki 8:2, 2 Ki 8:1-2, 2 Ki 8:3, 2 Ki 8:1-3, 2 Ki 8:2-3, ...) is incorrectly mapped to the actual hits for the word "church", causing those results to appear in a search for Bible references. Unfortunately, there are potentially hundreds of possible references in that range, and even if we found the right one (which would be very time-consuming), there's not much it would actually tell us.
Given that it happens with <bible ~ 2 Kings 8> through <bible ~ 2 Kings 19>, but doesn't happen outside that range that suggest that the problematic reference must also span exactly those chapters? If so, wouldn't this search - in a non-bug-affected library narrow it down: (<bible ~ 2 Kings 8> ANDEQUALS <bible ~ 2 Kings 19>) NOTEQUALS (<bible ~ 2 Kings 20>, <bible ~ 2 Kings 7>) very significantly?
Bradley Grainger said:It's hard to say exactly how this could have happened, and transferring your entire 4+ GB index here for analysis is probably impractical.
I'm happy to do that if you wish. I can upload at 1Mbit/s and have plenty of FTP space I can use, so I could run it over night.
Bradley Grainger said:since you rebuilt your library index recently, it appears to be a bug that can be reproduced with some frequency
Actually, with the HCSB bug, I realised from my earlier screenshot that I only rebuilt my bible index, so I can't guarantee I haven't merged, sorry.
This is my personal Faithlife account. On 1 March 2022, I started working for Faithlife, and have a new 'official' user account. Posts on this account shouldn't be taken as official Faithlife views!
0 -
Mark Barnes said:
If so, wouldn't this search - in a non-bug-affected library narrow it down: (<bible ~ 2 Kings 8> ANDEQUALS <bible ~ 2 Kings 19>) NOTEQUALS (<bible ~ 2 Kings 20>, <bible ~ 2 Kings 7>) very significantly?
That's interesting. I just ran that search on my other 4.1 account (which doesn't have the bug), and it didn't return any potential hits. That doesn't make sense to me, if your diagnosis is accurate.
This is my personal Faithlife account. On 1 March 2022, I started working for Faithlife, and have a new 'official' user account. Posts on this account shouldn't be taken as official Faithlife views!
0 -
Bradley Grainger said:
It appears that what is happening is that one of the individual references to which <bible ~ 2 Kings 8> expands (e.g., 2 Ki 8:1, 2 Ki 8:2, 2 Ki 8:1-2, 2 Ki 8:3, 2 Ki 8:1-3, 2 Ki 8:2-3, ...) is incorrectly mapped to the actual hits for the word "church", causing those results to appear in a search for Bible references.
Further, <bible ~ 2 Kings 8:1-29> returns results correctly, though <bible ~ 2 Kings 8> does not. Likewise right through to 2 Kings 19.
This is my personal Faithlife account. On 1 March 2022, I started working for Faithlife, and have a new 'official' user account. Posts on this account shouldn't be taken as official Faithlife views!
0 -
In running some diagnostics for http://community.logos.com/forums/t/26594.aspx, I found that searching my own index for <= Ps 3:title-8> returns hits for "intertestamental". Unfortunately, I now have to rewrite my answer for that post, but I now have reproduced this bug locally. [:)]
0 -
Bradley Grainger said:
I found that searching my own index for <= Ps 3:title-8>
Bradley, please explain the syntax, especially the -8 part!
Dave
===Windows 11 & Android 13
0 -
Dave Hooton said:Bradley Grainger said:
I found that searching my own index for <= Ps 3:title-8>
Bradley, please explain the syntax, especially the -8 part!
That's probably Psalm 3:title through verse 8. Similar to Psalm 3:1-8, but starting with the title instead of verse 1.
MacBook Pro (2019), ThinkPad E540
0 -
Dave
===Windows 11 & Android 13
0 -
Dave Hooton said:
We introduced "Psalm 3:title" (and similar) in Logos 4 to allow English Bibles to address psalm superscription verses in a uniform way. While strange looking (especially in a range), this notation lets you explicitly address the verse, e.g., in Text Comparison.
0 -
I've found the reason that this bug occurs, and it will be fixed in the next beta. (Specifically, if a term is removed during index merging, that term will be preserved in the merged index, but incorrectly mapped onto the initial term in the index. Searches for that term will then return the hits for the index's first term (which is "intertestamental" for me, and "church" for you).) On my system, 149 terms are affected; it's likely that the same number could be affected for you. Data type reference terms are just as likely to be affected as textual terms; however, an "OR" query that comprises thousands of data type reference terms is easier to generate than a textual query with thousands of terms, so that's why it's easier to discover the problem with Bible references.
0 -
Bradley Grainger said:
I've found the reason that this bug occurs, and it will be fixed in the next beta.
Well done!
Bradley Grainger said:Specifically, if a term is removed during index merging
Please explain. If it is removed how can it be used later?
Dave
===Windows 11 & Android 13
0 -
Very good. Thanks. Can you detect whether an index has been merged, and will you force a re-index for those indexes?
This is my personal Faithlife account. On 1 March 2022, I started working for Faithlife, and have a new 'official' user account. Posts on this account shouldn't be taken as official Faithlife views!
0 -
Dave Hooton said:Bradley Grainger said:
Specifically, if a term is removed during index merging
Please explain. If it is removed how can it be used later?
The term was "removed" in the sense that it no longer occurs in the input data. (E.g., a book is updated, and a typo is removed; that word no longer occurs in your library.) However, it's not removed from the index file itself, so it's still possible to search for it and get incorrect results.
0 -
Mark Barnes said:
Can you detect whether an index has been merged, and will you force a re-index for those indexes?
We can detect if an index has been merged, but we won't force re-indexing because we don't need to. We can detect this type of term at runtime and ignore it.
0 -
Bradley Grainger said:
We can detect if an index has been merged, but we won't force re-indexing because we don't need to. We can detect this type of term at runtime and ignore it.
Does this mean that a fix to the Index Merge is not needed? Has the "runtime" (or any) fix been incorporated in RC1?
Dave
===Windows 11 & Android 13
0 -
Dave Hooton said:Bradley Grainger said:
We can detect if an index has been merged, but we won't force re-indexing because we don't need to. We can detect this type of term at runtime and ignore it.
Does this mean that a fix to the Index Merge is not needed? Has the "runtime" (or any) fix been incorporated in RC1?
We are still investigating whether a bigger fix is needed in index merging, but RC1 will detect whether this situation has been created in the past, and ignore the terms that were showing incorrect results. (I.e., Mark should be able to run the 2 Kings 8 search and not see any results for "church" with 4.2 RC 1.)
0 -
Bradley Grainger said:
We are still investigating whether a bigger fix is needed in index merging, but RC1 will detect whether this situation has been created in the past, and ignore the terms that were showing incorrect results.
Thanks, Bradley.
Dave
===Windows 11 & Android 13
0 -
Bradley Grainger said:
Mark should be able to run the 2 Kings 8 search and not see any results for "church" with 4.2 RC 1.
Quite true. Thanks!
This is my personal Faithlife account. On 1 March 2022, I started working for Faithlife, and have a new 'official' user account. Posts on this account shouldn't be taken as official Faithlife views!
0