Changes for searching in 4.2a Beta 2

4.2a introduces a new index file format that is designed to speed up certain “AND” and phrase searches.
IMPORTANT NOTE: Beta 1 can create the new index, but there is no speed improvement when searching it. This is likely to be due to a bug in Beta 1 that we will fix in Beta 2.
Until further notice, the information below does NOT apply to 4.2a Beta 1.
All new indexes will be built in this format by default, but installing Beta 1 will not automatically reindex your entire library. To test the new index format, run “rebuild indexes” (and wait a few hours/overnight depending on the size of your library).
What’s changed?
- Searches that require both a frequently-occurring word and an infrequently-occurring word will execute more quickly (e.g., http://community.logos.com/forums/p/17900/197369.aspx#197369)
- Searches in a single book are faster (e.g., search result hits should appear very quickly when opening a result from the Search panel; this will also improve visual filters).
What hasn’t changed?
- Searches for a single word (e.g., “the” or “God”) across the entire library
- Searches for “*”
- Searches that use OR or the comma operator
What’s still coming (possibly in the 4.2a beta cycle, possibly later)?
- There’s an inefficiency in phrase searches that makes them run slower than we would like.
- We may be able to make improvements to both indexing and index merging times.
- Expanding/collapsing the “By Book” search results hasn’t been updated to use the new search engine yet.
Comments
-
Thank you for that good news, Bradley. I'm still holding out hope that you guys will optimize "*" searching so we can use it to find all our highlighting.
And sort of unrelated but still having to do with Search: it would be awesome if we could expand and collapse books in the "By Book" view without it having to run the entire search over again.
0 -
Rosie Perera said:
it would be awesome if we could expand and collapse books in the "By Book" view without it having to run the entire search over again.
This is under consideration (we count it as part of the "By Book" note under "What's still coming?").
0 -
Bradley Grainger said:
“By Book” search results hasn’t been updated to
use the new search engine yet.So we have to use Ranked to get the improvements?
Dave
===Windows 11 & Android 13
0 -
Dave Hooton said:Bradley Grainger said:
“By Book” search results hasn’t been updated to
use the new search engine yet.So we have to use Ranked to get the improvements?
Sorry, that was confusing. "By Book" will use the new search engine, but collapse/expand is not improved. I'll edit my original post.
0 -
Bradley Grainger said:
Searches in a single book are faster (e.g., search result hits should appear very quickly when opening a result from the Search panel; this will also improve visual filters).
Another ambiguity:-
- the search of a single book vs. multiple books, is faster? AND/OR
- displaying the hit highlights in any resource is faster.
Dave
===Windows 11 & Android 13
0 -
Dave Hooton said:
- the search of a single book vs. multiple books, is faster? AND/OR
- displaying the hit highlights in any resource is faster.
Those are both true.
0 -
Bradley Grainger said:
Searches that require both a frequently-occurring word and an infrequently-occurring word will execute more quickly (e.g., http://community.logos.com/forums/p/17900/197369.aspx#197369)
Rebuilt the indexes but the search for dogs and evildoers takes the same 7s as it did in 4.2!
Dave
===Windows 11 & Android 13
0 -
Bradley Grainger said:Dave Hooton said:
- the search of a single book vs. multiple books, is faster? AND/OR
- displaying the hit highlights in any resource is faster.
Those are both true.
In the case of the HCSB wilcard search for OT Quotations the hit highlights took about 5 minutes to show in the resource vs. 10 minutes on 4.2, but the Search itself was not faster.
Dave
===Windows 11 & Android 13
0 -
Dave Hooton said:Bradley Grainger said:
Searches that require both a frequently-occurring word and an infrequently-occurring word will execute more quickly (e.g., http://community.logos.com/forums/p/17900/197369.aspx#197369)
Rebuilt the indexes but the search for dogs and evildoers takes the same 7s as it did in 4.2!
Oh, whine whine. [;)] Some of us would love to have 7s searches!! (You know I'm just tweaking you. I wish you could have even faster searches, too.)
0 -
Dave Hooton said:Bradley Grainger said:
Searches that require both a frequently-occurring word and an infrequently-occurring word will execute more quickly (e.g., http://community.logos.com/forums/p/17900/197369.aspx#197369)
Rebuilt the indexes but the search for dogs and evildoers takes the same 7s as it did in 4.2!
Hmm... something's wrong. I've just run some test searches at home, and (as you have observed) there is no speed increase. The new index file format is getting written to disk, but the search optimisations don't seem to be getting applied. For now, we'll have to assume that this improvement is not in Beta 1, but will be coming in Beta 2. (As far as I can tell so far, there is no harm in "rebuild indexes" right now; Beta 2 should be able to read the search index file format that Beta 1 has created and search it more quickly. I'll post updates as they are available.) I will update the original post.
0 -
Bradley Grainger said:
Hmm... something's wrong.
Unfortunately, this is a bug that slipped into Beta 1 at the last minute. The good news is that the indexes created by Beta 1 (if you have reindexed) are fine, and there will be no need to rebuild them for Beta 2. The other good news is that this bug is already fixed for Beta 2. The bad news is that Beta 2 is not available yet, so this new index feature can't be tested.
And just to taunt you [;)], here are some benchmarks for how searching should improve in the next beta:
- "look out for the dogs, look out for the evildoers" -- Old: 9.7s -- New: 0.5s
- "prince of the power of the air" -- Old: 13.1s -- New: 1.4s
- "from age to age the same" -- Old: 7.1s -- New: 3.0s
- "the god of this age" -- Old: 8.8s -- New: 5.4s
- "the order of melchizedek" -- Old: 6.7s -- New: 0.5s
(The phrase searches that still take multiple seconds with 4.2a Beta 2 are affected by the "inefficiency in phrase searches" (mentioned in the original post) that we hope to address in a future beta.)
0 -
Bradley Grainger said:
And just to taunt you
, here are some benchmarks for how searching should improve in the next beta:
- "look out for the dogs, look out for the evildoers" -- Old: 9.7s -- New: 0.5s
- "prince of the power of the air" -- Old: 13.1s -- New: 1.4s
- "from age to age the same" -- Old: 7.1s -- New: 3.0s
- "the god of this age" -- Old: 8.8s -- New: 5.4s
- "the order of melchizedek" -- Old: 6.7s -- New: 0.5s
Yowza!
Robert Pavich
For help go to the Wiki: http://wiki.logos.com/Table_of_Contents__
0 -
Holy moly....
I don't know what's happening but even while indexing this is so fast it boggles my mind.
I just did a BWS and the results were instant...and I mean INSTANT...no "appearing"...but PLUNK!
Robert Pavich
For help go to the Wiki: http://wiki.logos.com/Table_of_Contents__
0 -
Bradley Grainger said:
And just to taunt you
, here are some benchmarks for how searching should improve in the next beta:
- "look out for the dogs, look out for the evildoers" -- Old: 9.7s -- New: 0.5s
- "prince of the power of the air" -- Old: 13.1s -- New: 1.4s
- "from age to age the same" -- Old: 7.1s -- New: 3.0s
- "the god of this age" -- Old: 8.8s -- New: 5.4s
- "the order of melchizedek" -- Old: 6.7s -- New: 0.5s
So we can do our own benchmarking on these, were these Basic searches of All Text in Entire Library?
0 -
Dominick Sela said:
So we can do our own benchmarking on these, were these Basic searches of All Text in Entire Library?
The first was the Daddy of the phrases and that was as you state. I assume the others are the same.
Dave
===Windows 11 & Android 13
0 -
Bradley Grainger said:
And just to taunt you
, here are some benchmarks for how searching should improve in the next beta:
My times (All Text in Entire Library) with 972 resources in B1 with rebuilt index:-
- "look out for the dogs, look out for the evildoers" -- Old: 7s
- "prince of the power of the air" -- Old: 9.5s
- "from age to age the same" -- Old: 5.1s
- "the god of this age" -- Old: 6.5s
- "the order of melchizedek" -- Old: 4.8s
Note: I ran the searches ONCE in above order. When I re-ran there was only 0.1s difference (quad core!).
Dave
===Windows 11 & Android 13
0 -
Dominick Sela said:
So we can do our own benchmarking on these, were these Basic searches of All Text in Entire Library?
Yes.
0 -
W00t!Bradley Grainger said:And just to taunt you
, here are some benchmarks for how searching should improve in the next beta:
Sarcasm is my love language. Obviously I love you.
0 -
Beta 2. The basic search for "the dogs" phrase went from 16 seconds to 1 second. Impressive!
0 -
My times with 974 resources (4.2 B1 vs B2):-
- "look out for the dogs, look out for the evildoers" -- Old: 7s New: 0.44s
- "prince of the power of the air" -- Old: 9.5s New: 0.9s
- "from age to age the same" -- Old: 5.1s New: 1.8s
- "the god of this age" -- Old: 6.5s New: 3.1s
- "the order of melchizedek" -- Old: 4.8s New: 0.48s
Dave
===Windows 11 & Android 13
0 - "look out for the dogs, look out for the evildoers" -- Old: 7s New: 0.44s
-
12 seconds down to 0.59 seconds for the Dog search!
Robert Pavich
For help go to the Wiki: http://wiki.logos.com/Table_of_Contents__
0 -
where's that Jealous icon when you need it - trying not to covet the development machine - still impressive:
on 3595 resources
"look out for the dogs, look out for the evildoers" dropped from 47secs to 7.3
"prince of the power of the air" down from 96 secs to 24.15
Never Deprive Anyone of Hope.. It Might Be ALL They Have
0 -
Rosie Perera said:
Oh, whine whine.
Some of us would love to have 7s searches!! (You know I'm just tweaking you. I wish you could have even faster searches, too.)
I know the indexing might take a day or two but even your machine should have 7s with Beta 2[:D]
Dave
===Windows 11 & Android 13
0 -
Hmm…my figures are a bit odd: some good increases (though nothing as spectacular as Dave), but two searches are barely affected.
"look out for the dogs, look out for the evildoers" 29.79 9.15 225% increase "prince of the power of the air" 37.99 11.39 233% increase "from age to age the same" 21.79 21.93 1% decrease "the god of this age" 29.92 27.37 9% increase "the order of melchizedek" 18.94 3.25 483% increase This is my personal Faithlife account. On 1 March 2022, I started working for Faithlife, and have a new 'official' user account. Posts on this account shouldn't be taken as official Faithlife views!
0 -
Dave
===Windows 11 & Android 13
0 -
In the stable version the dog phrase took 2.88 sec in new resources and 28.51 in the entire library. This was without quotes. With quotes the scroes are 0.63 in new resources and 16.71 in the entire library.
In Beta 4.2a Beta 2 no quotes gave me 8.76 and with quotes 4.56. I have 2,347 resources in my library.
Mission: To serve God as He desires.
0 -
Lynden Williams said:
This was without quotes
Without quotes it is not a phrase as it just finds the words in any order.
Lynden Williams said:With quotes the scroes are 0.63 in new resources and 16.71 in the entire library
If the search was By Book the time is 16.71!
Lynden Williams said:with quotes 4.56
A noticeable improvement.
Dave
===Windows 11 & Android 13
0 -
Dave Hooton said:
What are your machine specs?
This is a Core 2 Duo E8300 with 6Gb RAM and a 15,000rpm HDD. My library is a shade over 4,000 resources. I tried a re-index overnight, but my figures this morning are similar.
This is my personal Faithlife account. On 1 March 2022, I started working for Faithlife, and have a new 'official' user account. Posts on this account shouldn't be taken as official Faithlife views!
0 -
Mark Barnes said:Dave Hooton said:
What are your machine specs?
This is a Core 2 Duo E8300 with 6Gb RAM and a 15,000rpm HDD. My library is a shade over 4,000 resources. I tried a re-index overnight, but my figures this morning are similar.
My times on the laptop for v.4.2 vs. v4.2a are:-
- "look out for the dogs, look out for the evildoers" -- 15.2s New: 0.9s
- "prince of the power of the air" -- 20.6s New: 2s
- "from age to age the same" -- 12.4s New: 3.6s
- "the god of this age" -- 20.2s New: 5.7s
- "the order of melchizedek" -- 14.7s New: 0.9s
Times on 4.2 can be about 1.5s faster. Times on 4.2a are within 0.2s and about 2x slower than the quad-core desktop!
Laptop is Core 2 Duo T7200, 2 GB RAM, 5400rpm HDD! I'm struggling to explain the relative lack of improvement on your machine.
Dave
===Windows 11 & Android 13
0 - "look out for the dogs, look out for the evildoers" -- 15.2s New: 0.9s
-
Mark Barnes said:
Hmm…my figures are a bit odd: some good increases (though nothing as spectacular as Dave), but two searches are barely affected.
"look out for the dogs, look out for the evildoers" 29.79 9.15 225% increase "prince of the power of the air" 37.99 11.39 233% increase "from age to age the same" 21.79 21.93 1% decrease "the god of this age" 29.92 27.37 9% increase "the order of melchizedek" 18.94 3.25 483% increase I have similar times. The only one I checked before the upgrade was the "dogs/evildoers" quote and my before and after times were similar. My current times for the others are similar to yours, e.g., "the god of this age" is 33.06 secs, on my dual core laptop.
The ones with the small speed increase are ones with mostly(all?) common words. Based on previous comments from Bradley, I believe that the search is doing an "AND" on all the words in a phrase as a first step (i.e., treating it not like a phrase), and then checking the results for word order. The index optimization is only helps the first step, so searches like "the god of this age" will have a lot of hits when not considered as a phrase, and then searching those preliminary results for exact phrase order is the time consuming part.
MacBook Pro (2019), ThinkPad E540
0 -
Bradley Grainger said:
4.2a introduces a new index file format that is designed to speed up certain “AND” and phrase searches.
Clarification: does this mean that in order to benefit from said change I need to rebuild my index?
Sarcasm is my love language. Obviously I love you.
0 -
Thomas Black said:
Clarification: does this mean that in order to benefit from said change I need to rebuild my index?
Yes
Bradley Grainger said:All new indexes will be built in this
format by default, but installing Beta 1 will not automatically reindex
your entire library. To test the new index format, run “rebuild indexes” (and wait a few hours/overnight depending on the size of your library).Prov. 15:23
0 -
-
Good grief
all I had to do was read the next sentence?
Oy, I need more coffee. [c]
Sarcasm is my love language. Obviously I love you.
0 -
Mark Barnes said:
Hmm…my figures are a bit odd: some good increases (though nothing as spectacular as Dave), but two searches are barely affected.
"look out for the dogs, look out for the evildoers" 29.79 9.15 225% increase "prince of the power of the air" 37.99 11.39 233% increase "from age to age the same" 21.79 21.93 1% decrease "the god of this age" 29.92 27.37 9% increase "the order of melchizedek" 18.94 3.25 483% increase I had not tested any but the first of these prior to 4.2a B2, but I got an 86% increase for that one
4.2 RC2 (= Gold) 4.2a B2 improvement "look out for the dogs, look out for the evildoers" 234.07 31.58 85.5% faster A search for "the order of melchizedek" isn't much faster (26.37 sec). But this is with my machine pretty overloaded (35 tabs on my Taskbar!), memory usage at 73%, and everything running sluggish. I haven't rebooted in several days. I need to shut everything down, reboot and try it fresh.
0 -
I think there's a bug been introduced. I just searched for the BEFORE 1 WORD god BEFORE 1 WORD of BEFORE 1 WORD this BEFORE 1 WORD age, just as a test. It managed to complete the my content search in 4,224.73s (that's one hour 10 minutes). But the rest of the search is still going - a day and a half later! One core of my CPU has been at 100% ever since (at least I know my cooling's working, right?).
This is my personal Faithlife account. On 1 March 2022, I started working for Faithlife, and have a new 'official' user account. Posts on this account shouldn't be taken as official Faithlife views!
0 -
Mark Barnes said:
I think there's a bug been introduced. I just searched for the BEFORE 1 WORD god BEFORE 1 WORD of BEFORE 1 WORD this BEFORE 1 WORD age, just as a test. It managed to complete the my content search in 4,224.73s (that's one hour 10 minutes). But the rest of the search is still going - a day and a half later! One core of my CPU has been at 100% ever since (at least I know my cooling's working, right?).
Had you ever tried this search before, so that you know for sure it's been "introduced" rather than having been there all along?
0 -
Rosie Perera said:
Had you ever tried this search before, so that you know for sure it's been "introduced" rather than having been there all along?
Not this exact search, but experimented with similar searches when there were some problems with phrase searching in Hebrew Bibles. So I'm not sure, but I think it's a new bug. I'll go and try it in my 4.1 installation.
This is my personal Faithlife account. On 1 March 2022, I started working for Faithlife, and have a new 'official' user account. Posts on this account shouldn't be taken as official Faithlife views!
0 -
Mark Barnes said:
I think there's a bug been introduced.
It's an inherent weakness. It will search for the BEFORE 1 WORD god quite happily but bogs down as you introduce more terms. So the BEFORE 1 WORD god BEFORE 1 word of takes a long time in 4.2 and 4.2a (I gave up after 5 minutes).
trust BEFORE 1 word in BEFORE 1 WORD him takes about 2s presumably because it has less common words. OTOH trust BEFORE in BEFORE 1 WORD him produces some bizarre results; and I still await the verdict on the "correctness" of the syntax.
Dave
===Windows 11 & Android 13
0 -
Dave Hooton said:
trust BEFORE 1 word in BEFORE 1 WORD him takes about 2s presumably because it has less common words.
Stop leading me into temptation brother [:D] 2 secs IF only...
it takes mine 72+,
Never Deprive Anyone of Hope.. It Might Be ALL They Have
0 -
Rest easy! My result covers 354 resources which is roughly 1/3 of my library.
Dave
===Windows 11 & Android 13
0