Another plea for optimising the Parallel Resource menu

I've been running some tests on the parallel resource menu, as I'm really struggling with it's efficiency. I rely on it heavily in my work flow, and it's a real bottleneck (rather than opening commentaries/dictionaries from a PG or BWS, and have dozens of tabs, I like to select one at a time from the PR menu).

I decided to run some benchmarks with Process Monitor running. I know very well that gives me only the smallest insight into what Logos is actually doing, and lots of my guesses will be quite wrong. I'm also certain that some things that seem daft to end-users have perfectly logical and sensible explanations.

Nevertheless, there seems to be a huge amount of inefficiency in the parallel resource look up given that there are indexes to all the headwords.

This is for created a PR menu from Enuma Elish in AYBD, but it wouldn't really matter. (I've run this test several times before benchmarking, so this is best case scenario - my Windows cache is well and truly primed.)

First, Logos searches through each of my collections which have "Show in Parallel Resources" to see whether AYBD appears in any of them. Time taken is 2.6691 seconds.
Second, Logos queries catalog.lxn, which seems to be for word-stemming (takes about 0.2 seconds).
Third, Logos gets a list of all resources with the same type as the one we're looking up the PRS for (I'm presuming this). This proved not to be true (see inefficiency 3 below).
Fourth, for every resource with this type
- Logos access the resource and reads several thousand bytes of data. Time taken = 0.005s.
- Logos queries the index (these are the .mstidx files). Time taken = 0.005s.
Fifth, if the resource is found in the index, the resource is accessed again = 0.004s
Sixth, the Library Catalog is accessed, probably to get cover images and book names. This takes 0.6 seconds.
Seventh, a series of apparently random books are all accessed. In this test, it included several volumes of Holman's commentaries, the Logos Hymnal, 1000 illustrations, (see inefficiency 3 below)

Likely Inefficiency #1: During stage 1, nested queries aren't cached, so several queries
run multiple times. In my case that means 14 unnecessary queries taking
a total time of 1.357s (50.8% time wasted).

Likely Inefficiency #2: Most of the resources don't contain a reference to be looked up. Yet despite this the resource is always access, even before the index is checked. Why? 0.005s doesn't seem like very much, but if you have 1,300 commentaries, as I do that's 6.5 seconds seemingly wasted.

Likely Inefficiency #3: During stage 7, dozens of resources are looked up, even though they'll never appear in the PRS. They contain a headword index, but they're of the wrong type (monographs or commentaries) so they'll never appear in the list. If they won't show up, why bother searching them?

Likely Inefficiency #4: None of this is cached, so if I do a lookup on the same headword from a different resource a few seconds later the whole process will run again. But surely there's no need. If the headword is the same, the results will be the same. Likewise if I do a lookup on a different headword from the same resource, then steps 1 and 3, and part of step 6 can be cached.

Like I said, I could be barking up the wrong tree entirely, and if so I'm sorry. But this is born out of frustration of a great feature that is frustratingly slow, and for which I long to improve.

Find more posts tagged with

Comments

Mark Smith

Well I'm impressed with the analysis at this point and your conclusions. I agree that there has to be unnecessary work being done, esp. when you look up the same headword in another resource. Caching would seem to be the answer for that. A second look-up should be almost instantaneous.

I would think what is happening first could also be helped through an index of one's collections being searched rather than a fresh search each time one accesses parallel resources.

I do hope Logos will find a way to make this operation much faster. Like you I use this feature frequently and often my run times are significantly longer than what you are experiencing.

Rosie Perera

https://community.logos.com/discussion/comment/175768#Comment_175768

I agree this needs to be sped up. I would use this feature more if it were faster.

Jacob Hantla

I agree with the need for more speed. I actually usually just access the parallel resource I want manually from "My Library" because of the slowness of parallel resources.

Mark Barnes

https://community.logos.com/discussion/comment/175771#Comment_175771

I've just created a database of all my headwords so I could run speed tests. If Logos created a single headword index (instead of one index per resource as currently), then a search for all resources matching the headword could be reduced to around 0.3s on my system (and 0.001 when cached). You'd still need to search the collections, and get the library data from the catalog, and sort by prioritisation, but each of those database calls should be less quicker than the initial search. These tests prove (I think!) that it's quite feasible to get the whole menu up in 1-2 seconds, even with a very large library.

Trey Selman

Mark, thanks for taking the time to do this analysis.

I, too, use this feature quite heavily and hope to see it optimized.

For my work-flow, this feature exhibits the greatest amount of sluggish performance.

How much more wonderful it would be if the momentary and light frustration of the sluggish performance was gone!

Dave Hooton

I'm really struggling with it's efficiency

If you look at your PRS in some instances the resource has no value for the reference whereas All Parallel Resources will not include that resource! I posted a couple of bug reports and I'm still puzzled by this response!

But this is born out of frustration of a great feature that is frustratingly slow,

I have 17 collections of which 8 are PRS so I get acceptable performance (2s) most of the time. But it can be an unacceptable 9s when a particular index (eg. bible) is used for the first time (see Laptop specs below). With better performance I would not be artificially limiting the number of PRS.

Mark Barnes

https://community.logos.com/discussion/comment/175794#Comment_175794

I'm still puzzled by this response!

What's really puzzling about that response is the situation Melissa wants to avoid is exactly the way it is currently designed! (And, of course, if it can do it for the All Parallel Resource menu it can do it for a collections menu.)

Todd Phillips

https://community.logos.com/discussion/comment/175779#Comment_175779

If Logos created a single headword index (instead of one index per resource as currently), then a search for all resources matching the headword could be reduced to around 0.3s on my system (and 0.001 when cached).

If they did this, then they would also be able to provide us with a true headword search in the search window, which we want to be able to do.

spitzerpl

https://community.logos.com/discussion/comment/175770#Comment_175770

I agree this needs to be sped up. I would use this feature more if it were faster.

Same here. I see the value but never use it because of performance.

Robert Pavich

https://community.logos.com/discussion/comment/175843#Comment_175843

Same here. I see the value but never use it because of performance.

Ditto

Melissa Snyder

The case to optimize parallel resource sets is still open. I've added a link to this thread to it.

Ron

https://community.logos.com/discussion/comment/175843#Comment_175843

I agree this needs to be sped up. I would use this feature more if it were faster.

Same here. I see the value but never use it because of performance.

[Y] Same situation with me.

Mark Barnes

Likely Inefficiency #3: During stage 7, dozens of resources are looked up, even though they'll never appear in the PRS. They contain a headword index, but they're of the wrong type (monographs or commentaries) so they'll never appear in the list. If they won't show up, why bother searching them?

I've realised I'm wrong about this. Non-dictionaries do appear in headword look-ups if they're appropriated indexed (e.g. like the Africa Bible Commentary). Is this new?

Mark Barnes

https://community.logos.com/discussion/comment/175843#Comment_175843

I see the value but never use it because of performance.

Just a note of encouragement. The more you use the feature, the quicker it gets - particularly if you have lots of RAM and a fast HDD.

In my experience commentary lookups are the slowest (simply because we have more commentaries than dictionaries, and probably because differing verse ranges make verse matching is fuzzier than headword matching).

Where I have many commentaries on a Bible book, I can find the first time I use the PRS menu take more than 1 minute. But subsequent usages are usually down around the 7-10 seconds mark. Still unnecessarily slow and frustrating, but perhaps worth persevering with.

Dave Hooton

https://community.logos.com/discussion/comment/175964#Comment_175964

Just a note of encouragement. The more you use the feature, the quicker it gets

It also becomes quicker by getting more accurate! Here's my PRS Lexicons on a lookup of G5467 in ESL (Enhanced Strong's):-

All Parallel Resources lists only the resources that contain the reference (the New American Standard Dictionary is not in my Collection). My PRS lists the 11 Greek lexicons from the collection of 18, so it has done some filtering! However, 6 lexicons don't even have an Index for Strong's Greek.

After 4 or 5 repeats my PRS resembles reality except for the inclusion of TDNT, but it should be correct first time.

Dave Hooton

Fourth, for every resource with this type

Needs amending in light of 3 (not the same types)

Seventh, a series of apparently random books are all accessed. In this test, it included several volumes of Holman's commentaries, the Logos Hymnal, 1000 illustrations, (see inefficiency 3 below)

What is Stage 7 now?

Likely Inefficiency #3: During stage 7, dozens of resources are looked up, even though they'll never appear in the PRS. They contain a headword index, but they're of the wrong type (monographs or commentaries) so they'll never appear in the list. If they won't show up, why bother searching them?

As I've shown resources that don't even have an index for the reference are included in the PRS. Resource type is not the issue because my collections will include different types having the same Index ie. data type. My commentaries include type:Bible Notes, and they should be included if they have the reference.

Russ Quinn

https://community.logos.com/discussion/comment/176021#Comment_176021

Thanks, Mark for your thorough analysis.

I'll have to leave the finer points of how all of this works to you and Dave.

I don't understand all that goes on under the hood enough to offer advice on actual optimizations.

However, I wonder if there are ways to at least change the user's perception of the speed of this feature.

Two suggestions in particular:

1. Is there a way to execute the search as a background process in anticipation of clicking the button instead of waiting on the user to click the button?
Perhaps the list could pre-populate for the current reference or headword whenever scrolling pauses for 5 seconds or so?

2. Could the results of the search be listed in the resource window as they are found instead of waiting for the entire search to be completed before any of the relevant resources are displayed?

Don't get me wrong . . . I'm all for optimizing the process to remove whatever inefficiencies are really there.
But most of the time, I feel like I am waiting for it to complete a search for more resources than I will probably use.

Fred Greco

https://community.logos.com/discussion/comment/175911#Comment_175911

I agree this needs to be sped up. I would use this feature more if it were faster.

Same here. I see the value but never use it because of performance.

Same situation with me.

I appreciate Mark's hard work in trying to figure this out (it is way beyond me), but these comments are the key for Logos. I also do not use this feature, even though it would be of great benefit almost every time I use Logos. It makes collections more usable, workspaces cleaner, RAM usage less (since less tabs need to be open) and makes commentaries much more valuable.

That is, it would, if it was at all usable. It is maddening to be working on a passage, reading one commentary, and have to click and wait like two minutes for the spinning wheel. It might even encourage users to buy more commentaries, as they would be more usable.

Right now, this feature might as well be removed. That is how bad it operates, IMO.

Westie

https://community.logos.com/discussion/comment/176066#Comment_176066

That is, it would, if it was at all usable. It is maddening to be working on a passage, reading one commentary, and have to click and wait like two minutes for the spinning wheel. It might even encourage users to buy more commentaries, as they would be more usable.

This is one of the strengths of L3, and why I still use L3. When I know that I will be reviewing various commentaries, dictionaries or lexicons, I use L3 because it is faster and easier. I feel that I have better access to books in L3, than I do in L4. I like L4, but this area is a real weakness for people that have a large library. I hope this changes.

Mark

Dave Hooton

https://community.logos.com/discussion/comment/176051#Comment_176051

1. Is there a way to execute the search as a background process in anticipation of clicking the button instead of waiting on the user to click the button?
Perhaps the list could pre-populate for the current reference or headword whenever scrolling pauses for 5 seconds or so?

You're on track with what could be accomplished:-

pre-populate for current reference if there is a pause (and sync'ing doesn't coincide!); and then
cache the results so the search does not have to be performed next time for the same reference.

It may not be possible to avoid a delay for the first access of PRS' ie. if you wait 5 secs you've already introduced a delay, even if the PRS is completed in that time!

2. Could the results of the search be listed in the resource window as they are found instead of waiting for the entire search to be completed before any of the relevant resources are displayed?

That would be messy because the resources need to appear in order of prioritization ie. after all resources have been found.

Mark Barnes

https://community.logos.com/discussion/comment/176007#Comment_176007

All Parallel Resources lists only the resources that contain the reference (the New American Standard Dictionary is not in my Collection). My PRS lists the 11 Greek lexicons from the collection of 18, so it has done some filtering! However, 6 lexicons don't even have an Index for Strong's Greek.

I have the opposite problem - almost nothing in my PRS for that reference! Certainly some kind of bug - perhaps with ESL itself?

I might do some more work on this and take it to the other thread if I have time.

Dave Hooton

https://community.logos.com/discussion/comment/176066#Comment_176066

I also do not use this feature, even though it would be of great benefit almost every time I use Logos. It makes collections more usable, workspaces cleaner, RAM usage less (since less tabs need to be open) and makes commentaries much more valuable.

It's part of the price for having dynamic collections, but I don't think it's a necessary price if we allow a few compromises ie. don't assume our collections will change every second and even if their content did change in the course of a session is that necessarily bad!? I'm not aware of metadata changes that affect my collections until something/someone brings it to my attention! But if my collections were indexed at startup this would vastly improve the efficiency of generating PRS's for a given reference. Possible triggers to perform a re-index would include editing a collection and resource metadata changes.

Note that I'm using the word "index" for any pre-defined structure that holds information on PRS's. It has more organisation than a typical cache. Some of your documents are indexed and indexing takes about 2s to 9s from various logs I've seen.

Mark Barnes

https://community.logos.com/discussion/comment/176087#Comment_176087

1. Is there a way to execute the search as a background process in anticipation of clicking the button instead of waiting on the user to click the button?
Perhaps the list could pre-populate for the current reference or headword whenever scrolling pauses for 5 seconds or so?

You're on track with what could be accomplished:-

pre-populate for current reference if there is a pause (and sync'ing doesn't coincide!); and then

cache the results so the search does not have to be performed next time for the same reference.

It may not be possible to avoid a delay for the first access of PRS' ie. if you wait 5 secs you've already introduced a delay, even if the PRS is completed in that time!

This was my initial thought until I started running the tests. But I think the problem is that PRS wasn't designed for Logos 4 out of the box. If you remember, it only got added after we all said how much we missed it. As a consequence the way indexing currently works is highly inefficient for PRS (though I have to say also for BWS, but we're more used to a wait there and run it much less often).

Pre-populating is a good idea, but if each datatype had its own index, rather than being spread over thousands of separate databases, that would solve the problem. (There's a small worry it might not be scalable, but I think it would scale to at least 20,000 records without problems, and more than that should only require a little lateral thinking.)

Mark Barnes

https://community.logos.com/discussion/comment/176021#Comment_176021

Needs amending

I lost my edit rights before I posted the last collection, unfortunately.

As I've shown resources that don't even have an index for the reference are included in the PRS.

I think we need to be careful to differentiation between bugs and features by design. I'll take the problems with ESL to your other thread.

What is Stage 7 now?

The file access that I called stage 7, is happening as part of stage 4. So stage 4 should now read "Fourth, for every similar resource with this index type", rather than "Fourth, for every resource with this type".

As you point out, I'd confused resource type with data type. In fact Logos appears to look at both resource type and data type. So Bibles aren't searched when you're looking at commentaries, even though they have the same data-type. But commentaries are searched if you're looking at dictionaries provided the commentary has an English headword index, which most don't.

Mark Barnes

https://community.logos.com/discussion/comment/176098#Comment_176098

It's part of the price for having dynamic
collections,

That's only a tiny fraction of the problem. We lose about 1-2s
because of the dynamic collections.

The reason this is much worse than L3 is because of the "All parallel
resources" menu which wasn't there in L3. This means L4 searches ALL
our resources, whilst in L3 it only searched resources we'd manually
added to a parallel association.

But if my collections were indexed at
startup this would vastly improve the efficiency of generating PRS's for
a given reference.

It collections queries were cached,
the maximum time saving would still only be 2-3 seconds on the first
run, and less than 1 second on subsequent runs.

For those who are wondering what happens 'behind the hood', I created this little video demonstrating which files Logos is accessing during a PRS lookup. You'll have to be very quick to see the collections searching going on (look for ResourceCollectionManager.db flashing up a few times between 31s and 37s).

https://community.logos.com/discussion/comment/176100#Comment_176100

Hi Mark,

Wow, that's really slow for your large library. Thanks for posting a video clip to show that. Two thoughts:

1) Perhaps not everyone needs the "All parallel resources" drop down - for those who set up their own resource collections (like yourself), perhaps offering the option to not show "All parallel resources" is a quick interim fix?

2) You mentioned the possibility of adding another index with all the head words as a long-term fix to this problem. I'm concerned that this will further increase the indexing time AND frequency for us.

Thanks,

Peter

Mark Barnes

https://community.logos.com/discussion/comment/176103#Comment_176103

1) Perhaps not everyone needs the "All parallel resources" drop down - for those who set up their own resource collections (like yourself), perhaps offering the option to not show "All parallel resources" is a quick interim fix?

Yes, that would speed things up and you would imagine be very easy to achieve. Even the largest of my collections would only represent about a quarter of "All parallel resources".

2) You mentioned the possibility of adding another index with all the head words as a long-term fix to this problem. I'm concerned that this will further increase the indexing time AND frequency for us.

I'm not sure it needs another index (although without fully understanding the workings of Logos I can't be sure). I'm suggesting replacing the existing indexes with new, larger ones.

Currently, each resource has it's own index file. Within that file are multiple indexes, one for each headword/datatype supported by that resource. So commentaries normally have a Bible index and a volume/page index, within that one file. 1,000 commentaries means 1,000 files. I'm suggesting having two files instead. One Bible index file, and one volume/page file. Each file would store one index. So instead of running 1,000 queries, taking perhaps 0.05s each (50 seconds), you'd only need to run one query which would take (very roughly) about 0.25seconds. (You'd also need additional files for all the other datatypes, so you'd end up with perhaps a few dozen files instead of a few thousand.)

Because I'm suggesting these indexes replace the existing ones (it's exactly the same data, just stored in a different way), I don't believe it would increase indexing time or frequency.

Dave Hooton

https://community.logos.com/discussion/comment/176109#Comment_176109

Currently, each resource has it's own index file. Within that file are multiple indexes, one for each headword/datatype supported by that resource. So commentaries normally have a Bible index and a volume/page index, within that one file. 1,000 commentaries means 1,000 files. I'm suggesting having two files instead. One Bible index file, and one volume/page file. Each file would store one index. So instead of running 1,000 queries, taking perhaps 0.05s each (50 seconds), you'd only need to run one query which would take (very roughly) about 0.25seconds.

Isn't that fulfilled by the LibraryIndex?

EDIT: Ok - you mean the mstidx file in ResourceManager\logos!

Jack Caviness

https://community.logos.com/discussion/comment/176103#Comment_176103

1) Perhaps not everyone needs the "All parallel resources" drop down - for those who set up their own resource collections (like yourself), perhaps offering the option to not show "All parallel resources" is a quick interim fix?

That would be good for me. I never use the "All Parallel Resources". I am only interested in the collections where I have checked use as PRS.

Mark Barnes

https://community.logos.com/discussion/comment/176116#Comment_176116

Isn't that fulfilled by the
LibraryIndex?

I'm not able to understand the format of LibraryIndex (I suspect it's proprietary). But LibraryIndex is used for searching, and is entirely separate from the Datatype/Headword indexing which is used for lookups. I think this is one of the reasons why it's not straightforward to add headword:term into the default search syntax.

So if you do a search for <HebrewStrongs = Strong’s Hebrew #1288> it will use LibraryIndex. But if you do a PRS lookup on the same number it will use the .mstidx files. I don't think LibraryIndex differentiates between headwords and occurrences within the text.

Bradley Grainger (Logos)

Mark,

Thanks for the performance analysis (which is spot on) and for sparking a conversation showing us that this feature is clearly useful and important to many users. Many problems could be solved (as you point out) by creating a global keylink index; in fact, we have a case in our bug tracking system about this that predates (I think) the PRS feature. As with any addition to an already complex system, there are some complicating factors that would make it a little tricky to implement. It wouldn't be as difficult as creating the Sentence Diagrammer, or optimising phrase searching, but certainly not something that could just be whipped out in a day. As the missing features list gets smaller, I hope we will have more time to focus on performance; it's clear that this is an important issue for many of you.

Mark Barnes

https://community.logos.com/discussion/comment/176204#Comment_176204

Thanks for the feedback, Bradley. We trust you guys to come up with good solution once you're able to focus on this sort of thing.

Mark Smith

https://community.logos.com/discussion/comment/176204#Comment_176204

it's clear that this is an important issue for many of you.

[Y] Thanks for letting us know you will take a look at this.

Thanks to Mark for championing this cause!

steve clark

https://community.logos.com/discussion/comment/176222#Comment_176222

Thanks Mark for bringing this to the foreground.

Thanks Bradley for letting us know it is not forgotten.

i use PRS frequently, which prevents me from having too many resources open.

Dave Hooton

https://community.logos.com/discussion/comment/176204#Comment_176204

Mark,

Thanks for the performance analysis (which is spot on) and for sparking a conversation showing us that this feature is clearly useful and important to many users. Many problems could be solved (as you point out) by creating a global keylink index;

Yes. It's only when you run the tests Mark has performed that you realise the nature of the inefficiencies with the current system. So thanks to Mark, and to Bradley for his confirmation and indication of a performance improvement in the (near) future.

Damian McGrath

https://community.logos.com/discussion/comment/176204#Comment_176204

As the missing features list gets smaller, I hope we will have more time to focus on performance; it's clear that this is an important issue for many of you.

As someone who doesn't miss the missing features, I'm all for performance improvements. It's my number one issue (and, I notice, the third most important issue on uservoice)

I would also like to thank Mark for his efforts here. It's been fascinating, truly.

Trey Selman

https://community.logos.com/discussion/comment/176204#Comment_176204

Bradley, Thank you for your reply.

As Mark said, we trust you to come up with a good solution.That trust comes easy from the years of faithful work.

Each day I look forward to seeing how you make a great tool better.

Again Mark, thank you very much for your diligence, love for God, care for the tool that aids in knowing Him more, and your care and love for others.

Wyn Laidig

I agree that PRS is nearly unusable as it is now. I used this ALL THE TIME in L3, and is one of the two major reasons I have a hard time accepting L4. (The other reason is the slowness of the editor when writing notes!) I hope Logos will invest time to solve these issues quickly.

Clinton Thomas

https://community.logos.com/discussion/comment/176559#Comment_176559

I agree that PRS is nearly unusable as it is now.

The word 'nearly' should not be in that sentence!

In June we were told in another thread that "Development is working on optimizing the parallel resource sets. I'll add this thread to the case. Thanks."

They have not done a good job.

Mark Barnes

https://community.logos.com/discussion/comment/176773#Comment_176773

They have not done a good job.

We haven't seen the results of their work yet. I presume they're doing a good job, but we'll have to wait and see.

Clinton Thomas

https://community.logos.com/discussion/comment/176816#Comment_176816

We haven't seen the results of their work yet. I presume they're doing a good job, but we'll have to wait and see.

Mine has become 2 to 10 times slower over the last 3+ months. They are presumably working on making it faster, it clearly has gotten slower while they they have been working on 'improving' performance.

Rosie Perera

https://community.logos.com/discussion/comment/176874#Comment_176874

We haven't seen the results of their work yet. I presume they're doing a good job, but we'll have to wait and see.

Mine has become 2 to 10 times slower over the last 3+ months. They are presumably working on making it faster, it clearly has gotten slower while they they have been working on 'improving' performance.

I believe they are working on it in a forked off code source for a future release. I don't think any of their work on this has been checked in to the existing releases (at least empirical evidence would support that interpretation). I expect we'll see an improvement in performance in the next version. They've only noted that the performance work has been "started" on UserVoice, but it has not ever been a major announcement with any of the releases thus far. I know they're constantly trying to tweak little things, but any major performance work is being pushed down in priority in order to get the missing feature work done. They've clearly stated in the past that getting features implemented takes priority for them over speeding things up. (Personally, I think new feature implementations should be done in such a way that they won't need performance tuning later; smart algorithms are fast by definition; there's nothing you can do to "performance tune" a slow algorithm other than scrapping it and starting over, which is a waste of development time. But that's just my opinion. I am not sure Logos shares that opinion.)

Mark Barnes

https://community.logos.com/discussion/comment/176880#Comment_176880

I am not sure Logos shares that opinion.

They certainly don't (it's all relevant, but look for "Code first, optimize later" towards the bottom).

Rosie Perera

https://community.logos.com/discussion/comment/176885#Comment_176885

I am not sure Logos shares that opinion.

They certainly don't (it's all relevant, but look for "Code first, optimize later" towards the bottom).

I know. I was trying to be diplomatic. [;)]