Logos 6 Best Feature: NO INDEXING !!!!!!!!!!

Page 3 of 14 (263 items) < Previous 1 2 3 4 5 Next > ... Last »
This post has 262 Replies | 5 Followers

Posts 10229
Denise | Forum Activity | Replied: Thu, Oct 3 2013 7:21 AM

Reading Australian Dave's initial comment, there's somewhat of an epiphany of sorts. '... and competitors searched at lightning speeds.'

I'm pretty much burned out on the question. I downloaded A-Company, it installed. It allowed me to select which books I wanted on my PC. And I started working. It was only later that I remembered I hadn't 'watched the videos' or 'signed up for Camp Accordance' or .... 'indexed'.  Indeed I'm not sure whether they have a manual or a forum (I guess they probably do).

All the hand-wringing to try to defend the current process is just that.


Posts 1691
LogosEmployee
Bob Pritchett | Forum Activity | Replied: Thu, Oct 3 2013 7:42 AM

Back in 2009 we posted a lot of technical info on why indexing 'is what it is.' While there may still be some room to optimize, a large part of the required time is literally required in order to do what we do.

Everybody has hundreds (or thousands) of different books making up their library. The permutations of ownership of 36,000 different books make it impossible to store every possible index and then just download it. While people who buy the same base package, and nothing else, in the same window of a few months might all have the same index, that's still a lot of combinations, and within a very short time someone will add a book and then need their own special index.

We can't just index each book separately, because if we did A) ranked results across your library would be impossible to compute with any accuracy, and B) we'd have to search each book, dramatically increasing the number of seeks on your hard drive.

Physical hard drives are much faster reading in sequence than 'seeking' to new locations. Wikipedia reports an average of 9 milliseconds for a seek on a desktop hard drive. That means that to load hits for "term" we have to spend 9 ms just getting to the list of hits, which can then be read continuously from that location. If you have 1,000 separate books with 1,000 separate pre-built indices, that'll be 9 ms x 1,000 = 9 seconds just in hard drive seek. Even super-smart coding can't argue with physics. And worse, you can't find the location of the hits in a single seek -- that seek just gets you there. You have to do other seeks (at least one) to load the index of where the term's hit lists are.

So we build a single index of your library -- but building that index involves 'reading' every book and storing the hits. We can use more memory to minimize seeks for writing the index file, but it's still going to take a lot.

This is why SSD's make a huge difference -- much faster seeking. (None, really -- it's direct access to memory.)

So how does Google be so fast even over the Internet?

A) Google has only one index for the whole Internet, not a separate index for every permutation of books a user owns. (We could pre-build a single index of all 36,000 books, but for most of you we'd keep returning results in books you don't own. Filtering out the books you don't own would slow you down. And you might not want us to re-download that multi-gig index to your hard drive every night as new books are released. We could leave it on the Internet, but then you'd have to be online.) Which leads to...

B) Google makes you use an index in the cloud, and does not support off-line use, like we do.

C) Google never 'seeks.' Google stores the entire index to the Internet in memory. Lots of memory. On lots of computers. Using lots of power. "Build my data center next to the dam" kind of power.

It's possible we could hand-optimize our indexing process and squeeze 10% or more out of it. (I'm not sure, but it's possible -- anything is possible.) But that's 6 minutes out of an hour, and may be literally the end of what we could do without changing behavior / functionality.

But laptops are quickly moving to SSDs, and there you'll see much bigger improvements than code will ever offer. If you're on a desktop, adding an SSD as an extra drive, and then moving resources and indexing there, is another option.

Sorry I don't have an easy solution, like "Bradley, stay up all night and optimize this!". :-) But I hope the explanation helps...

-- Bob

http://www.pcworld.com/article/2048120/benchmarks-dont-lie-ssd-upgrades-deliver-huge-performance-gains.html

Posts 2062
Forum MVP
Randy W. Sims | Forum Activity | Replied: Thu, Oct 3 2013 8:07 AM

Is it possible to better hide indexing? Smoke & mirrors. Find a way to make it more invisible, more transparent? Push it further into the background, an indexing service that is separated from resource downloads (which in turn could be given more user control, what downloads and when, etc). Download now. Make it available now for reading. But the indexer does it's magic in the background later, tonight, or next week. I imagine that would be quite a re-working, but it might be easier than trying to squeeze out optimizations. Sometimes illusions are more useful than realities.

Posts 5564
Forum MVP
Rich DeRuiter | Forum Activity | Replied: Thu, Oct 3 2013 8:11 AM

mike:
this constant indexing really cripple my computer and my life regardless whether I change the schedule of updates in setting or not.

When it gets in the way of doing my work, I hit the "Pause indexing for 4 hours" option and then start indexing over lunch, or supper, or some other time when I'm away from the computer. Really, unless you need to do heavy searching in the resource you just downloaded, it isn't necessary to index it right away.

On the other hand, the index is what makes searches work. Logos won't work without an index. If we want large libraries and quick searches, indexes are as necessary as a card catalogue (or a computerized inventory) in a large library. You can find things without it, but...

What could be done, though, is to rework the way indexing happens. And actually indexing is much faster now than it was when Logos4 was first released. However, I think it could be further streamlined by doing things like running at the lowest priority (on Windows) for certain kinds of resources (like Vyrso books), or for single resource updates (except for Bibles, Lexicons & Commentaries?). I would think this option (tiered indexing priority) may have some merit. It may also be possible to streamline the indexing process itself. This is not at all a simple task, but I'd be surprised if this doesn't get studied from time to time.

Another thought is to have a "Download anytime," along with a "Schedule indexing for later" command. It would also be helpful to automate the indexing command to run automatically with a "Download later" command (this was suggested in another thread). That way we could set our Logos to update at 5pm (or whenever we typically walk away from our work computer for the day), and know that when we return in the morning, it will have updated and indexed and we could hit the ground running.

mike:
Really, this past month the indexing is just waaaaaaaaaay too much.. I really couldn't use the computer & I always play the waiting game whenever I want to use Logos.

There were some major updates to the RI and some other critical resources this month that required some heavy duty indexing. There were also problems with the way that update was released that caused multiple update sessions instead of one major one. But the pace and size of updates we experienced this month is not typical. We could ask Logos to stop updating our resources, but I don't think most users would want that (I know I wouldn't).

mike:
What if I buy 1-2 books everyday & want to read those right away.. I HAVE to index all the time???

No you don't need to index the book to read it. You only need to index it if you want to included it in a "Basic" (AKA Library) search.

 Help links: WIKI;  Logos 6 FAQ. (Phil. 2:14, NIV)

Posts 178
DavidS | Forum Activity | Replied: Thu, Oct 3 2013 8:13 AM

Bradley Grainger (Logos):

mike:

I'm sure those smart programmers guys at Logos will know how to fix this (I'm praying for a miracle)

One possibility is to move the index from your computer to our servers (similar to what is done for the mobile apps right now). Would it be an acceptable solution to require an Internet connection for searching to work?

What if Logos would build one large index for all the books in there data base? Then it would just have to be down-loaded as it changes. This would greatly reduce the time compared to indexing on our much slower computers. [Assuming Logos has some real speed demons Devil they use.] There would have to be changes to check what books each user has purchased in the search routine. Just build a filter to drop the un-purchased books so they don't show in the search results. We could even have an option to see a small part of what is available in other un-purchased books. Which may increase sales for Logos! Embarrassed

Posts 9947
George Somsel | Forum Activity | Replied: Thu, Oct 3 2013 8:18 AM

DavidS:
What if Logos would build one large index for all the books in there data base? Then it would just have to be down-loaded as it changes.

You mean every day?  While I may not receive new resources every day, something is generally being released every day.

george
gfsomsel

יְמֵי־שְׁנוֹתֵינוּ בָהֶם שִׁבְעִים שָׁנָה וְאִם בִּגְבוּרֹת שְׁמוֹנִים שָׁנָה וְרָהְבָּם עָמָל וָאָוֶן

Posts 3687
Francis | Forum Activity | Replied: Thu, Oct 3 2013 8:23 AM

Thank you, Bob, for once again, coming in to provide explanations.

I too find the frequency, length and computer slowdown caused by indexing tedious. I like Randy's direction of thinking. It is nice to be able to postpone indexing but perhaps it would be better to be able to prioritize or "send to background" (where we understand it will take more time but will have little effect on computer performance). If someone is in a hurry to have new resources indexed and searchable, the prioritize functionality could be invoked. If it's just regular updates, perhaps it does not need to be so demanding all the time. I know there are other programs that have options to do the background maintenance work with CPU usage is below a certain percentage.

I don't have any complaint about the frequency of downloads. I understand and appreciate the constant work of updating resources. So mere frequency is not the issue when it comes to indexing. The real problem is how it affects usage given that it is both slow and demanding on computer resources.

Perhaps this will become a non-issue once SSD becomes everyone's lot, but we are far from being there yet. 

Here is a question about the suggestion of an online index that was made earlier. Does it have to be either/or (online or offline)? Can it be both? In other words, when working "online" can the desktop software call on the online index while indexing in the background for whenever offline use might come into play? I confess freely that I have no idea of what's under the hood, so it's just an idea, nothing more.

Posts 178
DavidS | Forum Activity | Replied: Thu, Oct 3 2013 8:26 AM

George Somsel:

DavidS:
What if Logos would build one large index for all the books in there data base? Then it would just have to be down-loaded as it changes.

You mean every day?  While I may not receive new resources every day, something is generally being released every day.

Yes! A zipped index would not be that bad to down load.

Posts 178
DavidS | Forum Activity | Replied: Thu, Oct 3 2013 8:28 AM

Or allow us to down load once a week as an option or when we purchase new books or when there are considerable changes in the books we own.

Posts 9947
George Somsel | Forum Activity | Replied: Thu, Oct 3 2013 8:33 AM

DavidS:

Or allow us to down load once a week as an option.

Why not simply postpone indexing until you go to bed.  In the meantime you can read your books, but searches involving any new additions will be a bit iffy.

george
gfsomsel

יְמֵי־שְׁנוֹתֵינוּ בָהֶם שִׁבְעִים שָׁנָה וְאִם בִּגְבוּרֹת שְׁמוֹנִים שָׁנָה וְרָהְבָּם עָמָל וָאָוֶן

Posts 178
DavidS | Forum Activity | Replied: Thu, Oct 3 2013 8:37 AM

That is an option but it still involves an indexing process. I have SSD so it is not that much of a problem for me. But there are many that can not afford to upgrade their computers.

Posts 2833
Doc B | Forum Activity | Replied: Thu, Oct 3 2013 8:38 AM

As I've said before, indexing is a pain, but I understand why it is necessary.  I'd MUCH rather be bothered with indexing after a download than with slow searches. And as Bob implied, having something like a Google search result is untenable. "We found your restaurant. It's on Earth." Start computing permutation formulae with a thirty-six thousand in there, and you'll need a mainframe computer to not choke on the numbers.

As for anyone for whom slow indexing becomes inconvenient, I sympathize. My computer is pretty much useless for that half-hour too. But as for anyone for whom slow indexing 'cripples my life', well, you really ought to find something else to do with your time.

My thanks to the various MVPs. Without them Logos would have died early. They were the only real help available.

Faithlife Corp. owes the MVPs free resources for life.

Posts 1129
Keith Larson | Forum Activity | Replied: Thu, Oct 3 2013 8:39 AM

mike:

I really couldn't use the computer & I always play the waiting game whenever I want to use Logos.

I suggest that you just use the "Pause for 4 hours" option if indexing is slowing you down too much. Once you stop having to do high CPU activity you can restart the indexing and pause it as many times as necessary. However, I rarely find indexing a problem and can keep on working just fine. I just close other CPU hogs like iTunes for example.

Posts 9947
George Somsel | Forum Activity | Replied: Thu, Oct 3 2013 8:46 AM

DavidS:

That is an option but it still involves an indexing process. I have SSD so it is not that much of a problem for me. But there are many that can not afford to upgrade there computers.

I DON'T HAVE SSD, but I don't find the indexing to be a problem at all.  It slows things down, but it only lasts a short time.

george
gfsomsel

יְמֵי־שְׁנוֹתֵינוּ בָהֶם שִׁבְעִים שָׁנָה וְאִם בִּגְבוּרֹת שְׁמוֹנִים שָׁנָה וְרָהְבָּם עָמָל וָאָוֶן

Posts 1051
William Gabriel | Forum Activity | Replied: Thu, Oct 3 2013 9:08 AM

I've been watching this thread and thinking it through a bit. May I ask, Bob/Logos, what kind of indexing data structure do you use? As a Logos user and a developer, I imagine there's a smoother way of accomplishing what everyone wants, and I'd like to think through it a little more deeply.

Here's a thought: what if we kept the local index as is but also offered the option of the online supplement? You already do this for my iPad, right? I do a Basic Search on a word, it returns with results from my library in about a second. If I search for that same term on the desktop, it can take 30 seconds to retrieve the final output. (just tried it out now, and that was my experience)

Of course, not everyone is on the internet all the time, but some kind of a synthesized/hybrid search option may give us the best of both worlds.

As far as indexing is concerned, there has to be a way to lower the priority. I'm not talking about CPU prioritization--I know that's already reduced so that it's not supposed to get in the way. But I've got to tell you, the indexer hogs the system until it's done. I imagine it's due to the use of all sorts of resources like I/O in addition to the CPU. I have developed a strategy for Logos indexing, but frankly, I shouldn't have to do that. I shouldn't have to worry about it. I don't care if it takes longer to finish, I want to be able to productively use my computer while Logos indexes.

Finally, have you considered a drop-in product for the indexing and searching? Some really smart guys have already done a lot of work like this, and they offer they work freely. What would it look like if Logos incorporated Lucene?

Bill

Posts 13368
Forum MVP
Mark Barnes | Forum Activity | Replied: Thu, Oct 3 2013 9:18 AM

In Logos 2, indexes came on disk, and the application just merged them in (and therefore cut indexing time).

In Logos 3, indexes came on disk and were kept separate so that indexing wasn't required. Without one joint index though, searching was slow.

In Logos 4 and 5, no indexes were supplied, and the application creates and merges them.

It would be relatively easy for Logos to go back to the Logos 2 days, when indexes were created by Logos and then downloaded. That could could indexing in half, but increase downloading by 50% or so.

That said, by the time Logos 6 comes out we'll all have SSDs and superfast processors, and we won't mind if indexing taking place. I have a very large library, and books are frequently purchased and updated. I also have an i7 processor and pro-level SSD. I regularly have indexing taking place whilst I'm working, and I barely notice it.

Posts 13368
Forum MVP
Mark Barnes | Forum Activity | Replied: Thu, Oct 3 2013 9:20 AM

Bob Pritchett:
We can't just index each book separately, because if we did A) ranked results across your library would be impossible to compute with any accuracy, and B) we'd have to search each book, dramatically increasing the number of seeks on your hard drive.

So why do we still have .mstidx files Stick out tongue? Isn't it time they were optimized too?

Posts 178
DavidS | Forum Activity | Replied: Thu, Oct 3 2013 10:07 AM

Bob Pritchett:

We can't just index each book separately, because if we did A) ranked results across your library would be impossible to compute with any accuracy, and B) we'd have to search each book, dramatically increasing the number of seeks on your hard drive.

Bob,

I am confused. I used to program computers years ago. If each book was indexed and the results sorted [at Logos] then they are all merged into a library wide index [on user computers] then you would have no long indexing. With today's processing speeds this should be really fast. Am I missing something here?

Thanks for your time to communicate with us. It really helps to understand Logos.

Posts 3938
abondservant | Forum Activity | Replied: Thu, Oct 3 2013 10:12 AM

Bob Pritchett:

Back in 2009 we posted a lot of technical info on why indexing 'is what it is.' While there may still be some room to optimize, a large part of the required time is literally required in order to do what we do.

Everybody has hundreds (or thousands) of different books making up their library. The permutations of ownership of 36,000 different books make it impossible to store every possible index and then just download it. While people who buy the same base package, and nothing else, in the same window of a few months might all have the same index, that's still a lot of combinations, and within a very short time someone will add a book and then need their own special index.

We can't just index each book separately, because if we did A) ranked results across your library would be impossible to compute with any accuracy, and B) we'd have to search each book, dramatically increasing the number of seeks on your hard drive.

Physical hard drives are much faster reading in sequence than 'seeking' to new locations. Wikipedia reports an average of 9 milliseconds for a seek on a desktop hard drive. That means that to load hits for "term" we have to spend 9 ms just getting to the list of hits, which can then be read continuously from that location. If you have 1,000 separate books with 1,000 separate pre-built indices, that'll be 9 ms x 1,000 = 9 seconds just in hard drive seek. Even super-smart coding can't argue with physics. And worse, you can't find the location of the hits in a single seek -- that seek just gets you there. You have to do other seeks (at least one) to load the index of where the term's hit lists are.

So we build a single index of your library -- but building that index involves 'reading' every book and storing the hits. We can use more memory to minimize seeks for writing the index file, but it's still going to take a lot.

This is why SSD's make a huge difference -- much faster seeking. (None, really -- it's direct access to memory.)

So how does Google be so fast even over the Internet?

A) Google has only one index for the whole Internet, not a separate index for every permutation of books a user owns. (We could pre-build a single index of all 36,000 books, but for most of you we'd keep returning results in books you don't own. Filtering out the books you don't own would slow you down. And you might not want us to re-download that multi-gig index to your hard drive every night as new books are released. We could leave it on the Internet, but then you'd have to be online.) Which leads to...

B) Google makes you use an index in the cloud, and does not support off-line use, like we do.

C) Google never 'seeks.' Google stores the entire index to the Internet in memory. Lots of memory. On lots of computers. Using lots of power. "Build my data center next to the dam" kind of power.

It's possible we could hand-optimize our indexing process and squeeze 10% or more out of it. (I'm not sure, but it's possible -- anything is possible.) But that's 6 minutes out of an hour, and may be literally the end of what we could do without changing behavior / functionality.

But laptops are quickly moving to SSDs, and there you'll see much bigger improvements than code will ever offer. If you're on a desktop, adding an SSD as an extra drive, and then moving resources and indexing there, is another option.

Sorry I don't have an easy solution, like "Bradley, stay up all night and optimize this!". :-) But I hope the explanation helps...

-- Bob

http://www.pcworld.com/article/2048120/benchmarks-dont-lie-ssd-upgrades-deliver-huge-performance-gains.html



I don't notice a difference in performance when my computer is indexing vs when its not. But my computer is not old either, nor does it lack for memory and has an SSD.

However Bob - my thinking was that you have 36,000 individual indexes - one for each book.

But my computer should only have one index. So when I buy a new book, the book downloads, and then the index for that book is added to the master index of books i own on my computer. As in ALL book indexes (for which I own a license) get added to the same file as they do now, however instead of my computer doing the indexing, why couldn't the index just be appended to my existing index?

Does that make more sense? This way most of the problems you outlined don't exist, and people with slower computers don't have to deal with indexing.

McAffee calls this "delta updates".

This way if you have version 1 of the virus definitions and they are on version 3, rather than download vs 2, and then version three, you simply download the things that version 2, and version 3 have that are new, and they are dynamically inserted into the virus definition file.


L2 lvl4, L3 Scholars, L4 Scholars, L5 Platinum,  L6 Collectors. L7 Baptist Portfolio. L8 Baptist Platinum.

Posts 777
JRS | Forum Activity | Replied: Thu, Oct 3 2013 10:13 AM

Why not simply incorporate/expand the scheduler for downloading and indexing with power options after the dnload/index is complete? 

IOW something like this:

Check for downloads Every day / Every Sun/Mon/Tue/Wed/Thu/Fri/Sat between xx:xx am/pm and zz:zz am/pm. 

Shut down / Reboot / Put Computer to Sleep / Do Nothing / Index after Downloading.

Shut down / Reboot / Put Computer to Sleep / Do Nothing after Indexing.

 

Logos should also be able to wake the computer from a sleep state to perform this dnload/index function (If it isn't already able to do so).

And, it should send Notification emails with log files attached upon any errors and/or a successful dnload/index report upon completion.

How blessed is the one whom Thou dost choose, and bring near to Thee(Psa 65:4a)

Page 3 of 14 (263 items) < Previous 1 2 3 4 5 Next > ... Last » | RSS