Indexing Again?

124»

Comments

  • Juanita
    Juanita Member Posts: 1,339

    If you want Bob's explanation, you can find it in this thread on pages 3 & 4.

  • Simon’s Brother
    Simon’s Brother Member Posts: 6,816 ✭✭✭

     What is the purpose of this indexing?


    In short....greatly improved search speeds..

  • Joe K
    Joe K Member Posts: 48 ✭✭

    Logos,

    I really do not like the indexing that is occuring, and on every single customer's computer.  I am not that big of a search through my library type of guy, whereby i could live with the search speed of L3.  This constant index rebuilding is a big turn off for me.  This really needs to be done better.  I cannot imagine folk are going to live with a 12-hour indexing after a new resource is acquired.  People will stay away from purchasing in order to avoid the dreadful indexing.  Please re-think the process!

    Thanks! -JoeK

  • Mark Smith
    Mark Smith Member, MVP Posts: 11,790 ✭✭✭

     

    Please re-think the process!

    My feelings exactly. Whether our resources are indexed and when the indexing occurs should we choose it should be up to us. There have been many good points made in support of this. Let Logos ship this with full indexing turned on and indexing at the control of the program, updater, or whatever it is that now controls it, but give us some control over this behavior.

    Right now other things in 4.0 (rendering of search results is a big one) slow things down so much that all advantages for the types of searches I routinely do is lost. I take quite a hit in performance for everything else I use my computer for in the six hours or more it takes to index my Library. At least let me schedule doing that after I go to bed!

    Pastor, North Park Baptist Church

    Bridgeport, CT USA

  • Bradley Grainger (Logos)
    Bradley Grainger (Logos) Administrator, Logos Employee Posts: 11,950

    You don't need to do B by slowing the process, just drop the process priority.  For instance I've done lots of Distributed computing projects and while they're designed to peg your CPU at 100% they're also designed to let loose of the processor immediately; because they run in a lower priority process.  Best of both worlds I think.  If you don't need the power, the indexer has it, but if you need the power the indexer let's you have it immediately until you don't need it anymore.

    The indexer already runs its indexing threads at the lowest possible priority. It doesn't drop the process priority during that period, but Windows uses thread priority as the basis of scheduling. (The process priority is just a base value that is applied to the thread.) Still, we could try dropping the process priority during indexing to see if it makes a difference.

    My feeling is the biggest problem is that the indexer is doing a huge amount of I/O, which makes the system feel really sluggish. (Programs that are just cracking RC5 keys, for example, do almost zero I/O.) On Vista, we tried using "background processing mode" (http://code.logos.com/blog/2008/10/using_background_processing_mode_from_c.html); while it definitely improved the subjective responsiveness of the app, it made indexing run for 5-10x longer than it normally would; even on a top-of-the-line machine, indexing would now take days. (Basically, every indexer I/O request got pushed behind everything else running on the system: virus scanners, backup, defrag, etc.)

    We are profiling I/O usage and trying to bring it to a minimum; this should really help systems with slower HDDs. Unfortunately, we're using a third-party component during the indexing that probably issues 10x more I/O than is strictly necessary; even more unfortunately, it's doubtful that we'll have time to rewrite this before we ship 4.0. But we're doing everything else we can to reduce CPU and disk usage, and we hope to deliver even more improvements in the upcoming betas.

  • Jacob Hantla
    Jacob Hantla Member, MVP Posts: 3,871 ✭✭✭

    it made indexing run for 5-10x longer than it normally would; even on a top-of-the-line machine, indexing would now take days.

    I personally don't think that an index time in the week+ range is a problem AFTER the initial index is built...so long as we still have use of the previous index while the next is being built.

    Would it be possible to search new resources indexless (or generate a secondary, "quick" index as soon as new resources are downloaded and then display those as "new" results. Then about a week or two or more later, once the new index is all built, the old one can be replaced and the "new" results will no longer appear.

    I think we can forgive some processor work upon install (especially if it's pausable and can survive a restart), but I could imagine quite a few people would be hesitant to install (or purchase new resources) if it would consign their machine to the perpetual indexing overload that we're feeling now.

    Jacob Hantla
    Pastor/Elder, Grace Bible Church
    gbcaz.org

  • TCBlack
    TCBlack Member Posts: 10,978

    Thanks for the explanation Bradley. 

    It helps to keep the beta testers fed with lots of nubilous information.  [:)]

    Hmm Sarcasm is my love language. Obviously I love you. 

  • Dave Hooton
    Dave Hooton Member, MVP Posts: 35,672 ✭✭✭

    A) We can index newly installed books in their own tiny index, and
    then run searches against two indexes. Your results would be presented
    as "Hits in new books:" (a short list) and "Hits in your library:" (the
    longer, normal list). Then we could re-index or merge the indexes
    later. You'd get immediate access to your new book, but have to wait a
    day or so for it to be integrated into the master index. To sell this
    as a feature..."For the first day you own it, we highlight everytime it
    shows a search hit! See the value of your new book!" :-)

    B) We can index slowly. (This is what Windows and other large
    indexers are doing.) You won't notice your CPU getting pounded, and
    your drive will spin less often. We'll just take 12 hours to do 6 hours
    of work. Advantage: less thrashing. Disadvantage: Takes twice as long.

    I somehow missed this thread from 4 days ago, so my answer is that option A is needed and my laptop will love option B!

    But we're doing everything else we can to reduce CPU and disk usage, and we hope to deliver even more improvements in the upcoming betas.

    I really appreciate the 30% improvement I experienced with indexing for B3 vs B2.

     

    Dave
    ===

    Windows 11 & Android 13