Please explain "Ranked" results...

Damian McGrath
Damian McGrath Member Posts: 3,051 ✭✭✭
edited November 20 in English Forum

I do a basic search for Luke 1:1-4....

The result:

 

3 separate hits from the Word in Life Study Bible; 2 hits in a row from the Lectionary; an passage explaining the layout of the ACCS commentary....

Are they ranked by places we'd never even dream of looking?

Tagged:

Comments

  • Simon’s Brother
    Simon’s Brother Member Posts: 6,816 ✭✭✭

    Are they ranked by places we'd never even dream of looking?


    There are other places where I wonder the same thing about the hits we get presented.

  • Bob Pritchett
    Bob Pritchett Member, Logos Employee Posts: 2,280

    Ranking is done with standard information retrieval algorithms. They find articles that have the most similarity to your query. So if your query is:

    red, blue, green

    and the article is

    red, blue, green

    You get a 100% match. (High rank.)

    If the article is:

    red, blue

    you get a 66% match.

    If the article is:

    red white blue white green white

    you get a 50% match.

    A side effect of this standard system is that short articles that are similar to your query get high ranks. And a short article gets preferred over a long one (since long ones have more non-query words). There are attempts to "normalize for length", but on single term queries -- and a Bible verse is a single term -- it's hard to overcome the effect. In this case the lectionary article is very tiny, since the Bible text itself is inserted dynamically (so you can select the version of your choice). To the indexer, it's just an article with 3 or 4 BIble verses. That makes a query for one of those Bible verses a 25% match, which ranks much higher than the 0.002% match it might be in a much larger article.

    And arguably, it is a good match. That article only has four things in it, and your search was for one of those four things. That's a better match to your query than a big long commentary article that has hundreds of unrelated words in it.

    Now you and I know that's not a great match. But the system doesn't. And it doesn't know what would be a better match, since it can't read your mind. I can't even read your mind, :-). I'm not sure why you're searching for Luke 1:1-4. Are you looking to find the text of it? (In which case the lectionary page isn't so bad.) Or are you looking for a commentary on it?  Or are you looking into Theophilus? Or into book introductions in general? Or for a contrast in Luke's framing vs. the other gospels?

    What would you expect as the first hit? (Honest question, so we can try to tune the system in the future.)

    Generally, you'll get better results the more terms you provide. Even one more term helps; I tried:

    <bible ~ "luke 1:1-4"> theophilus

    and got a much more useful journal article contrasting Luke 1:1-4 and Acts 1:1-3. Seems pretty relevant to my query.

    Ideally, we're like Google. (Which also returns a "short page" featuring the text "Luke 1:1-4" -- in this case, from a Bible that has just the reference and the text.) A "bag of words" query works best.

    Hope this helps!

     

  • Simon’s Brother
    Simon’s Brother Member Posts: 6,816 ✭✭✭

    Ranking is done with standard information retrieval algorithms. They find articles that have the most similarity to your query. So if your query is:

    red, blue, green

    and the article is

    red, blue, green

    You get a 100% match. (High rank.)

    If the article is:

    red, blue

    you get a 66% match.

    If the article is:

    red white blue white green white

    you get a 50% match.

    A side effect of this standard system is that short articles that are similar to your query get high ranks. And a short article gets preferred over a long one (since long ones have more non-query words). There are attempts to "normalize for length", but on single term queries -- and a Bible verse is a single term -- it's hard to overcome the effect. In this case the lectionary article is very tiny, since the Bible text itself is inserted dynamically (so you can select the version of your choice). To the indexer, it's just an article with 3 or 4 BIble verses. That makes a query for one of those Bible verses a 25% match, which ranks much higher than the 0.002% match it might be in a much larger article.

    And arguably, it is a good match. That article only has four things in it, and your search was for one of those four things. That's a better match to your query than a big long commentary article that has hundreds of unrelated words in it.

    Now you and I know that's not a great match. But the system doesn't. And it doesn't know what would be a better match, since it can't read your mind. I can't even read your mind, :-). I'm not sure why you're searching for Luke 1:1-4. Are you looking to find the text of it? (In which case the lectionary page isn't so bad.) Or are you looking for a commentary on it?  Or are you looking into Theophilus? Or into book introductions in general? Or for a contrast in Luke's framing vs. the other gospels?

    What would you expect as the first hit? (Honest question, so we can try to tune the system in the future.)

    Generally, you'll get better results the more terms you provide. Even one more term helps; I tried:

    <bible ~ "luke 1:1-4"> theophilus

    and got a much more useful journal article contrasting Luke 1:1-4 and Acts 1:1-3. Seems pretty relevant to my query.

    Ideally, we're like Google. (Which also returns a "short page" featuring the text "Luke 1:1-4" -- in this case, from a Bible that has just the reference and the text.) A "bag of words" query works best.

    Hope this helps!

     


     

    Thanks Bob.  Makes sense about what is happening and maybe how to think about how we might refine our searchs a little more.

  • Damian McGrath
    Damian McGrath Member Posts: 3,051 ✭✭✭

    Bob,

    Thanks for the comprehensive response. That's another item checked off my mega wishlist [Y]  (have you seen the others?)

     

    I can't even read your mind, :-).

    No need.... I'll tell you what's on it :)

     

     I'm not sure why you're searching for Luke 1:1-4. Are you looking to find the text of it? (In which case the lectionary page isn't so bad.) Or are you looking for a commentary on it?  Or are you looking into Theophilus? Or into book introductions in general? Or for a contrast in Luke's framing vs. the other gospels?

    What would you expect as the first hit? (Honest question, so we can try to tune the system in the future.)

    I suppose for those sections of books or articles dedicated to these verses. Maybe, articles in which this appears in the title... Or in chapter headings.... I expect a "ranked" result to be one weighted for importance... I can't imagine how hard it would be to program for that....

     

     

     

    Generally, you'll get better results the more terms you provide. Even one more term helps; I tried:

    <bible ~ "luke 1:1-4"> theophilus

    and got a much more useful journal article contrasting Luke 1:1-4 and Acts 1:1-3. Seems pretty relevant to my query.

     

     

    I only get one result for this search. Why is that? I would have thought that I had plenty of resources which dealt with Luke 1:1-4 and with Theophilus. What am I missing here?

    Also, I've been searching from the right click context menu.... Can't add extra search terms there.

    Ideally, we're like Google

    Ideally, I'd like the results to be more google-like.  I know that they factor in all sorts of things in weighing up their rankings, not just number of hits per word count.

     

    But, all said, at least I understand. I'll probably just stick to the By Book search and drill my way down....

    PS can we please make these collapsible:

    image