PBB: Query regarding headwords and tagging

Page 1 of 1 (8 items)
This post has 7 Replies | 1 Follower

Posts 2279
Andy | Forum Activity | Posted: Sat, Aug 20 2011 1:38 AM

I am currently working on my first proper personal book, George Herbert's The Country Parson, and I was wondering if someone would be able to help clarify a couple of points.

The first query pertains to headwords.

I understand from the Wiki that headwords operate in a similar way to topics in L3. I presume, therefore, if I were to insert 'Rome' as a headword (as per the example in the Wiki), a search for 'Rome' would then locate and bring up the headed passage in the list of results. (I presume the result would be returned in the topic section with the Bible dictionaries and Encyclopaedias prior to the longer list of general results by word.)

I was wondering, however, whether there is a defined set of headwords and whether these were listed anywhere? I presume it is good practice to limit the number of headwords in use in order to ensure that the topic searches are not rendered useless.

I was also wondering about the appropriateness of utilising headwords in a book like The Country Parson. I understand that Bible Dictionaries, etc., utilise headwords, but, again, want to be consistent with regards to the usual approach in typical L4 resources. There will be opportunity to utilise headwords as the book tends to fall into sections dealing with discrete and defined topics. My issue, then, is one of consistency.

Finally, there is the question of whether I should modernise the language or not.

In truth, I would prefer to retain as much of the original language as possible (I would not want to use a modernised version in an academic paper, for example), but appreciate that this would pose practical difficulties (i.e., I understand a search for the "cross of Christ" would not return results for the "crosse of Christ". I was wondering, therefore, if there was anyway of tagging the archaic variants so that they will be picked up when running a search using the modern spelling.

Apologies for the string of questions and thank you, in advance, for your kind assistance.

Andy

Posts 2751
DominicM | Forum Activity | Replied: Sat, Aug 20 2011 2:13 AM

Headwords can be whatever you want, as you for Dictionaries it would be the topic/word, but each book will have a different set of them, therefore a definitive list is impossible, if its a specific sub-topic again I would use headwords for those but its personal preference and how much time you wish to spend tagging

I would not modernise, but thats a personal choice, you could always add a footnote with your obsevations

if need to make an internal link other than to a "topic" headstone if you use something wierd like ZZZ-MyLink as the headword then your search results will not be adversely affected

I havent found a way of firing a search from a link in PB (yet) havent tried, but if it is possible you could make your own PB for topical searching and just update the link acordingly so it searched all variants on one click, whether this is possible I dont yet know..

Edit
After Testing - it is possible..

[["crosse of Christ" OR "cross of Christ"  >> logos4:Search;kind=Basic;q=$22crosse_of_Christ$22_OR_$22cross_of_Christ$22;match=stem]]

Never Deprive Anyone of Hope.. It Might Be ALL They Have

Posts 2279
Andy | Forum Activity | Replied: Sat, Aug 20 2011 2:45 AM

Thanks so much for your response, Dominic, the information regarding headwords is really helpful and has resolved my query.

With regards to modernisation, I agree entirely. What I guess I want to do is to be able to run a search for, say, 'cross of Christ' and return results for both 'cross of Christ' and 'crosse of Christ' across all resources (or a particular collection). In other words, I want the surface text to read 'crosse', 'temporall', 'countrey', etc., but L4 to pick up the variants, 'cross', 'temporal', 'country', etc., with regards to search results.

Footnotes and, I guess, editorial notes (in square brackets) would achieve this to some extent, but would interrupt the surface text as there are many, many variants in every paragraph (which would necessitate many, many footnotes).

I will have a play around and report back if I stumble across any satisfactory workaround. I would be grateful if anything occurs to anyone else.

Apologies for belabouring this point, but I have half a mind to turn to either Edmund Spenser (I have a minor obsession with The Faerie Queene; I know, I know, but the medication is helping somewhat...) or perhaps Donne, but would be reluctant to do so without a workaround. Modernisation (for which I share your inclination) is not really an option with poetry and, pre-Samuel Johnson, the variation in spelling would render the search function almost useless with such Early Modern texts. 

Seriously, thanks again for the clarification regarding headwords. 

Edit: Dominic: You beat me to it with your edit. Thanks for testing that. I will give it a whirl and see if it achieves what I am after. Thank you again.

Posts 24005
Forum MVP
Graham Criddle | Forum Activity | Replied: Sat, Aug 20 2011 3:44 AM

Andy

Have you tried searching with "Match all word forms" set?

I tried this for "cross" and "crosse" and got virtually the same number of hits.

Don't know if this is a potential approach to your search challenge?

Graham

Posts 26223
Forum MVP
Dave Hooton | Forum Activity | Replied: Sat, Aug 20 2011 11:34 PM

Andy Evans:
In other words, I want the surface text to read 'crosse', 'temporall', 'countrey', etc., but L4 to pick up the variants, 'cross', 'temporal', 'country', etc., with regards to search results.

There is no easy solution to this. The "Match all Word forms" option cannot be relied upon as its word forms are algorithmic and it can provide fewer results than you expect (e.g. not matching the archaic form) or more results (e.g. plural forms).

Don't use that option and type the forms you expect in a list in Basic Search e.g.

(cross, crosse)   ==> same as cross OR crosse

(fairy, faery, faerie)

At some stage Logos will provide Word Lists (and allow them to be searched) so you don't have to generate them each time you search.

Dave
===

Windows 10 & Android 8

Posts 2279
Andy | Forum Activity | Replied: Sun, Aug 21 2011 12:02 AM

Dave Hooton:

Andy Evans:
In other words, I want the surface text to read 'crosse', 'temporall', 'countrey', etc., but L4 to pick up the variants, 'cross', 'temporal', 'country', etc., with regards to search results.

There is no easy solution to this. The "Match all Word forms" option cannot be relied upon as its word forms are algorithmic and it can provide fewer results than you expect (e.g. not matching the archaic form) or more results (e.g. plural forms).

Don't use that option and type the forms you expect in a list in Basic Search e.g.

(cross, crosse)   ==> same as cross OR crosse

(fairy, faery, faerie)

At some stage Logos will provide Word Lists (and allow them to be searched) so you don't have to generate them each time you search.

Thanks for this, Dave.

I was aware that it is possible to run a string of search terms (fairy, faery, etc.) in order to identify variants, but hoped to be able to address this in the creation of the PBB. The issue with some of the early modern texts is that the spelling is inconsistent within the actual document itself. It is difficult, therefore, to anticipate the spelling of each and every variant and there is a danger, even with a creative string of search terms, that something might be missed.

I was hoping that it might be possible to address this (with syntax) in the creation of the PB (i.e. tag 'faery' as 'fairy' and 'crosse' as 'cross'). This would involve more work in the creation of the PB, but would eliminate inconsistencies in searches.

I understand from the response that this is not presently possible.

The only alternative (short of living with it) is to amend the inconsistencies prior to creating the PB. This could be done, but is less than ideal (certainly in an academic context).

I agree that Word Lists will be helpful and will assist in the building of searches. It would also be helpful if we could edit and add our own Word Lists (this would be a more than acceptable workaround).

All of this makes me more appreciative of the work of Samuel Johnson Wink.

Thanks again for you help with this.

 

Posts 26223
Forum MVP
Dave Hooton | Forum Activity | Replied: Sun, Aug 21 2011 12:28 AM

Andy Evans:
It would also be helpful if we could edit and add our own Word Lists

This will be the basic functionality.

 

Dave
===

Windows 10 & Android 8

Posts 2279
Andy | Forum Activity | Replied: Sun, Aug 21 2011 12:50 AM

Dave Hooton:

This will be the basic functionality.

Thanks, Dave. That is really useful to know.

Andy

Page 1 of 1 (8 items) | RSS