In past discussions I have read justifications as to why what appears in natural language as oddities in search analyses occur, such as the "--" results. One of the problems I have yet to resolve for myself is how to get beyond technical explanations about reverse interlinears, cells, and whatnots, to data I can actually use.
For instance, this morning I searched for verbs in a section (in the LHB) to look at what grammatical subjects dominate. Apparently, only clause search allows me to sort by subjects, so that's what I used. But when I sorted, I had a good number of results listed as "--" (results I had difficulty in understanding since the verbs listed had clearly identifiable subjects). The problem I had was this: what value have the statistics if there is a large number of instances that are classified as "--" for whatever reason? I was trying to determine which person entity was the leading subject of verbs in the section but could not know with confidence what the answer was. The problem is not just statistical: how am I to know when I look at one particular subject that there are not instances classified as "--" that actually refer to this subject?
So, what is one to do with analysis and sorting and "--" results?