L/V 10+ Tip of the Day #302 What is a word? no, seriously

MJ. Smith
MJ. Smith Member, MVP Posts: 53,086 ✭✭✭✭✭
edited November 21 in English Forum

Another tip of the day (TOTD) series for Logos/Verbum 10. They will be short and often drawn from forum posts. Feel free to ask questions and/or suggest forum posts you'd like to see included. Adding comments about the behavior on mobile and web apps would be appreciated by your fellow forumites. A search for "L/V 10+ Tip of the Day site:community.logos.com" on Google should bring the tips up as should this Reading List within the application.

This tip is inspired by the forum post: L/V 10+ Tip of the Day #301 Unit of meaning ... no "I" is not added to the text - Logos Forums

In public Canvas documents, Phil Gons shared an excellent diagram.

For most purposes this is sufficient. We think of a word as a collection of letters with a space at either end. Okay, we know the collection of letters is not random and that punctuation can replace the space but you trust if I say "pyx" that it is a word even if you haven't a clue as to what it means. And if I say "bucket list" you know to treat it as a single word. However, in computer science they switch to "token" and in linguistics to "lexical unit" because there are several problem cases where word defined as letters between spaces just doesn't work. Think of "lexical unit" as being what comprehensive lexicons try to provide. Using wikipedia's list this is:

Lexical item - Wikipedia">

Common types of lexical items/chunks include:

  1. Words, e.g. cattree
  2. Parts of words, e.g. -s in trees-er in workernon- in nondescript-est in loudest
  3. Phrasal verbs, e.g. put off or get out
  4. Multiword expressions, e.g. by the wayinside out
  5. Collocations, e.g. motor vehicleabsolutely convinced.
  6. Institutionalized utterances, e.g. I'll get itWe'll seeThat'll doIf I were youWould you like a cup of coffee?
  7. Idioms, e.g. break a legwas one whale of aa bitter pill to swallow
  8. Sayings, e.g. The early bird gets the wormThe devil is in the details
  9. Sentence frames and heads, e.g. That is not as...as you thinkThe problem was
  10. Text frames, e.g., In this paper we explore...; First...; Second...; Lastly....

An associated concept is that of noun-modifier semantic relations, wherein certain word pairings have a standard interpretation. For example, the phrase cold virus is generally understood to refer to the virus that causes a cold, rather than to a virus that is cold.

I would personally like to see more coding identifying these partial/multiword lexical units but that is because I don't know the original languages well enough to recognize them.

Orthodox Bishop Alfeyev: "To be a theologian means to have experience of a personal encounter with God through prayer and worship."; Orthodox proverb: "We know where the Church is, we do not know where it is not."

Tagged: