Off topic: machine translation

Many Logos users are interested in translation, so this article may be of interest:
How Google Converted Language Translation Into a Problem of Vector Space Mathematics
Sample:
Now Tomas Mikolov and a couple of pals at Google in Mountain View have developed a technique that automatically generates dictionaries and phrase tables that convert one language into another.
The new technique does not rely on versions of the same document in different languages. Instead, it uses data mining techniques to model the structure of a single language and then compares this to the structure of another language.
Comments
-
Intriguing. The imperfection of this approach hinges on the proposition that "every language must describe a similar set of ideas, so the words that do this must also be similar."
That is true up to a point. It's fine for nouns and numbers (the examples used in that article) and adjectives and verbs. But our language shapes the way we think and the ideas we can construct in our minds. Hebrew is a very concrete language, for example. People speaking Hebrew think in terms of physical objects and characteristics, not abstract ideas. Yes, they can conceive of abstract ideas such as compassion, but they conceive of them in very physical concrete ways (the word for compassion is the same word that means "womb" -- you can see why the ideas would be connected; a mother has compassion on the child in her womb; BDB says the word compassion was "originally brotherhood, brotherly feeling, of those born from same womb"; we do not think of wombs when we say the word "compassion").
There are some words and concepts that can't even be translated into a single word/concept in another language because the idea doesn't exist in that other language. The only example I can think of off the top of my head is the word for lamb which doesn't exist in certain languages (there are some parts of the world where sheep are unheard of), so when translating the Bible, when it talks about Jesus as the "Lamb of God" the translators have had to resort to cultural equivalents (some other sacrificial animal such as an ox) but it isn't a completely one-to-one correspondence, because of the whole world of sheep that involves shearing, which is built into the word "lamb" when we hear it.
0 -
On Rosie's note, there are some languages in this world in which there are no extant linguistic representations of "love" or "thank you."
Machine translation keeps getting better and better. But does it replace humans? So far, no. And probably, never.
Attempts to rely heavily on machine translations have so far produced rather inadequate texts...
0 -
On the other hand, just a couple of hours ago I directed someone to the machine-translated version of a German Wikipedia article about Hermann Strathmann. So yes, there is a place for machine translation. But it will never suffice completely.
0 -
If you're genuinely interested in the vector space models of semantics and how they tackle the difficulties proffered above see http://www.jair.org/media/2934/live-2934rce -4846-jair.pdf It is an overview article presenting several paths of research currently supported by open source software.
Orthodox Bishop Alfeyev: "To be a theologian means to have experience of a personal encounter with God through prayer and worship."; Orthodox proverb: "We know where the Church is, we do not know where it is not."
0