Saturday, December 20, 2008

Textbook - Section 4


As we continue by flying tour of language and computers, we move from talking to other people in the same language through computers to talking to people in different languages on computers, i.e., we are moving into machine translation. This is going to start getting increasingly tech-y, but it never gets hardcore at all. Anyway, here's the next bit. Unfortunately, there's a lot of formatting that's getting lost here. I've tried to put most of it back.

My Trusty Speech-o-Matic!

We’ve talked so far about how the language we use online can be a window into our own subconscious knowledge of our language and how the communicative abilities of the Internet have increased language opportunities. One of the greatest dreams for computers is that they would somehow allow us to communicate across languages. The dream version might be the supposed Universal Translator as seen in the Star Trek world, or it’s companion in a million other movies. Anyone in the universe walks up, speaks through whatever means their alien bodies allow, and magically the computer translator pops everything out as Mainstream American English. Brilliant! Could it ever be possible? How would it work?

Perhaps the best way to start looking at Machine Translation is by looking at what it is not. Translation is not substituting the words of one language for the words of another language. This may work for some sentences, but it quickly breaks down. Let’s take the English sentence Sylvia is going to the library and attempt to translate it into French. There are parallel words in French for most of this sentence. English to go is similar to French aller; English to is similar to French à; English the is similar to French la; and English library is similar to the French bibliotèque. So let’s take just substitute one word for another. Sylvia goes to the library becomes Sylvia va à la bibliotèque.

That turned out fairly well, but I’ve already fudged things a bit. You might have noticed I said that to go is similar to aller, but then I used va in the sentence. This is because, French, like English, has different verb forms based upon tense, aspect, and mood. JAKENOTE, DO THEY KNOW ALL OF THESE TERMS FROM EARLIER CHAPTERS? There are many parallels between French and English verb forms, but they do not always match. French verbs, for instance, mark whether the subject of the sentence is first person, second person, or third person right on the verb, while English does not by and large. Therefore, to translate any verb into French, we need to know what the subject is. Since this is not part of the English verb, we cannot read it from the word itself. Instead the machine translator must know what the subject is. This, in turn, requires a grammatical analysis of the sentence.

A similar problem occurs with translations between English and German. English generally follows a sentence pattern of subject-verb-object, while German follows a pattern of subject-object-verb, i.e., the verb goes in the middle for English sentences but at the end for German sentences. JAKE NOTE, A POINTER SOMEWHERE IN THIS PARAGRAPH TO YOUR TYPOLOGY CHAPTER SOUNDS APPROPRIATE. In such a case, even if there was a perfect one to one match between English words and German words, the translator again cannot simply substitute one word for another. It needs to know what the subject is, what the object is, and what the verb is in order to re-arrange things. In short, any non-trivial translation requires that the computer be able to grammatically analyze any sentence it encounters. Even this might seem simpler than it is. Most of the sentences that you have read in this chapter have never been written before, and never encountered by you, the reader. (Some of you may wish you never encounter sentences like this again.) These exact sentences will not be in any database of the English language. Instead, you the reader, and the computer translator, must know English grammar sufficiently to analyze sentences you’ve never heard.

Guess what? The problem gets worse. Let’s say we want to translate between English and either Japanese or Korean. Both of those languages express what are called honorifics right in their grammar and verb forms. Honorifics are sort of a grammatical form of politeness and their use depends upon a series of factors such as the status of the person being spoken to, the status of the person speaking, the intimacy of their relationship, and more. English deals with many of these same issues. We do not speak the same way when giving a talk at a funeral as we do while playing Guitar Hero III with a friend. Nor do we talk the same way to our lovers as we do to our company’s CEO -- if you want to keep your job. These social factors are expressed through the word choices we make, the elaborateness of the phrasing, and the assertiveness with which we speak. JAKE NOTE; AGAIN, CAN WE REFERENCE THE SOCIO CHAPTERS FOR THIS? Japanese and Korean use these methods to express social relationships as well, but they additionally code some of this in the exact verb endings used in a sentence. This implies that the computer cannot perform a true translation of Sylvia went to the library into Korean or Japanese without some pragmatic knowledge of each society – who is of high status, who is of low, and what their relationships are like. Such knowledge is, of course, nowhere in the sentence itself, but requires knowing the entire social context of the sentence.


writtenwyrdd said...

More fascinating stuff, paca!

fairyhedgehog said...

So you're saying the babel fish doesn't really exist?

pacatrue said...

It's not fair, is it? Bring on the babel fish!