A few weeks ago I introduced the dissertation topic by linking to a clip of Finnish hockey to show the problem that we all have in finding words in a new language. This post talks about some of the solutions to this problem, comparing written and spoken language.
In many written languages, it's obvious where words are. In everything I'm writing here (well, not everything -- is "I'm" one word or two?) we have a word with a nice space on each side to say where it starts and stops, and so, if you did not know a word you could go look it up in a dictionary, which is just a big list of words. This even works in some written languages that we may not know. Here is the sentence, "I will destroy you all with my college game picks" in Italian:
io distruggere voi tutto con mio università gioco selezion
or in Portuguese:
Eu destrui-lo-ei todo com minhas picaretas do jogo da faculdade
At least, babelfish thinks they are the same. So I could go look up the meaning of "picaretas" because it looks to be a word. (Actually, I don't speak any Portuguese, but you can figure out almost all of the Portuguese sentence, can't you? I will put my best English guess for each word underneath (this is called glossing).
Eu destrui-lo-ei todo com minhas picaretas do jogo da faculdade
I destroy-Future-you? all with my picks of game of college
It's also interesting to see the Romance language connections going through in the two languages. Here I've paired words that are obviously similar: Io-Eu; distruggere-destrui; tutto-todo; con-com; mio-minhas; gioco-jogo. Noticing these connections is how people reconstruct languages that are no longer with us, such as the language that Portuguese and Italian both came from.)
However, word boundaries are not always clear even in written language. Chinese for instance uses its famous characters. While each character is an individual word, they are also combined to make new words, and in fact most Chinese words have at least two syllables, meaning two characters when written. So when you look at a string of characters, there's nothing on the page which tells you to take a character by itself as a word, or to combine it with the next one to make a word. An example might help here.
Take the word, Beijing. It's a single word, a name of a city, written with two characters, each of which has a meaning on its own. "Bei" means north and "jing" means capital. If you are reading and get to the character for Bei and look it up, you will find a meaning of north. But that's not really the word in this case. You really need to be looking up the whole thing "Beijing". (FYI, Nanjing is Southern Capital; and they were very creative with city names in Taiwan. Tai is the place, then the capital is Taipei (Taibei), meaning Tai North, and there's Tainan, meaning Tai South, and even a Taichung (Taizhong), meaning Tai Middle. Guess where on the island each of these cities is located.)
This used to drive me crazy when I got to intermediate Chinese. I spent hours looking up each character and then trying to figure out the meaning of the sentence, but I was usually wrong because I wasn't looking up the meaning of 2 or 3 or 4 characters together. It's kind of like finding the word "boa constrictor" and trying to see figure out what a feather boa was doing in the jungle; or not knowing that a blackboard is not the same as any old board painted black.
This doesn't appear to be a real problem for people fluent in Chinese, because we don't read in our native languages one word at a time. Instead we can see several words at once, so you can weigh all the possibilities at once in your head. Many things just don't make any sense in the context, so the mind discards them without us even being conscious of it.
In short, the word segmentation problem exists even in written language; however, it is ubiquitous in spoken language, in which obvious pauses on each side of a word are a rare, rare exception. In written language, we might solve the problem by looking at a bunch of text at once, but this won't work in spoken language. In the written word, everything sits there for you without changing and your eye jumps from group to group. But speech happens in real time with sound after sound coming at you and then disappearing forever. Think of a word like "disappearing". The whole word never hits your ear at once. There's a little burst of sound for a "d", then a loud "i" sound, then an "s", and so on. There's at least 7 different sounds (and really more) in the word. The listener must capture each of these sounds in memory and assemble the words together from these lingering memory traces.
Now, it turns out there are several cues to help you find words in speech. I talked in an earlier post about one, which is stress. In English, most, but not all, words are stressed on the first syllable. So if you can hear stress, then you can guess that the stress is the start of the word and be right most of the time. (But, again, not all of the time; click on my earlier post to see the trouble this causes for B.)
Another cue is what's called phonotactics. Phonotactics are the rules each language has for what sounds can go together. For instance, in Hawaiian, all syllables must either be a Vowel by itself or a single consonant followed by a vowel. (V or CV in structure). The pull of phonotactics is very strong. You may have heard that Merry Christmas in Hawaii is Mele Kalikimaka. Why? First off for Merry, there's no "r" sound in Hawaiian, so you put in an "l" instead. Mele. (Just to be safe; that's two syllables, me.le; not a word that rhymes with "meal".) Now, look at the word Christmas. I'll put it in phonetics with a period for each syllable. "krIst.mas", roughly. Compare it to other Hawaiian words that follow the normal V or CV pattern I mentioned -- kahuna, kalo, honolulu, likelike, kamehameha, humuhumunukunukuapua'a. The last is the Hawaii state fish and with periods for syllable breaks is hu.mu.hu.mu.nu.ku.nu.ku.a.pu.a.'a. Look how every syllable is just a V or a CV; no mass of consonants in a row like in [krist.mas]. So for a Hawaiian to say Christmas, she must convert it to the right syllable structure, which means you either delete some of those consonant that are all bunched together, or you insert vowels to give each consonant a syllable. In this case, we will insert an "a" to give [karisamasa]. This might sound a bit like Japanese accented English, because Japanese syllable structure is kind of like Hawaiian. Now, Hawaiian again doesn't have the "r" sound, so sub in "l" like in mele, and it also doesn't have "s", so sub in "k" and you get [kalikimaka]. Mele Kalikimaka.
OK, that was a bit of an aside, but, besides being interesting, it is just supposed to illustrate how all languages have rules about what sounds can go together in a word or syllable. In English, for instance, "p" can only be followed IN THE SAME SYLLABLE by a vowel, an "l" or an "r". Pain, preen, plan, etc. But never another consonant like "s" or "t" or "k". A word like "pkan" doesn't sound like English, while "pron" isn't a word as far as I know, but it could be. So, if we encounter such sounds, we usually just drop the "p". Psychology, pneumonia, pterodactyl. Sometimes when we encounter an illegal consonant cluster, we go the Hawaiian route and add in a vowel to make a syllable. The Vietnamese name "nguyen" is often pronounced "nagyun" by English speakers, because we can't do an "n" followed by a "g" IN THE SAME SYLLABLE in English.
I keep saying "IN THE SAME SYLLABLE" in this post, because we can certainly have a "p" followed by a "t" when they are in different syllables or in different words. "Copter" [kap.ter] is easy as pie to say. All of these phonotactic rules can be cues to word boundaries. English has a couple dead giveaways. One is "ng" as in "sing" "wing" and "bling". That "ng" sound is always found only at the end of words (but sometimes with an ending like "singing" or "singer"), so, if you hear that "ng" sound, you know it is an end. English speakers are so dedicated to the idea that "ng" only goes at the ends of things that we have a hard time saying the sound at all when it's not at the end. Going back to the name "Nguyen", that's actually not an [n] followed by a [g], but is just the regular [ng] sound, except that Vietnamese allows that sound to occur at the beginnings of words. So even though it is the exact same sound as in English, boy, is it a beast to say in the wrong place.
Another giveaway is "h". "h" is always at the start. Heart, happy, hello, but never "buh" or "youpah".
In sum, phonotactics help in finding word boundaries.
The final cue I will discuss tonight is hearing words by themselves. B's first word was "kitty" though pronounced "giggy". Maybe what happens is we sometimes hear "kitty" by itself and so we can pull it out of the speech stream when we later hear "look, there's a kitty", which remember will not have any pauses between the words. This probably does happen, but it doesn't happen as much as we might think. If you look at transcripts of talking to children and adults, words are rarely used all by themselves to make an entire sentence. Instead we say things like: there's the kitty; isn't the kitty soft?; look, the kitty!; there goes kitty!
We can't shut up and just say "kitty". It's almost always with a bunch of stuff tagged on.
My guess is that this already-knowing-words is important a bit later in learning a language. You use whatever other cues you have to pull out "kitty" and that makes other segmentation easier.
Before I close this down, it's worth looking at my kitty sentences again. One interesting thing about them is that only one important word seems to recur, namely "kitty". The sentences start sounding a little bit like 'the word "kitty" being surrounded by other stuff that you can just ignore'. Only those sounds [kIdi] really repeat together over and over. Everything else changes. "Kitty" then is frequent, while everything else is not frequent and harder to predict.
This is the math angle of finding words and is where my dissertation topic starts, but that's for next time.
Do I get a trick or a treat for a blog post of this size?