American English: Thoughts on Teaching Listening

By David Barker
Author and Publisher of Materials for
Japanese Learners of English

PART 1

I can’t remember who said it (I have a feeling it may have been Penny Ur), but I remember hearing a quote about teaching listening once that really made me stop and think:

We don’t really teach listening; we just keep testing it.

Whoever it was, I think he or she had a very valid point. Our standard methodology for teaching listening is a cycle of giving listening tasks and then asking questions in order to test the learners’ comprehension of what they have heard. In our defence, of course, it is difficult to see how we could do otherwise. Like reading, listening is a receptive skill that can only be developed through repeated practice, so there are good reasons for teaching it the way we do. Anyway, I was recently asked to do a presentation on this topic, and I started thinking about aspects of listening that do actually need to be taught rather than simply practiced. The first thing that came to mind was a list of general principles of which learners often seem to be unaware, and I want to write about the first of those today.

The first point is that we listen with our brains, not our ears.

Many learners think that improving their listening skills means developing their ability to recognize audio signals produced by proficient speakers of the language. In fact, the audio signal that most speakers produce does not contain sufficient information for it to be recognized in isolation. To hear an example of this, try saying the following sentence quickly.

“There was a great movie on TV last night.”

Now say it again at the same speed, but stop at “a.” Repeat the first three words, but keep your pronunciation exactly the way it was when you said the full sentence. Now dictate this segment of language to a proficient speaker of English and ask them to write down what you are saying. The chances are that they will be unable to do so. If you then complete the sentence, however, you will find that the listener is suddenly able to understand what just sounded like random noise a few seconds ago. The person will probably even tell you that they can “hear” the words now. Of course, the listener cannot hear “There was a” because you are not really saying it. The reason they think they can hear it is that after the words enter their ears, their brain takes over to decode the signal in the light of what went before, what came after, and what that person knows about the way English works. In other words, it is the brain that is doing the listening, not the ears.

You can see this process in action for yourself by playing around with an iPhone 4S. Even if you do not have one of these yourself, I am sure you have been bored to death by friends who do have one telling you about it and/or demonstrating its features. Anyway, Siri is a “virtual personal assistant” that, so its developers claim, is capable of recognizing natural speech. The interesting thing about Siri is that it only works when you have a good Internet connection. This is because it has to take the audio signal and process it through powerful servers in order to work out what has actually been said. If listening really were a matter of simply recognizing audio signals, this would not be necessary. The fact that it is necessary shows that it is the computers that are doing the listening, not the microphone.

I was quite skeptical of Siri before I tried it, because the voice recognition software I have used in the past has been worse than useless. Siri, however, is remarkably accurate, and it gets things right a lot more often than it gets them wrong. When it does make mistakes, however, it provides interesting insights into the challenges the human brain faces when it attempts to comprehend spoken language. One of the most difficult things about decoding the audio signal produced by spoken language is working out where the word boundaries lie. This is because different sequences of words can produce an identical audio signal. To give you an example of this, here is a true story about a message I tried to send the other day. On that day, I was feeling particularly pleased with myself because I had managed to get up at 6 a.m., and I decided to send an email to a friend to inform her of this remarkable achievement. Picking up my phone, I dictated to Siri, “I just broke the world record for getting up early.” When it had taken a moment to absorb the sequence of sounds that came from my mouth, Siri displayed the following message on the screen:

I just broke the world record forgetting up early.

If you think about it, the audio signal produced by my original sentence and the one produced by Siri’s transcription of it would be identical, but no speaker of English would interpret those sounds as Siri did because that sentence simply does not make any sense.

Another issue that both humans and computers have to cope with is the problem of homophones (different words that share the same pronunciation). One way software engineers attempt to deal with this is by taking into account the relative frequencies of words. For example, “rain” is more common than “reign,” so when in doubt, the computer will opt for the more common word. This, however, can lead to mistakes. As a follow-up to my first sentence, I continued my message: “Maybe I should contact the newspapers to tell them about my feat.” Before you read on, can you guess how Siri transcribed this sentence?

Of course, “Maybe I should contact the newspapers to tell them about my feet” is perfectly grammatical, and it would even make sense in some contexts. The problem is that in order to arrive at the correct interpretation, the listener has to hold in his or her mind a continually developing sense of what is being discussed. This sounds simple, and indeed it is—for humans. Unfortunately (or maybe fortunately!), however, it is still beyond the capabilities of even the most powerful modern computers.

To summarize, it is important for learners to understand that the reason they cannot “hear” clearly what proficient speakers of English are saying is that the speakers are not saying it clearly in the first place, so the sooner they give up on that idea, the easier it will be for them. Other proficient speakers can’t “hear” English in that sense either. What allows us to decode the signal and understand what is being said is our knowledge of English vocabulary and grammar, and our ability to keep track of topics over the course of a conversation. As I said, we listen with our brains, not our ears.

PART 2

Pronunciation is one element of language courses that often gets overlooked. Part of the reason for this is that experienced teachers know how difficult it is to learn the sounds of a foreign language as an adult, especially if that language is nothing like your own. This basically means we accept that Japanese students will always have a Japanese accent, that Koreans will always have a Korean accent, and so on. Incidentally, I always used to think in terms of learners “gaining” the accent of a foreign language, but I remember hearing a friend talking about a Japanese person he knew who had managed to “lose” her Japanese accent. That is an interesting way of looking at it. I wonder which viewpoint is more common among teachers?

Anyway, as well as acknowledging the difficulty of the task of teaching pronunciation, most teachers also realize that even with a heavy accent, the majority of learners will be able to make themselves understood to proficient speakers of English. The combined effect of these two beliefs is that pronunciation often gets relegated to a once-in-a-while exercise with the sole purpose of providing a bit of variety in the course.

There are at least two problems with this way of thinking. The first is that teachers, particularly those of monolingual classes, are often very poor judges of how comprehensible their students actually are to regular speakers of the language. When I lived in New Zealand, I did the examiner training for IELTS (International English Language Testing System). As part of the workshop, we had to watch videos of candidates speaking and assign grades. What soon became clear was that teachers were giving far higher grades to students of nationalities they were familiar with. For example, two teachers who had worked in Korea gave a Korean student a high grade for her speaking, whereas the teachers who had mainly worked with European learners gave her a low one. Their reasoning was, “We can’t really understand what she is saying.”

The second reason why pronunciation deserves more attention in language courses is that a learner’s knowledge of the sounds of a language will directly affect their ability to perceive and recognize those sounds. In other words, having good pronunciation is just as important for listening as it is for speaking. My limited understanding of how recognition systems work is that they compare sensory input with stored representations of a variety of forms. For example, we learn how the word “boy” sounds, and we then create and store a template of it in our brains. When audio signals reach our ears, they are run through the database in order to find matches. The same principle applies to the recognition of words and letters. You recognize “x” as the letter that comes before “z” because the marks on this screen fit the representation of that letter that you already have stored in your brain. Of course, you would probably recognize it if I wrote it as “X” too, and even if I wrote it by hand. The human brain has an incredible tolerance for variation that allows it to recognize shapes in a way that computers cannot. That is the theory behind those weirdly shaped letters you have to input manually on some blogs in order to post a comment. The system works because humans can tolerate greater manipulation of basic forms than computers can.

Even so, there are limits to the tolerance (I am using the word here in its engineering sense) of even the human brain’s recognition systems, and these become stricter when representations of objects or phenomena resemble each other. For example, in many cases, it is impossible for us to distinguish between “1,” “l,” and “I” when written in isolation because they look so similar. When that happens, the knowledge of language and context that I described in my previous entry kicks in and allows us to make inferences that go beyond the information that is being provided by the senses.

When a language student learns a new word, they create a template for it and store that template in their database. It is quite possible that when they reproduce the word from its template, the audio signal that results will be within the limits of tolerance of proficient speakers of the language, so the learner will be able to make him or herself understood. A problem arises, however, when the focus switches to listening. Because the template the learner has created does not really match the signal produced by proficient speakers, and because the learner’s recognition system will naturally have a more limited tolerance owing to their lower mastery of the language, there is a very good chance that they will not recognize what they are hearing. It’s a bit like going to meet someone that you have never met at an airport armed only with a photograph that was taken twenty years ago. If the person doesn’t actually look like the photograph, there is a good chance that they will walk right past you without you recognizing them at all.

Like all language teachers, I constantly struggle to make myself understood to my students. I have often noticed that the reason my students cannot understand what I am saying is that they have learned an incorrect pronunciation of a particular word. The following is a typical example of a conversation in one of my classes:

Me: Can you close the curtain?

Student: ??

Me: The CURTAIN.

Student: Curtain??

Me: (gesturing) The curtain!!

Student: Ah, kah-ten!!

It is almost as if they are correcting my pronunciation to match their internal representation of the word. Every teacher in Japan knows that we can easily make ourselves understood by simply saying a word the way our students say it, and I suspect the same is true of any teacher with experience of teaching a particular language group.

My point is that learners need to learn words as accurately as possible so that the template they create reflects the audio signal that is produced when proficient speakers of the language pronounce that word. If a learner creates a template that is significantly different, it might be close enough for their recreation of it to be understood by proficient speakers, but it may not be close enough for them to recognize the word when they hear it.

As teachers, I think we need to start realizing that pronunciation is just as much a listening skill as it is a speaking one, and we need to start giving it greater prominence in our courses.

PART 3

Speaking from my own experience, I think a strong argument could be made that, wherever possible, it is better to study the pronunciation of a language before you study the actual language itself. This is because listening to a language when you have no idea of its vocabulary or grammar forces you to rely 100% on your ears, which results in you hearing the language the way it really sounds. If you learn a non-phonetic language like English or Chinese by reading and writing graphic representations of the words, your brain will automatically assign sounds to those characters according to how it thinks they would be pronounced in your first language. I had that experience when trying to read Chinese words written in “pinyin.” I was fortunate in my learning of Japanese that I was able to learn the sound system before doing any formal study of the language by listening to Japanese pop songs and learning the words by heart. One great way of helping your students to understand what it means to use only their ears is to play them videos or recordings of songs in a language that none of them is familiar with. Check out this video for a famous example of someone just using their ears to copy the sounds of a foreign language. Isn’t it amazing how much it sounds like English while being completely incomprehensible!

In my last post, I discussed the importance of developing pronunciation skills in order to improve your listening ability, but I did not say exactly what skills I was talking about. That will be the topic of today’s post. There will be nothing new here for experienced teachers, but I hope it will remind people of things that they might have forgotten over the years. For newer teachers, I hope some of the points will give you ideas about how the teaching of pronunciation can be broken down into manageable (i.e., teachable) components.

In his excellent book “Sound Foundations” (essential reading for new teachers), Adrian Underhill breaks the English sound system into three parts:
1.Sounds in isolation
2.Words in isolation
3.Connected speech

Of these, it is probably the first that has traditionally received the most attention in EFL classes, yet many teachers (including me) would argue that it is actually the least important. I do not have the space to go into the mechanics of English phonetics here, but I would like to mention two points that learners may not be aware of.

The first concerns our ability to produce sounds. As far as my limited understanding goes, musical instruments can be divided into two groups: those that can produce a potentially infinite range of sounds, and those that can only produce a specified number. An example of the former group is a violin. Whatever note you play on a violin, it is always theoretically possible to play another that is slightly higher or lower by moving your fingers a tiny amount. An example, of the latter group is a piano. If you play the note “C” on a piano, the next note up on the keyboard is a “C#.” It is not possible to play a note that lies between the two.

The human voice is far more versatile than any musical instrument, and when we are born, we are like violins in that we have the potential to recognize and produce any sound of any language in the world. As we grow up and master our first language, however, we become “pianos,” only able to make and recognize the sounds that our language requires us to distinguish. This is not a limitation of our brains; it is one of its strengths. Knowing which sounds are used to distinguish meanings in our own language allows our brains to have a far wider tolerance for variation, which is a key element in our ability to decode spoken language. To return to the musical instrument metaphor, it allows us to hear which note is being played on any kind of piano, even ones that are not quite in tune. This is one of the biggest challenges faced by learners of English, who not only have to learn sounds that may not exist in their own language, but who then also have to learn to tolerate the variations of those sounds that occur in different accents and dialects.

The second point is that it is extremely difficult to produce sounds that you cannot hear. When I first studied Chinese, I remember being drilled extensively in the four tonal variations that are used to carry meaning in that language. My problem was not so much that I couldn’t produce the tones, but rather that they all sounded the same to me when the teacher pronounced them. It may help learners to know that there is no way they will be able to produce sounds that they cannot hear, and that the pronunciation and recognition of sounds will be a constantly ongoing project for them as they continue with their studies.

Underhill’s second level of “words in isolation” is one of the key areas where I believe teachers and learners need to focus their attention in the classroom. Although it is tempting to analyse the individual sounds that make up a word, it is far more beneficial from both a “speaking” and a “listening” point of view to focus on its syllable pattern – i.e., how many syllables it has, and which one(s) are accented. I was taught to use circles to represent syllables when introducing new words, with a big circle used to show the location of accents. For example, “computer” would be “oOo,” and “America” would be “oOoo.” I also use underlining to show secondary accents, so “information” would be ooOo. I have found that it is very useful for students to practice saying the pattern of a new word using “da-da-da” before they try to pronounce the word properly. Once the correct pattern has been established, it becomes much easier to say the word with good pronunciation. (Using this system, “information” would be “DA-da-DA-da.) By learning how many syllables a word has and which of these is/are accented, students will be able to store an accurate representation of a word’s “silhouette” in their brain. This, more than anything, will enable them to recognize it when they hear it in spoken language. (If you think about it, this is also how we recognise people: we tend to look at the overall size and shape of prominent features rather than at the details of how those parts are made up.)

Underhill’s third level is “connected speech.” From the point of view of learners, particularly speakers of Asian languages, “word linking” is a vital concept that needs to be explicitly dealt with in the classroom. When learners complain of spoken English being “too fast,” the problem is often not one of speed at all, but rather an issue of words being pronounced together differently from the way they are pronounced in isolation. One common example of this is the shifting of final consonant sounds to the start of following words that begin with a vowel. (Betty Azar’s co-author, Stacy Hagen, has done a video explaining how this works.) I used to use John Lennon’s song “Imagine” for song dictations in some of my more advanced classes, and I found that someone would always ask me “What does ‘sonly’ mean?” I didn’t understand the question until we listened together and the students told me to stop the music as Lennon sang “above us only sky.” A particularly common example of this type of word linking occurs with “an” in phrases like “an umbrella” or “an orange.” Actually, I remember hearing somewhere that the only reason the word “an” exists is that it allows “a” to “lend” the final consonant to the following word so that it becomes easier to say.

This is a very brief summary of an extremely complicated area, but once again, I would like to stress its importance not so much in speaking classes, but rather in the teaching of listening. There are many great books and online resources available for those who want to do some further study. Stacey Hagen’s series (link above) is a great place to start, and you can also find some useful basic guidelines on word linking by searching for “rules for connected speech” on the Internet. As you study how word-linking works, however, remember to keep your “listening teacher’s hat” on and think about how it will help your students to develop their listening ability as well as their speaking skills.

American English

Fecha

Pages

Thoughts on Teaching Listening

No comments:

Post a Comment