Experiment assessment at the 20% mark: Accelerating comprehension? – Week 51

I have now watched 240 hours of Mandarin-language movies and TV shows, or 20% of the total time for my experiment. Nearly a year has gone by since I began this adventure on January 17, 2014.

The sounds of a language that was once utterly foreign to me have now become familiar, though not quite intelligible. As I reported at the 10% mark, I continue to make steady progress in my deciphering and comprehension. I now occasionally understand complete phrases, and in most sentences I can pick up at least one word.

My incipient comprehension is starting to become useful. When watching a regular movie or show without subtitles, the words and phrases I understand enhance my understanding of the plot, even if marginally.

At this 240-hour mark, I tested my listening comprehension using a new episode of the same Chinese soap opera I have used for this purpose in the past—A Tale of 2 Cities[1]. I think it is a good test because I never watch this particular show or even this genre—so the results are not influenced by previous familiarity with the content or specific voices and manners of speaking. At the same time, the dialogue seems to be in standard Mandarin[2] and is not technical, but rather about daily life. Thus, the results should be representative.

This time, I devised a simple system to measure more accurately and objectively the percentage of word occurrences I was understanding. As I watched, for the first time, 15 minutes of the episode, I jotted down the words I believed I understood. I then watched the entire 15 minutes again, one section at a time, verifying as best as I could which words I got right (discarding the ones I was unsure of) and estimating the total number of words in each section. Thus, within a couple of percentage points, I can confidently affirm that I now understand 8% of words in a routine standard Mandarin conversation, including repeats, inasmuch as this soap opera is a representative sample.

The following graph shows how my estimated comprehension has evolved over time (blue line), alongside the time I have put in (red line).


If the rate of learning as measured for the first 240 hours were to continue indefinitely, I would understand 40% of the words (including repeats) by the end of my experiment, and would take 3,000 hours to reach 100% listening comprehension. Of course, that extrapolation is tenuous at best. The main reason the rate of learning would decline is because of diminishing returns—more specifically, due to the diminishing word frequency of new words.[3]

On the other hand, the rate of learning might also accelerate because of the nature of the language acquisition process. I am listening to a large amount of audio content that I do not understand, but it nonetheless is entering my brain, which is evolutionarily designed to recognize patterns and create neural synapses to process the sounds efficiently. I am convinced that this cognitive development occurs far beyond what I can consciously and self-referentially perceive at any given time in terms of comprehension of actual words. As my brain silently labors, its Mandarin repository and processing ability gradually increase before finally manifesting as actual conscious comprehension of words and phrases.

Furthermore, like pieces in a 10,000-piece puzzle, the more words I learn (especially the “corner pieces” of key pronouns, verbs, conjunctions, and so forth), the more the general panorama comes into view. As this happens, deciphering new words in context becomes easier.

Although my self-assessments are rough estimates—especially the previous, less meticulous ones—my progress would seem to indicate that thus far, the latter beneficial phenomena have outweighed the diminishing word frequency factor. After the first 120 hours, I estimated I was understanding 2.75% of word occurrences, while after another 120 hours, I now estimate I understand 8% of them.

For the sake of conjecture, and despite the tenuous nature of any extrapolation, let us assume that I did continue my rate of an 8% increase in word occurrence comprehension for every 240 hours of listening. What would that spell for my hypotheses?

The first and main hypothesis is that I can learn to understand Mandarin just by watching authentic videos. Obviously, that hypothesis would be proven correct, since eventually I would get to 100% comprehension. Though any conclusive affirmations would be premature at this point, that conjecture is logical and consistent with my experience thus far. If I was able to get past the initial hurdle of deciphering and consolidating comprehension of a few dozen words in Mandarin[4], it seems self-evident that I will continue to make progress and eventually understand the language.

Skipping ahead, the third hypothesis is that after watching 1,200 hours of authentic Mandarin videos, I will have attained sufficient comprehension to tackle a new video, and on first viewing, understand the general plot or the topics that are being discussed. According to my extrapolation, after 1,200 hours I would understand 40% of word occurrences. I am unsure whether that would be enough to attain the aforementioned intermediate level of comprehension, but I do not believe it would be. I think to really understand the general plot and topics of any new video, one would need to understand closer to 60% of word occurrences.

This projection coincides with my subjective expectation based on how the experiment is going thus far. I think it is quite possible that my rate of acquisition will accelerate and, as a result, the percent of word occurrences will increase more quickly and reach 60%. On the other hand, I would not be surprised if that does not happen, and five or six years from now, at the end of my experiment, I am in fact at 40% comprehension, thus refuting the third hypothesis.

The second hypothesis is that this method is actually efficient and effective as compared to traditional, old school methods that are heavy on formal study, grammar rules, translations, and memorization. This hypothesis will be the most complex and controversial to assess.

A presumably very efficient method requires at least 4,600 hours to achieve a “professional working proficiency” in Mandarin, comprising listening, speaking, reading, and writing. I would guess that an inefficient traditional method might take twice that amount of time.

Further, I estimate that one needs to understand about 90% of word occurrences in speech between natives, as in a soap opera, to attain that level of proficiency[5]. At my current rate, extrapolated, that would take me 2,700 hours of viewing. It might then take me another 1,350 to achieve an equivalent level of speaking proficiency[6], bringing the total to 4,050 hours. That does not include learning Chinese characters and being able to read and write. If these estimates and my extrapolation prove accurate, it seems my method would be similarly inefficient as traditional (old school) academic methods, and my second hypothesis would be refuted as well.

. . .

More importantly, though, I am having a lot of fun. As I’ve discovered during my current vacation period, watching Chinese movies and Boonie Bears cartoons is a great way to avoid dealing with more urgent, practical matters. I watched 48 hours of Mandarin between December 11 and January 6, but did not even touch the piles of unfiled papers in my closet!

Many of the Chinese movies I have watched enriched my life culturally, aesthetically, and philosophically.

The Boonie Bears have been a great bonding experience with my daughter and even with my wife on a few late nights when no one was sleepy! While watching the sadistic bears and their logger nemesis in action is not any more culturally or morally edifying than Bugs Bunny or Tom and Jerry, the great thing is that you can enjoy the plot and the antics without subtitles.

That is important, because in the past 40 hours, I have deliberately reduced my use of subtitles from a previous 70% of viewing to a current 60%. I will continue to reduce their use until most, and then all, of my viewing is without this crutch.

Of course, the most useful show I have found is Qiao Hu. It has no subtitles, I understand half of the dialogue, and I can easily pick up several new words in each episode. And it is really enjoyable—for a two year old! Needless to say, I watch much less Qiao Hu than I “should” to avoid giving up on my experiment due to boredom.

I really look forward to being able to understand and enjoy movies without subtitles. While I probably will not get to that point anytime soon for first viewings, I expect that sometime this year or next it will become feasible to enjoy my favorite movies without subtitles, when watching them for the fourth or fifth time.

Since last July, my daughter has not watched enough Mandarin to make notable progress. Alas, I do not think she will learn in this way. Nevertheless, I believe the exposure she has had to this difficult and important language, and to Chinese culture through film, is enriching. If she decides to learn Mandarin when she is a little older, she will be a leg up because of this early exposure.

For me, the Mandarin wilderness trek continues with enthusiasm unabated.


[1] http://en.wikipedia.org/wiki/A_Tale_of_2_Cities

[2] OK, I just looked this up and apparently it is in standard Singaporean Mandarin (oh man oh man), but that seems to be close enough to Standard Chinese in China. (http://en.wikipedia.org/wiki/Standard_Singaporean_Mandarin)

[3] If I understood every single occurrence of just 5 or 10 Mandarin words, my percentage would be much higher than my current result. However, that is not trivial, because the trick is being able to decipher those words in the context of sentences spoken quickly by native speakers.

To illustrate the importance of word frequency, a word corpus taken from English language movie and television transcripts reveals that just 10 words account for 21.8% of word occurrences (http://en.wiktionary.org/wiki/Wiktionary:Frequency_lists/TV/2006/1-1000).

During my 15-minute test, I understood 29 unique words, for a total of 76 word occurrences out of an approximate 943 total words spoken.

[4] My total words deciphered, but not consolidated to the point where I am can systematically pick them out in conversation, is in the hundreds.

[5] When natives to speak to you as a foreigner, they slow down their speech and restrict their vocabulary a bit, allowing you to understand close to 100% at professional working proficiency.

[6] Assuming that having a very high level of listening comprehension will make learning to speak well much easier.

11 thoughts on “Experiment assessment at the 20% mark: Accelerating comprehension? – Week 51

  1. Bete says:

    That is very interesting analysis. I’m impressed. I like the puzzle concept and believe your learning curve will taper off and then if you keep at it will surge at some time.

  2. “The sounds of a language that was once utterly foreign to me have now become familiar” — that’s a good sign, as was your being able to recognize some overheard conversation as being Mandarin a few months ago (week 27). I’ve spent a lot of time listening to spoken Thai that I can only understand a little (or sometimes nothing) of, but even still you can often catch something of what’s going on, even if it’s just a kind of non/sub-verbal emotional exchange. Anyway, I like how other languages sound.

    Yeah, watching movies (etc) in the target language really can be a good way to put off doing things you really should do but that aren’t quite as…fun. 😀

    • Thanks for your comments.

      What do you think about the usefulness of listening to content of which you understand very little, i.e. mostly incomprehensible input? The reason I ask is that, if I understood correctly, ALG frowns upon this, and seeks to have all content be in the 70% comprehension range. In a language learning forum I participate in, most all the participants also seem to think that listening to incomprehensible input is a waste of time. I, on the contrary, think it is quite useful as long as it done with full attention–in fact, that’s a premise of my project.

  3. That’s an interesting question…the big questions for me are, at/below what percentage comprehension-wise is input useless? At what percentage (range) is it optimal? I don’t think there are any answers to these questions as yet, so if you want to experiment on yourself, I for one look forward to reading about your experiences and progress.

    I certainly sat through my fair share of classes in AUA where my idea of what was going on was really vague, sometimes practically nonexistent. This usually happened in the early phases of being at a new level of class; eventually, my comprehension would come up. So obviously, getthing that “very low comprehension input” worked in terms of acquiring Thai; whether getting “higher comprehension input” at that point would have accelerated the learning process — that, I don’t know.

    I’ve also sat through snatches of dialogue or scenes in TV shows and movies where my comprehension has been extremley low — occasionally entire shows or movies; these are harder to assess in terms of their efficacy as language learning tools, at least in part because they were just one element in a mix of more comprehensible input (easier to understand shows and movies, classes at AUA). Most of the TV I’ve watched has been lakorn — Thai soap operas — where it’s very easy to get at least some idea of what’s going on, even when the language is problematic.

    My hunch is that certain kinds of totally incomprehensible input really are totally (or almost totally) useless. My list would include things like the following, where the language is so advanced as to be totally incomprehensible to the listener/viewer: audio only, and especially something like a recording of a university lecture, where you’re just getting an information monologue delivered in a fairly monotone, unemotional voice; and shows where the visuals have little or nothing to do with the language used, there’s no easy to follow narrative, there are no obvious emotional interactions between people — stuff like “high level” news shows (politics, economics), or talk shows about similar topics (with “talking heads”); and maybe certain types of documentaries. Basically, I’m talking about situations where the input consists of language that’s way out of your reach, and the context and nonverbal side of things don’t really provide any real help in terms of “getting” what’s going on.

    I think you need to be able to have some understanding of what you’re experiencing, in order for that experience to be an instance of language learning — but it doesn’t have to be a very high level of understanding; it could be pretty vague.

    A key idea behind comprehensible input is that if your language capabilities are weak, then you need to have things like context and nonverbal cues. Good stuff for beginner level — if you’re limited to video — might be shows that demonstrate things and talk about them at the same time (cooking, crafts); I’m big on soap operas, but by the time I started really trying to watch them, I was already at a fairly decent level of Thai (around 1750 hours class time, I think, plus a fair amount of out-of-class experience) — though in retrospect, I probably could have started watching them earlier.

    Here’s a thought experiment of mine: totally incomprehensible input: an audio recording of someone reading, in a flat tone of voice, a series of vocabulary words, or even phrases, that don’t add up to anything meaning-wise; i.e., they don’t form any kind of narrative or information or communication, just instances of the language itself. My guess is that that kind of input would get you nowhere in terms of language acquisition — it would be a waste of time; if it were the only form of input you were getting, you’d never learn the language.

    So again, for me, the questions are: how low of a level of comprehension could you get away with and still be learning; what level of comprehension would be optimal in terms of maximal learning?

    Then there’s also always the question of enjoyment. If what you’re getting is totally, absolutely incomprehensible (like in the example I give above), you might find that your ability to enjoy it fades away after the novelty wears off — and there’s nothing there to really hold your attention. But even a fairly low level of comprehension can make for an enjoyable experience. For example, you’re watching a scene in a movie or show, and you understand that these 2 people have a conflict going and are having a showdown; something is at stake. You can’t figure out what is at stake, what’s going on, or what they’re saying to one another; but you get the scene on an emotional level. — I’ve been through this plenty of times watching movies and TV in Thai, and I enjoyed what I was watching and it wasn’t a chore to keep paying attention, despite the extreme “sketchiness” of my understanding.

    Never underestimate the fun/enjoyment-factor 😉 …

  4. That’s a really complete answer and I will reference it in the future, as I think you really hit the nail on the head.

    In your implied (and much more logical) definition of incomprehensible input, an example of which is pure audio when one is just beginning to learn a language that is unrelated to languages one already knows, I agree that its usefulness is limited at best.

    And in that definition, none of what I’m listening to is actually incomprehensible input, since I always have rich visual cues and can get the gist what is going on. A good example is the cartoon Boonie Bears. At the beginning of my experiment, an entire episode could go by without my recognizing any words. Nonetheless, I understood the basic plot. And in many episodes I was able to pick up a word here and there. Even the hardest material–such as movies without any subtitles–is not really incomprehensible in that strict sense.

    Some of the strongest critics of my experiment (namely on forums), however, consider my watching these sources to be incomprehensible input and thus a waste of time.

    As you say, a very interesting question is what is the optimal percentage of comprehension for learning. I would venture to guess that, while there might be an optimal point (which would probably vary depending on each person, L1 and L2, etc.), the benefits of watching with full attention at any rate of comprehension would not decline very significantly as compared to that optimal point.

    It’s actually probably optimal to mix things up, since you might glean different benefits at different levels. For instance, and just for the sake of argument, watching at 80% comprehension might be fantastic for picking up new vocabulary, at 98% might be best for strengthening one’s grammar, learning idiomatic expressions, contributing to spoken fluency, etc., and at below 10% might be really good (for beginners, by definition) for learning high frequency vocabulary, paying attention to phonemes and cadence, etc.

    In the case, and in line with what you suggest, better criteria for choosing what to watch might be simply what you enjoy more, feel motivated by, what is easily available, what contributes to your life in other ways beyond language acquisition, and so forth.

  5. “I always have rich visual cues and can get the gist what is going on” — that’s comprehensible input, in my book. “Comprehensible input” to me means getting the overall meaning; it doesn’t mean that you have to understand every detail, as you would with a language that you’re totally fluent in.

    Mixing things up is good for fun, for getting different types of language (a kids’ show vs. the nightly news vs. a soap opera; dialogue vs. monologue, etc), for covering different topics (and their associated sets of vocabularies). I’ve sometimes watched stuff that’s way over my head — just as an experiment, for the sound of the language, for exposure to the particular type or register of language they’re using; but I usually find that I can’t sustain it, and end up setting those kinds of shows aside, to maybe pick them up again at some future point.

    Anyway, there’s no penalty for experimenting with your input sources…. 🙂

  6. Oh, and regarding the “comprehensible” in “comprehensible input”: it refers to understanding the situation as a whole, not just the language; language is only one facet. So if you’re following what’s going on — in a Boonie Bears episode, or while talking to a native speaker who’s willing to supplement their speech with gestures, pictures, etc. — then it’s “comprehensible” even if you can’t understand a word of what’s being said.

    “Traditional” language learning is I think premised on the idea that you need to learn all the vocabulary (and grammar rules, etc) before you can understand an instance of the language being used; ALG/natural language learning is assumes that if you can follow what’s going on in a situation where language is being used, you will end up acquiring language. So it kind of reverses the logic and assumptions of more “studious” language learning methods.

    As you acquire more and more of your target language, you become less and less reliant on context and nonverbal cues to make the input comprehensible; you can get to the point where you can understand input that consists of nothing other than language.

  7. If you want a suggestion on a show, check out this one – http://en.wikipedia.org/wiki/Love_Cheque_Charge

    We’ve started and ditched a number of romantic comedies but this one actually grabbed us. We’re up around episode 45 and are enjoying it still. I’ve learned a ton, even watching with subtitles. My husband isn’t actively studying and needs the subs to enjoy it. I try to listen first before reading. I’d probably get more benefit from watching without subs but want my hubby to be able to watch too. He hates to admit it, but he’s actually enjoying this show. lol

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s