Analyzing Narrative Samples

Page Navigation

How to Calculate Mean Length of Communicative Unit
How to Calculate Number of Different Words and Total Number of Words


The ENNI scores used by the CHESL project are:

  • Story Grammar (A1 and A3)
  • Mean Length of Communicative Unit (MLCU)
    • Measure of sentence length
  • Number of Different Words and Total Number of Words
    • Measure of expressive vocabulary

How to Calculate Mean Length of Communicative Unit

The ENNI Mean Length of Communicative Unit (MLCU) is calculated using the MLU (Mean Length of Utterance) command in CLAN.
  1. Run the check command to ensure accuracy of transcript coding.
    1. Open a second CLAN window and position it next to your transcript so you can see both windows. This way, you can run your analyses in one window, and view your transcript in the other.
    2. Open the Commands window in this new instance of CLAN, if it is not already open.
    3. Type check into this Commands window.
    4. Click File In and select the correct .cha transcript.
    5. Click Add-> then Done.
    6. Click Run in the Commands window.
    7. The output of the check command will appear in the second CLAN window. If there are any errors reported in the output, correct them in your transcript, and save the corrected version. Then type check into the Commands window, and click File In. Remove the transcript you are analyzing from the right-hand pane by selecting it and clicking Remove. Then add it back again from the lower left-hand pane by clicking Add->, and proceed from step 5 above. CLAN commands will not run correctly if you do not remove and then re-add the transcript before running the same command again.
  2. Add a morphological tier using the mor command.
    1. Type mor +t*CHI into the Commands window.
    2. Click File In and select the correct .cha transcript.
    3. Click Add-> then Done.
    4. Click Run in the Commands window. A new file with the same name as the original, but the extension .mor.cex will be created in your working directory. This is a file containing a %MOR (morphological) tier which includes part-of-speech tags for every word. This command automatically parses the words in the transcript, and excludes words marked as repeats, hesitations, and fillers. An example of a transcript which includes a morphological tier is shown below.

      Sample Narrative Transcription with Morphological Tier

  3. Calculate MLCU on your newly created .mor.cex file.
    1. Type mlu  +t*CHI  -t%mor  –s”[+ bch]” into the Commands window.
    2. Click File In and select the correct .cha transcript.
    3. Click Add-> then Done.
    4. Click Run in the Commands window. The Ratio of morphemes over utterances generated in the output file is your raw MLCU score. A sample output window is shown below. In this example, the MLCU = 7.063.

      Sample Narrative Transcription MLCU Output

How to Calculate Number of Different Words and Total Number of Words

Number of Different Words (NDW) measures the number of unique words (also referred to as types in language sampling). Total Number of Words (TNW) counts all the words in the transcript (also referred to as tokens).

  1. Mark boundaries for grammatical morphemes.
    1. Save a new copy of your transcript (so you have a version with and without hyphens). Adding NW (number of words) to the file name identifies it as the transcript which has morpheme boundaries marked.
    2. Go through the entire transcript and place hyphens between word stems the following grammatical morphemes:
      1. plural –s, (e.g., balloon-s)
      2. third person singular –s, (e.g., run-s)
      3. possessive –'s, (e.g., boy-'s)
      4. present progressive –ing, (e.g., bounce-ing)
      5. past tense –ed, (e.g., bounce-ed)
        Note that spelling should be altered (like in 4 and 5 above) when letters are doubled or dropped, so CLAN can recognize the stem of the word. This means that play-ing and play-ed will be counted as two instances of the word play. Separate morphemes even in words that children have mismarked, for example fall-ed, ate-ed. Do not mark boundaries between derivational morphemes (e.g., –ful in beautiful) because it is not clear that children recognize the between stem words and words with derivational morphemes. For example, it is not clear that a child recognizes that beauty and beautiful have the same root, therefore CLAN counts these as two separate words.
  2. Calculate NDW and TNW using the freq (frequency) command.
    1. Type freq +t*CHI +r6 –s”[+ bch]” +s”*-%%”
    2. Click File In and select the correct .cha transcript (marked with hyphens).
    3. Click Add-> then Done.
    4. Click Run in the Commands window. This command generates a list of words. The number beside each word indicates how often that particular word occurred in the transcript. A summary of the number of types and tokens is provided at the bottom of the word list. A shortened sample output window is shown below. In this example, NDW = 159 and TNW = 452.

      Sample Narrative Transcription Number of Words Measures

    Note: If you receive error messages while running any CLAN commands, check to ensure that your 'lib' (library) and 'mor lib' (morphological library) directories indicate the correct folders. For more information, see the Installing and Running CLAN section of Transcribing Narrative Samples.

    Top of Page