Authors Anonymous Statistics In Your World 
Student Notes  
Back to Notes  
Brief Description
 
Aims and Objectives
 
Prerequisites
 
Equipment and Planning
 
Section A - Introduction
 
Section B - Word Length
 
Section C - Sentence Length
 
Section D - Summary
 
Test Questions
 
Test Questions - Answers
 
Connections with Other Units
 

Brief Description

The unit encourages pupils to look for patterns in word length, sentence length, and summary statistics among different passages from the same author anJ between different authors, to establish which two of three passages are by the same author.

Design time: 3-4 hours

 

Aims and Objectives

On completion of the unit pupils should have an appreciation of how the summary statistics can be used in inference, in particular as clues to authorship. They will have practised summarizing data in tables, including using class intervals, calculating a mean from raw data, drawing bar charts and identifying the mode or modal class. They will also have used simple proportions and the range, and will be expected to compare such summary statistics obtained from different sources.

They should be more aware of the effects of extreme values in the calculation of the mean through the effect of short sentences as used in direct speech. Optional sections include the calculation of the mean from grouped data and with the use of the mid-point of an interval. Comparison of the accuracy of the means calculated in different ways is also mentioned.

 

Prerequisites

There are no statistical prerequisites. Pupils need to be able to use simple proportions and correct numbers to 1 decimal place.

 

Equipment and Planning

The unit is flexible in use. Initially pupils are guided through the analysis of the given passage A, and are expected to analyse a passage 30 sentences long of their own choosing. This can be passage B, or another one taken from a reading book, or one of their own essays. The use of individual passages provides variety and plenty of data for comparison, but is difficult to mark.

Care is needed in drawing conclusions from passages containing a great deal of direct speech, since this increases the proportion of short sentences. Contrasts can usefully be drawn between books read by first-year pupils and those read by adults, or books written in different centuries.

Section D gives an opportunity to compare analysed data, and to attempt to decide which two of three passages are by the same author.

The use of R pages depends on the approach used.

R1 is needed by all students, and is provided as an R page for easy reference at various stages of the unit. These can be collected in and used on future occasions.

Each student will require a copy of R2 and R3 for each passage he is asked to analyse, unless he is to draw his own tables and axes for bar charts.

R4 (passage B) can be used instead of providing other passages in Sections B and C. If it is not used, then it will need to be done, or summary pages issued, for Section D.

R5 (passage C) can be used for additional analysis if required.

R6 and R7 show sufficient summary data of passages B and C for pupils to complete the clues needed for Section D without doing all the initial analysis themselves. Notice that for ease of duplication the scales used are not the same as those in the text.

Pupils can work individually or in pairs when analysing the data, although it is recommended that each pupil completes the bar charts.

Class d iscussion is likely to be necessary at the outset and in Section D, and will be useful whenever comparisons are made.

 

Detailed Notes

Section A

Class discussion here on possible ways of identifying authors' styles will set the scene for the unit. Comparison of perhaps Dickens or Shakespeare with a modern day author with whom the pupils are familiar may help, and, although not considered in the unit, mention could be made of the dating methods used to identify the paper of early manuscripts. Analysis of personal essays can yield information about help received from parents with homework, or cheating in examinations. One of the earliest applications o f this field o f literary authorship was in trying to decide on the authorship of the epistles in the New Testament.

 

Section B

Pupils either analyse their own passage of 30 sentences (provided by themselves or by the teacher from a familiar book), or passage B on page R4. Answers for this are given at the end of these notes. Decisions may have to be made about whether to include numbers, and how to count words that are hyphenated or have apostrophes. Discussion of any such problems would help, and a consensus opinion could be used to deal with them.

B1
Page R2 will be needed to record the results.

If a book is being used, it may not be advisable to allow pupils to write the number of letters over each word. Working in pairs to complete Table 5 can overcome this. Care will also have to be taken to see that Table 5 is extended if necessary, as mentioned in the text.

B2
Comparison of bar charts is generally easier than tables. If R pages are not being used, or pupils are being asked to draw bar charts from scratch, then for comparison purposes the scales used must be the same as in Figure I. The bars have purposely been kept separate, to stress the fact that the data (word lengths) are discrete.

Try to make pupils look at the overall shape of each bar chart when making compa risons.

  1. This is for pupils who have analysed different passages, and discussion could prove interesting and useful here.

B3
It is unlikely that there will be a great deal of difference between the modes - the incidence of three-letter words is high in the English language. The range can be more revealing, but can be distorted by something like a particularly long place name.

*B4
Optional for brighter pupils. This allows pupils to calculate the mean word length from the frequency table. It provides good practice for them, but does not yield a great deal of information about authorship.

B5
Since the mode and, to some extent, the range are not necessarily good discriminators, proportions are sometimes used. Again the pupils should be warned to look out for possible distortions due to the repeated use of a particular word.

Nevertheless, comparisons provide interesting resolts, particularly if the pupils have analysed different passages. The use of 100 words in each passage makes the comparisons easier.

 

Section C

Sentence length is usually more helpful in the analysis of literary style than word length; but difficulties can arise if the passage contains much direct speech.

C1
It is easier if this is done in pairs, with one person counting the number of words and the other recording the results.

It is recommended that the individual sentence lengths be recorded to assist in the calculation of the mean and the range.

Pupils are then expected to find the range and the modal class, and compar'e them with those for passage A. Care will have to be taken in deciding the mode if any additional figures have been more conveniently grouped into an interval of width greater than five.

C2
Bars have again been kept separate because of the use of discrete data. If the ability of the class is high, there may be a case for introducing the approximation to continuous data here, using a horizontal axis as shown:

with the bars touching at the marked points.

C3
This is best done using a calculator to add up the number of words per sentence already listed. Some may prefer to count the individual words in the passage. Comparison of mean sentence lengths for descriptive passages can be very revealing.

*C4
An opportunity exists here for brighter pupils to calculate the mean from a grouped frequency table. Exptanation may be necessary of how to find the mid-point of the class interval, why this is necessary, and the implications of using it.

  1. 14.0 is from the original raw data. Some accuracy has been lost in grouping the data.
  2. Calculations may have to be based on a grouped frequency table when the raw data are not available or are too unwieldly to handle.

C5
Proportions of long and short sentences are considered to be one of the best clues to authorship. Indeed, some authors repeatedly use such long sentences that this is often a good place to start an investigation.

Comparison of books written in the 19th century with those of the 20th century show a marked difference in sentence length proportions.

 

Section D

The approach here depends on what has been done before, but it gives the pupils a chance to practise the techniques at the same time as trying to identify authorship. Item 4, 'Mean word length', has an asterisk, since the equivalent section earlier in the unit (B4) was also denoted as being optional for brighter pupils.

D1
If passage B has already been analysed in the previous section, then all pupils can analyse passage C (if further practice is required) or use the summary data on page R7.

If passage B has not been analysed and further practice is required, then half the class can do passage B and the rest passage C. If no further practice is required, then the summary sheets on pages R6 and R7 will both be required. These sheets contain enough information for summary statistics to be calculated.

Pupils should make a check list of the clues mentioned in the summary, insert the relevant answers for each of the three passages (they will need to look through the unit for the summary statistics for passage A), and then look for similarities or differences. Mention should be made of the use of direct speech in passages A and C.

 

Summary Statistics
Words Passage B Passage C
Mode 3 letters 3 letters
Range 10 letters 10 letters
Mean 3.9 letters 4.5 letters
Proportion of long words (8 or more) 3/100 1/100
Proportion of short words (3 or less) 49/100 39/100

 

 

Words Passage B Passage C
Mode 16-20 words 1-5 words
Range 33 words 34 words
Proportion of long sentences (26 or more) 6/30 2/30
Proportion of short sentences (5 or less) 0/30 10/30 (influenced by direct speech)

 

The passages are taken from:

Passage A
The Lord of the Flies by William Golding (Faber and Faber)

Passage B
Prester John by John Buchan (Thomas Nelson & Sons)

Passage C
The Spire by William Golding (Faber and Faber)

 

 

Page R1

PASSAGE A

After5 they4 had3 eaten5 Ralph5 and3 the3 biguns6 set3 out3 along5 the3 beach5. They4 left4 Piggy5 propped7 up2 on2 the3 platform8. This4 day3 promised8, like4 the3 others6, to2 be2 a1 sunbath7 under5 a1 blue4 dome4. The3 beach5 stretched9 away4 before6 them4 in2 a1 gentle6 curve5 till4 perspective11 drew4 it2 into4 one3 with4 the3 forest6; for3 the3 day3 was3 not3 advanced8 enough6 to2 be2 obscured8 by2 the3 shifting8 veils5 of2 mirage6. Under5 Ralph's7 direction8, they4 picked6 a1 careful7 way3 along5 the3 palm4 terrace7, rather6 than4 dare4 the3 hot3 sand4 down4 by2 the3 water5. He2 let3 Jack4 lead4 the3 way3, and3 Jack4/ trod with theatrical caution though they could have seen an enemy twenty yards away. Ralph walked in the rear, thankful to have escaped responsibility for a time.

Simon, walking in front of Ralph, felt a flicker of incredulity a beast with claws that scratched, that sat on a mountain-top, that left no tracks and yet was not fast enough to catch Samneric. However Simon thought of the beast, there rose before his inward sight the picture of a human at once heroic and sick.

He sighed. Other people could stand up and speak to an assembly, apparently, without that dreadful feeling of the pressure of personality; could say what they would as though they were speaking to only one person. He stepped aside and looked back. Ralph was coming along, holding his spear over his shoulder. Diffidently, Simon allowed his pace to slacken until he was walking side by side with Ralph and looking up at him through the coarse black hair that fell now to his eyes. Ralph glanced sideways, smiled constrainedly as though he had forgotten that Simon had made a fool of himself, then looked away again at nothing. For a moment or two Simon was happy to be accepted and then he ceased to think about himself. When he bashed into a tree Ralph looked sideways impatiently and Robert sniggered. Simon reeled and a white spot on his forehead turned red and trickled. Ralph dismissed Simon and returned to his personal hell. They would reach the castle some time; and the chief would have to go forward.

Jack came trotting back.
'We're in sight now.'
'All right. We'll get as close as we can.'
He followed Jack towards the castle where the ground rose slightly. On their left was an impenetrable tangle of creepers and trees.
'Why couldn't there be something in that'?
'Because you can see. Nothing goes in or out.'
'What about the castle then?'

 

Page R2

Passage used: __________

Word length Tally marks Number of words
1    
2    
3    
4    
5    
6    
7    
8    
9    
10    
11    
12    
     
     

Total

 

Table 5 - Word lengths of 100 words.

 


Figure 3 - Bar chart of word lengths

Mode = _____ letters.
Shortest word had _____ letters, longest word had _____ letters.
Range = _____ - _____ letters = _____ letters

 

Page R3

Passage used: __________ Individual sentence lengths: __________

Class interval: Sentence length (number of words) Tally marks Number of sentences
1-5    
6-10    
11-15    
16-20    
21-25    
26-30    
31-35    
36-40    
41-45    
     

Table 6 - Sentence lengths of 30 sentences.

 


Figure 4 - Bar chart of sentence lengths.

 

Page R4

PASSAGE B

To2 begin5 with4, it2 was3 no light task to fight one's way through the dense undergrowth of the lower slopes. Every kind of thorn bush lay in wait for my skin, creepers tripped me up, high trees shut out the light, and I was in mortal fear lest a black mamba might appear out of the tangle. It grew very hot, and screes above the thicket were blistering to the touch. My tongue, too, stuck to the roof of my mouth with thirst.

The first chimney I tried ran out on the face into nothingness, and I had to make a dangerous descent. The second was a deep gully, but so choked with rubble that after nearly braining myself I desisted. Still going eastwards, I found a sloping ledge which took me to a platform from which ran a crack with a little tree growing in it. My glass showed me that beyond this tree the crack broadened into a clearly defined chimney which led to the top. If I can once reach that tree, I thought, the battle is won.

The crack was only a few inches wide, large enough to let in an arm and a foot, and it ran slant-wise up a perpendicular rock. I do not think I realised how bad it was till I had gone too far to return. Then my foot jammed, and I paused for breath with my legs and arms cramping rapidly. I remember that I looked to the west, and saw through the sweat which kept dropping into my eyes that about half a mile off a piece of cliff which looked unbroken from the foot had a fold in it to the right. The darkness of the fold showed me that it was a deep, narrow gully. However, I had no time to think of this, for I was fast in the middle of my confounded crack. With immense labour I found a chockstone above my head, and managed to force my foot free. The next few yards were not so difficult, and then I stuck once more.

For the crack suddenly grew shallow as the cliff bulged out above me. I had almost given up hope, when I saw that about three feet above my head grew the tree. If I could reach it and swing out I might hope to pull myself up to the ledge on which it grew. I confess it needed all mycourage, for I did not know but that the tree might be loose, and that it and I might go rattling down four hundred feet. It was my only hope, however, so I set my teeth, and wriggling up a few inches, made a grab at it. Thank God it held, and with a great effort I pulled my shoulder over the ledge, and breathed freely.

My difficulties were not ended, but the worst was past. The rest of the gully gave me good and safe climbing, and presently a very limp and weary figure lay on the cliff-top. It took me many minutes to get back my breath and to conquer the faintness which seized me as soon as the need for exertion was over.

When I scrambled to my feet and looked round, I saw a wonderful prospect. It was a plateau like the high-veld, only covered with bracken and little bushes like hazels. Three or four miles off the ground rose, and a shallow vale opened. But in the foreground, half a mile or so distant, a lake lay gleaming in the sun.

 

Page R5

PASSAGE C

One delver relaxed, and smeared a hand over his sweaty face. The other disappeared from sight and began to make grunting noises. The master builder knelt dowri quickly, his hands on the edge of a slab, and leaned still further forward.

'Anything?'
'Nothing, master. Come-hup!'

The man's head appeared and his two hands. He held the iron rod in both of them, one thumb marking a distance, the other on the shining point. The master builder inspected the rod slowly from one thumb to the other. He looked through Jocelin, shaped his lips to whistle but made no sound. Jocelin understood that he was ignored, and turned away to examine the nave. He caught sight of the white, noble head of Anselm where he sat two hundred yards away by the west door, obeying the letter of his instructions, but out of earshot and almost out of sight. Jocelin felt a sudden return of pain that the man should look like one thing and behave like another; a touch of astonishment too, and incredulity. If he wants to behave like a child, let him sit there till he grows to the stone! I shall say nothing. He turned back to the master builder, and this time knew himself to be recognised.

'Well Roger my son?'

The master builder straightened up, knocked the dust from his knees, then brushed it from his hands. The delvers were at work again, scrape, chunk.

'Did you understand what you saw, reverend father?'
'Only that the legends are true. But then; legends are always true.'
'You priests pick and choose.'
You priests.
I must be careful not to anger him, he thought. As long as he does what I want, let him say what he likes.
'Confess, my son. I told you the building was a miracle and you would not believe me. Now your eyes have seen.'

 

Page R6

ANALYSIS OF PASSAGE C
Word length Number of words
1 4
2 20
3 25
4 18
5 18
6 7
7 4
8 1
9 0
10 1
11 2

Word lengths of 100 words

Total number of words: 100
Total number of letters used: 450

 


Bar chart showing word lengths

Sentence lengths of 30 sentences
20, 37, 15, 12, 20, 18, 26, 21, 13, 26
18, 16, 43, 14, 20, 17, 14, 13, 19, 22
31, 22, 19, 10, 23, 27, 14, 16, 13, 17

 

Sentence length Number of sentences
1-5 0
6-10 1
11-15 8
16-20 11
21-25 4
26-30 3
31-35 1
36-40 1
41-45 1

Table showing summarized sentence lengths

 


Bar chart showing sentence lengths

 

Page R7

ANALYSIS OF PASSAGE C

Word length Number of words
1 3
2 11
3 25
4 17
5 13
6 10
7 15
8 4
9 1
10 0
11 1

Word lengths of 100 words

Total number of words: 100
Total number of letters used: 450

 


Bar chart showing word lengths

 

Sentence lengths of 30 sentences
11, 11, 19, 1, 2, 1, 8, 20, 13
13, 13, 36, 26, 18, 4, 15, 4, 16, 8
8, 6, 6, 5, 2, 10, 14, 4, 14, 5, 2

 

 

Sentence length Number of sentences
1-5 10
6-10 6
11-15 8
16-20 4
21-25 0
26-30 1
31-35 0
36-40 1

Table showing summarized sentence lengths

 


Bar chart showing sentence lengths

 

 

Page R8

TEST ANSWER SHEET
1   Word lengths
  a
  b 1 Mode of word lengths = _____ letters
2 Range of word lengths = _____ letters
3 Proportion of long words = _____
4 Proportion of short words = _____
     
2   Sentence lengths
  a
Sentence lengths Tally marks Number of sentences
1-5    
6-10    
11-15    
16-20    
21-25    
26-30    
31-35    

Total

 
   
  b 1 Modal class is _____ words.
2 Range of sentence lengths is _____ words.
3 Mean sentence length is _____
4 Proportion of long sentences is _____
5 Proportion of short sentences is _____
     
3   Record your decision, giving reasons, on the reverse of this sheet.

 

 

Test Questions

Answer all questions on the sheet provided (R8).

  1. Word length Table 1 shows the number of letters in the first 100 words of a passage from a novel.
    Word length Number of words
    1 9
    2 20
    3 24
    4 19
    5 8
    6 9
    7 4
    8 5
    9 0
    10 0
    11 0
    12 1
    13 1
    Total 100

    Table 1

    1. Use this information to complete the bar chart on theanswer sheet. Give the chart a title and label the axes.
    2. Find:
      1. the mode of word lengths,
      2. the range of word lengths,
      3. the proportion of long words (8 or more letters),
      4. the proportion of short words (3 letters or less), and record your answers in the appropriate spaces on the answer sheet.
  2. Sentence length
    The number of words in the first 30 sentences of the passage were:
    13, 7, 31, 9, 15, 22, 12, 14, 6, 35
    5, 25, 14, 18, 33, 20, 5, 11, 13, 16
    10, 10, 5, 30, 6, 22, 18, 22, 6, 8
    1. Use tally marks to complete Table 2 on the answer sheet and draw a bar chart on the axes provided.
    2. Find:
      1. the modal class,
      2. the range of sentence lengths,
      3. the mean sentence length,
      4. the proportion of long sentences (26 words or more),
      5. the proportion of short sentences (5 words or less).
  3. Which author?
    You will need pages R6 and R7 and the summary statistics below to provide the information for this question.
    The author of the passage used in this test wrote either passage B or passage C.
    1. Use your answers to the test questions and the information you have been given to decide which passage the author wrote. Write your answers on the reverse side of the answer sheet, stating your reasons.
  Passage B Passage C
Words
Mode 3 letters 3 letters
Range 10 letters 10 letters
Mean 3.9 letters 4.5 letters
Proportion of long words (8 or more) 3/100 1/100
Proportion of short words (3 or less) 49/100 39/100
Sentences
Mode 16-20 words 1-5 words
Range 33 words 34 words
Mean 20 words 10.5 words
Proportion of long sentences (26 or more) 6/30 2/30
Proportion of short sentences (5 or less) 0/30 10/30 (influenced by direct speech)

Summary Statistics

 

Answers

Preliminary requirernents
To complete this test each pupil will require:

  1. A copy of the test questions.
  2. A copy of the test answer sheet on page R5, unless they are to draw the tables and diagrams for themselves.
  3. Access to pages R6 and R7 that were used with the unit.
1   Word lengths
  a Bar chart showing length of 100 words
  b 1 Mode of word lengths = 3 letters
2 Range of word lengths = 12 letters
3 Proportion of long words = 7/100
4 Proportion of short words = 53/100

Mean is not asked for, but if calculated gives 3.8 letters.

     
2   Sentence length
  a
Sentence lengths Tally marks Number of sentences
1-5 3
6-10 8
11-15 7
16-20 4
21-25 4
26-30 1
31-35 3

Total

30

Table 2

 


Bar chart showing lengths of 30 sentences

  b 1 Modal class is 6-10 words.
2 Range of sentence length is 30 words.
3 Mean sentence length is 15 4 words (from original data); 15.2 words (from grouped frequency table).
4 Proportions of long sentences is 4/30.
5 Proportions of short sentences is 3/30
     
3   The test passage is from Free Fall by William Golding, hence the correct answer is passage C.

Comparison of word lengths may lead pupils to conclude passage B is by the same author as the test passage. The work covered in the unit should, however, make them doubt the use of word lengths as a good indicator.

Comparisons of the bar charts of sentence lengths shows a reasonable similarity between passage C and the test passage.

 

Connections with Other Published Units from the Project

Other Units at the Same Level (Level 2)

On the Ball
Opinion Matters
Seeing is Believing
Getting it Right
Fair Play

Units at Other Levels in the Same or Allied Areas of the Curriculum

Level 3

Multiplying People
Pupil Poll
Phoney Figures

This unit is particularly relevant to: English, Humanities, Mathematics.

Interconnections between Concepts and Techniques Used in these Units

These are detailed in the following table. The code number in the left-hand column refers to the items spelled out in more detail in Chapter 5 of Teaching Statistics 11-16.

An item mentioned under Statistical Prerequisites needs to be covered before this unit is taught. Untts which introduce this idea or technique are listed alongside.

An item mentioned under Idea or Technique Used is not specifically introduced or necessarily pointed out as such in the unit. There may be one or more specific examples of a more general concept. No previous experience is necessary with these items before teaching the unit, but more practice can be obtained before or afterwards by using the other units listed in the two columns alongside.

An item mentioned under Idea or Technique Introduced occurs specifically in the unit and, if a technique, there will be specific detailed instruction for carrying it out. Further practice and reinforcement can be carried out by using the other units listed alongside.

Code No. Statistical Prerequisites Introduced In
  None  
  Idea or Technique Used Introduced in Also Used in
1.2a Using discrete data Seeing is Believing Fair Play
Getting it Right
Phoney Figures
Opinion Matters
Multiplying People
Pupil Poll
1.2c Problems of classification of data Opinion Matters
Getting it Right
Multiplying People
 
1.3a Sampling from small well-defined population Pupil Poll Fair Play
  Idea or Technique Introduced Also Used in
2.1a Constructing single variable frequency tables On the Ball
Multiplying People
Seeing is Believing
Pupil Poll
Opinion Matters
2.2a Bar charts for discrete data Seeing is Believing
Pupil Poll
Multiplying People
Phoney Figures
2.2e Bar charts for continuous data Seeing is Believing
3.1a Mode for discrete data Seeing is Believing
3.1c Mean for small data set On the Ball
Getting it Right
Seeing is Believing
Fair Play
3.1e Modal class Multiplying People
3.1f Mean for frequency distribution Seeing is Believing
Fair Play
3.2a Range Phoney Figures
3.2b Fractiles (intuitive)  
5a Reading tables Seeing is Believing
Phoney Figures
Opinion Matters
Multiplying People
5e Comparing directly comparable data On the Ball
5o Use of a test statistic  
5e Comparison of two samples  
5u Inference from bar chart Multiplying People
Phoney Figures

 

Back