Online Catalogue Other Longman Websites Opportunity Chest University Online Primary Online Longman Turkey Join Us Longman Hits navigation bar

 

University Online

The Strange World of EFL Testing
Has psychology no place in testing?

by Mario Rinvolucri, Pilgrims, UK


Menu
You do not need to read this article through from beginning to end but can click to the part you find most interesting or relevant to your needs

A look at a classic of testing literature

The built-in psychological unfairness of the test situation

The exam taker's imposed frame of mind

Where are the humanistic voices in the world of EFL testing?

Some features of an ideal humanistic test

Testing intermediate writing

Testing low level writing

Past experiments in humanising testing

Self-evaluation of this article

( A thank you to Ahmet Sofuoglu, Longman Turkey Business Development Manager, for having encouraged me, pushed me, challenged me into giving a plenary on Testing at Marmara College, Istanbul, during their April 2002 Conference, testing being an area that I normally shy away from. Thank you, Ahmet.)


Do you find the world of Alice through the Looking Glass unsettling, thought-provoking and deeply strange? This is precisely what I feel about the world of language testing, with its breath-taking disregard for the person of the test-taker.


A look at a classic of Testing Literature.

To illustrate what I mean, let us take a look at Language Testing in Practice, by Lyle F. Bachman and Adrian. S. Palmer, Oxford, 1996. This solid tome, a thorough work of well-researched seriousness, runs to 377 pages. Not more than 10 pages deal with the psychology of testing, that is to say the psychology of test takers. The main statement that Bachman and Palmer make on candidate psychology comes on pages 114-115:

As noted in Chapter 4, the test takers' responses to the characteristics of the test environment and tasks can potentially inhibit or facilitate optimum performance.

The authors then list three aspects of testing that may affect some candidates' ability to acquit themselves well.:

1…….. Test takers' familiarity with test setting may determine, in part, their affective responses to test tasks. When there is a high level of correspondence between the characteristics of the target language use setting and tasks on the one hand, and the test setting and tasks on the other, we may be able to assume that test takers will have a generally positive affective response to the tests and test tasks…….

2. …….We would generally expect that test takers who have the relevant topical knowledge will have positive affective responses to the test and test tasks……..

3. Finally, test takers' general levels and profile of language ability can influence their affective responses. Test takers who have high levels of language ability are likely to feel positive about taking a language test, while less proficient test takers
may feel threatened by the test.

To summarise the authors' thoughts in simpler language : if a given test seems to be measuring language use they will need in real life, the candidates will feel happy, if they know the answers to the test questions, the candidates will feel happy, and, finally if their language level is high, test takers will be happy sitting tests.

Is that really all there is to say about affectivity in language testing?

Yet Bachman and Palmer are honourable men who have advised the UCLES examination board in Cambridge , UK and many other exam authorities.
Their book is convincing when it comes to discussion of various types of validity, reliability, construct development, scoring criteria, scoring methods, scoring procedures and scoring scales, but what about the human being who sits at the centre of all these conceptualisations, the person taking the exam/test?

From my reading ( limited) of the testing literature in EFL, little has been written about the student, the human being, invited or forced into the crisis situation of the
exam room. Without consideration of the human factors in testing, what is the use of elaborating scientifically honed and perfected tests?

Back to the menu

The built-in psychological Unfairness of the Test Situation

Major exams have a different psychological effect on different individuals. In my own
particular case tests often filled me with a feeling of adrenalin pumping, joy at performance, a feeling of challenge and exhilerating risk. The effect they had on my brother was largely destructive: his writing hand trembled so much in his 16 plus UK State exams that he could hardly hold a pen. He passed only one subject and this
"failure" governed the path he has taken through life. According to a teacher who dealt with us both, Bernard was noticeably more intelligent than me. I would submit that the British State's academic judgement of these two brothers at 16 was grossly inaccurate because it put Bernard in a situation he could not bear while offering me an ideal circus ring to show off in. I jumped through the hoops with more glee than awareness or dignity.
The EFL testing experts do not concern themselves with cases like Bernard's. Words like anxiety, panic, fear, crisis, stress do not figure much in the indexes of their books. You have to go to sources like Journal of Behaviour Therapy and Experimental Psychiatry ( 1972) No 3. to find the work of people like T.K.Beck who produced video-taped scenes for desensitisation of test anxiety. The
scenes on his video include these:

- a person tossing and turning the night before an exam is to be taken

- a typical classroom with pupils talking nervously before the instructor
arrives. He comes in carrying the exam papers.

- Close up of time slipping by as the anxious student writes frantically
on official paper.

Are manifestations of anxiety and stress in the face of exams rare occurrences that only effect that tiny minority of the student population who need psychiatric help, or are the scenes above typical of what a quite large number of exam takers live through?
I have yet to find, in the literature, any comprehensive list of the ways that people cope with pre-exam stress but here are two idiosyncratic examples.
a) in mid teenage this highly successful professional woman
did ballet exercises from 6.00 till 8.00 am on the day of the exam. She would thus go into the exam with a relaxed and slightly tired body and a very alert mind.

b) A man who now runs and markets a major language exam, used to smuggle an old
pair of slippers into the exam room. He would sneak his feet into them and a sensation of comfort would come over him. With his stress levels thus lowered
he reckoned he could write much better papers.

These two people managed to cope with test-generated tension creatively and successfully. Many people, like these two, manage to cope with the internal crisis situation that an exam can generate, but there may be a serious price to pay in terms of unhappiness. The words that follow are those of a Spanish EFL teacher on a TT course at Pilgrims in UK:

" Yesterday I was talking to some of my friends about university and student life, and most of us thought it was an experience we didn't want to go through again. All the pressure of exams and results was too hard to make us want to repeat it: one of us
said that after finishing her studies she still had dreams about having to pass a test again, and not being able to do it. " ( Humanising Language Teaching, www.hltmag.co.uk Year 4 Issue 1, Jan 2002, Readers' Letters )

The group who had this discussion were all professionals in their 30's and 40's.
They are the "successful" products of the Spanish academic system with its
strict hurdle race of tests and exams. If they feel like this, what do the "rejects", the
"failures" feel?

Back to the menu

The Exam Taker's imposed Frame of Mind

We have so far had a look at the way EFL testing literature avoids dealing with the
exam as a psychological crisis, that can generate, stress, anxiety, fear and even panic. We have also looked at clear cases of exam takers entering the testing room in a far from optimal state of body, heart and mind.

But there are other more cognitive aspects to most tests that need looking at. The majority of candidates go into a language exam in a " mistakes avoidance" state of mind. They often have a strong mapping of what they do not know or are unsure about and are determined to hide these areas from the examiners. A dramatic example
of this came up when UCLES ( The Cambridge, UK, exam authority) did an analysis, by nationality, of mistakes being made at First Certificate ( FCE) level. They discovered that Japanese students had made no mistakes with relative clauses. They smelled a rat and had a close look at the Japanese scripts- this national group had
scrupulously avoided using any relative clauses! (You translate " the woman, who has two studies, always does her best work in the other one " into Japanese this way: " The two studies having woman always does her best work…..")
Is a " mistakes avoidance" strategy a resourceful state of mind and heart? Is it conducive to showing your paces, to really shining in the target language? My own feeling is that the fear of falling into error fiercely inhibits natural linguistic and intellectual creativity.

I remember once showing a long letter I had received from a student to the Secretary of a Language Examination Board. He read through the eight lower intermediate pages of hand-writing, in which the writer was desperately trying to teach me some economics ( her specialism) and then looked up and said , pensively,
" This text was not written to be corrected."
He was dead right. This student wanted me to understand her meaning, despite her language having more holes in it than a piece of crochet work. The exams man was amazed to read a piece of communicative writing. In his work he would normally only see mistakes avoidance writing.

What do we think we are measuring if we put the exam taker into a linguistically defensive state of mind and then evaluate her shrunken production?

John Fanselow in Breaking Rules, Longman, 1987, points out that it is the tester who always initiates, by setting a composition title, by generating a cloze procedure. a C test, a Multiple Choice exercise or whatever.. As Fanselow puts it, the
test taker is perpetually playing on the away ground, working within a frame strictly prescribed by the other. The candidate is the uneasy guest at the examiner's table.

Sometimes an exam taker refuses to act out the passive, reactive role assigned to him.

This was the case in a University physics test where the candidate was asked this question:
Show how it is possible to determine the height of a tall building with the aid of a barometer.

The student suggested lowering the barometer from the top of the building to the street on a rope and then measuring the length of the rope.

This logical and feasible answer earned him a zero mark. He appealed.

The external examiner who was brought in asked him to re-answer the question, giving him six minutes to do so.

The student offered this, as one of several possible correct answers:

Take the barometer to the top of the building. Drop it and time its fall with a stopwatch. Then using the formula S= ½ ar 2, calculate the height of the building.

The external gave his second answer nearly 100%. The candidate then offered three or four more solutions to the problem, none of them the conventional answer the original examiner had been after.

This lad was not in a mood to give the examiner what he knew he expected. He was determined to play the game on his own highly intelligent home ground.
He rejected the intellectual state of obedience and passivity that the exam
implicitly required.
(the barometer story was written for the New Yorker by Alexander Calandra, professor of Physics at Washington University, St Louis, USA)

Back to the top menu

Where are the humanistic voices in the World of EFL Testing?

If you search the literature for major work on testing by members of the humanistic
language teaching movement you don't find much. People like Caleb Gattegno, Earl Stevick, Charles Curran, Lozanov, Herbert Kohl, Gertrude Moskowitz, Bernard Dufeu, John Morgan, Herbert Puchta, Alan Maley, Alan Duff are fascinated by the processes of learning language. They have written thousands of pages, between them, on the learner as a whole person, as a creative mind, but nothing major springs to mind from their work when we look at the area of testing and exams. (The work of John Fanselow is a serious exception to this generalisation)
The humanistic movement's failure to address the problem of testing is
a grievous one, as no teacher has ever moved through her career without somehow coming to terms with this difficult area. At Pilgrims, with a network of excellent, humanistically motivated teacher trainers, we have offered a course on testing only once in a quarter of a century's work. A cop-out? Yes, I have to admit it is. The area of testing is far too important to left to the personality types who naturally gravitate towards wanting to measure, to quantify, to evaluate and generally to establish themselves as the gate-keepers.

Back to the top menu

Some features of an ideal Humanistic Test

The first question to be asked when testing language is " What is language?

Following Dufeu, (Teaching Myself, Oxford, 1994) I would suggest that language is
Being rather than Having. In my own case I have Latin. I studied it for 8 years and
if I have to produce any, I construct it, consciously applying the rules I learnt.
It goes something like this: agricolam ( accusative case of "farmer", fourth declension noun, and it can come at the start of the sentence even though it is the object ) puella ( puella, or girl is the subject, so no "m" at the end)
amat ( yep, amo, amas, amat, so this is third person singular…. and it looks good to have the verb at the end, not like in Church Latin, where it can go in the middle,,,,,) . So, the girl loves the sailor.

I hardly need to point out that the way I know Latin has nothing to do with being able to communicate in a language. I know no Turkish, and yet the sounds of Merhaba have a place in my head and my heart. Merhaba evokes a first meeting with someone, a feeling of beginning and seems to me to be an excellent way of greeting some one.
I am, I exist in and through Merhaba, while agricola is an intellectually dead translation of the English term farmer.

Merhaba is a Mario word, a Mario pleasure, a Mario handshake.
Puella is a counter on a language chessboard and does nothing to evoke the many puellas I have met and appreciated, in some cases loved. The "signifier", in the case of puella, is a thousand miles from the very important "signified".


Following the work of Carter and McCarthy at Nottingham University I would say that language is essentially relational, a bridging between two or more people, a central aspect of their coming together, of their meeting.
Let me give you a detailed instance of how the grammar of spoken UK English
encodes for relationship.

If a speaker says " She was saying they're coming to night" the speaker, by using past continuous, implies that he knows the woman he is reporting.

If the speaker says: " She said they're coming tonight", then we know nothing about his relationship to the woman whose words he is reporting.

This is one of the nitty-gritty examples from the Cancode Corpus of oral English that
Carter and McCarthy have been working on for the past ten years.

The trouble with almost all tests is that they deal with language as having, as an inert mound of knowledge, and that nearly all written tests are non-relational, in that the candidate is not doing them in any strong I -thou frame. When the class-teacher sets the test the students is in some sort of relationship with the teacher, but really more with her red pencil, with her language-critical faculty, than with her as a person.


How, then, can we test language as being and language as relationship?

This is a revolutionary question to which I can only offer a couple of tenuous answers which I have not yet checked out in the reality of an evaluating situation.

Back to the top menu

A. Testing intermediate Writing

1. Tell the candidates that the best six pieces of their writing will go up on the school web site and so will be read by other students, by parents and prospective
parents. ( You are providing the test takers with a real audience, a palpable audience and
a largely well-disposed audience, that could well include their own family)

2. Give the students four or five 1 page extracts of excellent, simple English prose.
Ask them to read and re-read these for 15 minutes before writing. Ask them to
enjoy and soak up the voices of the writers.

3. Ask the students to write a piece of their own, under the influence of the style of one of the passages…. they can even write a continuation of the passage of their choice, or what went before it.

4. The pieces of writing from the test go up round the walls of the classroom for all
to read - the students' task is to pick the six pieces to go up on the website.

5. The teacher then does her normal marking according to normal linguistic
criteria and awards her technical, L2 correctness marks accordingly.

Back to the top menu

Testing low level writing

1. The teacher asks each student to write her a two page letter about a topic that has not yet been discussed in class ( the topic could be technical, personal or whatever)
The student is to write the letter as much as possible in English but is allowed to code switch to mother tongue where absolutely necessary.

After marking the letters, you can usefully ask the students to work with colleagues and try to find adequate English for the mother tongue parts of their letters.
This type of test not only permits evaluation but also immediate further learning.
The permission to use L1 allows the students to express themselves in much less curtailed language and so to enrich what they dare to want to try to say.


In both the tests proposed above the candidate has an addressee or audience to write to. Her writing is relational, whether addressed to the teacher personally or to the
school's website audience. In the first exercise the student is also in strong linguistic
relationship to the authors of the model texts.
In both tests, the student chooses the topic area to write about, within the relationship s/he perceives with the reader, so in John Fanselow terms, the candidate is playing on her home ground.
To get a good technical mark the student will be aware of mistakes avoidance but also has
the human motivation to express herself fully to a reader/s.
To say that these two tests do away with exam stress, anxiety and fear is to claim too much. My hope is that they may reduce these negative factors.

Back to the top menu

Past Experiments in Humanising Testing

The Cooperative Language Movement Tests.

In this approach, widely practised in US secondary education, the students do most of their work together in groups of 4-6 , and each group is organised to be as heterogeneous as possible in terms of race, of class and of academic ability.

When the time comes for the test each of the group of six do their preparation together, with the stronger ones helping the weaker ones. It is in their interest to do so, as the students know that, while they will take the test as individuals, and while their test papers will be evaluated individually, the mark they finally receive will be the average mark for the group.
This mode of testing raises the hackles of people in very individualistic societies, for instance Germany, but is realistic in terms of what happens in later life. If a team of engineers build a bridge, the whole group will be judged on the outcome and the less good professionals will benefit from the presence of those who are stronger. Isn't the team you work in judged as a whole, as well as sometimes individually?


Learner-Teacher Co-evaluation

Evaluating another person's work puts you in boss/parent position over them.
There have been many attempts at power-sharing over past 50 years and a recent one,
at upper secondary level, is described by Christoph Ruehlemann in his article:
Sharing the power: action research into learner and teacher co-evaluation
(you can read the whole article at < www.hltmag.co.uk> under Major Article,
Year 4, Issue 1, January 2002.)
In describing his experiment with co-evaluation in a German State School, Christoph
describes a system of careful checks and balances. The first text in the exam is marked by both the teacher and a peer-evaluator, using the same type and number of criteria. They each have a 50% say. The second text in the test is marked for one criterion by the teacher and for three by the peer-evaluator, thus giving the student
a 75% say. The third text is marked only by a peer-evaluator, gving the student full power of decision.

Christoph, at the end of his careful, detailed article, asks:
Do teachers and learners benefit from co-evaluation?
and then has this to say:
The answer is a clear yes. The obvious benefit for the teacher lies in the diagnostic exploitability of rating disagreements. ……..Astonishingly, accuracy turned out to be an area of relative rating harmony between teacher and students…….. There was much greater rating disharmony around the criterion variety. It became evident that this criterion had not yet been sufficiently well taught and learnt, an insight that contrasted sharply with the teacher's expectations. So, investigating these rating differences may greatly help identify learner weaknesses and define areas of additional learning and teaching.

….Co-valuation provides an occasion for genuine learner and teacher cooperation in a field where, traditionally, teacher autonomy is paid for by teacher isolation.

Co-evaluation benefits learners too. Getting to read their classmates' texts puts them in the place of the audience, which establishes writing as a communicative act- rather than a language exercise. Interestingly, for learners to accept their peers as 'real readers' it is prerequisite that evaluating and grading is not the prerogative of the teacher, but shared by the classroom community.

Finally, Co-evaluation greatly contributes to learner autonomy and responsibility.


Student - self evaluation

In Freedom to Learn for the 80's, Charles E. Merrill, 1983, Carl Rogers describes the pioneering work of Dr Herbert Levitan, a lecturer in neurophysiology. In the context of an undergraduate course where the contents and manner of teaching were extensively negotiated with the class group, Levitan decided that the marks awarded for the course should be based entirely on student self-evaluation. Each student had
to submit the following:

- a portfolio of all written material s/he had produced over the semester
- a diary of reflections on his work over the semester;
- the grade he awarded himself and a justification.

Levitan writes: I reminded them that I reserved the right, and indeed felt the obligation, to give them feedback on the grade they assigned themselves. I made clear, however, that I would respect their final decision on the grade they wished to have submitted to the University.

Here are two of Levitan's students' self-evaluations:

Evaluating myself is difficult, but I will try and be objective. I feel I've come a long way since the start of the course. Instead of just learning facts I learned how to ask questions and approach a problem…. but more importantly, I learnt how to discover more on my own. I believe my effort in the course is worth a B.

Based on the amount of time I spent in class compared to the amount of time I could have spent and the number of concepts I could have learned I give myself the grade of C for the course. I do not think a higher grade is justified, simply because I did not make a formal attempt at synthesis of a topic of interest (term paper). Also a lower grade than C would not reflect the amount of time I placed in the course and my satisfaction with what I learnt.


Levitan reports that the distribution of self-evaluation grades for the course was:
33% A
45% B
20% C
2% D.

On many previous courses on the same topic, which he had taught without consulting the students on what they wanted to learn and how they wanted to learn it and without asking them to self-evaluate he had suffered a drop-out rate of 30-40%. On this course no one dropped out.
Yes, of course, Levitan's experiment would not work in all contexts and in all cultures. Any experiment's generalisable value will be constrained by major cultural and belief variables.

Back to the top menu

Self evaluation of this Article

For those readers who are convinced, at belief level, that the psychological aspect of testing must be ignored, because otherwise one simply enters an issueless touchy-feely jungle, Mario, you will have confirmed and hardened their conviction. From now on they will devote yet more energy to their validities and their reliabilities

For those readers who have generally accepted that current testing ways are simply a given about which nothing much can be done, you maybe have half-opened a window
on a hazy, new thought-landscape

For people who feel that most of current testing is psychologically unfair, this article
may articulate things they have always felt and suspected.

People looking for alternative ways of testing may be motivated to try out the practical systems outlined in the second half of the article.

I give myself a B+++ grade for effort
a B - grade for width and depth of knowledge of the area
a B + grade for trying to find an appropriate voice for this piece.


If you have testing experiments you would like to share with colleagues round the world, why don't you send them in for publication?

mario@pilgrims.co.uk

Your Articles archive
More articles by Mario Rinvolucri and Eylem Butuner

Back to university on-line

Meet the Crew
Online Learning
Longman Dictionairies
Penguin Readers
Pearson Education Worldwide
 
 
 
 
 
Copyright & Legal Conditions Privacy Statement Contact Us Site Map Longman Homepage Base Navigation
Argentina Home Campaigns Events Companion Websites Longman.com Local Catalogue Contact Us Sign up to PGA Longman.com