The Multiple-Choice Test: truly objective assessment?
Pick up an examination paper from any country in the world, and you will find lots of multiple-choice questions. Love them or hate them, they are endemic to educational assessment. Why is this? Research by Professor Glenn Fulcher of Leicester’s School of Education has sought to answer this question, looking in detail at multiple-choice tests – their history, structure, strengths and weaknesses.
The multiple choice question made its debut in 1914, the creation of Frederick J. Kelly in his doctoral dissertation "Teachers’ Marks, Their Variability and Standardization". Before the end of the First World War, it had become the item of choice in all standardised psychological tests, and by the 1920s it was used in most educational tests.
Two key problems were identified with the multiple choice item. The first was the use of the teacher’s subjective judgment when assessing learners. While it is true that scoring a key as 'correct' is purely objective, the belief that the question is objective is an illusion. As the example shows below, it is possible to embed cultural and social assumptions into items that can result in responses that do not reflect the true ability of learners on the ability of interest (score contamination).
It is also possible to construct multiple choice questions where it is possible to imagine a context in which more than one response is correct. Writing multiple choice questions is therefore extremely difficult.
Kelly was writing at a time when the education system was expanding rapidly and teachers simply did not have enough time to mark examinations. The multiple choice question was intended to be quick to score for teachers, and cheap to score for education authorities.
The early 20th Century saw the development of the first accountability policies, and test scores were the means of implementation. So when Wood published his evaluation of what he called the 'new type tests' in 1928, he produced a table of costs between traditional written examinations and multiple choice examinations, and simply concluded “these differences are too large to need comment.”
These were the heady days of efficiency drives as the industrial economies realised that the Great War would be won or lost on well-organised munitions production as much as military strategy. Taylor’s time studies were flavour of the month, and the multiple choice question maximised industrial testing productivity in the army, and in the schools.
Whatever the criticisms of the multiple choice test, it is a cost-effective technology that has been tried and tested over a hundred years. We know how to build high-quality standardised tests with the multiple choice test. So it will still be a common feature of our tests in another hundred years.
Professor Glenn Fulcher
- Language testing and assessment
- Policy issues in test use
- Philosophy of testing and assessment
- Construct definition and operationalisation
- Designing test specifications
- Task design, prototyping and piloting
- Content analysis for EAP or specific purposes testing
- Investigating task difficulty
- Designing rating scales for performance assessment
- Interface issues in computer based testing
- Pragmatic notions of validity and the philosophy of educational assessment
- Ethics and Standards in language testing practice
- Test use, political mandates and philosophy