In: Statistics and Probability
What are the major differences between norm-referenced and criterion-referenced reliability and validation procedures?
Norm-referenced tests (NRTs, sometimes referred to as
standardized tests) and criterion-referenced tests (CRTs, also
known as classroom tests) are two families of tests that are
distinguished most clearly in terms of the ways scores are
interpreted, the purposes of the tests, levels of specificity, the
distributions
of scores, the structures of the tests, and what we want the
students to know in advance. In more detail, the two types of tests
differ in:
1) The ways scores are interpreted differ is that NRTs are designed
to compare the performances of students to one another in relative
terms, while CRTs are built to identify the amount or percent of
the material each examinee knows or can do in absolute terms.
2) The purposes of the tests also differ with NRTs primarily
designed to spread examinees out on a continuum of general
abilities so examinees’ performances can be compared to each other
(usually with standardized scores), while CRTs are designed to
assess the amount of material that the examinees know or can do
(usually expressed in percentages).
3) Levels of specificity are necessarily different with NRTs
tending to measure very general
language abilities (for proficiency or placement purposes), while
CRTs usually focus on specific,
well-defined (and usually objectives-based) language knowledges or
skills (for diagnostic or
achievement purposes).
4) The distributions of scores also differ in that, ideally, NRT
scores are normally distributed
(indeed items are selected to ensure this is the case), while CRT
scores ideally would produce
quite different distributions at different times in the learning
process: with students scoring very
low in a positively skewed distribution at the beginning of a
course on a diagnostic CRT
(indicating that they needed to learn the material) and students
scoring generally high in a
negatively skewed distribution at the end of the course on an
achievement CRT (indicating that
most of them mastered the material; indeed, in the unlikely event
that all students master all the
material, they should all score 100%).
5) The structures of the tests also differ with NRTs tending to
have many items with a few long subtests (e.g., listening, grammar,
reading, etc.) each of which has diverse item content, while CRTs
are typically built around numerous, short subtests that contain
well-defined and similar items in each.
6) What we want the students to know in advance of the test differs
in that, for NRTs, security is usually an important issue because
we do not want examinees to know the content of the test items,
while for CRTs, we teach the content of the course and want the
students to study that content, so we tell them what to study, and
we test that content. If they know the content, they should
succeed.