Mathematics Benchmarking Report TIMSS 1999–Eighth Grade




CHAPTER 2: Performance at International Benchmarks

The TIMSS 1999 international benchmarks delineate performance of the top 10 percent, top quarter, top half, and lower quarter of students in the entities participating in the study. To help interpret the achievement results, Chapter 2 describes eighth-grade mathematics achievement at each of these benchmarks together with examples of the types of items typically answered correctly by students performing at the benchmark.


To provide an idea of the mathematics understandings and skills displayed by students performing at different levels on the TIMSS mathematics achievement scale, TIMSS described performance at four international benchmarks. The TIMSS 1999 international benchmarks delineate performance of the top 10 percent, top quarter, top half, and lower quarter of students in the countries participating in the TIMSS 1999 study. (The benchmarks were set at the 90th, 75th, 50th, and 25th percentiles, respectively.)

As states and school districts spend time and energy on improving students’ mathematics achievement, it is important that educators, curriculum developers, and policy makers understand what students know and can do in mathematics, and what areas, concepts, and topics need more focus and effort. To help interpret the range of achievement results for the TIMSS 1999 Benchmarking participants presented in Chapter 1, this chapter describes eighth-grade mathematics achievement at each of the TIMSS 1999 international benchmarks, explaining the types of mathematics understandings and skills typically displayed by students performing at the benchmarks. The benchmark descriptions are presented together with examples of the types of mathematics test questions typically answered correctly by students reaching the benchmark. Appendix D contains the descriptions of the understandings and skills assessed by each item in the TIMSS 1999 assessment at each benchmark.(1)

For each of the example test questions, the percentages of correct responses are provided for selected countries as well as for the jurisdictions participating in the TIMSS 1999 Benchmarking project. The countries and Benchmarking jurisdictions are presented in descending order, with those performing highest shown first. The countries included for purposes of comparison are the United States as well as a dozen European and Asian countries of interest. These include several high-performing European countries (Belgium (Flemish), the Czech Republic, the Netherlands, and the Russian Federation), countries that are major economic trading partners of the United States (Canada, England, and Italy), and the top-scoring Asian countries of Chinese Taipei, Hong Kong, Japan, Korea, and Singapore.

Presented previously in Chapter 1, Exhibit 1.4 shows the percentages of students in each participating entity reaching each international benchmark – Top 10%, Upper Quarter, Median, and Lower Quarter. If an entity had high average achievement in mathematics and a large percentage of its students at or above the upper benchmarks, this indicates that the students are concentrated among the highest-achieving students internationally. For example, top-performing Singapore had nearly half (46 percent) of its students reaching the Top 10% Benchmark and three-fourths (75 percent) reaching the Upper Quarter Benchmark – the point on the scale that typically only 25 percent of the students would be expected to reach if achievement were distributed equally from country to country. Most of the Singaporean students (93 percent) reached the Median Benchmark. Performance in the United States was closer to the distribution that might be expected if achievement were distributed the same from country to country: nine percent of the students reached the Top 10% Benchmark, 28 percent reached the Top Quarter Benchmark, and 61 percent reached the Median Benchmark.

The analysis of performance at these benchmarks in mathematics suggests that three primary factors appeared to differentiate performance at the four levels:

• The mathematical operation required
• The complexity of the numbers or number system
• The nature of the problem situation.

For example, there is evidence that students performing at the lower end of the scale could add, subtract, and multiply whole numbers. In contrast, students performing at the higher end of the scale solved non-routine problems involving relationships among fractions, decimals, and percents; various geometric properties; and algebraic rules.

How Were the Benchmark Descriptions Developed?

To develop descriptions of achievement at the TIMSS 1999 international benchmarks, the International Study Center used the scale anchoring method. Scale anchoring is a way of describing students’ performance at different points on the TIMSS 1999 achievement scale in terms of the types of items they answered correctly. It involves an empirical component in which items that discriminate between successive points on the scale are identified, and a judgmental component in which subject-matter experts examine the content of the items and generalize to students’ knowledge and understandings.
For the scale anchoring analysis, the results of students from all the TIMSS 1999 countries were pooled, so that the benchmark descriptions refer to all students achieving at that level. (That is, it does not matter which country the students are from, only how they performed on the test.) Certain criteria were applied to the TIMSS 1999 achievement scale results to identify the sets of items that students reaching each international benchmark were likely to answer correctly and those at the next lower benchmark were unlikely to answer correctly.(2) The sets of items thus produced represented the accomplishments of students reaching each benchmark and were used by a panel of subject-matter experts from the TIMSS countries to develop the benchmark descriptions.(3) The work of the panel involved developing a short description for each item of the mathematical understandings demonstrated by students answering it correctly, summarizing students’ knowledge and understandings across the set of items for each benchmark to provide more general statements of achievement, and selecting example items illustrating the descriptions.

How Should the Descriptions Be Interpreted?

In general, the parts of the descriptions that relate to the understanding of mathematical concepts or familiarity with procedures are relatively straightforward. It needs to be acknowledged, however, that the cognitive behavior necessary to answer some items correctly may vary according to students’ experience. An item may require only simple recall for a student familiar with the item’s content and context, but necessitate problem-solving strategies from one unfamiliar with the material. Nevertheless, the descriptions are based on what the panel believed to be the way the great majority of eighth-grade students could be expected to perform.

It also needs to be emphasized that the descriptions of achievement characteristic of students at the international benchmarks are based solely on student performance on the TIMSS 1999 items. Since those items were developed in particular to sample the mathematics domains prescribed for this study, neither the set of items nor the descriptions based on them purport to be comprehensive. There are undoubtedly other mathematics curriculum elements on which students at the various benchmarks would have been successful if they had been included in the assessment.

Please note that students reaching a particular benchmark demonstrated the knowledge and understandings characterizing that benchmark as well as those characterizing the lower benchmarks. The description of achievement at each benchmark is cumulative, building on the description of achievement demonstrated by students at the lower benchmarks.

Finally, it must be emphasized that the descriptions of the international benchmarks are one possible way of beginning to examine student performance. Some students scoring below a benchmark may indeed know or understand some of the concepts that characterize a higher level. Thus, it is important to consider performance on the individual items and clusters of items in developing a profile of student achievement in each participating entity.

Several example items are included for each benchmark to complement the descriptions by giving a more concrete notion of the abilities students demonstrated. Each example item is accompanied by the percentage of correct responses for each TIMSS 1999 Benchmarking participant. Percentages are also provided for selected countries, as is the international average for all 38 countries that participated in TIMSS 1999. In general, the several entities scoring highest on the overall test also scored highest on many of the example items. Not surprisingly, this was true for items assessing a range of performance expectations – recall, ability to carry out routine procedures, and ability to solve routine and non-routine problems. The TIMSS 1999 results support the premise that successful problem solving is grounded in mastery of more fundamental knowledge and skills.

Item Examples and Student Performance

The remainder of this chapter describes each benchmark and presents three to five example items illustrating what students know and can do at that level. The correct answer is circled for multiple-choice items. For open-ended items, the answers shown exemplify the types of student responses that were given full credit. The example items are ones that students reaching each benchmark were likely to answer correctly, and they represent the types of items used to develop the description of achievement at that benchmark.(4)

next section >

1 For a detailed description of the items and benchmarks for TIMSS 1995 at fourth and eighth grades and how they compare to the National Council of Teachers of Mathematics’ (NCTM) Principles and Standards for School Mathematics, see Kelly, D.L., Mullis, I.V.S., and Martin, M.O., Profiles of Student Achievement in Mathematics at the TIMSS International Benchmarks: U.S. Performance and Standards in an International Context, Chestnut Hill, MA: Boston College.
2 For example, for the Top 10% Benchmark, an item was included if at least 65 percent of students scoring at the scale point corresponding to this benchmark answered the item correctly and less than 50 percent of students scoring at the Upper Quarter Benchmark answered it correctly. Similarly, for the Upper Quarter Benchmark, an item was included if at least 65 percent of students scoring at that point answered the item correctly and less than 50 percent of students at the Median Benchmark answered it correctly.
3 The participants in the scale anchoring process are listed in Appendix E.
4 Some of the items used to develop the benchmark descriptions are being kept secure to measure achievement trends in future TIMSS assessments and are not available for publication.

Click here to return to the ISC homepage

TIMSS 1999 is a project of the International Study Center
Boston College, Lynch School of Education