Mathematics Benchmarking Report TIMSS 1999–Eighth Grade




CHAPTER 2: Performance at International Benchmarks

What Issues Emerge from the Benchmark Descriptions?

The benchmark descriptions and example items strongly suggest a gradation in achievement, from the top-performing students’ ability to generalize and solve non-routine or contextualized problems to the lower-performing students being able primarily to use routine, mainly numeric procedures. The fact that even at the Median Benchmark students demonstrate only limited achievement in problem solving beyond straightforward one-step problems may suggest a need to reconsider the role, or priority, of problem solving in mathematics curricula.

The choices teachers make determine, to a large extent, what students learn. According to the NCTM’s “The Teaching Principle,” in effective teaching worthwhile mathematical problems are used to introduce important ideas and engage students’ thinking. The TIMSS 1999 Benchmarking results show that higher achievement is related to the emphasis that teachers place on reasoning and problem-solving activities (see Chapter 6, Exhibit 6.11). This finding is consistent with the video study component of TIMSS conducted in 1995. Analyses of videotapes of mathematics classes revealed that in the typical mathematics lesson in Japan students worked on developing solution procedures to report to the class that were often expected to be original constructions. In contrast, in the typical U.S. lesson students essentially practiced procedures that had been demonstrated by the teacher.

In looking across the item-level results, it is also important to note the variation in performance across the topics covered. On the 16 items presented in this chapter, there was a substantial range in performance for many Benchmarking participants. For example, students in the Benchmarking entities performed relatively well on the items requiring rounding (Exhibits 2.13 and 2.17), and students in Texas did very well on the subtraction questions (Exhibits 2.18 and 2.19). Conversely, students in the Benchmarking entities had particular difficulty with measurement items containing figures (Exhibits 2.2 and 2.9). In some cases, differences of this sort will result from intended differences in emphasis in state or district curricula. It is likely, however, that variation in results may be unintended, and the findings will provide important information about strengths and weaknesses in intended or implemented curricula. For example, Maryland, the Michigan Invitational Group, Chicago, Rochester, and Miami-Dade may not have anticipated performing below the international average on a relatively straightforward word problem involving proportional reasoning (Exhibit 2.8). At the very least, an in-depth examination of the TIMSS 1999 results may reveal aspects of curricula that merit further investigation.



next chapter >


7 TIMSS used item response theory (IRT) methods to summarize the achievement results on a scale with a mean of 500 and a standard deviation of 100. Given the matrix-sampling approach, scaling averages students’ responses in a way that accounts for differences in the difficulty of different subsets of items. It allows students’ performance to be summarized on a common metric even though individual students responded to different items in the test. For more detailed information, see the “IRT Scaling and Data Analysis” section of Appendix A.

Click here to return to the ISC homepage

TIMSS 1999 is a project of the International Study Center
Boston College, Lynch School of Education