© 2001 International Association for the Evaluation of Educational Achievement (IEA)
Over the last decade, many states and school districts have created content and performance standards targeted at improving students achievement in mathematics and science. In mathematics, in particular, most states are in the process of updating and revising their standards. All states except Iowa (which as a matter of policy publishes no state standards) now have content or curriculum standards in mathematics, and many educational jurisdictions have worked successfully to improve their initial standards in clarity and content.(1) Much of this effort has been based on work done at the national level during this period to develop standards aimed at increasing the mathematics competencies of all students. Since 1989, when the National Council of Teachers of Mathematics (NCTM) published Curriculum and Education Standards for School Mathematics, the mathematics education community has had the benet of a unied set of goals for mathematics teaching and learning. The NCTM standards have been a springboard for state and local efforts to focus and improve mathematics education.(2)
Particularly during the past decade, there has been an enormous amount of energy expended in states and school districts not only on developing mathematics content standards but also on improving teacher quality and school environments as well as on developing assessments and accountability measures.(3) Participating in an international assessment provides states and school districts a global context for evaluating the success of their policies and practices aimed at raising students academic achievement.
What Is TIMSS 1999 Benchmarking?
TIMSS 1999, a successor to the 1995 Third International Mathematics and Science Study (TIMSS), focused on the mathematics and science achievement of eighth-grade students. Thirty-eight countries including the United States participated in TIMSS 1999 (also known as TIMSS-Repeat or TIMSS-R). Even more signicantly for the United States, however, TIMSS 1999 included a voluntary Benchmarking Study. Participation in the TIMSS 1999 Benchmarking Study at the eighth grade provided states, districts, and consortia an unprecedented opportunity to assess the comparative international standing of their students achievement and evaluate their mathematics and science programs in an international context. Participants were also able to compare their achievement with that of the United States as a whole, and in the cases where they both participated, school districts could compare with the performance of their states.
Originally conducted in 1994-1995,(4) TIMSS compared the mathematics and science achievement of students in 41 countries at ve grade levels. Using questionnaires, videotapes, and analyses of curriculum materials, TIMSS also investigated the contexts for learning mathematics and science in the participating countries. TIMSS results, which were rst reported in 1996, have stirred debate, spurred reform efforts, and provided important information to educators and decision makers around the world. The ndings from TIMSS 1999, a follow-up to the earlier study, add to the richness of the TIMSS data and their potential to have an impact on policy and practice in mathematics and science teaching and learning.
Twenty-seven jurisdictions from all across the nation, including 13 states and 14 districts or consortia, participated in the Benchmarking Study (see Exhibit 1). To conduct the Benchmarking Study, the TIMSS 1999 assessments were administered to representative samples of eighth-grade students in each of the participating districts and states in the spring of 1999, at the same time and following the same guidelines as those established for the 38 countries.
In addition to testing achievement in mathematics and science, the TIMSS 1999 Benchmarking Study involved administering a broad array of questionnaires. TIMSS collected extensive information from students, teachers, and school principals as well as system-level information from each participating entity about mathematics and science curricula, instruction, home contexts, and school characteristics and policies. The TIMSS data provide an abundance of information making it possible to analyze differences in current levels of performance in relation to a wide variety of factors associated with classroom, school, and national contexts within which education takes place.
Why Did Countries, States, Districts, and Consortia Participate?
The decision to participate in any cycle of TIMSS is made by each country according to its own data needs and resources. Similarly, the states, districts, and consortia that participated in the Benchmarking Study decided to do so for various reasons.
Primarily, the Benchmarking participants are interested in building educational capacity and looking at their own situations in an international context as a way of improving mathematics and science teaching and learning in their jurisdictions. International assessments provide an excellent basis for gaining multiple perspectives on educational issues and examining a variety of possible reasons for observed differences in achievement. While TIMSS helps to measure progress towards learning goals in mathematics and science, it is much more than an educational Olympics. It is a tool to help examine such questions as:
Unlike in many countries around the world where educational decision making is highly centralized, in the United States the opportunities to learn mathematics derive from an educational system that operates through states and districts, allocating opportunities through schools and then through classrooms. Improving students opportunities to learn requires examining every step of the educational system, including the curriculum, teacher quality, availability and appropriateness of resources, student motivation, instructional effectiveness, parental support, and school safety.
Which Countries, States, Districts, and Consortia Participated?
Exhibit 1 shows the 38 countries, 13 states, and the 14 districts and consortia that participated in TIMSS 1999 and the Benchmarking Study.
The consortia consist of groups of entire school districts or individual schools from several districts that organized together either to participate in the Benchmarking Study or to collaborate across a range of educational issues. Descriptions of the consortia that participated in the project follow.
What Is the Relationship Between the TIMSS 1999 Data for the United States and the Data for the Benchmarking Study?
The results for the 38 countries participating in TIMSS 1999, including those for the United States, were reported in December 2000 in two companion reports the TIMSS 1999 International Mathematics Report and the TIMSS 1999 International Science Report.(5) Performance in the United States relative to that of other nations was reported by the U.S. National Center for Education Statistics in Pursuing Excellence.(6) The results for the United States in those reports, as well as in this volume and its companion science report,(7) were based on a nationally representative sample of eighth-grade students drawn in accordance with TIMSS guidelines for all participating countries.
Because having valid and efcient samples in each country is crucial to the quality and integrity of TIMSS, procedures and guidelines have been developed to ensure that the national samples are of the highest quality possible. Following the TIMSS guidelines, representative samples were also drawn for the Benchmarking entities. Sampling statisticians at Westat, the organization responsible for sampling and data collection for the United States, worked in accordance with TIMSS standards to design procedures that would coordinate the assessment of separate representative samples of students within each Benchmarking entity.
For the most part, the U.S. TIMSS 1999 national sample was separate from the students assessed in each of the Benchmarking jurisdictions. Each Benchmarking participant had its own sample to provide comparisons with each of the TIMSS 1999 countries including the United States. In general, the Benchmarking samples were drawn in accordance with the TIMSS standards, and achievement results can be compared with condence. Deviations from the guidelines are noted in the exhibits in the reports. The TIMSS 1999 sampling requirements and the outcomes of the sampling procedures for the participating countries and Benchmarking jurisdictions are described in Appendix A. Although taken collectively the Benchmarking participants are not representative of the United States, the effort was substantial in scope involving approximately 1,000 schools, 4,000 teachers, and 50,000 students.
How Was the TIMSS 1999 Benchmarking Study Conducted?
The TIMSS 1999 Benchmarking Study was a shared venture. In conjunction with the Ofce of Educational Research and Improvement (OERI) and the National Science Foundation (NSF), the National Center for Education Statistics (NCES) worked with the International Study Center at Boston College to develop the study. Each participating jurisdiction invested valuable resources in the effort, primarily for data collection including the costs of administering the assessments at the same time and using identical procedures as for TIMSS in the United States. Many participants have also devoted considerable resources to team building as well as to staff development to facilitate use of the TIMSS 1999 results as an effective tool for school improvement.
The TIMSS studies are conducted under the auspices of the International Association for the Evaluation of Educational Achievement (IEA), an independent cooperative of national and governmental research agencies with a permanent secretariat based in Amsterdam, the Netherlands. Its primary purpose is to conduct large-scale comparative studies of educational achievement to gain a deeper understanding of the effects of policies and practices within and across systems of education.
TIMSS is part of a regular cycle of international assessments of mathematics and science that are planned to chart trends in achievement over time, much like the regular cycle of national assessments in the U.S. conducted by the National Assessment of Educational Progress (NAEP). Work has begun on TIMSS 2003, and a regular cycle of studies is planned for the years beyond.
The IEA delegated responsibility for the overall direction and management of TIMSS 1999 to the International Study Center in the Lynch School of Education at Boston College, headed by Michael O. Martin and Ina V.S. Mullis. In carrying out the project, the International Study Center worked closely with the iea Secretariat, Statistics Canada in Ottawa, the IEA Data Processing Center in Hamburg, Germany, and Educational Testing Service in Princeton, New Jersey. Westat in Rockville, Maryland, was responsible for sampling and data collection for the Benchmarking Study as well as the U.S. component of TIMSS 1999 so that procedures would be coordinated and comparable.
Funding for TIMSS 1999 was provided by the United States, the World Bank, and the participating countries. Within the United States, funding agencies included NCES, NSF, and OERI, the same group of organizations supporting major components of the TIMSS 1999 Benchmarking Study for states, districts, and consortia, including overall coordination as well as data analysis, reporting, and dissemination.
What Was the Nature of the Mathematics Test?
The TIMSS curriculum frameworks developed for 1995 were also used for 1999. They describe the content dimensions for the TIMSS tests as well as the performance expectations (behaviors that might be expected of students in school mathematics).(8) Five content areas were covered in the TIMSS 1999 mathematics test. These areas and the percentage of the test items devoted to each are fractions and number sense (38 percent), measurement (15 percent), data representation, analysis, and probability (13 percent), geometry (13 percent), and algebra (22 percent). The performance expectations include knowing (19 percent), using routine procedures (23 percent), using complex procedures (24 percent), investigating and solving problems (31 percent), and communicating and reasoning (two percent).
The test items were developed through a cooperative and iterative process involving the National Research Coordinators (nrcs) of the participating countries. All of the items were reviewed thoroughly by subject matter experts and eld tested. Nearly all the TIMSS 1999 countries participated in eld testing with nationally representative samples, and the nrcs had several opportunities to review the items and scoring criteria. The TIMSS 1999 mathematics test contained 162 items representing a range of mathematics topics and skills.
About one-fourth of the questions were in the free-response format, requiring students to generate and write their answers. These questions, some of which required extended responses, were allotted about one-third of the testing time. Responses to the free-response questions were evaluated to capture diagnostic information, and some were scored using procedures that permitted partial credit. Chapter 2 of this report contains 16 example items illustrating the range of mathematics concepts and processes covered in the TIMSS 1999 test. Appendix D contains descriptions of the topics and skills assessed by each item.
Testing was designed so that no one student took all the items, which would have required more than three hours of testing time. Instead, the test was assembled in eight booklets, each requiring 90 minutes to complete. Each student took only one booklet, and the items were rotated through the booklets so that each item was answered by a representative sample of students.
How Does TIMSS 1999 Compare with NAEP?
The National Assessment of Educational Progress (NAEP) is an ongoing program that has reported the mathematics achievement of U.S. students for some 30 years. TIMSS and NAEP were designed to serve different purposes, and this is evident in the types of assessment items as well as the content areas and topics covered in each assessment. TIMSS and NAEP both assess students at the eighth grade, and both tend to focus on mathematics as it is generally presented in classrooms and textbooks. However, TIMSS is based on the curricula that students in the participating countries are likely to have encountered by the eighth grade, while NAEP is based on an expert consensus of what students in the United States should know and be able to do in mathematics and other academic subjects at that grade. For example, TIMSS 1999 appears to place more emphasis on number sense, properties, and operations than NAEP. NAEP appears to distribute its focus more equally across the content areas included in the assessment frameworks.(9)
Whereas NAEP is designed to provide comparisons among and between states and the nation as a whole, the major purpose of the TIMSS 1999 Benchmarking Study was to provide entities in the United States with a way to compare their achievement and instructional programs in an international context. Thus, the point of comparison or benchmark consists primarily of the high-performing TIMSS 1999 countries. The sample sizes were designed to place participants near the top, middle, or bottom of the TIMSS continuum of performance internationally, but not necessarily to detect differences in performance among different Benchmarking participants. For example, all 13 of the participating states performed similarly in mathematics in relation to the TIMSS countries near the middle. As ndings from the NAEP assessment in 2000 are released, it is important to understand the differences and similarities in the assessments to be able to make sense of the ndings in relation to each other.
How Do Country Characteristics Differ?
International studies of student achievement provide valuable comparative information about student performance, instructional practice, and curriculum. Accompanying the benets of international studies, though, are challenges associated with making comparisons across countries, cultures, and languages. TIMSS attends to these issues through careful planning and documentation, cooperation among the participating countries, standardized procedures, and rigorous attention to quality control throughout.(10)
It is extremely important, nevertheless, to consider the TIMSS 1999 results in light of countrywide demographic and economic factors. Some selected demographic characteristics of the TIMSS 1999 countries are presented in Exhibit 2. Countries ranged widely in population, from almost 270 million in the United States to less than one million in Cyprus, and in size, from almost 17 million square kilometers in the Russian Federation to less than one thousand in Hong Kong SAR and Singapore. Countries also varied widely on indicators of health, such as life expectancy at birth and infant mortality rate, and of literacy, including adult literacy rate and daily newspaper circulation. Exhibit 3 shows information for selected economic indicators, such as gross national product (gnp) per capita, expenditure on education and research, and development aid. The data reveal that there is great disparity in the economic resources available to participating countries.
How Do the Benchmarking Jurisdictions Compare on Demographic Indicators?
Together, the indicators in Exhibit 2 and Exhibit 3 highlight the diversity of the TIMSS 1999 countries. Although the factors the indicators reect do not necessarily determine high or low performance in mathematics, they do provide a context for considering the challenges involved in the educational task from country to country. Similarly, there was great diversity among the TIMSS 1999 Benchmarking participants. Exhibit 4 presents information about selected characteristics of the states, districts, and consortia that took part in the TIMSS 1999 Benchmarking Study.
As illustrated previously in Exhibit 1, geographically the Benchmarking jurisdictions were from all across the United States, although there was a concentration of east coast participants with six of the states and several of the districts and consortia from the eastern seaboard. Illinois was well represented, by the state as a whole and by three districts or consortia the Chicago Public Schools, the Naperville School District, and the First in the World Consortium. Several other districts and consortia also had the added benet of a state comparison the Michigan Invitational Group and Michigan, Guilford County and North Carolina, Montgomery County and Maryland, and the Southwest Pennsylvania Math and Science Collaborative and Pennsylvania.
As shown in Exhibit 4, demographically the Benchmarking participants varied widely. They ranged greatly in the size of their total public school enrollment, from about 244,000 in Idaho to nearly four million in Texas among states, and from about 11,000 in the Michigan Invitational Group to about 430,000 in the Chicago Public Schools among districts and consortia.
It is extremely important to note that the Benchmarking jurisdictions had widely differing percentages of limited English procient and minority student populations. They also had widely different percentages of students from low-income families (based on the percentage of students eligible to receive free or reduced-price lunch). Among states, Texas had more than half minority students compared with less than one-fth in Idaho, Indiana, and Michigan. Among the school districts, those in urban areas had more than four-fths minority students, including the Chicago Public Schools (89 percent), the Jersey City Public Schools (93 percent), the Miami-Dade County Public Schools (93 percent), and the Rochester City School District (84 percent). These four districts also had very high percentages of students from low-income families. In comparison, Naperville and the Academy School District had less than one-fth minority students and less than ve percent of their students from low-income families.
Research on disparities between urban and non-urban schools reveals a combination of factors, often interrelated, that all mesh to lessen students opportunities to learn in urban schools. Students in urban districts with high percentages of low-income families and minorities often attend schools with higher proportions of inexperienced teachers.(11) Urban schools also have fewer qualied teachers than non-urban schools. In reviewing the U.S. Department of Educations 1994 Schools and Stafng Survey, Education Week prepared a 1998 study on urban education that found that urban school districts experience greater difculty lling teacher vacancies, particularly for certain elds including mathematics, and that they are more likely than non-urban schools to hire teachers who have an emergency or temporary license.(12) Studies of under-prepared teachers indicate that such teachers have more difculty with classroom management, teaching strategies, curriculum development, and student motivation.(13) Teacher absenteeism is also a more serious problem in urban districts. An NCES report on urban schools found they have fewer resources, such as textbooks, supplies, and copy machines, available for their classrooms.(14) It also found that urban students had less access to gifted and talented programs than suburban students. Additionally, several large studies have found urban school facilities to be functionally older and in worse condition than non-urban ones.(15)
How Is the Report Organized?
This report provides a preliminary overview of the mathematics results for the Benchmarking Study. The real work will take place as policy makers, administrators, and teachers in each participating entity begin to examine the curriculum, teaching force, instructional approaches, and school environment in an international context. As those working on school improvement know full well, there is no silver bullet or single factor that is the answer to higher achievement in mathematics or any other school subject. Making strides in raising student achievement requires tireless diligence in all of the various areas related to educational quality.
The report is in two sections. Chapters 1 through 3 present the achievement results. Chapter 1 presents overall achievement results. Chapter 2 shows international benchmarks of mathematics achievement illustrated by results for individual mathematics questions. Chapter 3 gives results for the ve mathematics content areas. Chapters 4 through 7 focus on the contextual factors related to teaching and learning mathematics. Chapter 4 examines student factors including the availability of educational resources in the home, how much time they spend studying mathematics outside of school, and their attitudes towards mathematics. Chapter 5 provides information about the curriculum, such as the mathematics included in participants content standards and curriculum frameworks as well as the topics covered and emphasized by teachers in mathematics lessons. Chapter 6 presents information on mathematics teacher preparation and professional development activities as well as on classroom practices. Chapter 7 focuses on school factors, including the availability of resources for teaching mathematics and school safety.
Each of chapters 4 through 7 is accompanied by a set of reference exhibits in the reference section of the report, following the main chapters. Appendices at the end of the report summarize the procedures used in the Benchmarking Study, present the multiple comparisons for the mathematics content areas, provide the achievement percentiles, list the topics and processes measured by each item in the assessment, and acknowledge the numerous individuals responsible for implementing the TIMSS 1999 Benchmarking Study.
TIMSS 1999 is a project of the International
Boston College, Lynch School of Education