Kenji Hakuta

	Search this Site

SUPPLEMENTAL DECLARATION OF KENJI HAKUTA

(note: I am indebted to Diane August, Claude Goldenberg, Daria Witt and Mike Broom for their help in preparing this declaration)

1. I am a professor of education at Stanford University. I have previously filed a declaration in this case and a copy of my vita is attached thereto.

2. I was chair of a committee at the National Research Council (NRC), which is the operating arm of the National Academy of Sciences, the Institute of Medicine and the National Academy of Engineering, that examined the research base on the education of limited-English proficient children, referred to in the report as English-language learners. The nine-member committee was appointed by the NRC to represent knowledge from diverse disciplines. The NRC committee included notable scholars in the areas of linguistics, psychology, sociology, program evaluation, assessment, teacher education, and effective schooling. Committee members, as is customary with National Academy committees, also represented a diversity of perspectives regarding the most effective means of educating language minority children. To ensure that the report conformed to the highest standards of scientific rigor in an area that has been very politicized, the report was peer reviewed by nine outside experts prior to publication, and the review process was overseen by an external monitor (also appointed by NRC). Charles Glenn, in his deposition, identifies himself as one of the nine reviewers. In their declarations for the defendants, several experts refer to this report. These include Porter (paragraphs 9 and 10), Rossell (footnote 8), Gersten (paragraph 21), and Glenn (paragraph 9).

3. In this declaration, I will make four points.

Point 1. Declarants' citations regarding the NRC report misrepresent its main findings. They claim the NRC report indicates use of the native language is an ineffective instructional strategy, whereas in fact, the report finds this technique to be an effective approach. Declarants espouse an extremely selective, narrow and limited view of the education of language minority students which fails to address access of students to academic content. They have focused on whether native language use results in better or worse outcomes than some form of English-only program, and within that, on evaluation studies that define outcomes as English language and reading, and occasionally, mathematics. They have not addressed the essential areas of subject area knowledge and skills across the content areas (e.g., science, social studies, etc.)

Point 2. I will address the question of whether there is a sound theoretical basis on which structured English immersion programs in California can be developed. I will examine declarants' claim as it relates both to programs in other countries as well as in the United States. My conclusion will be that there is no defensible theory base to the programs prescribed by Proposition 227.

Point 3. I will examine the extent to which programs resembling those proposed under Proposition 227 will be successful in ensuring that LEP students learn English and attain high levels of subject matter skills and knowledge. I will show that the outcomes for students placed in programs similar to those proposed by Proposition 227 are alarmingly poor, hardly worthy of state-wide prescription, and harmful to students.

Point 4. I will point to several major misrepresentations of research in the declarations offered by the defendants.

Declaration relative to Point 1: Misrepresentation of the NRC Report Findings.

4. On the question of effectiveness of programs that use the native language to educate English language learners, the defendants either misconstrue or fail to understand the NRC report's conclusions. Indeed, the NRC report found evidence in favor of native language use. It is inaccurate to characterize the report as inconclusive on this point.

The committee focused its review on the major evaluation studies conducted over the past 20 years including three large-scale national evaluations of programs as well as five key reviews of smaller-scale program evaluations.

5. The three large-scale national evaluations of programs for English-language learners are: (1) the American Institutes for Research evaluation of programs, referred to as the AIR study (Danoff, 1978); (2) the National Longitudinal Evaluation of the Effectiveness of Services for Language Minority Limited English Proficient Students, referred to as the Longitudinal Study (Development Associates, 1984; Burkheimer et al., 1989); and (3) the Longitudinal Study of Immersion and Dual Language Instructional Programs for Language Minority Children, referred to as the Immersion Study (Ramirez et al., 1991). The committee also reviewed the report of a prior National Research Council committee that thoroughly examined the Longitudinal Study and the Immersion Study (Meyer and Fienberg, 1992).

6. Five key reviews of smaller-scale program evaluations include: (1) Baker and de Kanter's (1981) review of 28 studies of programs designed for English-language learners with evaluations considered to be methodologically sound; (2) Rossell and Ross (1986) and (3) Rossell and Baker's (1996) review of studies that evaluated alternative second-language programs which had random assignment to programs or statistical control for pretreatment differences between groups when random assignment was not possible; (4) Willig's (1985) meta-analysis of studies reviewed by Baker and de Kanter (in contrast with previous reviews, her analysis quantitatively measured the program effect in each study, even if it was not statistically significant); and (5) the U.S. General Accounting Office's (1987) survey of 10 experts in the field to gauge the effectiveness of bilingual education programs (this methodology was quickly dismissed as unsound).

7. Based on the review of these evaluation studies the NRC committee found evidence for the beneficial effects of native-language instruction. The Committee made special note of the merits of Willig's meta-analysis. Meta-analysis is a well-recognized statistical procedure that is more robust and sophisticated than simple tallying of the numbers of studies that support or do not support bilingual education, which is the approach used by Baker and deKanter, and by Rossell and Baker. Meta-analysis takes into account two key statistical characteristics of the studies being summarized: the size of the effect and its statistical robustness (which is mostly a function of the size of the sample in the study, and the variability of the individual scores within the sample). An overall test of statistical significance is then performed summing across all studies.

8. Subsequent to the publication of the report, Dr. Jay Greene of the University of Texas at Austin conducted a meta-analysis using the studies reviewed by Rossell and Ross. After eliminating studies that were invalid for inclusion by Rossell and Ross, he found a statistically significant effect in favor of bilingual education. He reports an average effect size of .21 standard deviation units in English reading, and .11 standard deviation units in math, measured in English. Overall, the results are statistically significant for reading, but not significant (though approaching significance) for math. The most likely reason why his analysis found effects where Rossell and Ross did not is because meta-analysis is a more finely tuned method of analysis. (It is like measuring the 100-meter dash with a stopwatch instead of a sand clock. The sand clock is only effective in detecting a difference if there are huge differences among runners, or if a very large number of races are measured and averaged.)

9. Dr. Greene notes, however, that 1/5 of a standard deviation is small compared to the overall gap between LEP students and the national average. This gap is typically about a full standard deviation in size. Thus, one might say that use of the native language takes care of 1/5 of the task of effectively educating LEP students. This is consistent with the overall conclusion of the NRC report, which states that many other things need to happen to effectively educate these children: improving teachers and teaching, focusing on academic content, putting accountability into place, and involving parents.

10. The NRC report also found advantages of native language use in its rigorous review of empirical studies of schools and classrooms that were considered effective and exemplary. We identified and reviewed the entire set of research studies on effective schooling conducted over the past 20 years. The advantages of native-language use are a prominent theme among these studies, either explicitly (e.g., Henderson and Landesman, 1992; Hernandez, 1991; Muniz-Swicegood, 1994; Lucas et al., 1990; Berman et al., 1995; Rosebery et al., 1992, Tikunoff, 1983; Pease-Alvarez et al., 1991; Calderon et al., 1996) or implicitly (Carter and Chatfield, 1986, and Goldenberg and Sullivan, 1994, both of which took place in school settings where there was a firm commitment to bilingual education).

11. However, as previously noted, the studies point to many additional individual, school and classroom factors that influence the achievement of English language learners. The report carefully examined what we know from cognitive science and developmental psychology about an individual's development of language and especially cognition. This is important because educating language minority children should be not just about language acquisition, but also about learning the academic curriculum. There are also many variables and dimensions of schooling that influence LEP children’s (indeed, all children’s) academic achievement, aside from language of instruction per se. These factors include: a supportive but challenging school-wide climate; effective school leadership; articulation of explicit learning goals and outcomes for students; coordination of programs and instruction within and between schools; effective explicit skills instruction; opportunities for student-directed activities; instructional strategies that enhance understanding; opportunities for practice; systematic student assessment; staff development; and home and parent involvement. Among the most important conclusions of our report—which the defendants’ declarations completely ignore—is that to be truly effective, programs for LEP students must be designed with these principles in mind. It is my belief and that of the NRC committee, based on our expertise and knowledge of the entirety of the research on bilingual children and school programs and policies, that the key to program improvement is not in finding a program that works for all children and all localities, or finding a program component (such as native language instruction) that works as some sort of "magic bullet," but rather finding a set of program components that works for the children in the community of interest, given the goals, demographics, and resources of that community. Native language instruction is one of the program components educators must be free to use in constructing effective programs . By depriving educators of this component, Prop. 227 summarily removes one of many useful tools that can be used to improve learning outcomes for these students. Prop. 227 represents a meat-cleaver approach to a challenge that requires much greater subtlety, complexity, and depth of understanding.

12. In sum, the NRC report finds that on average, bilingual education programs are more effective than English-only programs. However, there are many other important factors that influence student outcomes. There is much more work left to do by the schools if we are to enable LEP students to achieve at high academic levels. Improvement would have to focus on teachers, teaching, academic content and standards, accountability, school-wide leadership, program integration, parent involvement—and effective use of the native language to assure high level and meaningful learning for all students from the time they enter school. Proposition 227 removes an important tool -- use of the native language -- from the hands of educators. It would only serve to make even more difficult the challenges of school improvement.

Declaration relative to Point 2: The Theoretical Basis for Structured English Immersion.

13. By theoretical basis, I mean the availability of data or published reports that contain (1) a theory and accompanying objective descriptions of the policies, practices and the populations served by the program, and (2) outcomes of the program, such as English language development, content area development, and redesignation rate.

International Theory Base for Structured English Immersion.

14. Porter and Glenn in their declarations refer to examples from other nations as providing a theoretical basis for Proposition 227. During the political debate, experiences from European nations as well as Israel were frequently evoked. I will now examine the claims made on these grounds. I am not an expert on international comparisons in education, although I follow the literature with great interest. I did, however, serve as an expert consultant on an international team at the Organization for Economic Cooperation and Development (OECD) in 1988-1989 that looked at innovative practices in educating immigrant students in Australia, Canada, the Netherlands, Norway, Sweden, Switzerland, Yugoslavia, and the United States. This was a case study approach and examined a broad range of practices. Very few of these cases examined educational achievement due to the unavailability of data, and the member nations were interested in such innovations precisely because of the high failure rates of immigrant students in their schools through traditional programs.

15. Charles Glenn argues that the program advocated by Proposition 227 is a "well-established approach, a sound educational practice, which has been used for over 30 years by more than a dozen nations with excellent educational systems, and by thousands of schools in the United States" (paragraph 18). He bases his observation on a book he co-authored with Ester de Jong Language Minority Children in School: A Comparative Study of Twelve Nations, which I have read in manuscript form. The book is a broad description of how 12 Western Europe and English-speaking nations have responded to a tremendous increase in their immigration rate. A major question of the study was to ask how each nation has developed its educational system to provide equal educational opportunities to their diverse immigrant student population. The nations are Australia, Belgium, Canada, Denmark, France, Germany, the Netherlands, Spain, Sweden, Switzerland, the United Kingdom, and the United States.

16. Glenn declares that there is close similarity between the types of programs in the 12 nations studied in his report and the structured immersion program supported by Proposition 227. Although some of the educational programs of the 12 nations do share some characteristics of structured immersion programs, there are still a number of significant differences between programs. Furthermore, although all the nations offer both bilingual and immersion programs, the extent to which each nation subscribes to a different program varies. For example, Australia and Denmark place a greater emphasis on bilingual education, whereas Belgium and Switzerland place a greater emphasis on immersion programs. Probably the most considerable difference between the two is that all 12 nations offer both bilingual and immersion programs, whereas Proposition 227 only supports one approach, structured immersion.

17. Glenn declares that there is a "norm for the schooling of immigrant and other language-minority children in every country except the United States" (paragraph 11) that consists of a one-year period of intensive language instruction that could serve as a theoretical basis for Proposition 227. Careful reading of his book shows that this is a vast oversimplification and that Glenn is much more of a balanced scholar than his statement would indicate. In fact, it is clear from his book that each nation provides a very different type of educational system for its immigrant student population.

18. Among the 12 nations, there is also very little agreement on how long immigrant students should remain in immersion or bilingual programs before being mainstreamed into regular classes. Glenn indicates that many of the nations have created goals of mainstreaming immigrant students in immersion programs after three months to three years. There is no discussion of the length of time immigrant children remain in bilingual programs before being mainstreamed.

19. A seriously misleading assertion in Glenn's declaration is that structured immersion programs have been proven effective. In his report, he admits that there is "meager evidence--mostly anecdotal--that exists for the effectiveness of each approach [meaning both immersion and bilingual programs]. One study that he does cite evaluating an immersion program actually shows high rates of student failure. It examines what students in a one-year immersion program are doing four years after they were mainstreamed: "After four years, 32 percent of the group has continued their schooling without retention in grade, 38 percent had repeated one year, 8 percent two years, and 21 percent had dropped out of school." 20. In his declaration, Glenn only uses one reference to outcomes of structured immersion programs (paragraph 15) and he seriously misrepresents the point of the referenced material. He quotes E.D. Hirsch: "the initial gap between advantaged and disadvantaged students, instead of widening steadily as in the United States, decreases with each school grade. By the end of seventh grade, the child of a North African immigrant who has attended two years of French preschool (école maternelle) will on average have narrowed the socially induced learning gap." But this statement about the effectiveness of the program is taken totally out of its context, in which Hirsch is actually discussing how effective a strong core curriculum (such as the one in France) is for both advantaged and disadvantaged students. Thus, instead of supporting Proposition 227, as Glenn intends, these results show the educational risks of Proposition 227, which ignores content curriculum and replaces it with language instruction.

21. Porter in her declaration refers to Israel as another example of a successful program resembling Proposition 227 (paragraph 12-13). Israel offers three educational programs for incoming immigrant students: a three month intensive second language program, a one year intensive second language program (perhaps similar to Proposition 227), and placing students in mainstream classes with pull-out classes in Hebrew. None of these approaches address academic content material. A recent study by the Ministry of Education showed an extremely high dropout rate among immigrants in the 1996-1997 school year: "89.9 percent of non-immigrant 17-year old Jews were in school, as opposed to 66.4 percent of immigrants in the same age group. In other words, one-third of all 17 year-old immigrants were not in school."

22. Glenn in his book notes that over the last few decades, all of the nations have switched back and forth between using bilingual and immersion programs due to poor student achievement. Germany, for example, has changed the emphasis of its educational program used to instruct immigrant students four times since 1964. Thus, it is inappropriate to consider the numerous and varied educational systems of the 12 nations Glenn studied as one "well-established approach." It would be more appropriate to characterize each nation as still searching for an educational approach to effectively instruct its immigrant student population.

23. I conclude from my reading of Glenn's work, as well as of my own readings in the broader research in this area and my experience with the Organization for Economic Cooperation and Development (OECD), that research in other countries do not provide a sound theoretical basis for Proposition 227. First, there is no well-established immersion approach common to all countries. Second, these countries have not developed successful means to educate their immigrant children, who tend to drop out of school and fail to learn either content or language. Third, the individual international studies cited in support of structured immersion are misrepresented, and actually show alarming failure rates.

U. S. Theory Base on Structured English Immersion.

24. Porter refers to several structured immersion programs as extant examples of those envisioned by Proposition 227. However, our review of these programs finds they are very different from those that would be created by Proposition 227. They use some native language and provide flexibility to local districts and schools to design programs that meet their local needs. In those programs where data is available, students remain for more than one year. Most importantly, we know of no data from these programs that can establish that academic content can be learned successfully through a language a child does not know.

25. The other declaration claiming the research-based existence of Structured English Immersion Programs is from Gersten. In paragraph 45, he writes: "Students can access the core curriculum in mainstream classrooms while they are receiving supported instruction in English using specialized techniques such as those described and validated in the research of Elba Reyes and Candace Bos, Janette Klingner and Sharon Vaughn, Anna Chamot and Michael O’Malley, William Saunders, Valerie Anderson and Marsha Roit." These authors all contributed to a special of issue of the Elementary School Journal, of which he is the editor. However, with the exception of Saunders’ research, these techniques have NOT been validated as means for permitting students to "access the core curriculum in mainstream classrooms." Goldenberg, writing in that same issue of the journal, reviewed all of the research, and says in his commentary focusing on these very researchers’ studies:

"What is needed now is clear-cut evidence of effects for programs and strategies suggested by these authors. This same need exists for related approaches currently receiving widespread attention. For example, advocates of "sheltered English," sometimes called "specially designed academic instruction in English (SDAIE)," say this set of techniques . . .can be used to teach intermediate or advanced LEP students challenging content in English . . . As compelling as many of these recommended practices are, there are still many questions about implementation . . and effects. . . . Although all the articles can claim substantial theoretical and/or practical foundation, there is still a significant need for assessment and evaluation data." (p. 369)

26. Of the studies mentioned by Gersten, the study which has been validated in terms of measured student outcomes is Saunders. However, this research has been conducted in the Los Angeles USD and in schools with strong commitments to primary language instruction. Saunders’ work shows that given the academic and literacy foundation provided by reasonable primary language instruction in the first years of schooling, students can be helped to make a more successful transition to literacy and academic study in English

27. In paragraph 20, Gersten declares that Ramirez found no difference between immersion and bilingual education models. That is not true. Ramirez found an effect in favor of Early Exit Bilingual over Immersion, a finding that was validated by the Fienberg and Meyer report from the National Research Council.

28. In paragraph 35, Gersten declares that Greene’s analysis found no differences between bilingual programs and English-only programs. That is contrary to Greene’s report. (See declaration of Jay Greene.)

Declaration Relative to Point 2: The Time on Task Theory

Both Rossell (paragraph 38-39) and Porter (paragraph 20) use "time-on-task" theory to support structured English immersion, and to question the effectiveness of bilingual education. Time-on-task is a theory that is based on a straightforward input-output model. It says that there is a direct and causal relationship between the amount of time spent on input (instruction and learning) and the level of output accomplished (knowledge and skills accrued). Time-on-task is a theory that comes from the brain and behavioral sciences, especially the fields of animal behavior, experimental psychology, and cognitive science/artificial intelligence. My academic training and most of my early research have been in the areas of experimental psychology and cognitive and language development, and so I am familiar with various versions of time-on-task theory. Porter characterizes time-on-task as "the more time spent learning a subject, the better that subject will be learned." Rossell characterizes it as: "whatever advantage one gains from having the core curriculum explained in your so-called ‘primary’ language is offset by the reduced instructional time that of necessity must come about because of the need to squeeze Spanish language arts into a fixed day." The time-on-task model that Rossell and Porter describe is based on the traditional learning theories of John Watson and B. F. Skinner developed in the 1940s and 1950s. This can usually be described through linear models, such as the familiar equation from high school algebra, y=mx + b, where m is the slope, b is the zero-intervept, x is the input, and y is the output. This traditional model of learning is no longer accepted by learning scientists because it does not work. Currently viable models of learning are cognitive theory, which predicts a non-linear, metamorphosis-like type of development (such as the stages of development of a butterfly), and brain-maturational theory, which predicts innate knowledge and essentially says that minimal input is necessary to accomplish the output. There are many variations of each of these theories, but I need not trouble the court with these. My point is that no serious cognitive or brain behavioral scientist since the late 1970s has subscribed to traditional learning theory. There is simply too much evidence against it. Steven Pinker, the widely-respected Director of the Center for Cognitive Neuroscience at MIT and a leading proponent of the brain-maturational theory, has written two entire books that lay learning theory to rest. The only debate remaining is between various versions of cognitive theory and brain-maturational theory. The question of learning is not how much, but when and in what sequence

Porter, in the same paragraph in which she describes time-on-task theory, also describes the "critical age hypothesis" that "the optimal time to learn a second language is between age three and five or as soon thereafter as possible, and certainly before the onset of puberty". This is curious, because this hypothesis is founded on a central claim that time-on-task is not important! The critical period hypothesis, which comes from ethology and the brain sciences, says three things about learning: (1) exposure has to happen within a given time (often said to be before puberty for a second language), (2) within that critical age period, even a short amount of exposure is sufficient, and (3) outside of that critical period, even a large amount of exposure is not sufficient. They key evidence over which researchers in the field have been fussing is the finding that if learning happens within a critical period, the amount of exposure is irrelevant. Currently, there is vigorous debate over whether this applies to second language acquisition. Clearly, Porter is confused about the fundamental relationship between theory and data about critical periods as they apply to second language acquisition. If she simply means "the earlier the better," that is at least a debatable issue, but that has nothing to do with a critical period. The earlier one is exposed to English, the better; but the earlier one is exposed to any school subject matter, it is better as well.

31. In any event, the time-on-task theory has serious limitations; indeed, both Drs. Rossell and Baker concede that time spent exclusively in English for non-English speakers appears to be counter-productive to learning English. (Rossell & Baker, "The Emperor Has No Clothes," at 61; "Baker, Ramirez, et al.: Misled By Bad Theory" at 64.)

Declaration Relative to Point 3. The Probable Outcomes of Proposition 227 on Student Learning.

32. Porter declares on in paragraph 22 that "learning subject matter content in a second language can begin to occur in a matter of weeks, starting with the subjects that can be partially understood through symbols (mathematics), active experiments and demonstrations (science), and progressing to the social science". This is correct only in the most trivial sense, but utterly fails in meeting even the most watered down interpretation of academic standards.

33. I will offer a specific example from the Idea Proficiency Test (IPT), a commonly used test of English language proficiency, which assesses students in the areas of vocabulary, oral comprehension, grammar, and oral production. Performance on the IPT is divided into 6 levels (A through F). Based on the test level summary provided by the publishers of the test, the following characterizations can be made of performance at the different levels:

Level A: essentially cannot perform at any level.

Level B: a student can 1) tell his name and age; 2) identify family and common school personnel, classroom objects, basic body parts, common pets, and fruits; 3) use present tense verb "to be", 4) use regular plurals; 5) answer simple "yes/no" questions appropriately; 6) follow simple directions involving basic positions in space.

Level C: a student can 1) identify common occupations, clothing, farm animals, and foods; 2) express himself or herself using the present progressive tense (he or she is working) of common verbs; 3) use negatives and subject pronouns correctly; 4) use mass nouns (some glue, not a glue) appropriately; 5) follow the teacher’s directions related to identifying positions on a page; 6) repeat simple sentences correctly; and 7) comprehend and remember major facts of a simple story.

Level D: a student can 1) identify modes of transportation and household items; 2) name the days of the week; 3) describe common weather conditions; 4) use possessive pronouns correctly; 5) ask simple future tense questions, and so forth.

A full text of the summary sheet provided by the publisher is appended in Exhibit A.

33. One can reasonably ask two questions: (1) how quickly do student progress through these levels of English acquisition? and (2) what is a reasonable level of English at which one can assume that meaningful learning of the academic curriculum is taking place? I pursue these two questions below.

34. On the issue of expectations about the progress of students through levels of English acquisition, I published a paper in 1974 ("A preliminary report on the development of grammatical morphemes in a Japanese child learning English as a second language." Working Papers in Bilingualism, 3, 18-38. Reprinted in E. Hatch (Ed.). Studies in Second Language Acquisition. Rowley, Mass.: Newbury House Publishers, 1979), which is one of the foundational studies in the field of second language acquisition, which serves as the theoretical base upon which tests such as the IPT, the Bilingual Syntax Measure, and the Language Assessment Scales were developed. In that paper, I reported on a case study of the daughter of a Japanese visiting scholar family at Harvard who enrolled in kindergarten, and received what would today be considered pull-out ESL support. Despite her privileged background, parents who were proficient in English, and being in a school where she was just one of few English language learners, she went through a silent period of at least 5 months. She then began her journey towards full English production in April, at which point I began intensive tape recordings of her language. However, even this child took 2 years before she had a strong command of critical aspects of language, such as number, tense, conditionals, and complex sentences. Based on my knowledge of the literature on second language acquisition, many longitudinal studies which have been conducted based on a similar paradigm such as mine, I conclude that for the vast majority of children, grammatical control of English as a second language takes considerably longer than one year, even under ideal circumstances. I know of no longitudinal case study in which grammatical control is established in a period shorter than one year.

35. There is also strong evidence based on cross-sectional studies that basic English language acquisition takes three to five years on average. One graphic example comes from Canada, based on a study of 1,200 immigrant students in the Toronto Board of Education. These students were not in bilingual education programs, and in English-only programs, and were given a picture vocabulary test and a test of English grammar (Wright, E. & Ramsey, C. A., 1970, ‘Students of non-Canadian origin: Age on arrival, academic achievement and ability’. Research Report #88, Toronto Board of Education). I have replotted their data separately for 5^th, 7^th, and 9^th graders, as a function of their length of residence in Canada. This is shown in Exhibit B. The developmental curves (expressed in English deficit from native English comparisons) rise until 5 years of exposure, at which point they flatten out. These conclusions are strongly supported in a number of other studies, including a recent study by Harold Klesmer (‘E.S.L. Achievement Project: Development of English as a Second Language Achievement Criteria as a Function of Age and Length of Residence in Canada.’ North York Board of Education, September, 1993) of a randomly selected sample of 285 ESL students and 43 control native English students who were 12 years old. Using an impressive array of standardized tests of language proficiency, he showed developmental curves that tell a similar story.

36. Finally, and perhaps most importantly, one can look to sources of data closest to the students to be impacted by Proposition 227 in order to understand the likely harm these students will experience. Defendants offer a declaration from the Westminster School District in Orange County, which provides a program supposedly similar to Proposition 227. In their declaration, the Superintendent for the district, Dr. Barbara DeHart, reports in Paragraph 13(c) that "Students with IPT scores in levels A-E increased by an average 1.1 IPT level." Presumably this means the average student increased one language level (of 6 in total, A-F) per school year. We have obtained from the State Department of Education data supplied by the school district on the English language development of LEP students in that district, showing the distribution of IPT levels of the students, and the percentage of students at each level who developed 1 or more levels:

ELD Level (IPT) Pretest	Number of Students	Number Progressed > or = 1 Level	% Progress > or = 1 Level
A	329	261	79%
B	386	280	73%
C	206	161	78%
D	156	122	78%
E	120	88	73%
F	-	-	-
Total	1197	912	76%

37. Several things are noteworthy about Westminster’s data, particularly the use of it to show that the district’s all-English "program is successful in overcoming language barriers" (Westminster declaration; para. 13).

38. The average LEP student in Westminster gains slightly more than one (1.1) language level per year of instruction. This means that if a student begins school in first grade at language level A (i.e., a non-English speaker unable to function in English at any level), she or he will require nearly 3 years to be at level D, which IPT test developers (IPT 1 Oral, Grades K-6, English forms C & D) designate as "limited English speaking," and an additional 2 years to become a fluent English speaker. Even on the face of it, Westminster’s data appear to support the proposition that achieving English fluency requires approximately 5 years: A non-English speaker entering 1^st grade will become "limited English proficient" in late 3^rd/early 4^th grade and will not become a fluent English speaker until around the end of 5^th grade.

39. However, we suspect the story in Westminster is more complex than this. Inspection of the above table reveals that nearly 60% of the LEPs who progressed one or more language levels were functioning at the very lowest levels of English proficiency. Only a minority (23%) moved into the highest levels (E & F). The district superintendent’s claim that on average LEP students progress 1.1 language levels needs therefore to be interpreted in light of the fact that a disproportionate amount of that progress takes place at the lowest levels of English proficiency. Relatively few students are functioning or moving into the highest levels of English language proficiency. This of course helps explains the approximately 5% per year LEP to FEP redesignation rate in Westminster, which is no different from the statewide figure opponents of bilingual education use to argue that bilingual education has failed.

40. The implications of this pattern and timing of language development (in a district, we would emphasize, that claims already to be using "structured immersion"; para. 6), and how 227 would at a minimum delay LEP children’s learning in the academic domains, becomes evident if we look carefully at California content standards in a given subject. For illustrative purposes, we will use California’s science standards (CA Dept. of Education, 1990). Contrary to Porter’s specious declaration that subject matter learning in areas such as science can begin within weeks of second language learning, science learning as outlined by the state framework presupposes a high level of functional English proficiency. The English language levels required for meaningful conceptual learning to take place are far higher than the defendants admit or that this Proposition would permit LEP students to attain before subjecting them to what in essence remains a sink or swim instructional environment.

41. The California Science Framework presents a high-level and rigorous vision for science education. Among the many concepts outlined as central to its study and understanding are hypothesis, law, theory, fact, technology, energy, evolution, scale and structure, stability, and systems and interactions. Although the framework advocates the use of various teaching strategies, such as demonstrations and experiments, it also encourages teachers to engage students in learning activities that require high levels of language proficiency. For example: In developing science concepts, a teacher should: (1) pose questions to determine what ideas students hold about a topic before beginning instruction; (2) be sensitive to and capitalize on students’ conceptions about science; (3) employ a variety of instructional techniques to help students achieve conceptual understanding; and (4) include all students in discussions and cooperative learning situations (CA Dept. of Education, 1990, p. 3)

42. Even a cursory reading of the framework reveals how language-dependent science instruction is and, accordingly, how disadvantaged LEP students will be in a science program that conforms to the standards of the state’s framework. We return to our hypothetical Westminster student, who progresses slightly more than one language level per year. The table below correlates the student’s grade, likely IPT-determined language level at that grade, examples of oral language skills at that level, and excerpts from the California Science Framework outlining standards for conceptual understanding in science at early (K- 3rd) and late (3rd-6th) elementary school. It is crucial to note that although a skilled teacher could produce demonstrations and concrete experiences that would help students learn some of the vocabulary and perhaps even the concepts at a superficial level, it is extremely difficult to see how students at language levels A-E, and especially at the lowest 3 levels of English proficiency, could meaningfully be expected to learn the content outlined in the framework and do so in a language they understand tenuously. If LEP students are already familiar with the subject matter content outlined in the last column, they can certainly learn aspects of the corresponding English vocabulary and how to talk about the concepts in English; perhaps this is what Dr. Porter means. But it is impossible to see how students can simultaneously learn these very challenging concepts presented in a language for which they have limited understanding.

43. The unreasonable difficulties presented to LEP students, depicted in the table below, are further compounded when we consider that the Westminster data include only students in grades 2 through 7. In other words, the large numbers of students at the lowest levels of language proficiency are not first graders—they are at least in 2^nd grade. The challenges these students face are even more severe than the table above depicts.

/ / /

Grade	Probable IPT level	Sample oral language skills at this level	Excerpts from Science framework	Sample Content and Performance Standards
1	A (non English speaking)	Fewer than half the skills in level "B"	(Kindergarten through grade 3): "Forms of energy can be classified in several ways, depending on our purposes. Energy is manifested when we drop a ball, strike a match, make waves in a bathtub, clap our hands or rub them briskly together, or turn on a flashlight. Each form of energy has its own characteristics. For example, a given material will transmit some forms of energy and absorb or reflect others. A sheet of thick paper transmits sound but not light. A stretched sheet of plastic wrap transmits light but not water waves. . . . Energy is required when work is done on a system or when matter changes its form."	(Kindergarten through grade 2): Students identify forms of energy that are observable as light, heat, sound, and motion. Examples of types of work students should be able to do: Observe and describe the differences between striking a cup or not striking a cup when it is placed against the ear Observe and describe how the light from a flashlight is affected by placing a piece of black construction paper or some clear object over it Using a solar oven, students will identify froms of energy involved in cooking and will describe these forms using pictures and words.
2	B (non English speaking)	Tell name and age; identify family and common school personnel, classroom objects, basic body parts, common pets; use present tense verb "to be"; use regular plurals; answer simple "yes/no" questions; follow simple directions involving basic positions in space.
3	C (non English speaking) (D toward end of grade 3)	Identify common occupations, clothing, farm animals, foods; express self using the present progressive tense (he or she is working); use negatives and subject pronouns; use mass nouns (some glue, not a glue); follow directions related to identifying positions on a page; repeat simple sentences; comprehend, remember major facts of a simple story
4	D (limited English speaking)	Identify modes of transportation and household items; name the days of the week; describe common weather conditions; use possessive pronouns; ask simple future tense questions; understand, express comparative and quantitative concepts; repeat complex sentences; express creative thoughts in complete sentences	(Grades 3 through 6): "Energy passes through ecosystems in food chains mainly in the form of the chemical energy supplied to each organism by the nourishment it consumes. All organisms convert some of this energy into heat. Animals also convert some of it into mechanical energy. Green plants convert light energy into chemical energy by means of the photochemical process called photosyntesis."	(Grades 3 through 5): Students demonstrate an understanding that energy can flow into and out of a system causing measurable changes. Examples of types of work students should be able to do: Describe and compare input and output with various forms of energy used at recess or in the cafeteria during lunch. Compare the effects of various amounts of light on plants grown in sunlight, in artificial light, and in the dark. Students will observe how a meal is processed through the digestive system .... They will trace the flow of energy from the sun to the various food and packaging components, through the human body systems.... Students will produce a multimedia product that incorporates both visual and written forms of communication to present their results.
5	E (limited English speaking) (F toward end of grade 5)	Identify content area vocabulary; use superlatives and past tense; understand and name opposites; ask past tense questions; discriminate differences in closely paired words; describe and organize the main properties of common objects
6	F (fluent English speaking)	Use conditional tense verbs; discriminate fine differences in closely paired words; comprehend and predict the outcome of a story; recall and retell the main facts of a story; share meaningful personal experiences

Declaration Relative to Point 4: Major Misrepresentations of Research in Declarations

44. There is much to critique about the use of research by the defendants. However, several are egregious and cannot escape mention as examples of the defendants’ eagerness to bend the truth. Porter makes use of an evaluation study conducted by the New York City Board of Education in 1994 that compared students in bilingual education programs with those in ESL programs, to support her claim that there is an abundance of evidence that native language instruction programs ... have produced disappointing results most of the time" (paragraph 11). The findings of this study was described in an article by Barbara Mujica in the Fall, 1995 issue of READ Perspectives, a journal edited by Porter. The New York City study, and Mujica’s writeup of it, was described in the following way by my NRC committee: "Recently, extensive publicity has been given to an evaluation of the New York City bilingual program (Board of Education of the City of New York, 1994). That study compared the exit rates (how long children stayed in the program) and achievement of students in ESL and bilnigual programs. The comparisons made between the two programs were seriously confounded with native language: most of the students in the bilingual program had Spanish as a native language, while the students in ESL had other language backgrounds. Ironically, the New York City study carefully documented the native-language confound, but made no attempt to control for this variable or for other confounds (e.g., socioeconomic status). In the preface, the authors describe the results of the study as ‘preliminary’ and ‘ongoing.’ Yet advocates and the media accepted the conclusions of the report. The New York City evaluation has been heralded by advocates as providing ‘hard evidence’ (Mujica, 1995) because it makes bilingual education look ineffective. Again study quality is ignored if the results support the advocate’s position" (NRC report, p. 149).

45. Dr. Porter concludes her declaration by quoting from a study by Lopez and Mora, whose paper is attached in Appendix II from Governor Pete Wilson. She declares that: "Beyond high school graduation, the quality of schooling provided to English language learners has a profound effect on each student’s ability to pursue higher education, meaningful work, and the responsibilities of citizenship". She quotes from them: "we find that first and (to a lesser extent) second generation Hispanics who attended a bilingual education program earn significantly less than their otherwise similar English-immersed peers who received monolingual English instruction".

46. Even a cursory inspection of the study results shows that this conclusion is not supported by the data at all. Indeed, the most reasonable conclusions to draw from them would support the plaintiff's argument, rather then Porter’s. They find no statistically significant difference between the incomes of all Hispanics in their sample, whether they received bilingual education, English as a second language, or no special instruction (See Table 3, column 1). Statistically significant differences are only found when the analysis is done separately for first, second, third, and fourth generation Americans. It is not clear why one would want to perform these analyses separately. But even if we were to accept their analysis at face value, when one compares ESL to bilingual education, the only statistically significant difference between the two favors the effects of bilingual education on long-term income over ESL, for just the third generation. In other words, if ESL in this study is meant to be equivalent to the sheltered immersion proposed by Proposition 227, this study suggests that sheltered immersion will have a significant negative effect on the long-term income of at least some of the students compared to the effect of bilingual education.

47. But there are also good reasons to doubt the validity of all of these findings. The fact that students are not randomly assigned to the different programs means that it is essential to control for the initial differences in the students who are likely to be assigned to one program or another. Students with lower initial Englsih ability may be more likely to be assigned to bilingual education than to ESL or no program and those students may also possess a host of other educational disadvantages having to do with social class and family background. Because Lopez and Mora do not have a measure of initial English ability and have only limited measures of social class, they are failing to control fully for the disadvantages likely possessed by the bilingual students. Failing to control for those disadvantages fully means that their lower incomes after many years may well be caused by the uncontrolled disadvantages and not by the bilingual program to which they were assigned. It is plausible that the incomes of the bilingual students might have been even lower had they not received the claimed benefit of bilingual education.

48. The bias in their analyses caused by failing to control fully for the disadvantages associated with students who are more likely to be assigned to bilingual programs is more than theoretically possible; the fact that the combined analysis shows no significant differences while the analyses separating generations do show differences suggests that these results are largely artifacts of inadequate controls. The fact that the estimate for program effects varies across generations, even when they control for as many demographic variables as are available, suggests that there are powerful demographic factors that differentiate the populations in each generation for which they have no controls. Those same uncontrolled, powerful demographic factors are also likely to vary across students who are assigned to bilingual, ESL, and submersion programs. The fact that separating out the generations changes the results so dramatically is essentially an admission that they are not controlling for very important background characteristics.

49. Dr. Rossell presents an analysis of cumulative redesignation rates based on yearly data from the California State Department of Education, in Table 2 (paragraph 53). This table allegedly shows very low reclassification rates, with an expectation of only 42 percent of a cohort of LEP students entering kindergarten in 1991. This is sheer conjecture based on data that fails to take into account a number of factors, including the entry of new students into the schools and high student mobility. State data do not enable individual students to be tracked over time to determine redesignation rates. The data she reports are not "cumulative" in any sense other than that of an arithmetic exercise in adding up the percentages for each year. This is supported by looking at the distribution of students who are exited after certain numbers of years: the percentages of students who are exited after 1, 2, 3, 4, 5, or 6 years remains more or less the same. This is not consistent with what we know about language acquisition, and certainly defies my experience with exit rates of students followed over long periods of time. This is easiest to show graphically, in Exhibit D. In the exhibit, I compare Rossell’s conjecture based on state data against actual data obtained from the San Francisco Unified School District for 1996-7, which reports the percentage of students who were redesignated after 1, 2, 3, 4, 5, 6, and more years. To make the data comparable, the figure only looks at students who exited in 6 or fewer years, and then plots the cumulative percentage of those students who exited after 1 year, 1 or 2 years, 1, 2, or 3 years, etc. Rossell’s estimates are represented by the dotted line; SFUSD’s actual data are represented by the solid line. Clearly, her estimates resemble nothing like the real progression of students towardsredesignation as shown in San Francisco, in which a relatively small proportion of the students are redesignated in 1, 2, or 3 years, and then the pace picks up rapidly between 3 and 6 years. Rossell’s Table 2, therefore, tells us nothing about cumulative percentage of the redesignation of students across the years.

I declare under penalty of perjury under the laws of the State of California that the foregoing is true and correct.

Executed this ____ day of June, 1998, at ____________________, California.

____________________________

Dr. Kenji Hakuta