Indiana Department of Education Report on ISTEP - WDRB 41 Louisville News

Indiana Department of Education Report on ISTEP

Posted: Updated:

Indiana Department of Education Releases Report on ISTEP+ Validity

INDIANAPOLIS – In response to widespread problems associated with CTB McGraw-Hill's administration of the high-stakes ISTEP+ this spring, Indiana Superintendent of Public Instruction Glenda Ritz hired Dr. Richard Hill of the National Center for the Improvement of Educational Assessment to review the results.  A copy of Dr. Hill's report, as well as an interactive map that details the frequency of interruptions statewide and by school corporation is below.

Among other things, the report shows the following:

-          Because of the efforts of teachers, administrators, students and parents, as well as the swift and decisive actions taken by Superintendent Ritz, the average negative statewide impact on scores was not measurable.  However, this does not mitigate the effect the interruptions had on students, parents and teachers throughout Indiana. 

-          At this time, the exact impact of interruptions at the individual, classroom and teacher level cannot be ascertained. 

"First, I want to acknowledge the extraordinary efforts of Indiana students, parents, teachers, administrators and the employees of the Department of Education," said Superintendent Ritz.  Because of their dedication and hard work, the impact of these interruptions was limited.  However, let me be clear, the problems with the ISTEP+ contractor were absolutely unacceptable.  Every student deserves the opportunity to take a fair and uninterrupted assessment. 

"I have spent the last several months talking with Hoosiers about the impact these interruptions had in the classroom.  Although Dr. Hill's report found that the statewide average score was not affected by the interruptions, there is no doubt that thousands of Hoosier students were affected.  As Dr. Hill stated in his report, ‘We cannot know definitively how students would have scored this spring if the interruptions had not happened.' Because of this, I have given local schools the flexibility they need to minimize the effect these tests have on various matters, such as teacher evaluation and compensation.  I have also instructed CTB McGraw-Hill to conduct enhanced stress and load testing to ensure that their servers are fully prepared for next year's test and ensure that this never happens again." 

The Department of Education is conducting an ongoing negotiation regarding settlement with CTB McGraw-Hill.  Next steps for the Department include processing student reports to be available online to parents and students, and calculating A-F accountability results.

An interactive map showing the ISTEP+ interruptions by school corporation can be found by clicking here:  http://www.stats.indiana.edu/maptools/ISTEPinterruptions.html

 

An Analysis of the Impact of Interruptions on the 2013 Administration of the

Indiana Statewide Testing for Educational Progress—Plus (ISTEP+)

 

Richard Hill

The National Center for the Improvement of Educational Assessment, Inc.

July 27, 2013

 

Background

 

The Indiana Statewide Testing for Educational Progress—Plus (ISTEP+) is Indiana's statewide testing program.  Students in public and nonpublic schools in grades 3 through 8 take this test.  There are substantial consequences for test results at all levels in the public schools, including teachers.

 

Indiana has been transitioning the administration of the test from paper-and-pencil to on-line testing since 2009.  This past spring, approximately 95 percent of the students took the test on-line, an increase from 71 percent the previous year.

 

Testing began this year on Monday, April 29.  Starting at about 10:30 that morning, students throughout Indiana experienced interruptions during their testing.  It was quickly discovered that the interruptions were caused by a memory issue on the CTB/McGraw-Hill (CTB) servers.  Because CTB's immediate efforts to resolve the situation were unsuccessful, their technology engineers worked to isolate the source of the issues and made necessary adjustments to return to normal status as soon as possible.  Based on these interruptions, Indiana's Superintendent of Public Instruction Glenda Ritz extended the testing window by two days to May 14, 2013.

 

On the second day of testing, at around 11:15, a different memory issue on CTB/McGraw-Hill's servers caused additional widespread interruptions for Indiana students.  Students again experienced the issues seen on April 29, but in greater volume.  In response, CTB determined that the ISTEP+ Online system had to be "cut over" to the disaster recovery site.  While the system remained accessible, this "cut over" caused interruptions for almost all students who were active in the system.  Also, as the system was moved from the regular to the disaster recovery servers, not all of the student responses were immediately accessible to students when they logged back into that test session.  All of the student responses had been saved, but they were not immediately available due to the system issues.  Based on the severity of the interruptions and a recommendation from CTB, the State Superintendent requested that students should complete their current test session and then schools should suspend online testing for the rest of the day.  Superintendent Ritz asked that schools reduce their online testing to 50 percent of their planned testing load for the following day.  Also, Superintendent Ritz extended the online testing window three additional days, through May 17, 2013.

 

On May 1, online testing resumed at 50% of planned capacity.  Students using CTB's system experienced no further widespread interruptions.  As a precautionary measure, Superintendent Ritz asked schools to continue to reduce online testing to 50% of their planned testing load for the following day.   On May 2, Superintendent Ritz once again asked schools to reduce online testing to 50% of their planned testing load for one more day as a precautionary measure. On May 3, Superintendent Ritz conducted three conference calls with Indiana superintendents.  On May 6, she directed schools to resume online testing at 100% of their capacity.  Online testing was completed on May 17.

 

On May 24, the Department of Education provided schools with a list of students that CTB indicated had interrupted testing sessions.  The Department gave that list to local schools so that they could check the list against their records and add any students they determined were impacted by the interruptions but missed by CTB. 

 

On that same day, the Department also issued a request for qualifications to three national companies experienced in validating test results.  From that process, the National Center for the Improvement of Educational Assessment was awarded a contract to investigate the impact the interruptions had on ISTEP+ test scores.  This report is the outcome of that investigation.

 

Description of the Interruptions

 

There are two sources of data available about the interruptions.  The first comes from the records of CTB.  As students completed the test, data were captured about the timing of all events.  As a result, the CTB data can, for example, tell how much time a student spent on the test before an interruption occurred, how many items were presented to the student before the interruption, and how long it was before the student answered another question.  In addition to the CTB data, local school systems were provided with the opportunity to identify additional students who were interrupted—or affected by interruptions, in the judgment of the local person completing the form.  These data were collected by providing local school systems a list of the students identified by CTB as having been interrupted and allowing them to append additional students to the file.  In contrast to the detail of the CTB data, the local appends identified only the test (Mathematics, English/language arts {ELA}, science, social studies) for which a student had been affected.

 

Table 1 provides the number of interruptions, reported by grade, session and type of school, as identified by CTB.  As can be seen from the data, there were significant numbers of interruptions at all grades, but grades 3-6 had a higher proportion of interruptions than grades 7 and 8.  This may be simply a function of the time of day that testing started—it is reasonable to presume that students in grades 7 and 8 started testing earlier in the day than students at the lower grades, and therefore more students at those grades were finished before the interruptions started.  It is also clear that the substantial majority of interruptions occurred during Sessions 1 and 2 (when students were taking the mathematics test) than during the later sessions.  Of course, it is possible that a student who was interrupted during Session 1 was affected for the remainder of the testing—that is, we cannot assume because far fewer interruptions occurred during Sessions 3 and 4 (when students were taking the ELA test) that ELA scores were unaffected by the interruptions.  Non-public school students had approximately the same proportion of interruptions as public school students, although this trend varied from grade to grade.  Non-public school students make up about 7.5 percent of the tested population, and had slightly less than 8 percent of the interruptions, totaled across the grades.  Their percentage ranged from a high of 12 percent at grade 7 down to 6 percent at grade 3.

 

Table 1

 

CTB-Reported Interruptions,

By Grade, Session and School Type

 

Grade

Type of School

Session

Total

1

2

3

4

5

6

3

Public

10,745

5,429

784

929

0

0

17,887

Non-Public

522

421

131

46

0

0

1,120

4

Public

10,821

5,588

1,046

590

543

598

19,186

Non-Public

510

607

102

67

16

37

1,339

5

Public

12,006

5,684

947

864

862

481

20,844

Non-Public

1,019

321

49

110

17

15

1,531

6

Public

9,474

7,145

1,332

1,132

595

659

20,337

Non-Public

735

738

169

59

55

43

1,799

7

Public

8,729

4,321

813

986

594

518

15,961

Non-Public

1,315

711

111

86

26

16

2,265

8

Public

7,255

4,399

1,104

1,054

0

0

13,812

Non-Public

571

474

90

163

0

0

1,298

Total

Public

59,030

32,566

6,026

5,555

2,594

2,256

108,027

Non-Public

4,672

3,272

652

531

114

111

9,352

Total

63,702

35,838

6,678

6,086

2,708

2,367

117,379

 

Once students were interrupted, there was a range of time before they restarted the test.  Sometimes, the length of that delay was a function of the responsiveness of the system;  at other times, it was  due to a school decision to stop the administration for students for a period of time and have them restart the test at a later time.  When students restarted, they sometimes had to redo the last item they had been working on before the interruption occurred, but for the vast majority of students, this was the extent of lost data.  However, there were 600 students (440 in math and 160 in ELA) whose data was not "restored" when they logged back in.  These students ended up with two sets of responses to their interrupted session and if any of their answers were different (and either one was correct), they were given credit for the correct answer. 

 

In order to summarize the length of the interruptions, they have been categorized as follows:

 

  1. Less than 2 minutes
  2. 2 minutes or more, but less than 5 minutes
  3. 5 minutes or more, but less than 15 minutes
  4. 15 minutes or more, but less than one hour
  5. One hour or more, but less than a day
  6. One day or more

 

Table 2 provides the information about the length of delays using the above categorization scheme.  For public school students, the most common delay was for a day or more, although that was less than a majority of the interruptions.  For students delayed less than a day, the most common delay was for 5 minutes or more, but less than 15.  Students in non-public schools had more of a tendency to restart the test the same day they were interrupted, with the most common delay being 5-15 minutes for them, too.  A total of 734 observations (less than 1 percent) could not have their delay coded because their end-of-interruption time was not recorded on the interruptions file.

 

Table 2

 

CTB-Reported Interruptions,

By Length of Interruption

 

Grade

Type of School

Interruption Length Code

Total

1

2

3

4

5

6

3

Public

452

1,721

5,395

1,619

1,196

7,433

17,816

Non-Public

53

129

437

62

76

343

1,100

4

Public

806

2,429

4,756

2,039

1,620

7,417

19,067

Non-Public

123

251

369

101

88

399

1,331

5

Public

1,202

2,629

5,347

1,868

1,456

8,217

20,719

Non-Public

113

261

522

117

134

367

1,514

6

Public

1,285

2,716

5,396

1,832

981

8,003

20,213

Non-Public

224

272

600

147

111

436

1,790

7

Public

1,324

2,516

4,791

1,442

592

5,195

15,860

Non-Public

273

303

727

328

67

549

2,247

8

Public

1,098

1,904

3,656

1,243

651

5,164

13,716

Non-Public

106

227

388

151

114

286

1,272

Total

Public

6,167

13,915

29,341

10,043

6,496

41,429

107,391

Non-Public

892

1,443

3,043

906

590

2,320

9,254

Total

7,059

15,358

32,384

10,949

7,086

43,809

116,645

           

There were a total of 117,379 interruptions.  Some students were interrupted more than once, and the data in Tables 1 and 2 are a duplicated count—that is, if students were interrupted more than once, they show up in those tables as many times as they had interruptions.  Table 3 provides information about the numbers of times students were interrupted, and these are unduplicated counts.  A total of 79,442 students were interrupted, which is about one-sixth of the total population.  Earlier, we provided a caution that just because a student was interrupted while taking the mathematics test, one cannot assume that the interruption did not affect the student's performance on later sections of the test.  Similarly, we caution here that just because a student was not reported as interrupted, that does not mean the student was unaffected by the interruptions.  The interruption of one student in a room could conceivably have an effect on other students in that same room.  Table 3 is a count of the numbers of students directly affected by the interruptions. 

 

 

Table 3

 

CTB-Reported Interruptions,

By Numbers of Interruptions for Students

 

Grade

Type of School

Number of Interruptions

Total

1

2

3

4

5

6 or more

3

Public

9,132

2,844

665

156

46

32

12,875

Non-Public

497

177

49

18

10

0

751

4

Public

9,155

2,543

1,056

260

80

51

13,145

Non-Public

507

212

75

32

11

0

837

5

Public

9,179

2,985

1,164

366

85

47

13,826

Non-Public

688

223

91

26

4

0

1,032

6

Public

8,607

2,845

998

467

153

66

13,136

Non-Public

707

211

85

40

34

14

1,091

7

Public

7,913

2,133

751

223

86

32

11,138

Non-Public

634

246

142

102

27

26

1,177

8

Public

6,904

1,802

617

214

72

36

9,645

Non-Public

517

136

75

38

9

14

789

Total

Public

50,890

15,152

5,251

1,686

522

264

73,765

Non-Public

3,550

1,205

517

256

95

54

5,677

Total

54,440

16,357

5,768

1,942

617

318

79,442

 

The data in Table 4 includes both CTB- and locally-reported interruptions, and therefore is reported at a somewhat coarser level.  For example, rather than specifying the session during which a student was interrupted, this table is limited to the test.  (The mathematics test was administered in Sessions 1 and 2 and the ELA was administered in Sessions 3 and 4.  For students in grades 4-7, there were two additional sessions, during which they took either social studies or science, depending on their grade.)  Also, rather than reporting the number of interruptions, these data provide the number of tests for which students were interrupted (some students were interrupted more than once during a testing session, which would have been reflected in the previous tables, but is a level of detail that cannot be reported in Table 4). 

 

 

Table 4

 

Numbers of Tests for Which Students Were Interrupted,

Combining CTB- and Locally-Reported Data

 

Grade

Type of School

Number of Interrupted Tests

Total

0

1

2

3

4

3

Public

54,001

18,887

4,204

296

269

77,657

Non-Public

5,421

949

147

8

72

6,597

4

Public

50,059

18,240

1,825

2,588

223

72,935

Non-Public

5,030

1,018

138

103

53

6,342

5

Public

51,520

18,454

1,951

2,919

186

75,030

Non-Public

5,072

887

288

103

47

6,397

6

Public

55,737

17,069

2,333

3,169

279

73,687

Non-Public

4,387

1,430

266

150

77

6,310

7

Public

56,054

16,907

1,582

2,800

286

77,629

Non-Public

4,087

1,384

302

69

23

5,865

8

Public

57,086

14,946

4,050

253

198

76,533

Non-Public

4,012

1,227

286

6

21

5,552

Total

Public

324,457

104,503

15,945

12,025

1,441

458,371

Non-Public

28,009

6,895

1,427

439

293

37,063

Total

352,466

111,398

17,372

12,464

1,734

495,434

 

From Table 3, we know that CTB identified interruptions for just short of 80,000 students. From Table 4, we see that of the 495,434 students tested statewide across all grades, 352,466 had no tests interrupted—meaning 142,968 were reported as having at least one test interrupted when the locally-reported interruptions are added into the CTB-reported interruptions.  Thus, we know that the locally-reported interruptions added about 60,000 students to the list.  Combined across both data sets, approximately 29 percent of the students were identified as being directly affected by the interruptions.  The number that were indirectly affected—that is, did not have an interruption in their own test, but had a disruption in their classroom that affected them—is unknown.

 

Some inconsistencies in Table 4 should be noted.  For example, no student in grade 3 or grade 8 took more than two tests (those students are tested in mathematics and ELA only), and no student in any grade took more than 3 tests, so some locally-reported interruptions do not reflect the reality of the testing system.  But those discrepancies are small compared to the general information, so it appears as though the vast majority of local school personnel completing the form did so accurately to the best of their ability.

 

Table 5 provides the counts from the CTB- and locally-reported data set on the number of students interrupted for each test.

 

 

 

 

 

Table 5

 

Numbers of Students Interrupted by Test,

Combining CTB- and Locally-Reported Data

 

Grade

Type of School

Test

Math

ELA

Science

Social Studies

3

Public

21,717

6,577

N/A

N/A

Non-Public

1,029

368

N/A

N/A

4

Public

20,194

5,810

4,067

N/A

Non-Public

1,138

392

220

N/A

5

Public

20,703

6,180

N/A

4,331

Non-Public

1,159

529

N/A

219

6

Public

19,719

7,202

4,815

N/A

Non-Public

1,695

609

323

N/A

7

Public

18,932

6,144

N/A

4,023

Non-Public

1,635

450

N/A

173

8

Public

17,220

6612

N/A

N/A

Non-Public

1,331

518

N/A

N/A

 

Table 5 provides some interesting information.  For example, CTB had identified slightly over 12,000 students interrupted in math for grade 3;  after adding in the locally-reported interruptions, the number is almost twice that.  In addition, about 85 percent of the interruptions in the CTB file were during the math test, but that percentage is much lower in Table 5.  While a strong majority of the interruptions are in math, the interruptions during the ELA test total about one-fourth of all the interruptions.  A reasonable assumption is that school personnel did indeed frequently code students as being interrupted in ELA not because they were directly interrupted during that test, but because they felt interruptions occurring during the math test carried over to later tests.

 

While some of the data to be presented in this paper deals with student-level analyses, another portion will be looking at results aggregated to the school level.  For the CTB-reported interruptions, 169 schools (out of 1,831—over 9 percent) had no interruptions for any students at any grade within the school.  Half the schools had interruptions for 12 percent or fewer of their students, and only 10 percent of the schools had more than 37 percent of their students interrupted.  The average percentage of interruptions for public schools was 16.5; for non-publics, the average was 14.3 percent.  At first, it seemed as though it might be worthwhile looking at the schools with no interrupted students separately (as a baseline, since they had no interruptions).  However, since these schools were disproportionately non-public (93 out of 169, or almost three-fourths) and tended to be considerably smaller than average (about half the number of students as an average school), they cannot be presumed to be representative of the state as a whole, and therefore that area of investigation was abandoned. 

 

The correlations of percentage of students interrupted across grades within a school were modest.  For public schools, the highest correlation was the percentage interrupted at grade 6 with the percentage interrupted at grade 7—0.25.  Almost all of the remaining correlations were less than 0.20.  This means that schools that had many interrupted students at one grade tended to not have as high a percentage at other grades.  The consequence of this is that whatever impact the interruptions might have had on student achievement would be somewhat diminished when results are aggregated across all grades in a school.

 

The Impact of Interruptions on Test Scores

 

It has been important to note the range and number of interruptions that occurred during ISTEP+ testing this past spring.  The interruptions created a significant burden for students, teachers and administrators who had to deal with the issue and make their best efforts to get students' responses to reflect their real achievement levels.  In this section, we will look at the extent to which their efforts were successful—did the interruptions have a negative impact on student achievement, or were schools able to get valid scores from students despite the obstacle that the interruptions provided?

 

We cannot know definitively how students would have scored this spring if the interruptions had not happened.  However, we can look at historical information and determine whether the scores attained this spring were consistent with predictions we would have made from an historical perspective.  We will look at four sources of data to inform these predictions:

 

  1. The overall statewide results—that is, the change in statewide mean scaled scores between 2012 and 2013.  If the interruptions this spring had a negative effect on student scores, we might expect statewide mean scaled scores this year to have declined from last year.
  2. The improvement in school scores from 2012 to 2013, especially in comparison to the improvements shown by those schools from 2011 to 2012.  Some school had no students with interruptions;  others had a substantial majority.  If the interruptions had a negative effect on student scores, we would expect the improvements to be better sustained in schools with lower percentages of interrupted students.  This analysis holds grade within school constant, but looks at different cohorts of students (e.g., comparing  grade 3 in 2012 to grade 3 in 2013).
  3. The gain in school mean scores, following a cohort of students across grades within a school (e.g., looking at grade 3 in 2012 and grade 4 in 2013).  Again, one would expect the gains to be higher in the schools with fewer interruptions.
  4. Student-level data matched across years.  Again, one would expect the students without interruptions to have the largest gains from year to year, and those with the most troublesome interruptions (early in the testing session, multiple times within session, longer delays during a session) to have smaller gains than all other students.

 

For the last two analyses, we will compare the changes from 2012 to 2013 with comparable data from 2011 to 2012.  Since there were no interruptions in 2012, looking at the data from 2011 to 2012 in the same way as 2012 to 2013 provides a baseline of expectations.  So, for example, we will be looking at the gains from 2011 to 2012 for the schools that had larger percentages of interruptions in 2013 to see how much they changed the year before they were interrupted and then comparing that to the change the year they were interrupted.

 

 


 

Overall Statewide Results

 

Table 6 provides an overview of the statewide results since the inception of ISTEP+ test in 2009.  As can be seen from the table, the state enjoyed substantial gains from the first year to the second year of the program, which is not unusual—scores often change the most in the first years of a testing program as the schools adjust their curriculum to the new material being assessed.

 

The purpose of providing Table 6 is to set an historical context for the 2013 results.  If the interruptions had a serious impact on student test scores, we could expect the 2013 scores, and in particular the gains from 2012 to 2013, to be out of line with changes from previous years.  That did not happen.  Averaged across the grades, the state increased by 4 scaled score points a year in mathematics between 2010 and 2012, and 3 scaled score points in English language arts.  Between 2012 and 2013, the state increased by an average of 4 scaled score points in mathematics and 1 scaled score point in ELA.

 

Table 6

 

Mean ISTEP+ Scaled Scores for Public School Students, 2009 through 2013

 

Grade

Mathematics

English Language Arts

2009

2010

2011

2012

2013

2009

2010

2011

2012

2013

3

452

463

470

469

470

452

460

463

467

465

4

478

491

495

495

509

470

479

484

485

491

5

506

520

527

529

531

493

496

500

505

506

6

532

533

536

544

543

510

522

529

531

531

7

542

553

555

562

567

523

533

538

536

534

8

566

578

583

587

593

534

544

545

545

549

 

Scores increased from 2012 to 2013 in five grades in mathematics (the exception being a decrease of 1 point in grade 6) and in three grades in ELA.  Scores increased more in mathematics than in ELA in five grades, which is an interesting result, given that the substantial majority of the interruptions occurred while students were taking the mathematics test.  However, it is possible that the effect of the interruptions was cumulative—that is, once interruptions started happening, their impact grew as disruptions caused, for example, alterations in testing schedules. Combined with the fact that students completed some portion of the mathematics test before the interruptions started (and thus can be presumed to have some portion of the mathematics test reflect their full level of achievement), it is possible that some effect of the interruptions can be seen in this table.  However, Indiana has seen greater gains in mathematics scores than ELA scores over the years, and therefore observing greater gains in mathematics is consistent with historical patterns.

 

Table 7 looks at the 2012 and 2013 results in a bit more detail.  The substantial increase in scal, ed scores in both mathematics and ELA in grade 4, combined with the lack of improvement at grade 3 (indeed, a loss of 2 scaled score points in ELA) warranted a more careful look at what might have been the cause of those changes.

 

 

Table 7

 

Numbers of Students Tested and Mean Scaled Scores

On the ISTEP+ Test for 2012 and 2013

 

Grade

Mathematics

English Language Arts

2012

2013

Change

2012

2013

Change

N

Mean

N

Mean

N

Mean

N

Mean

3

74,283

469

76,410

470

+1

73,771

467

75,928

465

-2

4

74,133

495

71,755

509

+14

73,717

485

71,359

491

+6

5

77,150

529

73,719

531

+2

76,770

505

73,363

506

+1

6

75,587

544

77,012

543

-1

75,130

531

76,581

531

0

7

74,873

562

75,768

567

+5

74,396

536

75,372

534

-2

8

74,534

587

74,675

593

+6

74,099

545

74,307

549

+4

 

A clue as to what happened comes from looking at the changes in the numbers of students tested across years, following the same cohort.  At every grade, the 2013 numbers are consistent with those of the previous year, except going from grade 3 in 2012 to grade 4 in 2013, where the number of students tested declined by over 2,000.  An inquiry revealed that a new policy was put into place in 2013, whereby third-grade students who did not pass a reading test the previous spring or summer would continue to receive Grade 3 reading and literacy instruction, would receive additional interventions based on individual student learning needs, and would be officially reported as a third-grader the following school year (in this case, 2012-13).   As a result of this policy, approximately 2,500 students who would have been tested in the fourth grade in previous years took the third grade test instead. 

 

The following is a more detailed description of the policy, the implementation process, and the number of affected students.

 

To implement IC 20-32-8.5 (Reading Deficiency Remediation Plan), the Indiana State Board of Education and the Indiana Department of Education enacted a new policy during the 2011-12 school year, whereby third-grade students that 1) did not achieve a passing score on the IREAD-3 assessment in either Spring 2012 or Summer 2012, and 2) were not eligible for good cause exemptions, were retained as third graders for the 2012-13 school year as a last resort.

It is important to note that some of the retained students were actually placed in  grade 4 classrooms for instruction, as it is the responsibility of the local school to design a program that meets the learning needs of students and to determine classroom assignments.

In February 2013, Superintendent Ritz communicated to schools and corporations the flexibility that would exist during the spring of 2013 to provide the Grade 4 ISTEP+ test to any third grade student who met these criteria:

1)    The student did not pass IREAD-3 in Spring or Summer 2012 or receive a Good Cause Exemption (and was thus reported as a third grader during the 2012-13 school year), 

 

2)    The student received fourth grade instruction in all content areas (including literacy) during the 2012-13 school year, and

 

3) The student's parents understood that their child would be assessed using the Grade 4 ISTEP+ test.  

Superintendent Ritz's memo to superintendents and principals outlining this flexibility emphasized that all students participating in the Grade 4 ISTEP+ test (including those students who met the above criteria) would factor into a school or corporation's accountability calculations for Grade 4.  In total, schools and corporations exercised the option to administer the Grade 4 ISTEP+ test to nearly 250 Indiana third grade students in the spring of 2013.

Thus, there were approximately 2,500 students who are included in the grade 3 results for 2013 whose counterparts are missing from the 2012 results—and are not included in the grade 4 results for 2013.  Since these are students who did not pass a grade 3 reading test in 2012, it is reasonable to presume that they would have been among the lowest scoring students in reading, and below average in mathematics.  Removing those students from the fourth grade results and adding them into the third grade certainly raised the grade 4 2013 average, and may very well have lowered the grade 3 average as well.

 

To further investigate the issue, we looked at the numbers of students passing the ISTEP+ test in both years.  If the increase in grade 4 scores was mostly due to the change in policy, we should see the numbers of students passing the test approximately equal across the years, but a sharp decline in the number of failing students.  That is indeed what happened.  The number of students passing the grade 4 ELA test remained almost identical across the years, but the number of "Did Not Pass" students declined by over 2,000.  In mathematics, about 1,500 more students passed, but the number of "Did Not Pass" students declined by over 3,700.  So it is reasonable to presume that if the new policy had not been in place, and those 2,500 students affected by it had been tested in the fourth grade rather than the third, the change in mean scaled scores would be modestly positive for ELA for both grade 3 and grade 4, and mathematics mean scaled scores would have increased by several points at both grades.

 

Another policy change that complicates the interpretation of the changes of scores from one year to the next is the change from paper-and-pencil to on-line administration of the test.  Beginning with the 2009 administration of the ISTEP+ test, Indiana has been transitioning to online administration.  The percentage of students taking the test online was quite small in 2009 and 2010, but it was 36 percent in 2011, 71 percent in 2012, and 95 percent in 2013.  That rate of transition has not been constant across the grades, however.  In 2012, 92 percent of the grade 8 students took the test online, while only 34 percent of the third graders did.  The most typical pattern has been to transition one grade per year, and for the highest grades to start the transition first.  As a result, grade 3 in the elementary grades had the largest percentage of students transitioning this year, and grade 6 in the middle school grades.

 

While studies done in previous years have shown that the impact of the transition on test scores has been minimal, those studies have been done on schools and grades that have been earlier adopters.  The improvement in scores for the middle school grades was highest for grade 8, followed by grade 7 and grade 6 in that order—and that is the same order of percentage of online administration in 2012 (grade 8 was 92 percent, grade 7 was 86 percent, and grade 6 was 66 percent).  As a result, interpretation of the changes from 2012 to 2013 should not only take into account the interruptions but the change in mode of administration for many students.

 

The changes in scores from 2012 to 2013, once the changes in populations in grades 3 and 4 due to the new retention policy implemented this year are taken into account, are generally positive, and consistent with changes that Indiana has seen in the past.  Thus, while it is possible that some small portion of students may have had the interruptions affect their scores, it appears that on average across the vast majority of students, student performance was as high as it would have been if the interruptions had not occurred.

 

 

The Improvement in School Scores

 

A second investigation into the impact of the interruptions on student scores is the look at the changes in test scores at the school level across years, holding grade constant—that is, for example, comparing how grade 3 in a school scored in 2013 to how the third graders in that same school scored in 2012.  This statistic of cross-cohort change is generally referred to as "improvement" (in contrast to "growth," which refers to following the same cohort across grades).

 

For these analyses, we computed the percentage of students interrupted in each grade in each school in the state twice—once for the CTB-reported interruptions, and then again for the interruptions added by local school personnel.  Table 8 provides the average percentages of students interrupted.

 

Table 8

 

School Mean Percentages of Students Interrupted

 

Grade

CTB-Reported Interruptions

All Reported Interruptions

Public

Non-Public

Public

Non-Public

N

Mean %

N

Mean %

Mean %

Mean %

3

1,063

16

263

10

29

16

4

1,057

18

267

14

31

21

5

975

19

266

15

31

20

6

692

17

260

16

29

26

7

511

13

247

18

24

27

8

501

13

243

11

24

23

 

For the next analysis, also done grade by grade, public schools are grouped into three categories.  The first group had no students interrupted at that grade;  the second had some interrupted students, but less than 20 percent; and the third group had 20 percent or more students interrupted.  Table 9 provides the changes in test scores from 2012 to 2013, holding grade constant, for the three groups of schools.

 

 

Table 9

 

Average Change in ISTEP+ Test School Mean Scaled Scores between 2012 and 2013,

Reported by Percentage of Students Interrupted—Public Schools Only

 

Grade

Percentage of Interruptions

CTB-Reported Interruptions

All Reported Interruptions

Number of Schools

Average Change in Math

Average Change in ELA

Number of Schools

Average Change in Math

Average Change in ELA

3

None

265

3

-1

133

4

0

0+ - 20-

489

1

-3

439

1

-2

20% or more

290

2

-2

472

1

-2

4

None

238

13

5

138

10

3

0+ - 20-

485

13

5

407

13

6

20% or more

314

13

4

492

14

5

5

None

191

0

0

119

2

1

0+ - 20-

448

0

1

375

-1

1

20% or more

304

3

2

449

2

2

6

None

138

1

2

75

1

1

0+ - 20-

332

-1

0

293

-1

1

20% or more

192

-3

-1

294

-2

-1

7

None

74

6

0

50

9

3

0+ - 20-

299

5

-2

248

5

-2

20% or more

104

5

-2

179

5

-2

8

None

82

9

7

45

7

4

0+ - 20-

281

6

3

252

6

4

20% or more

113

7

4

179

8

5

 

If the interruptions had an impact on student test scores, the expectation for Table 9 would be that schools with no interruptions would show the most positive changes between 2012 and 2013, and that schools with greater rates of interruption would show less positive (or more negative) gains.  An example of this expected pattern occurs in grade 6 mathematics, where the schools with no CTB-reported interruptions had a mean gain of 1 scaled score point, while those with up to 20 percent of their students interrupted had a mean loss of 1 point, and those with 20 percent or more of their students interrupted had a mean loss of 3 points.  If that pattern had held up over the grades, it might be reasonable to presume that the interruptions had a small but measurable impact on test scores.  However, the pattern varies from grade to grade and from content area to content area.  The lack of a discernible pattern is true whether one looks at the CTB-reported interruptions only, or those combined with the school-reported interruptions.  On average across the grades, the gap between the non-interrupted schools and those with interruptions is about 1 point—on a test where the student-level standard deviation is between 50 and 75 points, depending on the grade and subject.

 

 

The Gain in School Scores

 

In contrast to the previous analysis, this one looks at the gains in scaled scores of cohorts of students across grades.  For this analysis, we need a baseline of growth expectations—that is, simply knowing that students gained from one year to the next would be insufficient information, since most students grow from year to year.  Therefore, we looked at the gains from 2011 to 2012 to use as a basis for comparing the growth from 2012 to 2013.

 

Schools are included in this analysis at a particular grade only if they also enrolled students the previous year at the lower grade.  Thus, for example, if a middle school enrolls students in grades 6-8, that school would be included in this analysis at grades 7 and 8, but not grade 6.  This is an issue that will be dealt with differently in the next analysis, where students will be matched from year to year regardless of their school in either year.

 

Tables 10a and 10b are identical to each other, except that Table 10a reports the results for schools broken down on the basis of the percentage of students interrupted as per the CTB-reported interruptions, whereas Table 10b includes all reported interruptions.  The same scores for each school are used in both tables—the only difference between them is the categorization of the schools.  Since the school-appended interruption files contain more records than the CTB interruption files, more schools are categorized in the third level of interruption, and fewer in the first level.

 

One interesting aspect to this analysis is that the schools are categorized by the percentage of students interrupted at the grade in 2013, but includes information on change from 2011 to 2012—the year before the interruptions took place.  Given that the interruptions were broadly distributed across schools, we would expect no differences among the three groups within a grade.  So, for example, all three groups of schools had approximately the same amount of gain from grade 3 in 2011 to grade 4 in 2012—about 25 points.  However, there are differences in those baseline scores as large as 5 points among the groups (grade 6-7 math and grade 5-6 ELA) in Table 10a, and one as high as 9 points in Table 10b (grade 6-7 math), and these likely reflect the normal variation one might expect to find across scores from year to year with this limited number of schools in each group.  Therefore, if we were to see a difference of this magnitude in the 2012 to 2013 gains, that difference might very well have been simply a reflection of this normal variation for that particular group. 

 

But in fact, the differences between the groups tend to be smaller in 2013—when the interruptions happened—than they were in 2012—the year before the interruptions.  Also, when one aggregates the data across grade levels and compares the average changes from 2011 to 2012 with the changes from 2012 to 2013, the results for all three categories of schools are almost identical, whether one uses the CTB-only data or the CTB data aggregated with the school-reported interruptions.  The gains schools made in 2013 are not related to the amount of interruption their students endured.  The schools with no interruptions did not have larger gains than schools that were interrupted, and schools with more moderate amounts of interruption did not have larger gains than schools with larger percentages of interrupted students.

 

 

Table 10a

 

Average Growth in ISTEP+ Test School Mean Scaled Scores between 2012 and 2013,

Reported by Percentage of Students Interrupted—Public Schools Only

CTB-Reported Interruptions Only

 

 

 

Number of Schools

Average Change in Math

 

Average Change in ELA

 

2011 to 2012

2012 to 2013

2011 to 2012

2012 to 2013

3-4

None

232

25

40

22

25

0+ - 20-

470

26

40

22

23

20% or more

300

26

39

23

22

4-5

None

177

36

34

22

19

0+ - 20-

402

33

35

20

21

20% or more

289

34

38

21

22

5-6

None

101

24

20

34

32

0+ - 20-

213

23

20

29

28

20% or more

132

20

18

28

25

6-7

None

47

22

22

7

10

0+ - 20-

182

27

24

8

4

20% or more

64

23

21

9

4

7-8

None

74

31

32

3

12

0+ - 20-

270

32

32

7

13

20% or more

109

31

32

6

12

 

 

 

Table 10b

 

Average Growth in ISTEP+ Test School Mean Scaled Scores between 2012 and 2013,

Reported by Percentage of Students Interrupted—Public Schools Only

Using Both CTB and Locally Reported Interruptions

 

 

 

Number of Schools

Average Change in Math

 

Average Change in ELA

 

2011 to 2012

2012 to 2013

2011 to 2012

2012 to 2013

3-4

None

135

26

40

23

25

0+ - 20-

396

26

40

22

24

20% or more

473

26

39

23

22

5-6

None

109

35

34

24

20

0+ - 20-

340

33

34

20

20

20% or more

419

34

38

20

22

5-6

None

60

24

19

33

28

0+ - 20-

189

22

21

29

29

20% or more

197

21

18

30

28

6-7

None

33

18

22

3

9

0+ - 20-

151

27

25

9

4

20% or more

110

24

21

8

5

7-8

None

39

34

31

3

10

0+ - 20-

247

31

32

6

12

20% or more

169

32

32

7

13

 

 

Student-level Data Matched across Years

 

The fourth analysis is a look at student-level data matched across years.  The first step in the analysis was to get student-level files for two consecutive years, then match each student's performance in the second year with that of the first.  This was done for two cohorts—the 2011-2012 group, and 2012-2013.

 

Students were matched only if they took the ISTEP+ test in consecutive grades, so students who were retained in a grade were not included in this analysis.  In addition, students who moved in or out of the state between tests were not included, and students were included only if they had valid test scores in both ELA and mathematics for both years.  Despite these restrictions, the vast majority of students were included.  Over 90 percent of the students had a match and valid test scores across years for all grades and years.  The lowest percentage of matched students naturally came from the match from grade 3 in 2012 to grade 4 in 2013, when approximately 2,000 additional students were retained in grade 3.  Even there, the match rate was over 90 percent.

 

Table 11 provides the numbers of students matched across years and the scaled score gains.  For 2013, the same statistics are provided for students who CTB reported as interrupted and for the CTB plus locally-reported interruptions.

 

The results reported in Table 11 show trends consistent with those of the three previous analyses.  The gains public school students made in 2013 were larger than their gains had been in 2012 for three of the grades, and smaller in the remaining two grades, for both mathematics and English language arts.  Public school students that CTB reported as interrupted had the same or larger gains than the overall average at every grade in mathematics and for three of the five grades in ELA.  Public school students reported by either CTB or locally as having been interrupted had gains equal to or greater than all students at all grades in mathematics and two of five grades in ELA.  In short, the data about overall interruptions indicate that students who were interrupted had gains that were as high as the students who were not interrupted.

 

Table 11

 

Average Growth in ISTEP+ Test Scaled Scores for Students Matched across Years,

2011 to 2012 and 2012 to 2013

 

Matched Grades

Year of Testing and Whether Reported as Interrupted

Public

Non-public

N

Gain in Math

Gain in ELA

N

Gain in Math

Gain in ELA

3-4

2012—All Students

70,218

23

20

5,585

13

23

2013—All Students

68,329

36

21

5,891

30

15

2013—CTB-reported

12,387

36

20

793

39

19

2013—All reported

26,969

36

19

1,390

36

18

4-5

2012—All Students

73,275

32

19

5,621

31

21

2013—All Students

70,385

35

20

5,927

33

14

2013—CTB-reported

13,028

37

21

991

35

17

2013—All reported

27,708

36

22

1,350

32

14

5-6

2012—All Students

71,447

16

31

5,252

18

39

2013—All Students

73,396

14

27

5,857

14

27

2013—CTB-reported

12,433

15

25

1,058

15

30

2013—All reported

27,118

14

23

2,052

15

27

6-7

2012—All Students

70,444

25

4

4,805

23

-1

2013—All Students

71,909

22

2

5,413

21

-1

2013—CTB-reported

10,471

23

3

1,140

23

-3

2013—All reported

25,971

22

6

1,870

20

-2

7-8

2012—All Students

69,725

31

7

4,794

29

17

2013—All Students

70,876

31

13

5,117

28

18

2013—CTB-reported

9,085

32

13

770

29

19

2013—All reported

24,106

32

11

1,645

28

16

 

One advantage of student-level data is that the interruptions can be explored in more detail.  For example, while the substantial majority of interrupted students were interrupted in mathematics (the first test taken), some had their first interruption during the ELA test, and therefore were not interrupted during the mathematics test.  We might expect the gains of those students on the mathematics test to be the same as non-interrupted students, but different from students who were interrupted while taking the mathematics test.  Also, the interruption data supplied by CTB provided much more detail about the interruptions.  From those data, we can look at students who were interrupted multiple times during one session of the test and the specific session of the test when they were first interrupted.

 

Table 12 provides data from the interruption data provided by both CTB and local school personnel.  Students were categorized as "None" if they were not interrupted in either the mathematics or the ELA test, "Math" if they were interrupted during the mathematics test, and "ELA" if they were not interrupted during the mathematics test but were first interrupted during the ELA test.  If student test scores were impacted by the interruptions, we would expect the "ELA" and the "None" students to have the same gains on the mathematics test (since the "ELA" students weren't interrupted until after they had completed the mathematics test), but lower gains on the ELA test.  In contrast, we would expect "Math" students to have lower gains than the other two groups on the mathematics test for sure, and possibly on the ELA test as well if we thought interruptions on one test would carry over to a later one.

 

Table 12

 

Average Growth in 2013 ISTEP+ Test Scaled Scores for Students Matched across Years,

Reported by First Test during Which They Were Interrupted

 

Matched Grades

Test of First Interruption

Public

Non-public

N

Gain in Math

Gain in ELA

N

Gain in Math

Gain in ELA

3-4

None

47,862

36

21

4,710

28

14

Math

19,049

35

20

1,088

37

19

ELA

1,418

36

21

93

33

16

4-5

None

49,254

34

19

4,689

33

14

Math

19,555

36

21

1,099

33

16

ELA

1,576

35

21

139

29

6

5-6

None

52,860

14

27

4,051

13

27

Math

18,702

14

26

1,638

15

28

ELA

1,834

14

21

168

15

30

6-7

None

52,482

23

2

3,711

21

0

Math

17,791

23

2

1,587

20

-3

ELA

1,636

20

5

115

27

2

7-8

None

52,696

31

13

3,628

28

18

Math

16,214

32

13

1,285

28

18

ELA

1,966

30

12

204

23

11

 

The data do not support that interpretation.  The public school students first interrupted during the math test had math and ELA gains that were not much different from the students who were never interrupted at all.  Public school students who were first interrupted during the ELA test had ELA gains within one point of the students who were never interrupted, with the exceptions of grades 6 (where they had significantly lower gains) and grade 7 (where their gains were actually somewhat higher than the students who were never interrupted).

 

The results in Table 13 are calculated using the CTB-interrupted data for public school students only.  The reports of interruptions by CTB have provided similar results to the total interruption reports for all the other analyses in this report (with the exception of the number of students identified), but also provide a level of detail not available from the all-interruptions file.  This tab, le uses information about the specific session during which students were first interrupted (Sessions 1 and 2 were the mathematics sessions, Sessions 3 and 4 were the ELA sessions).  In addition, we identified students who had been interrupted more than once during a session, anticipating that students who had encountered multiple interruptions might have lower gains than students who were just interrupted once (or were not interrupted at all).

 

Table 13

 

Gain Scores for Matched Students,

Reported by Type of Interruption

CTB-Reported Interruptions Only

Public School Students Only

 

Content Area

Grade

Not Interrupted

Any Interruption

First Session Interrupted

Multiple Interruptions within One Session

1

2

3

4

Math

3-4

36

36

37

35

30

36

37

4-5

34

37

38

35

31

37

37

5-6

14

15

16

14

14

11

16

6-7

22

23

24

21

21

18

24

7-8

31

32

33

31

32

27

32

ELA

3-4

21

20

20

20

19

21

20

4-5

20

21

21

23

23

22

21

5-6

27

25

27

24

18

24

27

6-7

2

3

3

1

7

3

2

7-8

13

13

14

12

11

12

14

 

Consistent with the findings reported earlier, Table 13 shows that students who were interrupted scored at about the same level, and often slightly higher, than the students who were not interrupted at all.  And contrary to expectations, students who were interrupted multiple times within a session gained as many points as students who were not interrupted at all.  But perhaps the most interesting finding from Table 13 is that the group with the lowest gains for mathematics was always one that was interrupted first in Session 3 or Session 4—sessions that were taken after they had completed the mathematics test.

 

Summary

 

There is considerable evidence that the interruptions had no negative impact on student scores for the vast majority of students;  indeed, students who were interrupted had somewhat larger gains across years than those who were not interrupted.  Given the volume and the nature of the interruptions, this finding certainly will come as a surprise to many.  One possible explanation that might be offered is that the interruptions affected students who were not identified as interrupted—that is, students in a class for which some, but not all, were interrupted might have all been affected by the interruptions.  However, that explanation does not seem plausible, since the state as a whole performed better in 2013 than it had in 2012.  If large numbers of students—numbers beyond the 20-25 percent who were identified as having been interrupted—had been affected, it does not seem possible that the state could have experienced these increases.

 

Although no data were collected that would confirm this hypothesis, it seems most plausible that the response to the interruptions, by both students and school personnel, was enough to overcome the potential problems created by the interruptions.  Students apparently worked as diligently on the tests as they would have if they hadn't been interrupted, and school personnel apparently minimized the impact of the interruptions on students' testing experiences.  Thus, while it certainly took significantly more effort to complete the testing this year because of the interruptions, that effort apparently was successful at negating the impact of the interruptions for the vast majority of students.

 

There were three major events that could have potentially impacted test scores this year:

 

  1. The new policy to retain students in grade 3 because of unsatisfactory scores on the IREAD test.
  2. The switch from paper-and-pencil to online administration for many schools.
  3. The interruptions affecting the online administration

 

Clearly, the policy to retain students in grade 3 had an impact on changes to the grade 3 and grade 4 scores between 2012 and 2013.  The switch from paper-and-pencil to online administration has not had much of an impact on scores in previous years, but the impact might have been more this year as the last grades within school made that transition.

 

It is important to note that this paper addresses only the larger issue of the impact of the interruptions when aggregated over large numbers of students.  When viewed from a high level, no consistent impact on test scores from the interruptions could be seen.  However, this is not the same as saying no student in the state was affected.  It certainly is possible that some students were affected;  if so, those occurrences were overshadowed by the lack of impact on the vast majority of students.  The interruptions data from CTB would permit a study of specific interruption patterns that might indeed permit one to identify students who likely were impacted by the interruptions.  Indeed, CTB has proposed some patterns in the data that will be pursued during the next phase of this study, and it is possible that some students will then be identified as having been affected by the interruptions.  If so, that will be important information to take into account during reporting. 

 

As noted earlier in this report, we cannot know definitively how students would have scored this spring if the interruptions had not happened.  In addition, the interruptions were not the only element that changed in the test administration this year, thereby adding a level of uncertainty as to the root cause of changes when they occurred.  However, the data strongly suggest, that the vast majority of students scored as well as they would have had the interruptions never happened.

INDIANAPOLIS – In response to widespread problems associated with CTB McGraw-Hill's administration of the high-stakes ISTEP+ this spring, Indiana Superintendent of Public Instruction Glenda Ritz hired Dr. Richard Hill of the National Center for the Improvement of Educational Assessment to review the results.  A copy of Dr. Hill's report, as well as an interactive map that details the frequency of interruptions statewide and by school corporation is below.

 

Among other things, the report shows the following:

 

-          Because of the efforts of teachers, administrators, students and parents, as well as the swift and decisive actions taken by Superintendent Ritz, the average negative statewide impact on scores was not measurable.  However, this does not mitigate the effect the interruptions had on students, parents and teachers throughout Indiana. 

-          At this time, the exact impact of interruptions at the individual, classroom and teacher level cannot be ascertained. 

 

"First, I want to acknowledge the extraordinary efforts of Indiana students, parents, teachers, administrators and the employees of the Department of Education," said Superintendent Ritz.  Because of their dedication and hard work, the impact of these interruptions was limited.  However, let me be clear, the problems with the ISTEP+ contractor were absolutely unacceptable.  Every student deserves the opportunity to take a fair and uninterrupted assessment. 

 

"I have spent the last several months talking with Hoosiers about the impact these interruptions had in the classroom.  Although Dr. Hill's report found that the statewide average score was not affected by the interruptions, there is no doubt that thousands of Hoosier students were affected.  As Dr. Hill stated in his report, ‘We cannot know definitively how students would have scored this spring if the interruptions had not happened.' Because of this, I have given local schools the flexibility they need to minimize the effect these tests have on various matters, such as teacher evaluation and compensation.  I have also instructed CTB McGraw-Hill to conduct enhanced stress and load testing to ensure that their servers are fully prepared for next year's test and ensure that this never happens again." 

 

The Department of Education is conducting an ongoing negotiation regarding settlement with CTB McGraw-Hill.  Next steps for the Department include processing student reports to be available online to parents and students, and calculating A-F accountability results.

 

An interactive map showing the ISTEP+ interruptions by school corporation can be found by clicking here:  http://www.stats.indiana.edu/maptools/ISTEPinterruptions.html

 

An Analysis of the Impact of Interruptions on the 2013 Administration of the

Indiana Statewide Testing for Educational Progress—Plus (ISTEP+)

 

Richard Hill

The National Center for the Improvement of Educational Assessment, Inc.

July 27, 2013

 

Background

 

The Indiana Statewide Testing for Educational Progress—Plus (ISTEP+) is Indiana's statewide testing program.  Students in public and nonpublic schools in grades 3 through 8 take this test.  There are substantial consequences for test results at all levels in the public schools, including teachers.

 

Indiana has been transitioning the administration of the test from paper-and-pencil to on-line testing since 2009.  This past spring, approximately 95 percent of the students took the test on-line, an increase from 71 percent the previous year.

 

Testing began this year on Monday, April 29.  Starting at about 10:30 that morning, students throughout Indiana experienced interruptions during their testing.  It was quickly discovered that the interruptions were caused by a memory issue on the CTB/McGraw-Hill (CTB) servers.  Because CTB's immediate efforts to resolve the situation were unsuccessful, their technology engineers worked to isolate the source of the issues and made necessary adjustments to return to normal status as soon as possible.  Based on these interruptions, Indiana's Superintendent of Public Instruction Glenda Ritz extended the testing window by two days to May 14, 2013.

 

On the second day of testing, at around 11:15, a different memory issue on CTB/McGraw-Hill's servers caused additional widespread interruptions for Indiana students.  Students again experienced the issues seen on April 29, but in greater volume.  In response, CTB determined that the ISTEP+ Online system had to be "cut over" to the disaster recovery site.  While the system remained accessible, this "cut over" caused interruptions for almost all students who were active in the system.  Also, as the system was moved from the regular to the disaster recovery servers, not all of the student responses were immediately accessible to students when they logged back into that test session.  All of the student responses had been saved, but they were not immediately available due to the system issues.  Based on the severity of the interruptions and a recommendation from CTB, the State Superintendent requested that students should complete their current test session and then schools should suspend online testing for the rest of the day.  Superintendent Ritz asked that schools reduce their online testing to 50 percent of their planned testing load for the following day.  Also, Superintendent Ritz extended the online testing window three additional days, through May 17, 2013.

 

On May 1, online testing resumed at 50% of planned capacity.  Students using CTB's system experienced no further widespread interruptions.  As a precautionary measure, Superintendent Ritz asked schools to continue to reduce online testing to 50% of their planned testing load for the following day.   On May 2, Superintendent Ritz once again asked schools to reduce online testing to 50% of their planned testing load for one more day as a precautionary measure. On May 3, Superintendent Ritz conducted three conference calls with Indiana superintendents.  On May 6, she directed schools to resume online testing at 100% of their capacity.  Online testing was completed on May 17.

 

On May 24, the Department of Education provided schools with a list of students that CTB indicated had interrupted testing sessions.  The Department gave that list to local schools so that they could check the list against their records and add any students they determined were impacted by the interruptions but missed by CTB. 

 

On that same day, the Department also issued a request for qualifications to three national companies experienced in validating test results.  From that process, the National Center for the Improvement of Educational Assessment was awarded a contract to investigate the impact the interruptions had on ISTEP+ test scores.  This report is the outcome of that investigation.

 

Description of the Interruptions

 

There are two sources of data available about the interruptions.  The first comes from the records of CTB.  As students completed the test, data were captured about the timing of all events.  As a result, the CTB data can, for example, tell how much time a student spent on the test before an interruption occurred, how many items were presented to the student before the interruption, and how long it was before the student answered another question.  In addition to the CTB data, local school systems were provided with the opportunity to identify additional students who were interrupted—or affected by interruptions, in the judgment of the local person completing the form.  These data were collected by providing local school systems a list of the students identified by CTB as having been interrupted and allowing them to append additional students to the file.  In contrast to the detail of the CTB data, the local appends identified only the test (Mathematics, English/language arts {ELA}, science, social studies) for which a student had been affected.

 

Table 1 provides the number of interruptions, reported by grade, session and type of school, as identified by CTB.  As can be seen from the data, there were significant numbers of interruptions at all grades, but grades 3-6 had a higher proportion of interruptions than grades 7 and 8.  This may be simply a function of the time of day that testing started—it is reasonable to presume that students in grades 7 and 8 started testing earlier in the day than students at the lower grades, and therefore more students at those grades were finished before the interruptions started.  It is also clear that the substantial majority of interruptions occurred during Sessions 1 and 2 (when students were taking the mathematics test) than during the later sessions.  Of course, it is possible that a student who was interrupted during Session 1 was affected for the remainder of the testing—that is, we cannot assume because far fewer interruptions occurred during Sessions 3 and 4 (when students were taking the ELA test) that ELA scores were unaffected by the interruptions.  Non-public school students had approximately the same proportion of interruptions as public school students, although this trend varied from grade to grade.  Non-public school students make up about 7.5 percent of the tested population, and had slightly less than 8 percent of the interruptions, totaled across the grades.  Their percentage ranged from a high of 12 percent at grade 7 down to 6 percent at grade 3.

 

Table 1

 

CTB-Reported Interruptions,

By Grade, Session and School Type

 

Grade

Type of School

Session

Total

1

2

3

4

5

6

3

Public

10,745

5,429

784

929

0

0

17,887

Non-Public

522

421

131

46

0

0

1,120

4

Public

10,821

5,588

1,046

590

543

598

19,186

Non-Public

510

607

102

67

16

37

1,339

5

Public

12,006

5,684

947

864

862

481

20,844

Non-Public

1,019

321

49

110

17

15

1,531

6

Public

9,474

7,145

1,332

1,132

595

659

20,337

Non-Public

735

738

169

59

55

43

1,799

7

Public

8,729

4,321

813

986

594

518

15,961

Non-Public

1,315

711

111

86

26

16

2,265

8

Public

7,255

4,399

1,104

1,054

0

0

13,812

Non-Public

571

474

90

163

0

0

1,298

Total

Public

59,030

32,566

6,026

5,555

2,594

2,256

108,027

Non-Public

4,672

3,272

652

531

114

111

9,352

Total

63,702

35,838

6,678

6,086

2,708

2,367

117,379

 

Once students were interrupted, there was a range of time before they restarted the test.  Sometimes, the length of that delay was a function of the responsiveness of the system;  at other times, it was  due to a school decision to stop the administration for students for a period of time and have them restart the test at a later time.  When students restarted, they sometimes had to redo the last item they had been working on before the interruption occurred, but for the vast majority of students, this was the extent of lost data.  However, there were 600 students (440 in math and 160 in ELA) whose data was not "restored" when they logged back in.  These students ended up with two sets of responses to their interrupted session and if any of their answers were different (and either one was correct), they were given credit for the correct answer. 

 

In order to summarize the length of the interruptions, they have been categorized as follows:

 

  1. Less than 2 minutes
  2. 2 minutes or more, but less than 5 minutes
  3. 5 minutes or more, but less than 15 minutes
  4. 15 minutes or more, but less than one hour
  5. One hour or more, but less than a day
  6. One day or more

 

Table 2 provides the information about the length of delays using the above categorization scheme.  For public school students, the most common delay was for a day or more, although that was less than a majority of the interruptions.  For students delayed less than a day, the most common delay was for 5 minutes or more, but less than 15.  Students in non-public schools had more of a tendency to restart the test the same day they were interrupted, with the most common delay being 5-15 minutes for them, too.  A total of 734 observations (less than 1 percent) could not have their delay coded because their end-of-interruption time was not recorded on the interruptions file.

 

Table 2

 

CTB-Reported Interruptions,

By Length of Interruption

 

Grade

Type of School

Interruption Length Code

Total

1

2

3

4

5

6

3

Public

452

1,721

5,395

1,619

1,196

7,433

17,816

Non-Public

53

129

437

62

76

343

1,100

4

Public

806

2,429

4,756

2,039

1,620

7,417

19,067

Non-Public

123

251

369

101

88

399

1,331

5

Public

1,202

2,629

5,347

1,868

1,456

8,217

20,719

Non-Public

113

261

522

117

134

367

1,514

6

Public

1,285

2,716

5,396

1,832

981

8,003

20,213

Non-Public

224

272

600

147

111

436

1,790

7

Public

1,324

2,516

4,791

1,442

592

5,195

15,860

Non-Public

273

303

727

328

67

549

2,247

8

Public

1,098

1,904

3,656

1,243

651

5,164

13,716

Non-Public

106

227

388

151

114

286

1,272

Total

Public

6,167

13,915

29,341

10,043

6,496

41,429

107,391

Non-Public

892

1,443

3,043

906

590

2,320

9,254

Total

7,059

15,358

32,384

10,949

7,086

43,809

116,645

           

There were a total of 117,379 interruptions.  Some students were interrupted more than once, and the data in Tables 1 and 2 are a duplicated count—that is, if students were interrupted more than once, they show up in those tables as many times as they had interruptions.  Table 3 provides information about the numbers of times students were interrupted, and these are unduplicated counts.  A total of 79,442 students were interrupted, which is about one-sixth of the total population.  Earlier, we provided a caution that just because a student was interrupted while taking the mathematics test, one cannot assume that the interruption did not affect the student's performance on later sections of the test.  Similarly, we caution here that just because a student was not reported as interrupted, that does not mean the student was unaffected by the interruptions.  The interruption of one student in a room could conceivably have an effect on other students in that same room.  Table 3 is a count of the numbers of students directly affected by the interruptions. 

 

 

Table 3

 

CTB-Reported Interruptions,

By Numbers of Interruptions for Students

 

Grade

Type of School

Number of Interruptions

Total

1

2

3

4

5

6 or more

3

Public

9,132

2,844

665

156

46

32

12,875

Non-Public

497

177

49

18

10

0

751

4

Public

9,155

2,543

1,056

260

80

51

13,145

Non-Public

507

212

75

32

11

0

837

5

Public

9,179

2,985

1,164

366

85

47

13,826

Non-Public

688

223

91

26

4

0

1,032

6

Public

8,607

2,845

998

467

153

66

13,136

Non-Public

707

211

85

40

34

14

1,091

7

Public

7,913

2,133

751

223

86

32

11,138

Non-Public

634

246

142

102

27

26

1,177

8

Public

6,904

1,802

617

214

72

36

9,645

Non-Public

517

136

75

38

9

14

789

Total

Public

50,890

15,152

5,251

1,686

522

264

73,765

Non-Public

3,550

1,205

517

256

95

54

5,677

Total

54,440

16,357

5,768

1,942

617

318

79,442

 

The data in Table 4 includes both CTB- and locally-reported interruptions, and therefore is reported at a somewhat coarser level.  For example, rather than specifying the session during which a student was interrupted, this table is limited to the test.  (The mathematics test was administered in Sessions 1 and 2 and the ELA was administered in Sessions 3 and 4.  For students in grades 4-7, there were two additional sessions, during which they took either social studies or science, depending on their grade.)  Also, rather than reporting the number of interruptions, these data provide the number of tests for which students were interrupted (some students were interrupted more than once during a testing session, which would have been reflected in the previous tables, but is a level of detail that cannot be reported in Table 4). 

 

 

Table 4

 

Numbers of Tests for Which Students Were Interrupted,

Combining CTB- and Locally-Reported Data

 

Grade

Type of School

Number of Interrupted Tests

Total

0

1

2

3

4

3

Public

54,001

18,887

4,204

296

269

77,657

Non-Public

5,421

949

147

8

72

6,597

4

Public

50,059

18,240

1,825

2,588

223

72,935

Non-Public

5,030

1,018

138

103

53

6,342

5

Public

51,520

18,454

1,951

2,919

186

75,030

Non-Public

5,072

887

288

103

47

6,397

6

Public

55,737

17,069

2,333

3,169

279

73,687

Non-Public

4,387

1,430

266

150

77

6,310

7

Public

56,054

16,907

1,582

2,800

286

77,629

Non-Public

4,087

1,384

302

69

23

5,865

8

Public

57,086

14,946

4,050

253

198

76,533

Non-Public

4,012

1,227

286

6

21

5,552

Total

Public

324,457

104,503

15,945

12,025

1,441

458,371

Non-Public

28,009

6,895

1,427

439

293

37,063

Total

352,466

111,398

17,372

12,464

1,734

495,434

 

From Table 3, we know that CTB identified interruptions for just short of 80,000 students. From Table 4, we see that of the 495,434 students tested statewide across all grades, 352,466 had no tests interrupted—meaning 142,968 were reported as having at least one test interrupted when the locally-reported interruptions are added into the CTB-reported interruptions.  Thus, we know that the locally-reported interruptions added about 60,000 students to the list.  Combined across both data sets, approximately 29 percent of the students were identified as being directly affected by the interruptions.  The number that were indirectly affected—that is, did not have an interruption in their own test, but had a disruption in their classroom that affected them—is unknown.

 

Some inconsistencies in Table 4 should be noted.  For example, no student in grade 3 or grade 8 took more than two tests (those students are tested in mathematics and ELA only), and no student in any grade took more than 3 tests, so some locally-reported interruptions do not reflect the reality of the testing system.  But those discrepancies are small compared to the general information, so it appears as though the vast majority of local school personnel completing the form did so accurately to the best of their ability.

 

Table 5 provides the counts from the CTB- and locally-reported data set on the number of students interrupted for each test.

 

 

 

 

 

Table 5

 

Numbers of Students Interrupted by Test,

Combining CTB- and Locally-Reported Data

 

Grade

Type of School

Test

Math

ELA

Science

Social Studies

3

Public

21,717

6,577

N/A

N/A

Non-Public

1,029

368

N/A

N/A

4

Public

20,194

5,810

4,067

N/A

Non-Public

1,138

392

220

N/A

5

Public

20,703

6,180

N/A

4,331

Non-Public

1,159

529

N/A

219

6

Public

19,719

7,202

4,815

N/A

Non-Public

1,695

609

323

N/A

7

Public

18,932

6,144

N/A

4,023

Non-Public

1,635

450

N/A

173

8

Public

17,220

6612

N/A

N/A

Non-Public

1,331

518

N/A

N/A

 

Table 5 provides some interesting information.  For example, CTB had identified slightly over 12,000 students interrupted in math for grade 3;  after adding in the locally-reported interruptions, the number is almost twice that.  In addition, about 85 percent of the interruptions in the CTB file were during the math test, but that percentage is much lower in Table 5.  While a strong majority of the interruptions are in math, the interruptions during the ELA test total about one-fourth of all the interruptions.  A reasonable assumption is that school personnel did indeed frequently code students as being interrupted in ELA not because they were directly interrupted during that test, but because they felt interruptions occurring during the math test carried over to later tests.

 

While some of the data to be presented in this paper deals with student-level analyses, another portion will be looking at results aggregated to the school level.  For the CTB-reported interruptions, 169 schools (out of 1,831—over 9 percent) had no interruptions for any students at any grade within the school.  Half the schools had interruptions for 12 percent or fewer of their students, and only 10 percent of the schools had more than 37 percent of their students interrupted.  The average percentage of interruptions for public schools was 16.5; for non-publics, the average was 14.3 percent.  At first, it seemed as though it might be worthwhile looking at the schools with no interrupted students separately (as a baseline, since they had no interruptions).  However, since these schools were disproportionately non-public (93 out of 169, or almost three-fourths) and tended to be considerably smaller than average (about half the number of students as an average school), they cannot be presumed to be representative of the state as a whole, and therefore that area of investigation was abandoned. 

 

The correlations of percentage of students interrupted across grades within a school were modest.  For public schools, the highest correlation was the percentage interrupted at grade 6 with the percentage interrupted at grade 7—0.25.  A, lmost all of the remaining correlations were less than 0.20.  This means that schools that had many interrupted students at one grade tended to not have as high a percentage at other grades.  The consequence of this is that whatever impact the interruptions might have had on student achievement would be somewhat diminished when results are aggregated across all grades in a school.

 

The Impact of Interruptions on Test Scores

 

It has been important to note the range and number of interruptions that occurred during ISTEP+ testing this past spring.  The interruptions created a significant burden for students, teachers and administrators who had to deal with the issue and make their best efforts to get students' responses to reflect their real achievement levels.  In this section, we will look at the extent to which their efforts were successful—did the interruptions have a negative impact on student achievement, or were schools able to get valid scores from students despite the obstacle that the interruptions provided?

 

We cannot know definitively how students would have scored this spring if the interruptions had not happened.  However, we can look at historical information and determine whether the scores attained this spring were consistent with predictions we would have made from an historical perspective.  We will look at four sources of data to inform these predictions:

 

  1. The overall statewide results—that is, the change in statewide mean scaled scores between 2012 and 2013.  If the interruptions this spring had a negative effect on student scores, we might expect statewide mean scaled scores this year to have declined from last year.
  2. The improvement in school scores from 2012 to 2013, especially in comparison to the improvements shown by those schools from 2011 to 2012.  Some school had no students with interruptions;  others had a substantial majority.  If the interruptions had a negative effect on student scores, we would expect the improvements to be better sustained in schools with lower percentages of interrupted students.  This analysis holds grade within school constant, but looks at different cohorts of students (e.g., comparing  grade 3 in 2012 to grade 3 in 2013).
  3. The gain in school mean scores, following a cohort of students across grades within a school (e.g., looking at grade 3 in 2012 and grade 4 in 2013).  Again, one would expect the gains to be higher in the schools with fewer interruptions.
  4. Student-level data matched across years.  Again, one would expect the students without interruptions to have the largest gains from year to year, and those with the most troublesome interruptions (early in the testing session, multiple times within session, longer delays during a session) to have smaller gains than all other students.

 

For the last two analyses, we will compare the changes from 2012 to 2013 with comparable data from 2011 to 2012.  Since there were no interruptions in 2012, looking at the data from 2011 to 2012 in the same way as 2012 to 2013 provides a baseline of expectations.  So, for example, we will be looking at the gains from 2011 to 2012 for the schools that had larger percentages of interruptions in 2013 to see how much they changed the year before they were interrupted and then comparing that to the change the year they were interrupted.

 

 


 

Overall Statewide Results

 

Table 6 provides an overview of the statewide results since the inception of ISTEP+ test in 2009.  As can be seen from the table, the state enjoyed substantial gains from the first year to the second year of the program, which is not unusual—scores often change the most in the first years of a testing program as the schools adjust their curriculum to the new material being assessed.

 

The purpose of providing Table 6 is to set an historical context for the 2013 results.  If the interruptions had a serious impact on student test scores, we could expect the 2013 scores, and in particular the gains from 2012 to 2013, to be out of line with changes from previous years.  That did not happen.  Averaged across the grades, the state increased by 4 scaled score points a year in mathematics between 2010 and 2012, and 3 scaled score points in English language arts.  Between 2012 and 2013, the state increased by an average of 4 scaled score points in mathematics and 1 scaled score point in ELA.

 

Table 6

 

Mean ISTEP+ Scaled Scores for Public School Students, 2009 through 2013

 

Grade

Mathematics

English Language Arts

2009

2010

2011

2012

2013

2009

2010

2011

2012

2013

3

452

463

470

469

470

452

460

463

467

465

4

478

491

495

495

509

470

479

484

485

491

5

506

520

527

529

531

493

496

500

505

506

6

532

533

536

544

543

510

522

529

531

531

7

542

553

555

562

567

523

533

538

536

534

8

566

578

583

587

593

534

544

545

545

549

 

Scores increased from 2012 to 2013 in five grades in mathematics (the exception being a decrease of 1 point in grade 6) and in three grades in ELA.  Scores increased more in mathematics than in ELA in five grades, which is an interesting result, given that the substantial majority of the interruptions occurred while students were taking the mathematics test.  However, it is possible that the effect of the interruptions was cumulative—that is, once interruptions started happening, their impact grew as disruptions caused, for example, alterations in testing schedules. Combined with the fact that students completed some portion of the mathematics test before the interruptions started (and thus can be presumed to have some portion of the mathematics test reflect their full level of achievement), it is possible that some effect of the interruptions can be seen in this table.  However, Indiana has seen greater gains in mathematics scores than ELA scores over the years, and therefore observing greater gains in mathematics is consistent with historical patterns.

 

Table 7 looks at the 2012 and 2013 results in a bit more detail.  The substantial increase in scaled scores in both mathematics and ELA in grade 4, combined with the lack of improvement at grade 3 (indeed, a loss of 2 scaled score points in ELA) warranted a more careful look at what might have been the cause of those changes.

 

 

Table 7

 

Numbers of Students Tested and Mean Scaled Scores

On the ISTEP+ Test for 2012 and 2013

 

Grade

Mathematics

English Language Arts

2012

2013

Change

2012

2013

Change

N

Mean

N

Mean

N

Mean

N

Mean

3

74,283

469

76,410

470

+1

73,771

467

75,928

465

-2

4

74,133

495

71,755

509

+14

73,717

485

71,359

491

+6

5

77,150

529

73,719

531

+2

76,770

505

73,363

506

+1

6

75,587

544

77,012

543

-1

75,130

531

76,581

531

0

7

74,873

562

75,768

567

+5

74,396

536

75,372

534

-2

8

74,534

587

74,675

593

+6

74,099

545

74,307

549

+4

 

A clue as to what happened comes from looking at the changes in the numbers of students tested across years, following the same cohort.  At every grade, the 2013 numbers are consistent with those of the previous year, except going from grade 3 in 2012 to grade 4 in 2013, where the number of students tested declined by over 2,000.  An inquiry revealed that a new policy was put into place in 2013, whereby third-grade students who did not pass a reading test the previous spring or summer would continue to receive Grade 3 reading and literacy instruction, would receive additional interventions based on individual student learning needs, and would be officially reported as a third-grader the following school year (in this case, 2012-13).   As a result of this policy, approximately 2,500 students who would have been tested in the fourth grade in previous years took the third grade test instead. 

 

The following is a more detailed description of the policy, the implementation process, and the number of affected students.

 

To implement IC 20-32-8.5 (Reading Deficiency Remediation Plan), the Indiana State Board of Education and the Indiana Department of Education enacted a new policy during the 2011-12 school year, whereby third-grade students that 1) did not achieve a passing score on the IREAD-3 assessment in either Spring 2012 or Summer 2012, and 2) were not eligible for good cause exemptions, were retained as third graders for the 2012-13 school year as a last resort.

It is important to note that some of the retained students were actually placed in  grade 4 classrooms for instruction, as it is the responsibility of the local school to design a program that meets the learning needs of students and to determine classroom assignments.

In February 2013, Superintendent Ritz communicated to schools and corporations the flexibility that would exist during the spring of 2013 to provide the Grade 4 ISTEP+ test to any third grade student who met these criteria:

1)    The student did not pass IREAD-3 in Spring or Summer 2012 or receive a Good Cause Exemption (and was thus reported as a third grader during the 2012-13 school year), 

 

2)    The student received fourth grade instruction in all content areas (including literacy) during the 2012-13 school year, and

 

3) The student's parents understood that their child would be assessed using the Grade 4 ISTEP+ test.  

Superintendent Ritz's memo to superintendents and principals outlining this flexibility emphasized that all students participating in the Grade 4 ISTEP+ test (including those students who met the above criteria) would factor into a school or corporation's accountability calculations for Grade 4.  In total, schools and corporations exercised the option to administer the Grade 4 ISTEP+ test to nearly 250 Indiana third grade students in the spring of 2013.

Thus, there were approximately 2,500 students who are included in the grade 3 results for 2013 whose counterparts are missing from the 2012 results—and are not included in the grade 4 results for 2013.  Since these are students who did not pass a grade 3 reading test in 2012, it is reasonable to presume that they would have been among the lowest scoring students in reading, and below average in mathematics.  Removing those students from the fourth grade results and adding them into the third grade certainly raised the grade 4 2013 average, and may very well have lowered the grade 3 average as well.

 

To further investigate the issue, we looked at the numbers of students passing the ISTEP+ test in both years.  If the increase in grade 4 scores was mostly due to the change in policy, we should see the numbers of students passing the test approximately equal across the years, but a sharp decline in the number of failing students.  That is indeed what happened.  The number of students passing the grade 4 ELA test remained almost identical across the years, but the number of "Did Not Pass" students declined by over 2,000.  In mathematics, about 1,500 more students passed, but the number of "Did Not Pass" students declined by over 3,700.  So it is reasonable to presume that if the new policy had not been in place, and those 2,500 students affected by it had been tested in the fourth grade rather than the third, the change in mean scaled scores would be modestly positive for ELA for both grade 3 and grade 4, and mathematics mean scaled scores would have increased by several points at both grades.

 

Another policy change that complicates the interpretation of the changes of scores from one year to the next is the change from paper-and-pencil to on-line administration of the test.  Beginning with the 2009 administration of the ISTEP+ test, Indiana has been transitioning to online administration.  The percentage of students taking the test online was quite small in 2009 and 2010, but it was 36 percent in 2011, 71 percent in 2012, and 95 percent in 2013.  That rate of transition has not been constant across the grades, however.  In 2012, 92 percent of the grade 8 students took the test online, while only 34 percent of the third graders did.  The most typical pattern has been to transition one grade per year, and for the highest grades to start the transition first.  As a result, grade 3 in the elementary grades had the largest percentage of students transitioning this year, and grade 6 in the middle school grades.

 

While studies done in previous years have shown that the impact of the transition on test scores has been minimal, those studies have been done on schools and grades that have been earlier adopters.  The improvement in scores for the middle school grades was highest for grade 8, followed by grade 7 and grade 6 in that order—and that is the same order of percentage of online administration in 2012 (grade 8 was 92 percent, grade 7 was 86 percent, and grade 6 was 66 percent).  As a result, interpretation of the changes from 2012 to 2013 should not only take into account the interruptions but the change in mode of administration for many students.

 

The changes in scores from 2012 to 2013, once the changes in populations in grades 3 and 4 due to the new retention policy implemented this year are taken into account, are generally positive, and consistent with changes that Indiana has seen in the past.  Thus, while it is possible that some small portion of students may have had the interruptions affect their scores, it appears that on average across the vast majority of students, student performance was as high as it would have been if the interruptions had not occurred.

 

 

The Improvement in School Scores

 

A second investigation into the impact of the interruptions on student scores is the look at the changes in test scores at the school level across years, holding grade constant—that is, for example, comparing how grade 3 in a school scored in 2013 to how the third graders in that same school scored in 2012.  This statistic of cross-cohort change is generally referred to as "improvement" (in contrast to "growth," which refers to following the same cohort across grades).

 

For these analyses, we computed the percentage of students interrupted in each grade in each school in the state twice—once for the CTB-reported interruptions, and then again for the interruptions added by local school personnel.  Table 8 provides the average percentages of students interrupted.

 

Table 8

 

School Mean Percentages of Students Interrupted

 

Grade

CTB-Reported Interruptions

All Reported Interruptions

Public

Non-Public

Public

Non-Public

N

Mean %

N

Mean %

Mean %

Mean %

3

1,063

16

263

10

29

16

4

1,057

18

267

14

31

21

5

975

19

266

15

31

20

6

692

17

260

16

29

26

7

511

13

247

18

24

27

8

501

13

243

11

24

23

 

For the next analysis, also done grade by grade, public schools are grouped into three categories.  The first group had no students interrupted at that grade;  the second had some interrupted students, but less than 20 percent; and the third group had 20 percent or more students interrupted.  Table 9 provides the changes in test scores from 2012 to 2013, holding grade constant, for the three groups of schools.

 

 

Table 9

 

Average Change in ISTEP+ Test School Mean Scaled Scores between 2012 and 2013,

Reported by Percentage of Students Interrupted—Public Schools Only

 

Grade

Percentage of Interruptions

CTB-Reported Interruptions

All Reported Interruptions

Number of Schools

Average Change in Math

Average Change in ELA

Number of Schools

Average Change in Math

Average Change in ELA

3

None

265

3

-1

133

4

0

0+ - 20-

489

1

-3

439

1

-2

20% or more

290

2

-2

472

1

-2

4

None

238

13

5

138

10

3

0+ - 20-

485

13

5

407

13

6

20% or more

314

13

4

492

14

5

5

None

191

0

0

119

2

1

0+ - 20-

448

0

1

375

-1

1

20% or more

304

3

2

449

2

2

6

None

138

1

2

75

1

1

0+ - 20-

332

-1

0

293

-1

1

20% or more

192

-3

-1

294

-2

-1

7

None

74

6

0

50

9

3

0+ - 20-

299

5

-2

248

5

-2

20% or more

104

5

-2

179

5

-2

8

None

82

9

7

45

7

4

0+ - 20-

281

6

3

252

6

4

20% or more

113

7

4

179

8

5

 

If the interruptions had an impact on student test scores, the expectation for Table 9 would be that schools with no interruptions would show the most positive changes between 2012 and 2013, and that schools with greater rates of interruption would show less positive (or more negative) gains.  An example of this expected pattern occurs in grade 6 mathematics, where the schools with no CTB-reported interruptions had a mean gain of 1 scaled score point, while those with up to 20 percent of their students interrupted had a mean loss of 1 point, and those with 20 percent or more of their students interrupted had a mean loss of 3 points.  If that pattern had held up over the grades, it might be reasonable to presume that the interruptions had a small but measurable impact on test scores.  However, the pattern varies from grade to grade and from content area to content area.  The lack of a discernible pattern is true whether one looks at the CTB-reported interruptions only, or those combined with the school-reported interruptions.  On average across the grades, the gap between the non-interrupted schools and those with interruptions is about 1 point—on a test where the student-level standard deviation is between 50 and 75 points, depending on the grade and subject.

 

 

The Gain in School Scores

 

In contrast to the previous analysis, this one looks at the gains in scaled scores of cohorts of students across grades.  For this analysis, we need a baseline of growth expectations—that is, simply knowing that students gained from one year to the next would be insufficient information, since most students grow from year to year.  Therefore, we looked at the gains from 2011 to 2012 to use as a basis for comparing the growth from 2012 to 2013.

 

Schools are included in this analysis at a particular grade only if they also enrolled students the previous year at the lower grade.  Thus, for example, if a middle school enrolls students in grades 6-8, that school would be included in this analysis at grades 7 and 8, but not grade 6.  This is an issue that will be dealt with differently in the next analysis, where students will be matched from year to year regardless of their school in either year.

 

Tables 10a and 10b are identical to each other, except that Table 10a reports the results for schools broken down on the basis of the percentage of students interrupted as per the CTB-reported interruptions, whereas Table 10b includes all reported interruptions.  The same scores for each school are used in both tables—the only difference between them is the categorization of the schools.  Since the school-appended interruption files contain more records than the CTB interruption files, more schools are categorized in the third level of interruption, and fewer in the first level.

 

One interesting aspect to this analysis is that the schools are categorized by the percentage of students interrupted at the grade in 2013, but includes information on change from 2011 to 2012—the year before the interruptions took place.  Given that the interruptions were broadly distributed across schools, we would expect no differences among the three groups within a grade.  So, for example, all three groups of schools had approximately the same amount of gain from grade 3 in 2011 to grade 4 in 2012—about 25 points.  However, there are differences in those baseline scores as large as 5 points among the groups (grade 6-7 math and grade 5-6 ELA) in Table 10a, and one as high as 9 points in Table 10b (grade 6-7 math), and these likely reflect the normal variation one might expect to find across scores from year to year with this limited number of schools in each group.  Therefore, if we were to see a difference of this magnitude in the 2012 to 2013 gains, that difference might very well have been simply a reflection of this normal variation for that particular group. 

 

But in fact, the differences between the groups tend to be smaller in 2013—when the interruptions happened—than they were in 2012—the year before the interruptions.  Also, when one aggregates the data across grade levels and compares the average changes from 2011 to 2012 with the changes from 2012 to 2013, the results for all three categories of schools are almost identical, whether one uses the CTB-only data or the CTB data aggregated with the school-reported interruptions.  The gains schools made in 2013 are not related to the amount of interruption their students endured.  The schools with no interruptions did not have larger gains than schools that were interrupted, and schools with more moderate amounts of interruption did not have larger gains than schools with larger percentages of interrupted students.

 

 

Table 10a

 

Average Growth in ISTEP+ Test School Mean Scaled Scores between 2012 and 2013,

Reported by Percentage of Students Interrupted—Public Schools Only

CTB-Reported Interruptions Only

 

 

 

Number of Schools

Average Change in Math

 

Average Change in ELA

 

2011 to 2012

2012 to 2013

2011 to 2012

2012 to 2013

3-4

None

232

25

40

22

25

0+ - 20-

470

26

40

22

23

20% or more

300

26

39

23

22

4-5

None

177

36

34

22

19

0+ - 20-

402

33

35

20

21

20% or more

289

34

38

21

22

5-6

None

101

24

20

34

32

0+ - 20-

213

23

20

29

28

20% or more

132

20

18

28

25

6-7

None

47

22

22

7

10

0+ - 20-

182

27

24

8

4

20% or more

64

23

21

9

4

7-8

None

74

31

32

3

12

0+ - 20-

270

32

32

7

13

20% or more

109

31

32

6

12

 

 

 

Table 10b

 

Average Growth in ISTEP+ Test School Mean Scaled Scores between 2012 and 2013,

Reported by Percentage of Students Interrupted—Public Schools Only

Using Both CTB and Locally Reported Interruptions

 

 

 

Number of Schools

Average Change in Math

 

Average Change in ELA

 

2011 to 2012

2012 to 2013

2011 to 2012

2012 to 2013

3-4

None

135

26

40

23

25

0+ - 20-

396

26

40

22

24

20% or more

473

26

39

23

22

5-6

None

109

35

34

24

20

0+ - 20-

340

33

34

20

20

20% or more

419

34

38

20

22

5-6

None

60

24

19

33

28

0+ - 20-

189

22

21

29

29

20% or more

197

21

18

30

28

6-7

None

33

18

22

3

9

0+ - 20-

151

27

25

9

4

20% or more

110

24

21

8

5

7-8

None

39

34

31

3

10

0+ - 20-

247

31

32

6

12

20% or more

169

32

32

7

13

 

 

Student-level Data Matched across Years

 

The fourth analysis is a look at student-level data matched across years.  The first step in the analysis was to get student-level files for two consecutive years, then match each student's performance in the second year with that of the first.  This was done for two cohorts—the 2011-2012 group, and 2012-2013.

 

Students were matched only if they took the ISTEP+ test in consecutive grades, so students who were retained in a grade were not included in this analysis.  In addition, students who moved in or out of the state between tests were not included, and students were included only if they had valid test scores in both ELA and mathematics for both years.  Despite these restrictions, the vast majority of students were included.  Over 90 percent of the students had a match and valid test scores across years for all grades and years.  The lowest percentage of matched students naturally came from the match from grade 3 in 2012 to grade 4 in 2013, when approximately 2,000 additional students were retained in grade 3.  Even there, the match rate was over 90 percent.

 

Table 11 provides the numbers of students matched across years and the scaled score gains.  For 2013, the same statistics are provided for students who CTB reported as interrupted and for the CTB plus locally-reported interruptions.

 

The results reported in Table 11 show trends consistent with those of the three previous analyses.  The gains public school students made in 2013 were larger than their gains had been in 2012 for three of the grades, and smaller in the remaining two grades, for both mathematics and English language arts.  Public school students that CTB reported as interrupted had the same or larger gains than the overall average at every grade in mathematics and for three of the five grades in ELA.  Public school students reported by either CTB or locally as having been interrupted had gains equal to or greater than all students at all grades in mathematics and two of five grades in ELA.  In short, the data about overall interruptions indicate that students who were interrupted had gains that were as high as the students who were not interrupted.

 

Table 11

 

Average Growth in ISTEP+ Test Scaled Scores for Students Matched across Years,

2011 to 2012 and 2012 to 2013

 

Matched Grades

Year of Testing and Whether Reported as Interrupted

Public

Non-public

N

Gain in Math

Gain in ELA

N

Gain in Math

Gain in ELA

3-4

2012—All Students

70,218

23

20

5,585

13

23

2013—All Students

68,329

36

21

5,891

30

15

2013—CTB-reported

12,387

36

20

793

39

19

2013—All reported

26,969

36

19

1,390

36

18

4-5

2012—All Students

73,275

32

19

5,621

31

21

2013—All Students

70,385

35

20

5,927

33

14

2013—CTB-reported

13,028

37

21

991

35

17

2013—All reported

27,708

36

22

1,350

32

14

5-6

2012—All Students

71,447

16

31

5,252

18

39

2013—All Students

73,396

14

27

5,857

14

27

2013—CTB-reported

12,433

15

25

1,058

15

30

2013—All reported

27,118

14

23

2,052

15

27

6-7

2012—All Students

70,444

25

4

4,805

23

-1

2013—All Students

71,909

22

2

5,413

21

-1

2013—CTB-reported

10,471

23

3

1,140

23

-3

2013—All reported

25,971

22

6

1,870

20

-2

7-8

2012—All Students

69,725

31

7

4,794

29

17

2013—All Students

70,876

31

13

5,117

28

18

2013—CTB-reported

9,085

32

13

770

29

19

2013—All reported

24,106

32

11

1,645

28

16

 

One advantage of student-level data is that the interruptions can be explored in more detail.  For example, while the substantial majority of interrupted students were interrupted in mathematics (the first test taken), some had their first interruption during the ELA test, and therefore were not interrupted during the mathematics test.  We might expect the gains of those students on the mathematics test to be the same as non-interrupted students, but different from students who were interrupted while taking the mathematics test.  Also, the interruption data supplied by CTB provided much more detail about the interruptions.  From those data, we can look at students who were interrupted multiple times during one session of the test and the specific session of the test when they were first interrupted.

 

Table 12 provides data from the interruption data provided by both CTB and local school personnel.  Students were categorized as "None" if they were not interrupted in either the mathematics or the ELA test, "Math" if they were interrupted during the mathematics test, and "ELA" if they were not interrupted during the mathematics test but were first interrupted during the ELA test.  If student test scores were impacted by the interruptions, we would expect the "ELA" and the "None" students to have the same gains on the mathematics test (since the "ELA" students weren't interrupted until after they had completed the mathematics test), but lower gains on the ELA test.  In contrast, we would expect "Math" students to have lower gains than the other two groups on the mathematics test for sure, and possibly on the ELA test as well if we thought interruptions on one test would carry over to a later one.

 

Table 12

 

Average Growth in 2013 ISTEP+ Test Scaled Scores for Students Matched across Years,

Reported by First Test during Which They Were Interrupted

 

Matched Grades

Test of First Interruption

Public

Non-public

N

Gain in Math

Gain in ELA

N

Gain in Math

Gain in ELA

3-4

None

47,862

36

21

4,710

28

14

Math

19,049

35

20

1,088

37

19

ELA

1,418

36

21

93

33

16

4-5

None

49,254

34

19

4,689

33

14

Math

19,555

36

21

1,099

33

16

ELA

1,576

35

21

139

29

6

5-6

None

52,860

14

27

4,051

13

27

Math

18,702

14

26

1,638

15

28

ELA

1,834

14

21

168

15

30

6-7

None

52,482

23

2

3,711

21

0

Math

17,791

23

2

1,587

20

-3

ELA

1,636

20

5

115

27

2

7-8

None

52,696

31

13

3,628

28

18

Math

16,214

32

13

1,285

28

18

ELA

1,966

30

12

204

23

11

 

The data do not support that interpretation.  The public school students first interrupted during the math test had math and ELA gains that were not much different from the students who were never interrupted at all.  Public school students who were first interrupted during the ELA test had ELA gains within one point of the students who were never interrupted, with the exceptions of grades 6 (where they had significantly lower gains) and grade 7 (where their gains were actually somewhat higher than the students who were never interrupted).

 

The results in Table 13 are calculated using the CTB-interrupted data for public school students only.  The reports of interruptions by CTB have provided similar results to the total interruption reports for all the other analyses in this report (with the exception of the number of students identified), but also provide a level of detail not available from the all-interruptions file.  This table uses information about the specific session during which students were first interrupted (Sessions 1 and 2 were the mathematics sessions, Sessions 3 and 4 were the ELA sessions).  In addition, we identified students who had been interrupted more than once during a session, anticipating that students who had encountered multiple interruptions might have lower gains than students who were just interrupted once (or were not interrupted at all).

 

Table 13

 

Gain Scores for Matched Students,

Reported by Type of Interruption

CTB-Reported Interruptions Only

Public School Students Only

 

Content Area

Grade

Not Interrupted

Any Interruption

First Session Interrupted

Multiple Interruptions within One Session

1

2

3

4

Math

3-4

36

36

37

35

30

36

37

4-5

34

37

38

35

31

37

37

5-6

14

15

16

14

14

11

16

6-7

22

23

24

21

21

18

24

7-8

31

32

33

31

32

27

32

ELA

3-4

21

20

20

20

19

21

20

4-5

20

21

21

23

23

22

21

5-6

27

25

27

24

18

24

27

6-7

2

3

3

1

7

3

2

7-8

13

13

14

12

11

12

14

 

Consistent with the findings reported earlier, Table 13 shows that students who were interrupted scored at about the same level, and often slightly higher, than the students who were not interrupted at all.  And contrary to expectations, students who were interrupted multiple times within a session gained as many points as students who were not interrupted at all.  But perhaps the most interesting finding from Table 13 is that the group with the lowest gains for mathematics was always one that was interrupted first in Session 3 or Session 4—sessions that were taken after they had completed the mathematics test.

 

Summary

 

There is considerable evidence that the interruptions had no negative impact on student scores for the vast majority of students;  indeed, students who were interrupted had somewhat larger gains across years than those who were not interrupted.  Given the volume and the nature of the interruptions, this finding certainly will come as a surprise to many.  One possible explanation that might be offered is that the interruptions affected students who were not identified as interrupted—that is, students in a class for which some, but not all, were interrupted might have all been affected by the interruptions.  However, that explanation does not seem plausible, since the state as a whole performed better in 2013 than it had in 2012.  If large numbers of students—numbers beyond the 20-25 percent who were identified as having been interrupted—had been affected, it does not seem possible that the state could have experienced these increases.

 

Although no data were collected that would confirm this hypothesis, it seems most plausible that the response to the interruptions, by both students and school personnel, was enough to overcome the potential problems created by the interruptions.  Students apparently worked as diligently on the tests as they would have if they hadn't been interrupted, and school personnel apparently minimized the impact of the interruptions on students' testing experiences.  Thus, while it certainly took significantly more effort to complete the testing this year because of the interruptions, that effort apparently was successful at negating the impact of the interruptions for the vast majority of students.

 

There were three major events that could have potentially impacted test scores this year:

 

  1. The new policy to retain students in grade 3 because of unsatisfactory scores on the IREAD test.
  2. The switch from paper-and-pencil to online administration for many schools.
  3. The interruptions affecting the online administration

 

Clearly, the policy to retain students in grade 3 had an impact on changes to the grade 3 and grade 4 scores between 2012 and 2013.  The switch from paper-and-pencil to online administration has not had much of an impact on scores in previous years, but the impact might have been more this year as the last grades within school made that transition.

 

It is important to note that this paper addresses only the larger issue of the impact of the interruptions when aggregated over large numbers of students.  When viewed from a high level, no consistent impact on test scores from the interruptions could be seen.  However, this is not the same as saying no student in the state was affected.  It certainly is possible that some students were affected;  if so, those occurrences were overshadowed by the lack of impact on the vast majority of students.  The interruptions data from CTB would permit a study of specific interruption patterns that might indeed permit one to identify students who likely were impacted by the interruptions.  Indeed, CTB has proposed some patterns in the data that will be pursued during the next phase of this study, and it is possible that some students will then be identified as having been affected by the interruptions.  If so, that will be important information to take into account during reporting. 

 

As noted earlier in this report, we cannot know definitively how students would have scored this spring if the interruptions had not happened.  In addition, the interruptions were not the only element that changed in the test administration this year, thereby adding a level of uncertainty as to the root cause of changes when they occurred.  However, the data strongly suggest, that the vast majority of students scored as well as they would have had the interruptions never happened.

Powered by WorldNow
All content © Copyright 2000 - 2014 WorldNow and WDRB. All Rights Reserved. For more information on this site, please read our Privacy Policy and Terms of Service.