Balanced Assessment Systems: Today and Tomorrow

The Every Student Succeeds Act (ESSA) sets expectations for innovation in assessment, assurance that assessment programs provide meaning and value, and input from a range of stakeholders. These themes have prompted conversations about balanced assessment systems. Many of us in the K–12 world understand the inherent rightness of a balanced assessment system, but it’s not easy to define or implement.

Getting our Terms Straight

One difficulty lies in the various terms educators, policy makers, and other leaders use when they discuss an assessment program or system. In our experience, we’ve learned that talking about different types of assessment can lead to a “Who’s on first?” circular conversation. After a few laps around the circle, we realize that we’re using the same term to mean different things, or using different terms to describe similar kinds of assessments.

To help establish clarity and consistency in the terminology we use with our colleagues and policy makers, we created an infographic with definitions that help explain the parts and the whole of a balanced assessment system. We don’t insist that our definitions are “right” and others are wrong; we use them as a starting point for conversations. So even if we don’t all use the same terms, we can start from a common understanding.

We break down the system into two characteristics of assessment systems that can be used to describe assessment activities: purpose and practice. Broadly speaking,

  • Purpose can be classified as formative—assessment for instruction, or summative—assessment of instruction.
  • Practice can be thought of as informal—unstructured and dynamic, or formal—structured and standardized.

Table 1

At the intersections of these categories are the specific assessment activities that take place across all levels and throughout the school year. Here we offer a brief look at the definitions we use for those activities. You can download the infographic to learn more.

Formative Assessment
Formative assessment, or assessment for learning, is a multi-step instructional process which includes the collection of evidence of student learning relevant to a current learning target. One of the fundamental principles of formative assessment is that students share responsibility for their own learning. After agreeing on specific learning targets, the teacher and student gather evidence and reflect on the student’s progress.

Techniques of formative assessment are varied and employed at both the individual and group level. Classroom assessment activities are the most recognized instances of assessment for learning; informal methods include over-the-shoulder observation and questioning and self- or peer-evaluation, while increasingly formal methods include homework and quizzes.

By our definition, interim assessments are grade-level achievement measures, spanning a full year’s content standards and following district-wide standardized administration. They’re used to monitor students’ progress towards expectations—to show whether students are on track for what they’re expected to learn by the end of the school year. Administered following instruction, these assessments may serve multiple purposes at the student, class, school, and district levels. As “early warning” indicators, interim assessments can help inform instruction in the short term, identifying curricular areas that individuals or groups of students may not be grasping, and also help make longer-term program decisions.

Summative Assessment
Student testing that takes place after instruction with the intention of evaluating student understanding or performance is known as summative assessment, or assessment of learning. Such tests may cover a single unit of instruction, a chapter of a textbook, a semester’s coursework, or an entire school year.

Benchmark assessments are used to monitor how well students learned recently taught curriculum and take the form of quizzes, chapter or unit tests, and mid-terms. These tests follow local curriculum and pacing, and may be homegrown by educators or purchased from an assessment vendor at the district level, to varying degrees of customization. These activities serve to calculate grades and/or to provide ongoing information to students, parents, teachers, and administrators.

Perhaps the most recognizable summative assessment technique is the statewide accountability assessment. These highly-formalized tests and procedures are implemented within each state to evaluate schools and districts. Their goals are to ensure that states are raising student performance, providing equal access to high-quality instructional opportunities, and initiating improvement efforts as needed. These tests were mandated by the Elementary and Secondary Education Act (ESEA) of 1965 and its subsequent reauthorizations under the No Child Left Behind Act (2002), which dramatically increased federal testing requirements for states. The most recent reauthorization of ESEA is the Every Student Succeeds Act (2015).

Considered a civil rights law, the ESEA was created to promote “equal educational opportunity” as our “first national goal.”

What Does the Future Look Like?

Signed by President Obama in late 2015, ESSA is the latest reauthorization of the ESEA. While retaining its precursors’ commitment to equal opportunity for all students and requirements for annual state testing, ESSA returns more control and responsibility to the states. ESSA is less prescriptive about how to demonstrate school improvement and how to intervene in underperforming schools and districts.

Statewide accountability assessment remains a major component in evaluating state performance. States must submit their accountability plans to the U.S. Department of Education (USDOE), and must include a non-academic measure in addition to test scores to demonstrate students’ progress. In addition, ESSA offers a degree of flexibility that allows states to be more innovative in their assessment approaches.

Despite the buzz about ESSA opening the door to innovation, at the statewide level, not much has changed. That’s because the accountability and peer review requirements under ESSA reinforce past models of statewide assessment. The USDOE has approved the ESSA implementation plans for all 50 states, the District of Columbia and Puerto Rico. Most of these plans contain statewide accountability assessments similar to those implemented in past years, plus the non-academic measures required by ESSA, such as absenteeism/attendance and participation in accelerated courses. Only a few state plans include indicators of student satisfaction or school climate.

New Uses for Interim Assessments
As a result of ESSA and stakeholders’ desire to reduce the time spent on testing, some states and districts have begun looking for interim assessments that integrate with and provide data for accountability purposes. (Under ESSA, the term “interim” could mean either interim or benchmark assessments as we’ve defined them.) Such integration would provide consistent results throughout the school year, including end-of-year tests, leading to more accurate predictions of student achievement.  To contribute to state and federal accountability measures, the interim tests must be of similar technical quality in terms of assessment design, content alignment and rigor, psychometrics, test security, and administration. Test design, development, and delivery practices for these assessments need to adhere to the standards established by the Council of Chief State School Officers and outlined in Standards for Educational and Psychological Assessment, and meet peer review requirements. While a commitment to taking these positive steps is not without a cost in both time and money, and would require states to manage the programs, the benefits outweigh the challenges.

Local Influence
Historically, innovations in assessment have begun at the state level and filtered down to district and local programs. But what if that balance were to shift? District leaders and local educators can now be active stakeholders in state ESSA plans. School districts seeking to eliminate duplicate or unnecessary tests, get meaningful data in time to inform instruction, and assess deeper learning may put pressure on states to find ways to support their initiatives. Additionally, districts can begin to consider efficient and effective ways to evaluate student engagement, school culture and climate, school quality, and the like, within a broader assessment ecosystem.

States’ ESSA plans are heavily weighted to what happens at the end of the year with the statewide assessment. Because of the statewide programs, district and local assessments are hyper-focused on grades 3–8 and on high school students’ readiness for college and career, but we’re hearing concerns from educators about gaps exposed by that focus.

Overarching Principles

With so much discourse about assessment among educators, legislators, and the public, it’s important to keep coming back to the core goal of any assessment: to provide fair, reliable, and valid results that support meaningful insights and decisions about student learning and learning needs.

To accomplish that, we must all:

  • Use common language to describe assessments and our expectations
  • Ensure that all stakeholders in a program understand the purpose of an assessment and what information it can realistically provide
  • Deliver assessments designed to support educator and program decisions with meaningful, actionable data and insights
  • Examine the purpose and usefulness of every element in an assessment system to eliminate duplication and let go of ineffective measures
Matthew Gushta, Ph.D.
Matthew Gushta, Ph.D.,  is a principal research scientist for AdvancED | Measured Progress. He has more than 15 years of experience as a research and measurement professional supporting K-12 education, focusing on effective application of technology, cognition, and reporting design in the areas of large-scale accountability assessment as well as formative and diagnostic assessment.