A review of instruments for measuring social and emotional learning skills among secondary school students
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
October 2019
Tools
A review of instruments
for measuring social and
emotional learning skills among
secondary school students
Joshua Cox
Brandon Foster
David Bamat
Education Development Center, Inc.
In collaboration with the Social and
Emotional Learning Research Alliance
Summary
This resource is intended to support state and local education
agencies in identifying reliable and valid instruments to measure
three social and emotional learning skills among secondary
school students: collaboration, perseverance, and self-regulated
learning.
U.S. Department of Education
At Education Development Center, Inc.U.S. Department of Education Betsy DeVos, Secretary Institute of Education Sciences Mark Schneider, Director National Center for Education Evaluation and Regional Assistance Matthew Soldner, Commissioner Elizabeth Eisner, Associate Commissioner Amy Johnson, Action Editor Felicia Sanders, Project Officer REL 2020–010 The National Center for Education Evaluation and Regional Assistance (NCEE) conducts unbiased large-scale evaluations of education programs and practices supported by federal funds; provides research-based technical assistance to educators and policymakers; and supports the synthesis and the widespread dissemination of the results of research and evaluation throughout the United States. October 2019 This report was prepared for the Institute of Education Sciences (IES) under Contract ED-IES-17-C-0008 by Regional Educational Laboratory Northeast & Islands administered by Education Development Center, Inc. The content of the publication does not neces sarily reflect the views or policies of IES or the U.S. Department of Education, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government. This REL report is in the public domain. While permission to reprint this publication is not necessary, it should be cited as: Cox, J., Foster, B., & Bamat, D. (2019). A review of instruments for measuring social and emotional learning skills among secondary school students (REL 2020–010). Washington, DC: U.S. Department of Education, Institute of Education Sciences, National Center for Edu cation Evaluation and Regional Assistance, Regional Educational Laboratory Northeast & Islands. Retrieved from http://ies.ed.gov/ncee/edlabs. This report is available on the Regional Educational Laboratory website at http://ies.ed.gov/ ncee/edlabs.
Summary
This resource was designed to support state and local education agencies in identifying
reliable and valid instruments for measuring collaboration, perseverance, and self-regulated
learning among secondary school students. It was developed through the Regional Educa
tional Laboratory Northeast & Islands Social and Emotional Learning Alliance, whose
members aim to identify and synthesize emerging evidence on social and emotional learn
ing to improve education outcomes. The alliance’s focus on social and emotional learning
skills is supported by evidence suggesting that these skills may improve students’ academ
ic and career outcomes (Farrington et al., 2012; Gayl, 2017; Heckman, 2008; West et al.,
2016). This resource can help alliance members and other state and local education agen
cies develop and track students’ social and emotional learning skills as an indicator of
student success within accountability models required by the Every Student Succeeds Act
of 2015.
This resource supports stakeholders in:
• Identifying available instruments for measuring collaboration, perseverance, and
self-regulated learning among secondary school populations.
• Understanding the information about reliability and validity that is available for
each of these instruments.
In addition, the resource offers questions that schools and districts should consider when
reviewing reliability and validity information.
iContents
Summary i
Why this resource? 1
What is this resource? 2
Worksheet 1. Questions to identify and evaluate instruments for measuring social and
emotional learning skills 6
Step 1. Indicate your response for each of the following questions. Be as specific as possible.
Use this information to identify an initial list of instruments that your school or district
might use. 6
Step 2. Now that you have identified the specific skills to be measured, the target group of
respondents, and the purpose for using the instrument, work through the following yes/
no questions and associated considerations. 7
What instruments are available? 8
What information is available about the reliability of the instruments? 9
What information is available about the validity of the instruments? 11
Content validity 11
Structural validity 12
External validity 12
Consequential validity 12
Generalizability, fairness, and substantive validity 13
Implications 14
Limitations 14
Appendix A. Methodology A-1
Appendix B. Summary of reliability and validity information on eligible instruments B-1
Notes Notes-1
References Ref-1
Boxes
1 Methodology 2
2 Definitions of key terms 3
Tables
1 Format of eligible instruments, by social and emotional learning skill 9
2 Availability of reliability and validity information in the 16 eligible instruments 10
3 Snapshot of whether reliability and structural validity information met conventionally
accepted criteria 11
4 Snapshot of consequential validity information 13
iiB1 Revised Self-Report Teamwork Scale: Summary of reliability and validity information B-1
B2 Teamwork Scale: Summary of reliability and validity information B-2
B3 Teamwork Scale for Youth: Summary of reliability and validity information B-3
B4 Subjective Judgement Test: Summary of reliability and validity information B-4
B5 Teacher-Report Teamwork Assessment: Summary of reliability and validity information B-5
B6 Engagement with Instructional Activity: Summary of reliability and validity information B-6
B7 Expectancy-Value-Cost Scale: Summary of reliability and validity information B-7
B8 Grit Scale—Original Form: Summary of reliability and validity information B-8
B9 Grit Scale—Short Form: Summary of reliability and validity information B-9
B10 Inventory of Metacognitive Self-Regulation on Problem-Solving: Summary of
reliability and validity information B-10
B11 Junior Metacognitive Awareness Inventory: Summary of reliability and validity
information B-11
B12 Self-Directed Learning Inventory: Summary of reliability and validity information B-12
B13 Self-Regulation Strategy Inventory—Self-Report: Summary of reliability and validity
information B-13
B14 Motivated Strategies for Learning Questionnaire: Summary of reliability and validity
information B-14
B15 Program for International Student Assessment Student Learner Characteristics as
Learners: Summary of reliability and validity information B-15
B16 Student Engagement Instrument: Summary of reliability and validity information B-16
iiiWhy this resource?
Students who enhance their social and emotional learning skills may also improve their
academic and career outcomes (Farrington et al., 2012; Gayl, 2017; Heckman, 2008; West
et al., 2016). These skills may also be malleable and amenable to intervention (Durlak,
Weissberg, Dymnicki, Taylor, & Schellinger, 2011; What Works Clearinghouse, 2007).
Accordingly, some education agencies have started to develop and track students’ social
and emotional learning skills as an indicator of student success within accountability
models required by the Every Student Succeeds Act of 2015.
However, researchers have not reached consensus on the best ways to measure social and
emotional learning skills or the appropriate uses of existing instruments (see, for example,
Duckworth & Yeager, 2015). Some researchers have cautioned against using instruments
to measure social and emotional learning skills for summative purposes in high-stakes
accountability models because most instruments are fairly new and lack adequate informa
tion on their reliability and validity (Duckworth & Yeager, 2015; Melnick, Cook-Harvey,
& Darling-Hammond, 2017). Reliability refers to whether an instrument consistently mea
sures the skill across respondents, time, or raters. Validity refers to whether an instrument
measures what it intends to measure and whether the inferences drawn from an instru
ment are appropriate. Indeed, this review of instruments for measuring social and emo
tional learning skills found no explicit language from their developers suggesting that any
of them should be used for summative purposes. It may be more appropriate for schools
to use data collected from instruments that measure social and emotional learning skills
for formative purposes to inform teaching, learning, and program investments (Melnick
et al., 2017). To make these decisions, state and local education agency administrators need
information to better understand the intended uses and limitations of available instru
ments, as well as their reliability and validity.
The Regional Educational Laboratory Northeast & Islands Social and Emotional Learn
ing Alliance is a researcher–practitioner partnership comprising researchers, regional edu
cators, policymakers, and others. Alliance partners share an interest in identifying and
synthesizing evidence on social and emotional learning skills to improve education out
comes. Champlain Valley School District (CVSD) in Vermont, one of the alliance’s part
ners, seeks to identify instruments for measuring three social and emotional learning skills
described in its mission statement: collaboration, perseverance, and self-regulated learning
(Champlain Valley School District, 2016). Its intention is to better understand students’
intrapersonal and interpersonal competencies, the two main areas into which social and
emotional learning competencies are organized. Intrapersonal competencies refer to how
students deal with their own thoughts and emotions, and interpersonal competencies refer
to the skills and attitudes students use to interact with others.
CVSD is exploring ways to use scores from instruments that measure collaboration, per
severance, and self-regulated learning to formally evaluate students’ social and emotion
al learning skills, develop its formative and summative assessment systems in accordance
with the Every Student Succeeds Act, and provide guidance for educators on measuring
and integrating social and emotional learning skills into instruction.
1What is this resource?
This resource supports local and state education agencies’ efforts to identify reliable and
valid instruments that measure social and emotional learning skills (see box 1 for an over
view of methods and appendix A for more detailed information). The resource also pro
vides questions that schools and districts should consider when reviewing reliability and
validity information. Where possible, the resource describes whether the available reliabil
ity and validity information meets conventionally accepted criteria. In recognition that
each school or district context is unique, this resource does not recommend specific instru
ments to use with accountability models required by the Every Student Succeeds Act.
This resource supports stakeholders in:
• Identifying available instruments for measuring collaboration, perseverance, and
self-regulated learning among secondary school populations.
• Understanding information about reliability and validity that is available for each
of these instruments.
Box 1. Methodology
Instrument identification process
The instruments were identified and reviewed in a six-step process:
1. Described collaboration, perseverance, and self-regulated learning skills.
2. Identified terms related to collaboration, perseverance, and self-regulated learning.
3. Searched the literature for relevant instruments measuring collaboration, perseverance,
and self-regulated learning.
4. Determined the eligibility for inclusion in this resource of instruments identified in the
literature search.
5. Reviewed the reliability and validity information available for eligible instruments.
6. Determined whether the available reliability and validity information met conventionally
accepted criteria.
Construct descriptions
To undertake the first step listed above, the Regional Educational Laboratory Northeast &
Islands and Champlain Valley School District (CVSD) partnered to describe each of the social
and emotional learning skills (or constructs) to be addressed in this resource:
• Collaboration: collaborating effectively and respectfully to enhance the learning
environment.
• Perseverance: persevering when challenged, dealing with failure in a positive way, and
seeking to improve one’s performance.
• Self-regulated learning: taking initiative and responsibility for learning.
These descriptions align with commonly used definitions in the research (for example,
Kuhn, 2015; Shechtman, DeBarger, Dornsife, Rosier, & Yarnall, 2013; Zimmerman, 2000).1
Eligibility criteria
Instruments had to meet the following eligibility criteria to be included in this resource:
• Measure one of the three targeted social and emotional learning skills.
(continued)
2Box 1. Methodology (continued)
• Have been used with a population of secondary school students (students in grades 9–12)
in the United States.2
• Be publicly available online at no or low cost.3
• Be published or had psychometric validation work completed between 19944 and 2017.
• Not be published as part of a doctoral dissertation.5
As recommended by the Standards for Educational and Psychological Testing (American
Educational Research Association, American Psychological Association, & National Council on
Measurement in Education, 2014), this resource presents information on fairness and reliabil
ity. Six components of validity were also evaluated for each instrument following Messick’s
(1995) construct validity framework: content, substantive, structural, external, generalizability,
and consequential. See box 2 for definitions of reliability, validity, the six components of validi
ty, and fairness. Each eligible instrument was then categorized using the criteria in box 2 (see
appendix B for a summary of psychometric evidence for eligible instruments).
Notes
1. There are many definitions of these skills in the research literature. See appendix A for additional citations
for each skill.
2. CVSD is specifically interested in identifying reliable and valid instruments for secondary school students.
3. Instruments published in research journals at a relatively low cost (less than $50) were included.
4. A group of educators, researchers, and child advocates met in 1994 to discuss social and emotional learn
ing and coined the term (Durlak, Domitrovich, Weissberg, & Gullotta, 2015).
5. These instruments were excluded because they are not typically analyzed with the same rigor as other pub
lished instruments (for example, instruments published in peer-reviewed journals).
Source: American Educational Research Association, American Psychological Association, & National Council
on Measurement in Education (2014) and Messick (1995).
This resource indicates whether psychometric information was available for reliability and
seven components of validity—content, structural, external, consequential, generalizabil
ity, fairness, and substantive1 (see box 2 for definitions). Schools and districts can use reli
ability and validity information to evaluate whether the adoption of a measure will likely
produce sufficient information to meet the needs of that school or district. It is important
to note that the mere availability of reliability and validity information for a measure does
not necessarily indicate support for the use of that measure.
Box 2. Definitions of key terms
Reliability. Whether an instrument consistently measures the skill across respondents, time,
or raters. Information that could support reliability includes several families of statistics,
including measures of internal consistency such as Cronbach’s α, omega or test-retest asso
ciations, and inter-rater reliability. For the instruments included in this resource, evaluation
of reliability information was based on the widely cited conventional criterion of a Cronbach’s
α ≥ .70 (Nunnally, 1978).
Validity. Whether an instrument measures what it is intended to measure and whether the
inferences drawn from an instrument are appropriate. This resource focuses on the following
seven components of validity.
(continued)
3Box 2. Definitions of key terms (continued)
Content validity. Whether items for an instrument contain content that adequately describes
the skill(s). Information that could support this component of validity includes a review of survey
items by experts, use of a well constructed theory for the skill(s), and the use of multiple per
spectives in creating the items. An example of using multiple perspectives is asking teachers
and students to define the skill and using that information with experts to construct items.
Structural validity. Whether an instrument’s items relate statistically to the skill being mea
sured. Information that could support this component of validity includes advanced statistical
analyses, typically factor analyses that examine whether items are correlated in expected ways.
For example, one approach to measuring structural validity is to show that an instrument’s
items statistically relate to the skills and associated subscales that the instrument measures.
External validity. Whether there are correlations between scores from the instrument and
scores from other instruments measuring similar skills. Information that could support this
component of validity includes positive correlations between scores generated from the
instrument and other similar instruments measuring the skill. For example, one approach to
measuring external validity is to establish whether correlations exist between two different
instruments that measure perseverance.
Consequential validity. Whether scores generated from an instrument are associated with the
intended consequences of using the instrument, such as improving student outcomes.1 Infor
mation that could support this component of validity includes correlations between students’
scores on the measure and grade point average or standardized test scores, if the intended
consequence is improved student learning, or graduation or job attainment.
Generalizability validity. Whether scores from the instrument are correlated with other modes
of measurement for the same skill, such as self-reported or observational information. Unlike
external validity, which explores correlations between scores from different instruments mea
suring similar skills, generalizability explores correlations between scores from different modes
of measurement of the same skill. Information that could support generalizability includes cor
relations between students’ self-report scores and either teacher-report scores for students or
other observational instruments measuring student behavior.
Fairness. Whether an instrument is not biased against specific subgroups of students. Information
that could support this component of validity includes having versions of the measure available
in multiple languages or statistical tests showing that scores from the measure function similarly
across all subgroups. For example, this could include students from different racial/ethnic back
grounds, students eligible for the national school lunch program, or English learner students.
Substantive validity. Whether students process an instrument’s items or tasks in the way
the developers intended. Information that could support this component of validity includes
cognitive interviews with a series of students as they engage with items from the measure. For
example, interviewing students can provide information about sources of confusion that might
emerge as students respond to items on the measure.
Note
1. According to Messick (1995, p. 11), “the consequential aspect of construct validity includes evidence and
rationales for evaluating the intended and unintended consequences of score interpretation and use in both
the short and long term.” In practice, the consequences of using a measure can be quite broad and depend on
how a district might propose to use a measure. However, the purpose of this resource is to review instruments
that might be predictive of important student outcomes. Thus, a consequence of using one of the instruments
in this resource should be that it helps districts focus on developing behaviors or skills that could predict
important student outcomes (for example, test scores, graduation, and college enrollment). For this reason
the resource defines consequential validity by whether scores generated on the measure are associated with
important student outcomes. There are many definitions of these skills in the research literature. See appen
dix A for additional citations for each skill.
4Practitioners using this resource should evaluate whether the reliability and validity infor
mation for each instrument provides sufficient support for adopting the measure in their
district. Practitioners looking for support in evaluating the psychometric information for
each instrument can find a series of guiding questions in worksheet 1 to help identify the
information necessary to select instruments that are likely to be useful in various local
contexts.
Different uses of an instrument will place unique demands on the quality of reliability and
validity information presented. For example, a district could be interested in monitoring
perseverance among students. In that case, whether an instrument has demonstrated cor
relations with other student outcomes (that is, consequential or generalizability validity)
may not be a concern. On the other hand, practitioners might be interested in ensuring
that an instrument has demonstrated correlations with other student outcomes if their
goal is to provide formative feedback to support a broader goal such as improving students’
grades or increasing educational attainment. Thus, practitioners should first identify the
components of reliability and validity that are most important to their school or district,
which puts them in a better position to evaluate reliability and validity information for
instruments measuring social and emotional learning skills.
5Worksheet 1. Questions to identify and evaluate instruments for measuring social
and emotional learning skills
Practitioners can use the questions in this worksheet to identify instruments and determine
whether those instruments align with their needs. The questions in step 1 can be used to
identify the skills that will be measured, the target group of respondents, and the purpose for
using the instrument. The questions in step 2 can be used to identify the components of reli
ability and validity that are most important in the practitioner’s school or district. Practitioners
can use their yes/no responses in step 2 and accompanying considerations when reviewing
the psychometric information for each instrument presented in appendix B. The worksheet
concludes by offering additional considerations to support practitioners in identifying whether
administering an instrument is feasible in their school or district.
Step 1. Indicate your response for each of the following questions. Be as specific as possible. Use
this information to identify an initial list of instruments that your school or district might use.
What are the specific skills to be measured? That is, are you are interested in measuring
collaboration, perseverance, or self-regulated learning? (Then, look at table 1 to see which
instruments cover the specific skills of interest to you.)
What students are you planning to assess (for example, all high school students)?
What is the purpose for using an instrument (for example, to provide support to teachers)?
6Step 2. Now that you have identified the specific skills to be measured, the target group of
respondents, and the purpose for using the instrument, work through the following yes/no questions
and associated considerations.
You can use responses to this worksheet to synthesize and evaluate information in the tables
in appendix B. You can also use table 2 to quickly identify whether information on reliability and
validity is available for each instrument.
Note: For each instrument, practitioners should, at a minimum, consider information presented in appendix B
on reliability, content validity, and structural validity. Table 3 in the main text provides information on whether
reliability information and structural validity information met conventionally accepted criteria. If this information
is not provided, or is provided but is not within conventionally accepted criteria, then scores generated from the
instrument may not be useful. Finally, reviewing content validity is necessary since it considers how the content
from an instrument’s items overlaps with the practitioner’s understanding of a particular social and emotional
learning skill. The alignment between the developer’s definition of a social and emotional learning skill and the
practitioner’s is a core issue for ensuring that a school or district is measuring what it intends to measure.
Question Response and considerations
Is it important that students were involved in ■ Yes ■ No
the instrument’s development process? If yes, consider that this resource found no information on
substantive validity for any of the instruments reviewed.
Are you interested in using scores from ■ Yes ■ No
the instrument along with instruments that If yes, consider examining information presented in
measure other related social and emotional appendix B for external validity to check whether scores
learning skills? generated from the instrument are related to scores
from other conceptually similar instruments of social and
emotional learning skills.
Are you interested in measuring a specific ■ Yes ■ No
social and emotional learning skill using more If yes, consider examining information presented in
than one mode of measurement, such as appendix B for generalizability validity to see if analyses
student self-report surveys and observations? were undertaken to establish whether scores from
the instrument are correlated with other modes of
measurement of the same social and emotional learning
skill.
Are you interested in connecting your students’ ■ Yes ■ No
social and emotional learning skills scores If yes, consider information presented in appendix B
to other consequential outcomes, such as for consequential validity and table 4 in the main text.
achievement scores, graduation rates, and Specifically, examine whether scores from the instrument
attendance? are correlated with other desired student outcomes.
Are you interested in comparing scores on the ■ Yes ■ No
instrument for different subgroups of students If yes, consider examining information presented in
(for example, by race/ethnicity, eligibility for appendix B for fairness to see whether information is
the national school lunch program, or English available about the specific subgroups you are comparing.
learner student status)?
Practitioners should also consider the following questions for all instruments under consideration:
• How affordable is the instrument to administer to the target group of respondents? Prac
titioners should consider costs associated with purchasing the instrument, if applicable,
as well as any administrative costs associated with administering and scoring the assess
ment and reporting the results.
• How much time is available for students to complete an instrument? Practitioners might
consider piloting the instrument with a few students to better understand the amount of
time that students require to complete the instrument.
• What format of administration is feasible in your context? Practitioners might consider
piloting the instrument with a few students to better understand whether the format is
feasible in their context.
• How will scores be reported? Are the scores easily interpreted and useful? Practitioners
should consider their purpose for administering the instrument (see step 1) and whether
the results from the instrument will support that purpose.
7What instruments are available?
In total, 16 instruments were assessed to be eligible for inclusion in the resource. The
initial search yielded 67 instruments as possible measures of one or more of the social and
emotional learning skills of interest. Of those 67 instruments, 30 were excluded because
they had not been administered with secondary school students in the United States, 24
were excluded because they were not publicly available, and 7 were excluded because they
were not published or had not undergone validation work between 1994 and 2017.2
Eligible instruments included five measures of collaboration, four measures of persever
ance, four measures of self-regulated learning, and three measures of both perseverance and
self-regulated learning (table 1). All instruments measuring perseverance or self-regulated
learning were student self-report surveys. Three of the five instruments measuring collab
oration were student self-report surveys, one was task or performance based, and one was a
teacher-report survey.
Note that different instruments measuring a social and emotional learning skill can define
that skill differently. For example, the Motivated Strategies for Learning Questionnaire
measures self-regulated learning, which the authors describe as including planning, moni
toring, and regulating activities (Pintrich, Smith, García, & McKeachie, 1991, p. 23). The
Junior Metacognitive Awareness Inventory, another measure of self-regulated learning,
uses a framework focused on planning, monitoring, and evaluation (Sperling, Howard,
Miller, & Murphy, 2002, p. 55). These small differences are driven by underlying differ
ences in the theoretical research used to shape the content of items and can ultimately
lead to variation between different instruments that measure the skill. Before using any
instrument, practitioners should examine items from the instrument to see whether they
aligns with their perspective on the skill. Social and emotional skills described in research
may share a common name, but they can be defined slightly differently by the items in an
instrument.
Instruments differ in their intended purpose, including whether for research, formative, or
summative uses. Descriptions for each of these uses is as follows:
• Research use. The intention is to use results produced by the instrument to describe
these skills for a particular population or to examine relationships. The research
may then be used as evidence to inform policy or practice, but it is not typically
linked to a punitive or beneficial consequence for individuals, schools, or districts.
• Formative use. The intention is to use results produced by the instruments to
inform instructional change that can influence positive change in students.
• Summative use. The intention is to assign a final rating or score to each student
by comparing each student against a standard or benchmark. These comparisons
can be used to assess the effectiveness of a teacher, school, or district, and an
assessment of underachievement can lead to negative consequences for teachers,
schools, or districts. Additionally, these comparisons can be used to determine
whether a student should be promoted to the next grade level or graduate. For
that reason, summative instruments are perceived as having higher stakes, and
instruments used for summative purposes traditionally require more stringent reli
ability and validity evidence before use (Crooks, Kane, & Cohen, 1996; Haladyna
& Downing, 2004).
8Table 1. Format of eligible instruments, by social and emotional learning skill
Social and emotional learning skill measured and instrument Format
Collaboration
Revised Self-Report Teamwork Scale Student self-report survey
Teamwork Scale Student self-report survey
Teamwork Scale for Youth Student self-report survey
Subjective Judgement Test Task or performance based
Teacher-Report Teamwork Assessment Teacher-report survey
Perseverance
Engagement with Instructional Activity Student self-report survey
Expectancy-Value-Cost Scale Student self-report survey
Grit Scale—Original Form Student self-report survey
Grit Scale—Short Form Student self-report survey
Self-regulated learning
Inventory of Metacognitive Self-Regulation on Problem-Solving Student self-report survey
Junior Metacognitive Awareness Inventory Student self-report survey
Self-Directed Learning Inventory Student self-report survey
Self-Regulation Strategy Inventory—Self-Report Student self-report survey
Perseverance and self-regulated learning
Motivated Strategies for Learning Questionnaire Student self-report survey
Program for International Student Assessment Student Learner Student self-report survey
Characteristics as Learners
Student Engagement Instrument Student self-report survey
Note: Instruments are sorted alphabetically first by the measured skill and second by the format of the
instrument.
Source: Authors’ analysis based on sources shown in tables B1–B16 in appendix B.
Of the 16 instruments reviewed for this resource, 11 were used for research purposes and 5
as a formative tool. There was no explicit language suggesting that any of the instruments
should be used for summative purposes. Practitioners should avoid using an instrument for
purposes other than those outlined by the instrument developer unless there is sufficient
reliability and validity evidence supporting its use for a different purpose.
What information is available about the reliability of the instruments?
All 16 instruments eligible for inclusion in this resource have information on reliability
(table 2; see appendix B for more detailed information). Fifteen of the instruments have
information on a measure of reliability known as Cronbach’s α, which is used to gauge
internal consistency by measuring the extent of the relationship among the items (for
example, survey questions) in a measure. Reliability statistics, such as Cronbach’s α, are
used to examine the likelihood of an instrument generating similar scores under consistent
conditions. Twelve of the instruments met conventionally accepted criteria for Cronbach’s
α, indicating that the instruments generated scores that were reliable (table 3). Two instru
ments providing measures for Cronbach’s α showed mixed results, with some subscales in
the instrument meeting conventionally accepted criteria for reliability and some not. One
instrument reported a Cronbach’s α value that was below conventionally accepted criteria.
Instruments that meet conventionally accepted thresholds for reliability might be more
suitable for informing policy and practice.
9Table 2. Availability of reliability and validity information in the 16 eligible
instruments
Validity
Conse Generaliz Substan
Instrument Reliability Content Structural External quential ability Fairness tive
Revised Self-Report
● ● ● ● ● ● ●
Teamwork Scale
Subjective Judgement
● ● ● ● ● ● ●
Test
Expectancy-Value-Cost
● ● ● ● ● ● ●
Scale
Teacher-Report Teamwork
● ● ● ● ●
Assessment
Teamwork Scale for Youth ● ● ● ● ●
Grit Scale—Original Form ● ● ● ● ●
Grit Scale—Short Form ● ● ● ● ●
Student Engagement
● ● ● ● ●
Instrument
Self-Directed Learning
● ● ● ● ●
Inventory
Teamwork Scale ● ● ● ● ●
Self-Regulation Strategy
● ● ● ●
Inventory—Self-Report
Junior Metacognitive
● ● ● ●
Awareness Inventory
Motivated Strategies for
● ● ● ●
Learning Questionnaire
Program for International
Student Assessment
● ● ●
Student Learner
Characteristics as Learners
Inventory of Metacognitive
Self-Regulation on ● ●
Problem-Solving
Engagement with
● ●
Instructional Activity
● Information is available. Information is not available.
Note: Instruments are sorted according to the availability of reliability and validity information.
Source: Authors’ analysis based on sources shown in tables B1–B16 in appendix B.
10Table 3. Snapshot of whether reliability and structural validity information met
conventionally accepted criteria
Information for Information for
reliability met structural validity
conventionally met conventionally
Instrument accepted criteriaa accepted criteriab
Junior Metacognitive Awareness Inventory Yes Yes
Program for International Student Assessment Student Learner
Characteristics as Learners Yes Yes
Self-Directed Learning Inventory Yes Yes
Student Engagement Instrument Yes Yes
Subjective Judgement Test Yes Yes
Teamwork Scale Yes Yes
Teamwork Scale for Youth Yes Yes
Grit Scale—Original Form Yes No
Revised Self-Report Teamwork Scale Yes No
Inventory of Metacognitive Self-Regulation on Problem-Solving Yes —
Self-Regulation Strategy Inventory—Self-Report Yes —
Teacher-Report Teamwork Assessment Yes —
Grit Scale—Short Form Some Some
Motivated Strategies for Learning Questionnaire Some No
Engagement with Instructional Activity No —
Expectancy-Value-Cost Scale — Yes
Note: Instruments are sorted first according to whether information for the instrument met conventionally
accepted criteria for reliability and then for structural validity. The two statistics were evaluated using only
the source articles presented for each measure in appendix B. Additional details regarding conventionally
accepted criteria are available in appendix A.
a. Evaluation of reliability information was based on the widely cited conventional criterion of a Cronbach’s
α ≥ .70 (Nunnally, 1978). However, researchers have highlighted that high Cronbach’s α values also corre
spond with measuring a decreased range of the measured skill (Sijtsma, 2009). Yes indicates Cronbach’s
α ≥ .70; No indicates Cronbach’s α < .70; Some indicates mixed results, with some subscales containing Cron
bach’s α values that meet the .70 threshold and some containing values that do not; and — indicates that
articles did not provide information for Cronbach’s α.
b. Evaluation of structural validity information was based on whether the articles reported confirmatory
factor models with fit statistics for the models falling into conventionally acceptable ranges: Tucker Lewis
index ≥ .90, comparative fit index ≥ .90, standardized root mean square residual ≥ .08, and root mean square
error of approximation > .05. Some indicates mixed results, with some subscales meeting conventionally ac
cepted criteria while others did not, and — indicates that articles that did not provide information for confirma
tory factor tests.
Source: Authors’ analysis based on sources shown in tables B1–B16 in appendix B.
What information is available about the validity of the instruments?
All 16 instruments eligible for inclusion in this resource had information related to at least
one component of validity (see table 2).
Content validity
The component of validity most commonly available for eligible instruments was content
validity (see table 2). While content validity is often supported by reviews involving multi
ple stakeholders with content expertise, none of the instruments included in this resource
had evidence of these types of review. All 16 instruments did, however, offer information
11on content validity in discussions about the theory and previous research that guided the
construction of items that make up the instruments.
Structural validity
Twelve of the 16 instruments in this resource had information on structural validity (see
table 2). Eight of these instruments were supported by evidence indicating that the instru
ment met conventionally accepted criteria based on results from confirmatory factor anal
ysis, a statistical technique to establish whether the items from a measure are related to
the centrally measured skill (see table 3). One additional measure showed mixed results
because the authors reported only one of many statistics needed to assess whether the
model fit adequately.
The predominant form of information provided for structural validity was results from
confirmatory factor analysis. Social and emotional learning skills are often made up of
more than one component. For this reason, instruments that measure social and emo
tional learning skills often comprise subscales that measure the components that make up
the broader social and emotional learning skill. For example, developers of the Grit Scale
(Duckworth, Peterson, Matthews, & Kelly, 2007) hypothesize that the instrument consists
of two separate but related components, Consistency of Interest and Perseverance of Effort.
A confirmatory factor analysis of the Grit Scale was conducted to answer whether the
data confirmed the existence of these two separate components. All items for a measure
should conceptually align with the overall concept of the skill in some way, with each item
describing a different component of the skill.
External validity
Twelve of the 16 instruments in this resource had information on external validity (see
table 2). External validity refers to information about the extent to which results produced
from an instrument are correlated with results from other instruments in strength and
direction in the theoretically expected manner. For instance, if instruments for depres
sion, anxiety, and happiness were administered to the same sample, scores on the depres
sion instrument would be expected to be positively associated with scores on the anxiety
instrument, while scores on the happiness instrument would be expected to be negative
ly associated with scores on the depression instrument. If these relationships were not
observed, then the instrument likely did not measure what it was intended to measure and
its external validity is questionable.
Consequential validity
Eleven of the 16 instruments in this resource had information on consequential validity
(see table 2). Consequential validity refers to information about the extent to which results
produced by an instrument are associated with important student outcomes. The predom
inant student outcomes examined in these studies were student achievement outcomes.
These associations were in the expected positive direction for 10 of these 11 instruments
(table 4). One measure, the Student Engagement Instrument, was also negatively correlat
ed with student suspension rates.
12Table 4. Snapshot of consequential validity information
Positively correlated Type of consequential
with consequential student outcome
Instrument student outcomesa reported in article
Student Engagement Instrument Yes Grade point averages,
course grades, and
suspension rates
Revised Self-Report Teamwork Scale Yes Course grades
Teamwork Scale Yes Grade point averages
Expectancy-Value-Cost Scale Yes Achievement scores
Junior Metacognitive Awareness Inventory Yes Swanson Metacognitive
Questionnaire, course grades,
and grade point averages
Self-Directed Learning Inventory Yes Grade point averages
Self-Regulation Strategy Inventory—Self-Report Yes Course grades
Subjective Judgement Test Yes Course gradesb
Teacher-Report Teamwork Assessment Yes Course grades
Grit Scale—Short Form Yes Achievement scores
Grit Scale—Original Form No Achievement scores
Engagement with Instructional Activity — —
Inventory of Metacognitive Self-Regulation on — —
Problem-Solving
Motivated Strategies for Learning Questionnaire — —
Program for International Student Assessment — —
Student Learner Characteristics as Learners
Teamwork Scale for Youth — —
Note: Instruments are sorted in the table according to whether they are positively correlated with consequen
tial student outcomes.
a. Yes indicates a positive association with consequential student outcome; no indicates a negative associa
tion with consequential student outcome; and — indicates that the articles did not provide information.
b. Although correlations between scores on the Subjective Judgement Test and course grades were in the
expected direction, the correlation was not statistically significant.
Source: Authors’ analysis based on sources shown in tables B1–B16 in appendix B.
Generalizability, fairness, and substantive validity
Five of the 16 eligible instruments had information on generalizability (see table 2). Gener
alizability is the extent to which results measuring a trait are associated with results mea
suring the same trait using a different mode of measurement, such as student self-report
and teacher report.
Three of the 16 instruments had information on fairness. Although some of the instru
ments indicated that the sought-after information existed in multiple languages, the search
did not identify information on whether the instruments behave similarly for different
social groups.
No instruments had information on substantive validity. Substantive validity refers to
whether respondents process items or tasks in the way the developers intended. One way
that instrument developers can determine whether there is evidence of substantive validity
is to conduct cognitive interviews during instrument development and collect verbal infor
mation from individuals as they respond to individual items within a particular measure.
13Information collected in these interviews is used to ensure that respondents are engaging
with the instrument as the developers intended.
Implications
This resource included 16 publicly available instruments that focus on measuring collabo
ration, perseverance, and self-regulated learning among secondary school students in the
United States. More instruments were initially identified but ultimately excluded because
they were not intended for use with secondary school students. Practitioners should use
caution when administering any instrument that was not developed for the population of
students that the instrument will be used to evaluate because those instruments often lack
psychometric evidence for that population. In the absence of such psychometric evidence,
practitioners cannot ensure that the analyses of scores generated from the measure are
reliable or valid for the target population. For example, analyzing student change across
time, comparing key subgroups, and benchmarking current levels of a trait in the district
are not warranted.
Among the 16 instruments identified, 11 were developed for use in research and 5 for
formative instruction. None of the information collected suggested that the instruments
should be used for summative purposes. With schools and districts ramping up efforts to
measure social and emotional learning skills for formative and summative use, practitioners
would benefit from the development of additional instruments for these purposes. Likewise,
additional work is needed to better understand whether existing instruments that were not
specifically developed for formative or summative purposes can be used for those purposes.
Meanwhile, practitioners should be cautious when using any measure for summative pur
poses that has not been developed and validated for that purpose. Without evidence to
support that an instrument is valid and reliable for a specific purpose, administrators are at
risk of using an invalid and unreliable assessment to inform high-stakes decisionmaking.
Finally, none of the instruments identified in this resource had information for substantive
validity, and only three had information on fairness. Information for the substantive com
ponent of validity is necessary to facilitate understanding of whether respondents process
the content of items from a measure as the developers intended. Information on fairness
is necessary for evaluating whether the measure is valid for comparing scores between
subgroups of students. To help practitioners better understand whether instruments are
measuring social and emotional learning skills for all populations of students, instrument
developers could assess the substantive and fairness components of validity. Practitioners
should use caution when administering instruments that lack information on substantive
validity or fairness, since these instruments may not be appropriate for all students that are
being evaluated.
Limitations
The first limitation of this resource is that only a small subset of instruments was reviewed.
The criteria for inclusion in this resource were more specific than in prior reviews of
instruments measuring social and emotional learning skills. For example, only instruments
measuring collaboration, perseverance, or self-regulated learning were reviewed. Further,
this resource excludes any instrument that had not been used with a secondary school
student population in the United States. These criteria were established because CVSD
14was interested in identifying instruments that measure collaboration, perseverance, and
self-regulated learning in students in secondary school. Similarly, with a focus on iden
tifying instruments that practitioners can use in their schools and districts, this resource
excludes any instrument that was not publicly available.
Second, an implicit assumption of the study was that only instruments or validation studies
that appeared in response to the search queries were included. It is possible that other
instruments or validation studies exist that were not identified in the queries. For example,
if a validation study did not include any of the search terms described in appendix A, the
psychometric information for that instrument does not appear in this resource.
15Appendix A. Methodology
The review of instruments included six primary steps:
• Described collaboration, perseverance, and self-regulated learning skills.
• Identified terms related to collaboration, perseverance, and self-regulated learning.
• Searched the literature for relevant instruments measuring collaboration, perse
verance, and self-regulated learning.
• Determined the eligibility for inclusion in this resource of instruments identified
in the literature search.
• Reviewed the reliability and validity information available for eligible instruments.
• Determined whether the reliability and validity information met conventionally
accepted criteria.
Described collaboration, perseverance, and self-regulated learning skills
The Regional Educational Laboratory (REL) Northeast & Islands and Champlain Valley
School District (CVSD) collaborated to describe each of the social and emotional learning
skills (commonly referred to as constructs in the research literature) to be addressed in this
resource:
• Collaboration: collaborating effectively and respectfully to enhance the learning
environment.
• Perseverance: persevering when challenged, dealing with failure in a positive way,
seeking to improve one’s performance.
• Self-regulated learning: taking initiative in and responsibility for learning.
These descriptions align with commonly used definitions in the research for collaboration
(Johnson & Johnson, 1994; Kuhn, 2015; Wang, MacCann, Zhuang, Liu, & Roberts, 2009);
perseverance (Duckworth & Quinn, 2009; Eccles, Wigfield, & Schiefele, 1998; Pintrich,
2003; Seifert, 2004; Shechtman et al., 2013); and self-regulated learning (Muis, Winne, &
Jamieson-Noel, 2007; Zimmerman, 2000).
Identified terms related to collaboration, perseverance, and self-regulated learning
The study team identified terms that are synonymous with or related to collaboration,
perseverance, and self-regulated learning. For collaboration, related terms included:
• Collaborative competence.
• Collaborative learning.
• Collaborative problem-solving.
• Cooperation.
• Cooperative learning.
• Teamwork.
For perseverance, related terms included:
• Academic courage.
• Academic motivation.
• Conscientiousness.
• Coping.
• Delayed gratification.
• Determination.
A-1• Grit.
• Locus of control.
• Motivation.
• Persistence.
• Resilience.
• Self-control.
• Self-discipline.
• Self-management.
• Tenacity.
For self-regulated learning, related terms included:
• Achievement goals.
• Metacognition.
• Motivation.
• Self-efficacy.
• Task interest.
Searched the literature for relevant instruments
Next, both peer-reviewed academic literature and reports by practitioners and districts
were searched. The following were among the main sources searched:
• Academic journal databases such as EBSCOhost (https://www.ebscohost.com) and
the Education Resources Information Center (ERIC) (https://eric.ed.gov).
• Institute of Education Sciences publications (https://ies.ed.gov).
• National Science Foundation (https://www.nsf.gov) publications and databases of
instruments (for example, through the STEM Learning and Research Center at
Education Development Center, http://stelar.edc.org).
• Compendia of instruments and literature reviews (Atkins-Burnett, Fernández,
Akers, Jacobson, & Smither-Wulsin, 2012; Fredricks & McColskey, 2012; Fredricks
et al., 2011; Kafka, 2016; National Center on Safe Supportive Learning Environ
ments, 2017; O’Conner, De Feyter, Carr, Luo, & Romm, 2017; Rosen, Glennie,
Dalton, Lennon, & Bozick, 2010).
• Other relevant databases such as the American Psychological Association
PsycTESTS database (http://www.apa.org/pubs/databases/psyctests/) and the Rush
University Neurobehavioral Center’s SELweb database (http://rnbc.org/research/
selweb/).
• Other relevant sources, including those from the following organizations:
• The Collaborative for Academic, Social, and Emotional Learning at the Uni
versity of Illinois at Chicago (http://www.casel.org).
• The National Academies of Sciences, Engineering, and Medicine (http://www.
nationalacademies.org).
• The Ecological Approach to Social Emotional Learning Laboratory at the
Harvard Graduate School of Education (https://easel.gse.harvard.edu).
• The Massachusetts Consortium for Innovative Education Assessment (http://
www.mciea.org).
• P21 Partnership for 21st Century Learning (http://www.p21.org).
• The Character Lab (https://characterlab.org).
A-2Searches were also conducted for performance assessments, self-report surveys, and other
instruments that seek to measure collaboration, perseverance, and self-regulated learning.
For example, search terms related to perseverance included:
• [Assessment] AND [Student] AND [Perseverance].
• [Instrument] AND [Student] AND [Perseverance].
• [Performance-based] AND [Assessment] AND [Student] AND [Perseverance].
• [Research] AND [Studies] AND [Student] AND [Perseverance].
• [Survey] AND [Student] AND [Perseverance].
Searches also included terms related to collaboration, perseverance, and self-regulated
learning. For example, a secondary search focused on identifying instruments that measure
perseverance included:
• [Assessment] AND [Student] AND [Persistence].
• [Instrument] AND [Student] AND [Persistence].
• [Performance-based] AND [Assessment] AND [Student] AND [Persistence].
• [Research] AND [Studies] AND [Student] AND [Persistence].
• [Survey] AND [Student] AND [Persistence].
Determined the eligibility of instruments identified in the literature search
A study team member used a structured protocol to determine the eligibility of instru
ments identified in the literature search. A second study team member then checked the
instruments identified as meeting the eligibility criteria against the protocol a second time.
In cases of disagreement on the eligibility of an instrument, the two study team members
met to discuss and resolve the discrepancy.
Instruments were deemed eligible if they:
• Measured one of the three targeted social and emotional learning skills.
• Were used with a population of secondary school students in the United States.
• Were publicly available online at no or low cost.
• Were published or had psychometric validation work completed between 1994 and
2017.
• Were not published as part of a doctoral dissertation.
Reviewed the reliability and validity information available for eligible instruments
A second search procedure was carried out for each of the instruments that met the initial
screening criteria to identify studies that might provide information about the reliability
and validity properties for each instrument. Search terms included: [Name of the instru
ment] AND [terms that included but were not limited to psychometrics, measurement,
reliability, and validity].
The reliability and validity information for eligible instruments was then categorized and
summarized to assess whether a measure was reliable and valid to use. Messick’s (1995)
construct validity framework was used in evaluating six components of validity for each
instrument: content, substantive, structural, external, generalizability, and consequential.
In addition, this resource presents information on fairness and reliability, as recommended
in Standards for Educational and Psychological Testing (American Educational Research
Association, American Psychological Association, and National Council on Measurement
A-3You can also read