A review of instruments for measuring social and emotional learning skills among secondary school students

Page created by Anne Walker
 
CONTINUE READING
October 2019

       Tools

      A review of instruments
     for measuring social and
  emotional learning skills among
    secondary school students
                                           Joshua Cox
                                         Brandon Foster
                                          David Bamat
                                Education Development Center, Inc.
                                In collaboration with the Social and
                               Emotional Learning Research Alliance

               Summary
               This resource is intended to support state and local education
               agencies in identifying reliable and valid instruments to measure
               three social and emotional learning skills among secondary
               school students: collaboration, perseverance, and self-regulated
               learning.

U.S.   Department     of   Education

                                                                               At Education Development Center, Inc.
U.S. Department of Education
Betsy DeVos, Secretary

Institute of Education Sciences
Mark Schneider, Director

National Center for Education Evaluation and Regional Assistance
Matthew Soldner, Commissioner
Elizabeth Eisner, Associate Commissioner
Amy Johnson, Action Editor
Felicia Sanders, Project Officer

REL 2020–010

The National Center for Education Evaluation and Regional Assistance (NCEE) conducts
unbiased large-scale evaluations of education programs and practices supported by federal
funds; provides research-based technical assistance to educators and policymakers; and
supports the synthesis and the widespread dissemination of the results of research and
evaluation throughout the United States.

October 2019

This report was prepared for the Institute of Education Sciences (IES) under Contract
ED-IES-17-C-0008 by Regional Educational Laboratory Northeast & Islands administered
by Education Development Center, Inc. The content of the publication does not neces­
sarily reflect the views or policies of IES or the U.S. Department of Education, nor does
mention of trade names, commercial products, or organizations imply endorsement by the
U.S. Government.

This REL report is in the public domain. While permission to reprint this publication is
not necessary, it should be cited as:

Cox, J., Foster, B., & Bamat, D. (2019). A review of instruments for measuring social and
emotional learning skills among secondary school students (REL 2020–010). Washington, DC:
U.S. Department of Education, Institute of Education Sciences, National Center for Edu­
cation Evaluation and Regional Assistance, Regional Educational Laboratory Northeast &
Islands. Retrieved from http://ies.ed.gov/ncee/edlabs.

This report is available on the Regional Educational Laboratory website at http://ies.ed.gov/
ncee/edlabs.
Summary

This resource was designed to support state and local education agencies in identifying
reliable and valid instruments for measuring collaboration, perseverance, and self-regulated
learning among secondary school students. It was developed through the Regional Educa­
tional Laboratory Northeast & Islands Social and Emotional Learning Alliance, whose
members aim to identify and synthesize emerging evidence on social and emotional learn­
ing to improve education outcomes. The alliance’s focus on social and emotional learning
skills is supported by evidence suggesting that these skills may improve students’ academ­
ic and career outcomes (Farrington et al., 2012; Gayl, 2017; Heckman, 2008; West et al.,
2016). This resource can help alliance members and other state and local education agen­
cies develop and track students’ social and emotional learning skills as an indicator of
student success within accountability models required by the Every Student Succeeds Act
of 2015.

This resource supports stakeholders in:
   • Identifying available instruments for measuring collaboration, perseverance, and
        self-regulated learning among secondary school populations.
   • Understanding the information about reliability and validity that is available for
        each of these instruments.

In addition, the resource offers questions that schools and districts should consider when
reviewing reliability and validity information.

                                             i
Contents

Summary                                                                                                i

Why this resource?                                                                                    1

What is this resource?                                                                                2
Worksheet 1. Questions to identify and evaluate instruments for measuring social and
     emotional learning skills                                                                        6
Step 1. Indicate your response for each of the following questions. Be as specific as possible.
    Use this information to identify an initial list of instruments that your school or district
    might use.                                                                                        6
Step 2. Now that you have identified the specific skills to be measured, the target group of
    respondents, and the purpose for using the instrument, work through the following yes/
    no questions and associated considerations.                                                       7

What instruments are available?                                                                       8

What information is available about the reliability of the instruments?                               9

What information is available about the validity of the instruments?                                 11
Content validity                                                                                     11
Structural validity                                                                                  12
External validity                                                                                    12
Consequential validity                                                                               12
Generalizability, fairness, and substantive validity                                                 13

Implications                                                                                         14

Limitations                                                                                          14

Appendix A. Methodology                                                                             A-1

Appendix B. Summary of reliability and validity information on eligible instruments                 B-1

Notes                                                                                         Notes-1

References                                                                                         Ref-1

Boxes
1   Methodology                                                                                       2
2   Definitions of key terms                                                                          3

Tables
1    Format of eligible instruments, by social and emotional learning skill                           9
2    Availability of reliability and validity information in the 16 eligible instruments             10
3    Snapshot of whether reliability and structural validity information met conventionally
     accepted criteria                                                                               11
4    Snapshot of consequential validity information                                                  13

                                                  ii
B1    Revised Self-Report Teamwork Scale: Summary of reliability and validity information       B-1
B2    Teamwork Scale: Summary of reliability and validity information                           B-2
B3    Teamwork Scale for Youth: Summary of reliability and validity information                 B-3
B4    Subjective Judgement Test: Summary of reliability and validity information                B-4
B5    Teacher-Report Teamwork Assessment: Summary of reliability and validity information       B-5
B6    Engagement with Instructional Activity: Summary of reliability and validity information   B-6
B7    Expectancy-Value-Cost Scale: Summary of reliability and validity information              B-7
B8    Grit Scale—Original Form: Summary of reliability and validity information                 B-8
B9    Grit Scale—Short Form: Summary of reliability and validity information                    B-9
B10   Inventory of Metacognitive Self-Regulation on Problem-Solving: Summary of
      reliability and validity information                                                      B-10
B11   Junior Metacognitive Awareness Inventory: Summary of reliability and validity
      information                                                                               B-11
B12   Self-Directed Learning Inventory: Summary of reliability and validity information         B-12
B13   Self-Regulation Strategy Inventory—Self-Report: Summary of reliability and validity
      information                                                                               B-13
B14   Motivated Strategies for Learning Questionnaire: Summary of reliability and validity
      information                                                                               B-14
B15   Program for International Student Assessment Student Learner Characteristics as
      Learners: Summary of reliability and validity information                                 B-15
B16   Student Engagement Instrument: Summary of reliability and validity information            B-16

                                                iii
Why this resource?

Students who enhance their social and emotional learning skills may also improve their
academic and career outcomes (Farrington et al., 2012; Gayl, 2017; Heckman, 2008; West
et al., 2016). These skills may also be malleable and amenable to intervention (Durlak,
Weissberg, Dymnicki, Taylor, & Schellinger, 2011; What Works Clearinghouse, 2007).
Accordingly, some education agencies have started to develop and track students’ social
and emotional learning skills as an indicator of student success within accountability
models required by the Every Student Succeeds Act of 2015.

However, researchers have not reached consensus on the best ways to measure social and
emotional learning skills or the appropriate uses of existing instruments (see, for example,
Duckworth & Yeager, 2015). Some researchers have cautioned against using instruments
to measure social and emotional learning skills for summative purposes in high-stakes
accountability models because most instruments are fairly new and lack adequate informa­
tion on their reliability and validity (Duckworth & Yeager, 2015; Melnick, Cook-Harvey,
& Darling-Hammond, 2017). Reliability refers to whether an instrument consistently mea­
sures the skill across respondents, time, or raters. Validity refers to whether an instrument
measures what it intends to measure and whether the inferences drawn from an instru­
ment are appropriate. Indeed, this review of instruments for measuring social and emo­
tional learning skills found no explicit language from their developers suggesting that any
of them should be used for summative purposes. It may be more appropriate for schools
to use data collected from instruments that measure social and emotional learning skills
for formative purposes to inform teaching, learning, and program investments (Melnick
et al., 2017). To make these decisions, state and local education agency administrators need
information to better understand the intended uses and limitations of available instru­
ments, as well as their reliability and validity.

The Regional Educational Laboratory Northeast & Islands Social and Emotional Learn­
ing Alliance is a researcher–practitioner partnership comprising researchers, regional edu­
cators, policymakers, and others. Alliance partners share an interest in identifying and
synthesizing evidence on social and emotional learning skills to improve education out­
comes. Champlain Valley School District (CVSD) in Vermont, one of the alliance’s part­
ners, seeks to identify instruments for measuring three social and emotional learning skills
described in its mission statement: collaboration, perseverance, and self-regulated learning
(Champlain Valley School District, 2016). Its intention is to better understand students’
intrapersonal and interpersonal competencies, the two main areas into which social and
emotional learning competencies are organized. Intrapersonal competencies refer to how
students deal with their own thoughts and emotions, and interpersonal competencies refer
to the skills and attitudes students use to interact with others.

CVSD is exploring ways to use scores from instruments that measure collaboration, per­
severance, and self-regulated learning to formally evaluate students’ social and emotion­
al learning skills, develop its formative and summative assessment systems in accordance
with the Every Student Succeeds Act, and provide guidance for educators on measuring
and integrating social and emotional learning skills into instruction.

                                             1
What is this resource?

This resource supports local and state education agencies’ efforts to identify reliable and
valid instruments that measure social and emotional learning skills (see box 1 for an over­
view of methods and appendix A for more detailed information). The resource also pro­
vides questions that schools and districts should consider when reviewing reliability and
validity information. Where possible, the resource describes whether the available reliabil­
ity and validity information meets conventionally accepted criteria. In recognition that
each school or district context is unique, this resource does not recommend specific instru­
ments to use with accountability models required by the Every Student Succeeds Act.

This resource supports stakeholders in:
   • Identifying available instruments for measuring collaboration, perseverance, and
        self-regulated learning among secondary school populations.
   • Understanding information about reliability and validity that is available for each
        of these instruments.

Box 1. Methodology

Instrument identification process
The instruments were identified and reviewed in a six-step process:
1. Described collaboration, perseverance, and self-regulated learning skills.
2. Identified terms related to collaboration, perseverance, and self-regulated learning.
3. Searched the literature for relevant instruments measuring collaboration, perseverance,
    and self-regulated learning.
4. Determined the eligibility for inclusion in this resource of instruments identified in the
    literature search.
5. Reviewed the reliability and validity information available for eligible instruments.
6. Determined whether the available reliability and validity information met conventionally
    accepted criteria.

Construct descriptions
To undertake the first step listed above, the Regional Educational Laboratory Northeast &
Islands and Champlain Valley School District (CVSD) partnered to describe each of the social
and emotional learning skills (or constructs) to be addressed in this resource:
• Collaboration: collaborating effectively and respectfully to enhance the learning
     environment.
• Perseverance: persevering when challenged, dealing with failure in a positive way, and
     seeking to improve one’s performance.
• Self-regulated learning: taking initiative and responsibility for learning.
     These descriptions align with commonly used definitions in the research (for example,
Kuhn, 2015; Shechtman, DeBarger, Dornsife, Rosier, & Yarnall, 2013; Zimmerman, 2000).1

Eligibility criteria
Instruments had to meet the following eligibility criteria to be included in this resource:
• Measure one of the three targeted social and emotional learning skills.

                                                                                       (continued)

                                                2
Box 1. Methodology (continued)

•    Have been used with a population of secondary school students (students in grades 9–12)
     in the United States.2
• Be publicly available online at no or low cost.3
• Be published or had psychometric validation work completed between 19944 and 2017.
• Not be published as part of a doctoral dissertation.5
     As recommended by the Standards for Educational and Psychological Testing (American
Educational Research Association, American Psychological Association, & National Council on
Measurement in Education, 2014), this resource presents information on fairness and reliabil ­
ity. Six components of validity were also evaluated for each instrument following Messick’s
(1995) construct validity framework: content, substantive, structural, external, generalizability,
and consequential. See box 2 for definitions of reliability, validity, the six components of validi ­
ty, and fairness. Each eligible instrument was then categorized using the criteria in box 2 (see
appendix B for a summary of psychometric evidence for eligible instruments).

Notes
1. There are many definitions of these skills in the research literature. See appendix A for additional citations
for each skill.
2. CVSD is specifically interested in identifying reliable and valid instruments for secondary school students.
3. Instruments published in research journals at a relatively low cost (less than $50) were included.
4. A group of educators, researchers, and child advocates met in 1994 to discuss social and emotional learn ­
ing and coined the term (Durlak, Domitrovich, Weissberg, & Gullotta, 2015).
5. These instruments were excluded because they are not typically analyzed with the same rigor as other pub ­
lished instruments (for example, instruments published in peer-reviewed journals).
Source: American Educational Research Association, American Psychological Association, & National Council
on Measurement in Education (2014) and Messick (1995).

This resource indicates whether psychometric information was available for reliability and
seven components of validity—content, structural, external, consequential, generalizabil­
ity, fairness, and substantive1 (see box 2 for definitions). Schools and districts can use reli­
ability and validity information to evaluate whether the adoption of a measure will likely
produce sufficient information to meet the needs of that school or district. It is important
to note that the mere availability of reliability and validity information for a measure does
not necessarily indicate support for the use of that measure.

Box 2. Definitions of key terms

Reliability. Whether an instrument consistently measures the skill across respondents, time,
or raters. Information that could support reliability includes several families of statistics,
including measures of internal consistency such as Cronbach’s α, omega or test-retest asso­
ciations, and inter-rater reliability. For the instruments included in this resource, evaluation
of reliability information was based on the widely cited conventional criterion of a Cronbach’s
α ≥ .70 (Nunnally, 1978).
Validity. Whether an instrument measures what it is intended to measure and whether the
inferences drawn from an instrument are appropriate. This resource focuses on the following
seven components of validity.

                                                                                                     (continued)

                                                        3
Box 2. Definitions of key terms (continued)

Content validity. Whether items for an instrument contain content that adequately describes
the skill(s). Information that could support this component of validity includes a review of survey
items by experts, use of a well constructed theory for the skill(s), and the use of multiple per­
spectives in creating the items. An example of using multiple perspectives is asking teachers
and students to define the skill and using that information with experts to construct items.

Structural validity. Whether an instrument’s items relate statistically to the skill being mea­
sured. Information that could support this component of validity includes advanced statistical
analyses, typically factor analyses that examine whether items are correlated in expected ways.
For example, one approach to measuring structural validity is to show that an instrument’s
items statistically relate to the skills and associated subscales that the instrument measures.

External validity. Whether there are correlations between scores from the instrument and
scores from other instruments measuring similar skills. Information that could support this
component of validity includes positive correlations between scores generated from the
instrument and other similar instruments measuring the skill. For example, one approach to
measuring external validity is to establish whether correlations exist between two different
instruments that measure perseverance.

Consequential validity. Whether scores generated from an instrument are associated with the
intended consequences of using the instrument, such as improving student outcomes.1 Infor­
mation that could support this component of validity includes correlations between students’
scores on the measure and grade point average or standardized test scores, if the intended
consequence is improved student learning, or graduation or job attainment.

Generalizability validity. Whether scores from the instrument are correlated with other modes
of measurement for the same skill, such as self-reported or observational information. Unlike
external validity, which explores correlations between scores from different instruments mea­
suring similar skills, generalizability explores correlations between scores from different modes
of measurement of the same skill. Information that could support generalizability includes cor­
relations between students’ self-report scores and either teacher-report scores for students or
other observational instruments measuring student behavior.

Fairness. Whether an instrument is not biased against specific subgroups of students. Information
that could support this component of validity includes having versions of the measure available
in multiple languages or statistical tests showing that scores from the measure function similarly
across all subgroups. For example, this could include students from different racial/ethnic back­
grounds, students eligible for the national school lunch program, or English learner students.

Substantive validity. Whether students process an instrument’s items or tasks in the way
the developers intended. Information that could support this component of validity includes
cognitive interviews with a series of students as they engage with items from the measure. For
example, interviewing students can provide information about sources of confusion that might
emerge as students respond to items on the measure.

Note
1. According to Messick (1995, p. 11), “the consequential aspect of construct validity includes evidence and
rationales for evaluating the intended and unintended consequences of score interpretation and use in both
the short and long term.” In practice, the consequences of using a measure can be quite broad and depend on
how a district might propose to use a measure. However, the purpose of this resource is to review instruments
that might be predictive of important student outcomes. Thus, a consequence of using one of the instruments
in this resource should be that it helps districts focus on developing behaviors or skills that could predict
important student outcomes (for example, test scores, graduation, and college enrollment). For this reason
the resource defines consequential validity by whether scores generated on the measure are associated with
important student outcomes. There are many definitions of these skills in the research literature. See appen ­
dix A for additional citations for each skill.

                                                      4
Practitioners using this resource should evaluate whether the reliability and validity infor­
mation for each instrument provides sufficient support for adopting the measure in their
district. Practitioners looking for support in evaluating the psychometric information for
each instrument can find a series of guiding questions in worksheet 1 to help identify the
information necessary to select instruments that are likely to be useful in various local
contexts.

Different uses of an instrument will place unique demands on the quality of reliability and
validity information presented. For example, a district could be interested in monitoring
perseverance among students. In that case, whether an instrument has demonstrated cor­
relations with other student outcomes (that is, consequential or generalizability validity)
may not be a concern. On the other hand, practitioners might be interested in ensuring
that an instrument has demonstrated correlations with other student outcomes if their
goal is to provide formative feedback to support a broader goal such as improving students’
grades or increasing educational attainment. Thus, practitioners should first identify the
components of reliability and validity that are most important to their school or district,
which puts them in a better position to evaluate reliability and validity information for
instruments measuring social and emotional learning skills.

                                             5
Worksheet 1. Questions to identify and evaluate instruments for measuring social
and emotional learning skills

Practitioners can use the questions in this worksheet to identify instruments and determine
whether those instruments align with their needs. The questions in step 1 can be used to
identify the skills that will be measured, the target group of respondents, and the purpose for
using the instrument. The questions in step 2 can be used to identify the components of reli­
ability and validity that are most important in the practitioner’s school or district. Practitioners
can use their yes/no responses in step 2 and accompanying considerations when reviewing
the psychometric information for each instrument presented in appendix B. The worksheet
concludes by offering additional considerations to support practitioners in identifying whether
administering an instrument is feasible in their school or district.

Step 1. Indicate your response for each of the following questions. Be as specific as possible. Use
this information to identify an initial list of instruments that your school or district might use.
What are the specific skills to be measured? That is, are you are interested in measuring
collaboration, perseverance, or self-regulated learning? (Then, look at table 1 to see which
instruments cover the specific skills of interest to you.)

What students are you planning to assess (for example, all high school students)?

What is the purpose for using an instrument (for example, to provide support to teachers)?

                                                  6
Step 2. Now that you have identified the specific skills to be measured, the target group of
respondents, and the purpose for using the instrument, work through the following yes/no questions
and associated considerations.
You can use responses to this worksheet to synthesize and evaluate information in the tables
in appendix B. You can also use table 2 to quickly identify whether information on reliability and
validity is available for each instrument.
Note: For each instrument, practitioners should, at a minimum, consider information presented in appendix B
on reliability, content validity, and structural validity. Table 3 in the main text provides information on whether
reliability information and structural validity information met conventionally accepted criteria. If this information
is not provided, or is provided but is not within conventionally accepted criteria, then scores generated from the
instrument may not be useful. Finally, reviewing content validity is necessary since it considers how the content
from an instrument’s items overlaps with the practitioner’s understanding of a particular social and emotional
learning skill. The alignment between the developer’s definition of a social and emotional learning skill and the
practitioner’s is a core issue for ensuring that a school or district is measuring what it intends to measure.

 Question                                             Response and considerations
 Is it important that students were involved in                               ■ Yes ■ No
 the instrument’s development process?                If yes, consider that this resource found no information on
                                                      substantive validity for any of the instruments reviewed.
 Are you interested in using scores from                                     ■ Yes ■ No
 the instrument along with instruments that           If yes, consider examining information presented in
 measure other related social and emotional           appendix B for external validity to check whether scores
 learning skills?                                     generated from the instrument are related to scores
                                                      from other conceptually similar instruments of social and
                                                      emotional learning skills.
 Are you interested in measuring a specific                                  ■ Yes ■ No
 social and emotional learning skill using more       If yes, consider examining information presented in
 than one mode of measurement, such as                appendix B for generalizability validity to see if analyses
 student self-report surveys and observations?        were undertaken to establish whether scores from
                                                      the instrument are correlated with other modes of
                                                      measurement of the same social and emotional learning
                                                      skill.
 Are you interested in connecting your students’                             ■ Yes ■ No
 social and emotional learning skills scores          If yes, consider information presented in appendix B
 to other consequential outcomes, such as             for consequential validity and table 4 in the main text.
 achievement scores, graduation rates, and            Specifically, examine whether scores from the instrument
 attendance?                                          are correlated with other desired student outcomes.
 Are you interested in comparing scores on the                               ■ Yes ■ No
 instrument for different subgroups of students       If yes, consider examining information presented in
 (for example, by race/ethnicity, eligibility for     appendix B for fairness to see whether information is
 the national school lunch program, or English        available about the specific subgroups you are comparing.
 learner student status)?

Practitioners should also consider the following questions for all instruments under consideration:
• How affordable is the instrument to administer to the target group of respondents? Prac­
    titioners should consider costs associated with purchasing the instrument, if applicable,
    as well as any administrative costs associated with administering and scoring the assess­
    ment and reporting the results.
• How much time is available for students to complete an instrument? Practitioners might
    consider piloting the instrument with a few students to better understand the amount of
    time that students require to complete the instrument.
• What format of administration is feasible in your context? Practitioners might consider
    piloting the instrument with a few students to better understand whether the format is
    feasible in their context.
• How will scores be reported? Are the scores easily interpreted and useful? Practitioners
    should consider their purpose for administering the instrument (see step 1) and whether
    the results from the instrument will support that purpose.

                                                          7
What instruments are available?

In total, 16 instruments were assessed to be eligible for inclusion in the resource. The
initial search yielded 67 instruments as possible measures of one or more of the social and
emotional learning skills of interest. Of those 67 instruments, 30 were excluded because
they had not been administered with secondary school students in the United States, 24
were excluded because they were not publicly available, and 7 were excluded because they
were not published or had not undergone validation work between 1994 and 2017.2

Eligible instruments included five measures of collaboration, four measures of persever­
ance, four measures of self-regulated learning, and three measures of both perseverance and
self-regulated learning (table 1). All instruments measuring perseverance or self-regulated
learning were student self-report surveys. Three of the five instruments measuring collab­
oration were student self-report surveys, one was task or performance based, and one was a
teacher-report survey.

Note that different instruments measuring a social and emotional learning skill can define
that skill differently. For example, the Motivated Strategies for Learning Questionnaire
measures self-regulated learning, which the authors describe as including planning, moni­
toring, and regulating activities (Pintrich, Smith, García, & McKeachie, 1991, p. 23). The
Junior Metacognitive Awareness Inventory, another measure of self-regulated learning,
uses a framework focused on planning, monitoring, and evaluation (Sperling, Howard,
Miller, & Murphy, 2002, p. 55). These small differences are driven by underlying differ­
ences in the theoretical research used to shape the content of items and can ultimately
lead to variation between different instruments that measure the skill. Before using any
instrument, practitioners should examine items from the instrument to see whether they
aligns with their perspective on the skill. Social and emotional skills described in research
may share a common name, but they can be defined slightly differently by the items in an
instrument.

Instruments differ in their intended purpose, including whether for research, formative, or
summative uses. Descriptions for each of these uses is as follows:
    • Research use. The intention is to use results produced by the instrument to describe
       these skills for a particular population or to examine relationships. The research
       may then be used as evidence to inform policy or practice, but it is not typically
       linked to a punitive or beneficial consequence for individuals, schools, or districts.
    • Formative use. The intention is to use results produced by the instruments to
       inform instructional change that can influence positive change in students.
    • Summative use. The intention is to assign a final rating or score to each student
       by comparing each student against a standard or benchmark. These comparisons
       can be used to assess the effectiveness of a teacher, school, or district, and an
       assessment of underachievement can lead to negative consequences for teachers,
       schools, or districts. Additionally, these comparisons can be used to determine
       whether a student should be promoted to the next grade level or graduate. For
       that reason, summative instruments are perceived as having higher stakes, and
       instruments used for summative purposes traditionally require more stringent reli­
       ability and validity evidence before use (Crooks, Kane, & Cohen, 1996; Haladyna
       & Downing, 2004).

                                             8
Table 1. Format of eligible instruments, by social and emotional learning skill
 Social and emotional learning skill measured and instrument                 Format
 Collaboration
 Revised Self-Report Teamwork Scale                                          Student self-report survey
 Teamwork Scale                                                              Student self-report survey
 Teamwork Scale for Youth                                                    Student self-report survey
 Subjective Judgement Test                                                   Task or performance based
 Teacher-Report Teamwork Assessment                                          Teacher-report survey
 Perseverance
 Engagement with Instructional Activity                                      Student self-report survey
 Expectancy-Value-Cost Scale                                                 Student self-report survey
 Grit Scale—Original Form                                                    Student self-report survey
 Grit Scale—Short Form                                                       Student self-report survey
 Self-regulated learning
 Inventory of Metacognitive Self-Regulation on Problem-Solving               Student self-report survey
 Junior Metacognitive Awareness Inventory                                    Student self-report survey
 Self-Directed Learning Inventory                                            Student self-report survey
 Self-Regulation Strategy Inventory—Self-Report                              Student self-report survey
 Perseverance and self-regulated learning
 Motivated Strategies for Learning Questionnaire                             Student self-report survey
 Program for International Student Assessment Student Learner                Student self-report survey
 Characteristics as Learners
 Student Engagement Instrument                                               Student self-report survey

Note: Instruments are sorted alphabetically first by the measured skill and second by the format of the
instrument.
Source: Authors’ analysis based on sources shown in tables B1–B16 in appendix B.

Of the 16 instruments reviewed for this resource, 11 were used for research purposes and 5
as a formative tool. There was no explicit language suggesting that any of the instruments
should be used for summative purposes. Practitioners should avoid using an instrument for
purposes other than those outlined by the instrument developer unless there is sufficient
reliability and validity evidence supporting its use for a different purpose.

   What information is available about the reliability of the instruments?

All 16 instruments eligible for inclusion in this resource have information on reliability
(table 2; see appendix B for more detailed information). Fifteen of the instruments have
information on a measure of reliability known as Cronbach’s α, which is used to gauge
internal consistency by measuring the extent of the relationship among the items (for
example, survey questions) in a measure. Reliability statistics, such as Cronbach’s α, are
used to examine the likelihood of an instrument generating similar scores under consistent
conditions. Twelve of the instruments met conventionally accepted criteria for Cronbach’s
α, indicating that the instruments generated scores that were reliable (table 3). Two instru­
ments providing measures for Cronbach’s α showed mixed results, with some subscales in
the instrument meeting conventionally accepted criteria for reliability and some not. One
instrument reported a Cronbach’s α value that was below conventionally accepted criteria.
Instruments that meet conventionally accepted thresholds for reliability might be more
suitable for informing policy and practice.

                                                      9
Table 2. Availability of reliability and validity information in the 16 eligible
instruments
                                                                             Validity
                                                                              Conse­ Generaliz­            Substan­
 Instrument                    Reliability   Content   Structural External   quential ability   Fairness     tive
 Revised Self-Report
                                   ●           ●          ●          ●          ●        ●            ●
 Teamwork Scale
 Subjective Judgement
                                   ●           ●          ●          ●          ●        ●            ●
 Test
 Expectancy-Value-Cost
                                   ●           ●          ●          ●          ●        ●            ●
 Scale
 Teacher-Report Teamwork
                                   ●           ●                     ●          ●        ●
 Assessment
 Teamwork Scale for Youth          ●           ●          ●          ●                   ●
 Grit Scale—Original Form          ●           ●          ●          ●          ●
 Grit Scale—Short Form             ●           ●          ●          ●          ●
 Student Engagement
                                   ●           ●          ●          ●          ●
 Instrument
 Self-Directed Learning
                                   ●           ●          ●          ●          ●
 Inventory
 Teamwork Scale                    ●           ●          ●          ●          ●
 Self-Regulation Strategy
                                   ●           ●                     ●          ●
 Inventory—Self-Report
 Junior Metacognitive
                                   ●           ●          ●                     ●
 Awareness Inventory
 Motivated Strategies for
                                   ●           ●          ●          ●
 Learning Questionnaire
 Program for International
 Student Assessment
                                   ●           ●          ●
 Student Learner
 Characteristics as Learners
 Inventory of Metacognitive
 Self-Regulation on                ●           ●
 Problem-Solving
 Engagement with
                                   ●           ●
 Instructional Activity

● Information is available.    Information is not available.
Note: Instruments are sorted according to the availability of reliability and validity information.
Source: Authors’ analysis based on sources shown in tables B1–B16 in appendix B.

                                                           10
Table 3. Snapshot of whether reliability and structural validity information met
conventionally accepted criteria
                                                                         Information for        Information for
                                                                          reliability met      structural validity
                                                                         conventionally        met conventionally
 Instrument                                                             accepted criteriaa     accepted criteriab
 Junior Metacognitive Awareness Inventory                                       Yes                    Yes
 Program for International Student Assessment Student Learner
 Characteristics as Learners                                                    Yes                    Yes
 Self-Directed Learning Inventory                                               Yes                    Yes
 Student Engagement Instrument                                                  Yes                    Yes
 Subjective Judgement Test                                                      Yes                    Yes
 Teamwork Scale                                                                 Yes                    Yes
 Teamwork Scale for Youth                                                       Yes                    Yes
 Grit Scale—Original Form                                                       Yes                     No
 Revised Self-Report Teamwork Scale                                             Yes                     No
 Inventory of Metacognitive Self-Regulation on Problem-Solving                  Yes                     —
 Self-Regulation Strategy Inventory—Self-Report                                 Yes                     —
 Teacher-Report Teamwork Assessment                                             Yes                     —
 Grit Scale—Short Form                                                        Some                    Some
 Motivated Strategies for Learning Questionnaire                              Some                      No
 Engagement with Instructional Activity                                         No                      —
 Expectancy-Value-Cost Scale                                                    —                      Yes

Note: Instruments are sorted first according to whether information for the instrument met conventionally
accepted criteria for reliability and then for structural validity. The two statistics were evaluated using only
the source articles presented for each measure in appendix B. Additional details regarding conventionally
accepted criteria are available in appendix A.
a. Evaluation of reliability information was based on the widely cited conventional criterion of a Cronbach’s
α ≥ .70 (Nunnally, 1978). However, researchers have highlighted that high Cronbach’s α values also corre ­
spond with measuring a decreased range of the measured skill (Sijtsma, 2009). Yes indicates Cronbach’s
α ≥ .70; No indicates Cronbach’s α < .70; Some indicates mixed results, with some subscales containing Cron ­
bach’s α values that meet the .70 threshold and some containing values that do not; and — indicates that
articles did not provide information for Cronbach’s α.
b. Evaluation of structural validity information was based on whether the articles reported confirmatory
factor models with fit statistics for the models falling into conventionally acceptable ranges: Tucker Lewis
index ≥ .90, comparative fit index ≥ .90, standardized root mean square residual ≥ .08, and root mean square
error of approximation > .05. Some indicates mixed results, with some subscales meeting conventionally ac ­
cepted criteria while others did not, and — indicates that articles that did not provide information for confirma ­
tory factor tests.
Source: Authors’ analysis based on sources shown in tables B1–B16 in appendix B.

     What information is available about the validity of the instruments?

All 16 instruments eligible for inclusion in this resource had information related to at least
one component of validity (see table 2).

Content validity

The component of validity most commonly available for eligible instruments was content
validity (see table 2). While content validity is often supported by reviews involving multi­
ple stakeholders with content expertise, none of the instruments included in this resource
had evidence of these types of review. All 16 instruments did, however, offer information

                                                         11
on content validity in discussions about the theory and previous research that guided the
construction of items that make up the instruments.

Structural validity

Twelve of the 16 instruments in this resource had information on structural validity (see
table 2). Eight of these instruments were supported by evidence indicating that the instru­
ment met conventionally accepted criteria based on results from confirmatory factor anal­
ysis, a statistical technique to establish whether the items from a measure are related to
the centrally measured skill (see table 3). One additional measure showed mixed results
because the authors reported only one of many statistics needed to assess whether the
model fit adequately.

The predominant form of information provided for structural validity was results from
confirmatory factor analysis. Social and emotional learning skills are often made up of
more than one component. For this reason, instruments that measure social and emo­
tional learning skills often comprise subscales that measure the components that make up
the broader social and emotional learning skill. For example, developers of the Grit Scale
(Duckworth, Peterson, Matthews, & Kelly, 2007) hypothesize that the instrument consists
of two separate but related components, Consistency of Interest and Perseverance of Effort.
A confirmatory factor analysis of the Grit Scale was conducted to answer whether the
data confirmed the existence of these two separate components. All items for a measure
should conceptually align with the overall concept of the skill in some way, with each item
describing a different component of the skill.

External validity

Twelve of the 16 instruments in this resource had information on external validity (see
table 2). External validity refers to information about the extent to which results produced
from an instrument are correlated with results from other instruments in strength and
direction in the theoretically expected manner. For instance, if instruments for depres­
sion, anxiety, and happiness were administered to the same sample, scores on the depres­
sion instrument would be expected to be positively associated with scores on the anxiety
instrument, while scores on the happiness instrument would be expected to be negative­
ly associated with scores on the depression instrument. If these relationships were not
observed, then the instrument likely did not measure what it was intended to measure and
its external validity is questionable.

Consequential validity

Eleven of the 16 instruments in this resource had information on consequential validity
(see table 2). Consequential validity refers to information about the extent to which results
produced by an instrument are associated with important student outcomes. The predom­
inant student outcomes examined in these studies were student achievement outcomes.
These associations were in the expected positive direction for 10 of these 11 instruments
(table 4). One measure, the Student Engagement Instrument, was also negatively correlat­
ed with student suspension rates.

                                             12
Table 4. Snapshot of consequential validity information
                                                        Positively correlated     Type of consequential
                                                         with consequential       student outcome
 Instrument                                              student outcomesa        reported in article
 Student Engagement Instrument                                   Yes              Grade point averages,
                                                                                  course grades, and
                                                                                  suspension rates
 Revised Self-Report Teamwork Scale                              Yes              Course grades
 Teamwork Scale                                                  Yes              Grade point averages
 Expectancy-Value-Cost Scale                                     Yes              Achievement scores
 Junior Metacognitive Awareness Inventory                        Yes              Swanson Metacognitive
                                                                                  Questionnaire, course grades,
                                                                                  and grade point averages
 Self-Directed Learning Inventory                                Yes              Grade point averages
 Self-Regulation Strategy Inventory—Self-Report                  Yes              Course grades
 Subjective Judgement Test                                       Yes              Course gradesb
 Teacher-Report Teamwork Assessment                              Yes              Course grades
 Grit Scale—Short Form                                           Yes              Achievement scores
 Grit Scale—Original Form                                        No               Achievement scores
 Engagement with Instructional Activity                           —               —
 Inventory of Metacognitive Self-Regulation on                    —               —
 Problem-Solving
 Motivated Strategies for Learning Questionnaire                  —               —
 Program for International Student Assessment                     —               —
 Student Learner Characteristics as Learners
 Teamwork Scale for Youth                                         —               —

Note: Instruments are sorted in the table according to whether they are positively correlated with consequen ­
tial student outcomes.
a. Yes indicates a positive association with consequential student outcome; no indicates a negative associa ­
tion with consequential student outcome; and — indicates that the articles did not provide information.
b. Although correlations between scores on the Subjective Judgement Test and course grades were in the
expected direction, the correlation was not statistically significant.
Source: Authors’ analysis based on sources shown in tables B1–B16 in appendix B.

Generalizability, fairness, and substantive validity

Five of the 16 eligible instruments had information on generalizability (see table 2). Gener­
alizability is the extent to which results measuring a trait are associated with results mea­
suring the same trait using a different mode of measurement, such as student self-report
and teacher report.

Three of the 16 instruments had information on fairness. Although some of the instru­
ments indicated that the sought-after information existed in multiple languages, the search
did not identify information on whether the instruments behave similarly for different
social groups.

No instruments had information on substantive validity. Substantive validity refers to
whether respondents process items or tasks in the way the developers intended. One way
that instrument developers can determine whether there is evidence of substantive validity
is to conduct cognitive interviews during instrument development and collect verbal infor­
mation from individuals as they respond to individual items within a particular measure.

                                                      13
Information collected in these interviews is used to ensure that respondents are engaging
with the instrument as the developers intended.

                                       Implications

This resource included 16 publicly available instruments that focus on measuring collabo­
ration, perseverance, and self-regulated learning among secondary school students in the
United States. More instruments were initially identified but ultimately excluded because
they were not intended for use with secondary school students. Practitioners should use
caution when administering any instrument that was not developed for the population of
students that the instrument will be used to evaluate because those instruments often lack
psychometric evidence for that population. In the absence of such psychometric evidence,
practitioners cannot ensure that the analyses of scores generated from the measure are
reliable or valid for the target population. For example, analyzing student change across
time, comparing key subgroups, and benchmarking current levels of a trait in the district
are not warranted.

Among the 16 instruments identified, 11 were developed for use in research and 5 for
formative instruction. None of the information collected suggested that the instruments
should be used for summative purposes. With schools and districts ramping up efforts to
measure social and emotional learning skills for formative and summative use, practitioners
would benefit from the development of additional instruments for these purposes. Likewise,
additional work is needed to better understand whether existing instruments that were not
specifically developed for formative or summative purposes can be used for those purposes.
Meanwhile, practitioners should be cautious when using any measure for summative pur­
poses that has not been developed and validated for that purpose. Without evidence to
support that an instrument is valid and reliable for a specific purpose, administrators are at
risk of using an invalid and unreliable assessment to inform high-stakes decisionmaking.

Finally, none of the instruments identified in this resource had information for substantive
validity, and only three had information on fairness. Information for the substantive com­
ponent of validity is necessary to facilitate understanding of whether respondents process
the content of items from a measure as the developers intended. Information on fairness
is necessary for evaluating whether the measure is valid for comparing scores between
subgroups of students. To help practitioners better understand whether instruments are
measuring social and emotional learning skills for all populations of students, instrument
developers could assess the substantive and fairness components of validity. Practitioners
should use caution when administering instruments that lack information on substantive
validity or fairness, since these instruments may not be appropriate for all students that are
being evaluated.

                                       Limitations

The first limitation of this resource is that only a small subset of instruments was reviewed.
The criteria for inclusion in this resource were more specific than in prior reviews of
instruments measuring social and emotional learning skills. For example, only instruments
measuring collaboration, perseverance, or self-regulated learning were reviewed. Further,
this resource excludes any instrument that had not been used with a secondary school
student population in the United States. These criteria were established because CVSD

                                             14
was interested in identifying instruments that measure collaboration, perseverance, and
self-regulated learning in students in secondary school. Similarly, with a focus on iden­
tifying instruments that practitioners can use in their schools and districts, this resource
excludes any instrument that was not publicly available.

Second, an implicit assumption of the study was that only instruments or validation studies
that appeared in response to the search queries were included. It is possible that other
instruments or validation studies exist that were not identified in the queries. For example,
if a validation study did not include any of the search terms described in appendix A, the
psychometric information for that instrument does not appear in this resource.

                                             15
Appendix A. Methodology

The review of instruments included six primary steps:
   • Described collaboration, perseverance, and self-regulated learning skills.
   • Identified terms related to collaboration, perseverance, and self-regulated learning.
   • Searched the literature for relevant instruments measuring collaboration, perse­
       verance, and self-regulated learning.
   • Determined the eligibility for inclusion in this resource of instruments identified
       in the literature search.
   • Reviewed the reliability and validity information available for eligible instruments.
   • Determined whether the reliability and validity information met conventionally
       accepted criteria.

Described collaboration, perseverance, and self-regulated learning skills

The Regional Educational Laboratory (REL) Northeast & Islands and Champlain Valley
School District (CVSD) collaborated to describe each of the social and emotional learning
skills (commonly referred to as constructs in the research literature) to be addressed in this
resource:
     • Collaboration: collaborating effectively and respectfully to enhance the learning
          environment.
     • Perseverance: persevering when challenged, dealing with failure in a positive way,
          seeking to improve one’s performance.
     • Self-regulated learning: taking initiative in and responsibility for learning.

These descriptions align with commonly used definitions in the research for collaboration
(Johnson & Johnson, 1994; Kuhn, 2015; Wang, MacCann, Zhuang, Liu, & Roberts, 2009);
perseverance (Duckworth & Quinn, 2009; Eccles, Wigfield, & Schiefele, 1998; Pintrich,
2003; Seifert, 2004; Shechtman et al., 2013); and self-regulated learning (Muis, Winne, &
Jamieson-Noel, 2007; Zimmerman, 2000).

Identified terms related to collaboration, perseverance, and self-regulated learning

The study team identified terms that are synonymous with or related to collaboration,
perseverance, and self-regulated learning. For collaboration, related terms included:
    • Collaborative competence.
    • Collaborative learning.
    • Collaborative problem-solving.
    • Cooperation.
    • Cooperative learning.
    • Teamwork.

For perseverance, related terms included:
    • Academic courage.
    • Academic motivation.
    • Conscientiousness.
    • Coping.
    • Delayed gratification.
    • Determination.

                                                 A-1
•    Grit.
    •    Locus of control.
    •    Motivation.
    •    Persistence.
    •    Resilience.
    •    Self-control.
    •    Self-discipline.
    •    Self-management.
    •    Tenacity.

For self-regulated learning, related terms included:
    • Achievement goals.
    • Metacognition.
    • Motivation.
    • Self-efficacy.
    • Task interest.

Searched the literature for relevant instruments

Next, both peer-reviewed academic literature and reports by practitioners and districts
were searched. The following were among the main sources searched:
    • Academic journal databases such as EBSCOhost (https://www.ebscohost.com) and
        the Education Resources Information Center (ERIC) (https://eric.ed.gov).
    • Institute of Education Sciences publications (https://ies.ed.gov).
    • National Science Foundation (https://www.nsf.gov) publications and databases of
        instruments (for example, through the STEM Learning and Research Center at
        Education Development Center, http://stelar.edc.org).
    • Compendia of instruments and literature reviews (Atkins-Burnett, Fernández,
        Akers, Jacobson, & Smither-Wulsin, 2012; Fredricks & McColskey, 2012; Fredricks
        et al., 2011; Kafka, 2016; National Center on Safe Supportive Learning Environ­
        ments, 2017; O’Conner, De Feyter, Carr, Luo, & Romm, 2017; Rosen, Glennie,
        Dalton, Lennon, & Bozick, 2010).
    • Other relevant databases such as the American Psychological Association
        PsycTESTS database (http://www.apa.org/pubs/databases/psyctests/) and the Rush
        University Neurobehavioral Center’s SELweb database (http://rnbc.org/research/
        selweb/).
    • Other relevant sources, including those from the following organizations:
        • The Collaborative for Academic, Social, and Emotional Learning at the Uni­
            versity of Illinois at Chicago (http://www.casel.org).
        • The National Academies of Sciences, Engineering, and Medicine (http://www.
            nationalacademies.org).
        • The Ecological Approach to Social Emotional Learning Laboratory at the
            Harvard Graduate School of Education (https://easel.gse.harvard.edu).
        • The Massachusetts Consortium for Innovative Education Assessment (http://
            www.mciea.org).
        • P21 Partnership for 21st Century Learning (http://www.p21.org).
        • The Character Lab (https://characterlab.org).

                                                   A-2
Searches were also conducted for performance assessments, self-report surveys, and other
instruments that seek to measure collaboration, perseverance, and self-regulated learning.
For example, search terms related to perseverance included:
    • [Assessment] AND [Student] AND [Perseverance].
    • [Instrument] AND [Student] AND [Perseverance].
    • [Performance-based] AND [Assessment] AND [Student] AND [Perseverance].
    • [Research] AND [Studies] AND [Student] AND [Perseverance].
    • [Survey] AND [Student] AND [Perseverance].

Searches also included terms related to collaboration, perseverance, and self-regulated
learning. For example, a secondary search focused on identifying instruments that measure
perseverance included:
    • [Assessment] AND [Student] AND [Persistence].
    • [Instrument] AND [Student] AND [Persistence].
    • [Performance-based] AND [Assessment] AND [Student] AND [Persistence].
    • [Research] AND [Studies] AND [Student] AND [Persistence].
    • [Survey] AND [Student] AND [Persistence].

Determined the eligibility of instruments identified in the literature search

A study team member used a structured protocol to determine the eligibility of instru­
ments identified in the literature search. A second study team member then checked the
instruments identified as meeting the eligibility criteria against the protocol a second time.
In cases of disagreement on the eligibility of an instrument, the two study team members
met to discuss and resolve the discrepancy.

Instruments were deemed eligible if they:
    • Measured one of the three targeted social and emotional learning skills.
    • Were used with a population of secondary school students in the United States.
    • Were publicly available online at no or low cost.
    • Were published or had psychometric validation work completed between 1994 and
       2017.
    • Were not published as part of a doctoral dissertation.

Reviewed the reliability and validity information available for eligible instruments

A second search procedure was carried out for each of the instruments that met the initial
screening criteria to identify studies that might provide information about the reliability
and validity properties for each instrument. Search terms included: [Name of the instru­
ment] AND [terms that included but were not limited to psychometrics, measurement,
reliability, and validity].

The reliability and validity information for eligible instruments was then categorized and
summarized to assess whether a measure was reliable and valid to use. Messick’s (1995)
construct validity framework was used in evaluating six components of validity for each
instrument: content, substantive, structural, external, generalizability, and consequential.
In addition, this resource presents information on fairness and reliability, as recommended
in Standards for Educational and Psychological Testing (American Educational Research
Association, American Psychological Association, and National Council on Measurement

                                                  A-3
You can also read