Social Science and Evidence-based Everything: the case of education

Page created by Stacy Lang
 
CONTINUE READING
Educational Review, Vol. 54, No. 3, 2002

                                                                        Social Science and Evidence-based
                                                                        Everything: the case of education
                                                                        ANN OAKLEY, Social Science Research Unit, University of London Institute of
                                                                        Education, London, UK
Downloaded by [Vrije Universiteit Amsterdam] at 03:40 27 January 2013

                                                                        ABSTRACT     Recent moves in academic and policy circles to strengthen the social
                                                                        science research evidence base have raised questions about the quality and status of
                                                                        educational research. They have suggested a need for systematic research synthesis,
                                                                        for greater accessibility of sound educational research evidence, and greater respect
                                                                        for the perspectives of the different stakeholders in the educational research process.
                                                                        This paper looks at the background to ‘the evidence movement’, and discusses a
                                                                        particular government-funded initiative designed to take forward the challenge of
                                                                        systematic reviews of educational research. It considers some of the ways in which
                                                                        this activity poses challenges for social science methodology.

                                                                        A Brief History of Evidence-based Everything
                                                                        There is a long history of people’s dissatisfaction with expert and non-systematic
                                                                        approaches to evidence, mostly in medicine but also in other areas (Oakley, 1998).
                                                                        Early in the twentieth century the science of research synthesis as a remedy for this
                                                                        dissatisfaction began to emerge with Karl Pearson’s classic study in 1904 on the
                                                                        effectiveness of typhoid vaccine (Pearson, 1904), and with other work in education,
                                                                        agriculture and physics. Many of the tools of research synthesis were developed by
                                                                        American social scientists from the 1960s on. Notable in this effort were the
                                                                        educational psychologist Gene Glass and his wife, psychotherapist Mary Lee Smith,
                                                                        who computed 833 effect sizes derived from 375 studies obtained from over 1000
                                                                        citations in order to Ž nd out whether or not psychotherapy worked (Smith & Glass,
                                                                        1977). Their conclusion that psychotherapy does work made the psychotherapists
                                                                        happier than their colleagues in the social work Ž eld, who around the same time had
                                                                        to deal with a discouraging review about casework which suggested that troubled
                                                                        clients left to their own devices did no worse, and sometimes distinctly better, than
                                                                        those who were ‘helped’ by social workers (Fischer, 1973).

                                                                        Current Challenges
                                                                        Although research synthesis began in social science, today’s evidence movement in
                                                                        education and social policy more generally is very much driven by the example of
                                                                        what can be achieved with research synthesis in health care. The Cochrane Collabo-
                                                                        ration (http://www.cochrane.org), an international network of health care researchers,
                                                                        has established 50 different collaborative review groups which undertake pro-
                                                                        ISSN 0013-1911 print; 1465-3397 online/02/030277-10 Ó   2002 Educationa l Review
                                                                        DOI:10.1080/0013191022000016329
278                                                                          A. Oakley

                                                                        grammes of systematic reviews, contributing the results to an electronic database of
                                                                        research evidence, the Cochrane Library. This currently holds 1297 completed
                                                                        reviews, with another 1013 in protocol stage.
                                                                           The example of the Cochrane Collaboration has made professionals and policy-
                                                                        makers in other disciplines think hard about the parallels and differences between
                                                                        health care and other forms of professional intervention in people’s lives. As the
                                                                        President of the Royal Statistical Society said in 1996: ‘We are, through the media,
                                                                        as ordinary citizens, confronted daily with controversy and debate across a whole
                                                                        spectrum of public policy issues. But typically we have no access to any form of a
                                                                        systematic ‘evidence base’— and therefore no means of participating in the debate
                                                                        in a mature and informed manner’ (Smith, 1996, pp. 369–70).
                                                                           These criticisms have been extended to education in the form of a number of
                                                                        arguments about the unscientiŽ c, non-cumulative, uncollaborative and inaccessible
Downloaded by [Vrije Universiteit Amsterdam] at 03:40 27 January 2013

                                                                        nature of much educational research (Hargreaves, 1996; Hillage et al., 1998;
                                                                        McIntyre & McIntyre, 1999; OECD, 2001; Tooley & Darby, 1998). There have been
                                                                        many responses to these criticisms. Is the objective of basing policy and practice on
                                                                        evidence simply a government plot? Is it an attempt to restrict academic freedom by
                                                                        imposing standard criteria for research quality—much like the attempts to standard-
                                                                        ise tests for educational attainment and the quality of teachers’ work? Is it a wildly
                                                                        mistaken move to impose the ‘positivism’ of the medical model on a social world
                                                                        that cannot possibly be expected to meet the same requirements? (Atkinson, 2000;
                                                                        Elliott, 2001; Hammersley, 2001). Of course, each of these arguments is analytically
                                                                        distinct, but their coalescence at a historical moment when the notion of ‘evidence’
                                                                        is coming at us from all quarters suggests a rhetorical linkage between them.
                                                                           One way to consider these criticisms is to take a closer look at some of the work
                                                                        that is actually going on under the general heading of ‘evidence-based/informed
                                                                        education’.

                                                                        The EPPI Centre and Research Synthesis
                                                                        In 2000 the DfES funded a programme of systematic reviews of educational research
                                                                        supported by the Evidence for Policy and Practice Information and Coordinating
                                                                        (EPPI) Centre at the Social Science Research Unit, University of London Institute of
                                                                        Education. The DfES initiative at the EPPI Centre is currently funded until 2005 and
                                                                        covers schooling from 0–19. Its aims are: to set up and coordinate groups of
                                                                        researchers and others who want to do systematic reviews in education; to provide
                                                                        an infrastructure of training and support in how to do these; to develop standardised
                                                                        systems for collecting and classifying bibliographic references and for undertaking,
                                                                        holding and updating reviews; to coordinate methodological work; and to develop
                                                                        ways of working with different groups of users. The vision at the end of this work
                                                                        is REEL—The Research Evidence in Education Library. This is an electronic
                                                                        database of completed reviews of educational research evidence— and the hope is
                                                                        that it will be available in every school and educational institution in the country, and
                                                                        that it will be user-friendly and relevant to the needs and interests of a large range
                                                                        of stakeholders in the education process.
                                                                           The nearest parallel to REEL is the Cochrane Library. However, we in education
                                                                        have the opportunity to beneŽ t from the experiences of the Cochrane Collaboration
                                                                        in sponsoring systematic reviews of health care. One example of what we hope will
Evidence-based Everything                                                            279

                                                                        not happen in educational research synthesis can be illustrated by a recent enquiry
                                                                        about what is known about the best way to treat the epidemic of head lice in
                                                                        schoolchildren. Most Internet search engines will locate a large number of different
                                                                        sites relating to this topic, and many of these give contradictory advice. The
                                                                        Cochrane Library does have a recently updated review of interventions to treat head
                                                                        lice (Dodd, 2002), and this is superior to most Internet offerings in providing a
                                                                        systematic account of where the information included in it has been taken from, and
                                                                        how it has been quality assessed. The head lice treatment review is 45 pages long,
                                                                        examines 70 studies, dismisses 67 of them as being of unsound quality and makes
                                                                        detailed recommendations for future research needs. But the review is full of
                                                                        technical language and, having read the review as a user of health care, one is not
                                                                        much nearer knowing what one ought to do to treat a child with head lice than one
                                                                        was at the beginning.
Downloaded by [Vrije Universiteit Amsterdam] at 03:40 27 January 2013

                                                                           If the notion of ‘evidence’ is to mean anything other than the intellectual property
                                                                        of elite groups, the accessibility of both the process and the results of research
                                                                        synthesis to a range of users must be an integral value. The EPPI Centre work is
                                                                        based on the following principles: that different user groups need access to high
                                                                        quality review of research; that collaborative developments are needed in systematic
                                                                        review methodology; and that such moves towards a Ž rmer evidence base for social
                                                                        science and policy will get nowhere unless they follow principles of openness and
                                                                        equal participation of all stakeholders. Like all good things, this sounds wonderful.
                                                                        It is unlikely to be easy.
                                                                           During our Ž rst two years, ten groups of researchers and practitioners have been
                                                                        through a formal registration system to become review groups (RGs). The groups are
                                                                        in the following areas: English teaching; assessment and learning; school leadership;
                                                                        gender and education; post-compulsory education; inclusive education; early years;
                                                                        thinking skills; modern languages; and the impact of continuous professional devel-
                                                                        opment on classroom teaching and learning. The methods for facilitating the review
                                                                        work of these groups were piloted in a systematic review of strategies to support
                                                                        pupils with emotional and behavioural difŽ culties in mainstream primary classrooms
                                                                        conducted by two researchers at the Institute of Education and the National Foun-
                                                                        dation for Educational Research (Evans & BeneŽ eld, 2001). Two years into the
                                                                        initiative and members of the EPPI education team are beginning to re ect on their
                                                                        experiences in carrying out the Ž rst reviews (see Torgerson et al., in press, for an
                                                                        example). These re ections identify both successes and challenges, and will form an
                                                                        important learning experience for future research synthesis in education. They point
                                                                        to particular gaps in the methodology of research synthesis, among which the lack
                                                                        of agreed quality criteria for establishing the validity and reliability of ‘qualitative’
                                                                        research is probably the most critical. Although there has been some progress in this
                                                                        direction, for example in the Ž eld of health promotion reviews (Harden et al., 2001;
                                                                        Shepherd et al., 2001), it is as yet unclear how this is likely to translate into the
                                                                        related Ž eld of educational inquiry.
                                                                           The registration process for RGs includes peer refereeing of plans and protocols.
                                                                        Each review undertaken by a review group requires a detailed review protocol which
                                                                        deŽ nes the review question, inclusion and exclusion criteria and strategies for
                                                                        searching the literature. All protocols are placed on the EPPI Centre website for open
                                                                        comment. Most reviews are done in two stages: a mapping stage, in which relevant
                                                                        literature is captured and systematically keyworded to provide a descriptive account
                                                                        of the research effort in that particular area; and an in-depth review stage, in which
280                                                                         A. Oakley

                                                                        a subset of the literature found is examined and interrogated in more detail and data
                                                                        extracted from primary studies. An important feature of the reviewing software
                                                                        developed by the EPPI Centre is that it allows data to be entered on a range of study
                                                                        designs. These data are then analysed together for individual reviews, but they can
                                                                        be combined differently to answer other review questions; reviews can also be
                                                                        updated by adding data from new primary studies.
                                                                           The function of the groups is to produce systematic reviews which are part of the
                                                                        REEL electronic database of educational research reviews. The RGs decide the
                                                                        questions to be tackled in the reviews, and what kinds of studies they wish to
                                                                        include. The review process follows a clear set of stages. It begins with setting the
                                                                        review question and developing the rest of the protocol, continues through searching
                                                                        for studies, working out what literature to include at title and abstract stage,
                                                                        keywording the studies found and entering them into a bibliographic database,
Downloaded by [Vrije Universiteit Amsterdam] at 03:40 27 January 2013

                                                                        deciding which studies to look at in depth, extracting data from these, putting the
                                                                        data into a synthesis database, undertaking some quality assessment to identify
                                                                        reliable studies, and Ž nally synthesising the evidence and writing the review report.

                                                                        Systematic Reviews
                                                                        The above list includes accepted stages for a systematic review (Davies, 2000). A
                                                                        recent paper by Mark Petticrew (2001) at the Social and Public Health Sciences Unit
                                                                        in Glasgow listed eight myths about systematic reviews: that they are just like
                                                                        ordinary reviews, only bigger; that they involve statistical synthesis; that they have
                                                                        to be done by experts; that they can be done without proper information and library
                                                                        support; that they are a substitute for good quality primary research; that they are not
                                                                        relevant to the real world; they require a biomedical model with easily quantiŽ able
                                                                        outcomes; and only include randomised controlled trials.
                                                                           However, and contrary to its mythical representation, a systematic review is
                                                                        simply a way of accessing research knowledge; in this sense it is a piece of research
                                                                        in its own right. Systematic reviews synthesise the results of primary research, use
                                                                        explicit and transparent methods, and are accountable, replicable and updateable.
                                                                        Although the history of both medicine and social Ž elds such as education are full of
                                                                        sporadic attempts to develop systematic methods for Ž nding out what we know, most
                                                                        research reviews traditionally use either an expert model—in which experts are asked
                                                                        to say what they think—or a non-systematic approach (or both). Most traditional
                                                                        literature reviews are discursive rampages through selected bits of literature the
                                                                        researcher happens to know about or can easily reach on his or her bookshelves at
                                                                        the time. Neither in this approach, or in the expert model are the conclusions of a
                                                                        review based on anything explicit. It is very hard to know what authors are drawing
                                                                        on, and why they conclude what they do (Jackson, 1980).
                                                                           Table I is from a programme of health education review work we have been doing
                                                                        for a number of years. We found six reviews of older people and accident prevention
                                                                        which included a total of 137 studies. But only 33 studies were common to at least
                                                                        two reviews, only two were common to all six reviews, and only one study was
                                                                        treated consistently in all six reviews. The second example, shown in Table II, is
                                                                        from studies of anti-smoking education for young people: two so-called systematic
                                                                        reviews carried out at the same time; 27 studies included in both reviews, only three
                                                                        common to both. The last Ž gure in Table II—70—is the number of studies we knew
Evidence-based Everything                                                         281

                                                                                                TABLE I. Six reviews of older people and
                                                                                                           accident preventio n

                                                                                                Total studies included                 137
                                                                                                Common to at least two reviews          33
                                                                                                Common to all six reviews                2
                                                                                                Treated consistentl y in all reviews     1

                                                                                                Oliver et al., 1999.

                                                                                               TABLE II. Two reviews of smoking preventio n
                                                                                                       programmes for young people

                                                                                               Total studies included                  27
Downloaded by [Vrije Universiteit Amsterdam] at 03:40 27 January 2013

                                                                                               Common to both reviews                  3
                                                                                               Available for review at the time        70

                                                                                               Oakley & Fullerton, 1995

                                                                        from our systematically keyworded bibliographic database were actually available to
                                                                        be included in these reviews at the time they were done.
                                                                           The consequence of this situation is that reviewers reach very different conclu-
                                                                        sions about the nature of research evidence. In the above two examples, looking at
                                                                        what studies had been included and how they were treated explained the different
                                                                        conclusions of the different reviews about the best ways to stop older people falling
                                                                        over and persuading young people not to smoke.

                                                                        Challenges for Social Science of Evidence-based Everything
                                                                        There are at least four issues ahead of social science in accommodating itself to the
                                                                        challenge of the evidence movement. These are to: revisit critically the question of
                                                                        the differences between medicine and other professions; Ž nd ways of reducing bias
                                                                        in policy and practice evaluation; develop methods for assessing the trustworthiness
                                                                        of qualitative research; and soften the polemic of ‘quantitative’ versus ‘qualitative’
                                                                        methods.

                                                                        (i) Medicine, etc.
                                                                        In 1999, Diane Ravitch, a Research Professor in Education,  ew back to New York
                                                                        from a trip to California and the following day developed pains in her left leg
                                                                        together with a strange shortness of breath. Her neighbour, who was a radiologist,
                                                                        recognised the signs of pulmonary embolism, and sent her to hospital. There Ravitch
                                                                        listened to the doctors discussing her condition, and she began to fantasise about
                                                                        what would happen if the doctors who were gathered round her bed discussing her
                                                                        treatment were to be replaced with education experts. The Ž rst thing that would
                                                                        happen would be the disappearance of the doctors’ certainty that she had a problem.
                                                                        The education experts would argue about whether anything was actually wrong with
                                                                        her at all—after all, illness is a socially constructed experience. Some would point
                                                                        out that attributing problems to people is simply to blame the victim. In Ravitch’s
                                                                        fantasy (or nightmare) the hospital administrator walks in at this point and announces
282                                                                                 A. Oakley

                                                                                     TABLE III. Crime prevention research reviews: research design and
                                                                                                           effectivenes s claimed

                                                                                                                             ‘Successful’ ‘Unsuccessful ’
                                                                                                                             interventio n interventio n

                                                                                     Logan 1972 (N 5 100 studies).
                                                                                     Random control groups                      27%            73%
                                                                                     Non-rando m control groups                 50%            50%
                                                                                     No control groups                          69%            31%
                                                                                     Wright & Dixon 1977 (N 5 96 studies).
                                                                                     Random/matched control groups              25%            75%
                                                                                     Other control groups                       51%            49%
Downloaded by [Vrije Universiteit Amsterdam] at 03:40 27 January 2013

                                                                        that the hospital has just received a large grant to treat people like her. Immediately
                                                                        the experts decide that her symptoms are real after all, but now they are unable to
                                                                        agree on what to do. Each has their favourite cure, and each cites different sets of
                                                                        research ‘Ž ndings’ to support their case. Ravitch, who did get better, because there
                                                                        is an established and effective treatment for her complaint, was left wondering why
                                                                        medicine and education should behave so differently (Ravitch, 1999).
                                                                           Of course, it is part of the whole process of professionalisation that we should
                                                                        defend our territory and our professional claims to expertise, and this is an endemic
                                                                        reason why educationalists, social workers, criminologists and so on behave as
                                                                        though what they do is unique, and not open to question. But is it really?

                                                                        (ii) Reducing Bias
                                                                        People doing research synthesis in both health care and other areas have been
                                                                        impressed by the way in which well-designed experimental studies versus other
                                                                        kinds of studies seem to behave differently when it comes to research Ž ndings. Table
                                                                        III shows the relationship between research design and successful versus unsuccess-
                                                                        ful interventions in two reviews of crime prevention research. Experimental evalua-
                                                                        tions using random or matched control groups were considerably less likely to
                                                                        demonstrate effectiveness than those using other designs.
                                                                           There are many examples of practices which were considered helpful on the basis
                                                                        of observational evidence, but which rigorous evaluations showed to be harmful. A
                                                                        more recent example from the criminology Ž eld is a review of prison tour pro-
                                                                        grammes for preventing juvenile delinquency. Known as ‘scared straight’, these
                                                                        initiatives employed the principle that seeing people in prison would deter young
                                                                        people from committing crimes. They have enjoyed a great deal of popularity in the
                                                                        USA, despite only ‘anecdotal’ evidence of their success. A systematic review by
                                                                        Anthony Petrosino and colleagues (Petrosino et al., 2000) found that ‘scared straight’
                                                                        programmes actually have the opposite effect from the one intended, making crime
                                                                        more likely.
                                                                           These Ž ndings from research synthesis are worrying for Ž elds such as education
                                                                        where so much of the ‘evidence’ is derived from small-scale qualitative research,
                                                                        depends heavily on practitioner judgements about the right thing to do, and/or is
                                                                        taken from poorly evaluated interventions.
Evidence-based Everything                                                              283

                                                                                               TABLE IV. ‘Quantitative ’ criteria proposed by four
                                                                                               groups* for judging the trustworthines s of ‘qualita-
                                                                                                                  tive’ studies

                                                                                  Total number of criteria                                     46
                                                                                  Used in:
                                                                                                             one review                        28
                                                                                                             two reviews                       10
                                                                                                             three reviews                     6
                                                                                                             all four reviews                  2

                                                                                  *Boulton et al., 1996; Cobb & Hagemaster, 1987; Mays & Pope, 1995; Medical
                                                                                  Sociology Group, 1996.
Downloaded by [Vrije Universiteit Amsterdam] at 03:40 27 January 2013

                                                                        (iii) Critically Appraising Different Study Designs
                                                                        The third challenge, referred to earlier, concerns how to appraise different study
                                                                        designs. Whatever kinds of study we work with in research synthesis, it is necessary
                                                                        to make a decision about whether these have anything reliable to say or not. This
                                                                        ‘quality’ screening is needed for all types of research design included in a review.
                                                                        But, while there are lots of different scales for scoring the quality of experimental
                                                                        research, with some degree of consensus, for example on the importance on having
                                                                        socially comparable intervention and control groups, sorting out trustworthy from
                                                                        untrustworthy qualitative studies is a much more murky business.
                                                                           Table IV shows the ‘quantitative’ criteria for assessing quality proposed by four
                                                                        research groups. The criteria included items such as ‘study purpose clearly stated’,
                                                                        ‘adequate and appropriate Ž nal sample’ and ‘careful records of data’, and added up
                                                                        to 46 in total, of which 28 occurred in only one of the lists, ten in two, and six in
                                                                        three, with only two being common to all four lists— a clear description of the
                                                                        sample and how it was recruited, and an adequate description of how the Ž ndings
                                                                        were derived from the data. The alternative is a list, shown in Table V, of
                                                                        ‘qualitative’ criteria for assessing the trustworthiness of qualitative studies proposed
                                                                        by another four groups. There were 28 different criteria in these four lists, including
                                                                        items such as ‘persistent observation’, ‘understanding data within holistic contexts’
                                                                        and ‘the privileging of subjective meaning’. Thirteen of the criteria appeared once
                                                                        and 11 twice; one—the collection of ‘thick’ data—appeared in three lists, but none
                                                                        were common to all four.

                                                                                  TABLE V. ‘Qualitative ’ criteria proposed by four groups* for judging the
                                                                                                     trustworthines s of ‘qualitative ’ studies

                                                                                  Total number of criteria                                     28
                                                                                  Used in:
                                                                                                             one review                        13
                                                                                                             two reviews                       11
                                                                                                             three reviews                     1
                                                                                                             all four reviews                  0

                                                                                  *Leininger , 1994; Lincoln & Guba, 1985; Muecke, 1994; Popay et al., 1998.
284                                                                                          A. Oakley

                                                                        (iv) Dissolving the Methods War
                                                                        A main danger ahead is that large areas of research, traditionally important in
                                                                        education, escape the evidence net either because no one can reach any consensus
                                                                        about how to sort out the reliable from the unreliable, or (which might be worse)
                                                                        because a new orthodoxy sets in according to which qualitative research is simply a
                                                                        world apart—nothing to do with evidence at all. The last few years have seen a rash
                                                                        of papers on this theme. One recent one by a ‘qualitative’ sociologist (Barbour, 2001)
                                                                        argued that none of the Ž ve ‘technical Ž xes’ most cited in the methods literature as
                                                                        rescuing qualitative research from its rigourlessness—purposive sampling, grounded
                                                                        theory, multiple coding, triangulation and respondent validation—are anything more
                                                                        than spurious ‘bumper stickers’ designed to boost academic credibility.
Downloaded by [Vrije Universiteit Amsterdam] at 03:40 27 January 2013

                                                                        Conclusion
                                                                        The DfES-funded initiative at the EPPI Centre referred to in this paper is one of three
                                                                        listed at the beginning of the Strategy Consultation Paper produced by the National
                                                                        Educational Research Forum in 2000 as addressing the requirement for a coherent
                                                                        approach to existing educational research (NERF, 2000). The other two are the
                                                                        ESRC initiative on evidence-informed policy, coordinated by Ken Young at Queen
                                                                        Mary and WestŽ eld College in London (http://www.evidencenetwork.org), and the
                                                                        Campbell Collaboration (http://campbell.gse. upenn.edu), an international network of
                                                                        people doing systematic reviews in education, social work and criminal justice.
                                                                        These three initiatives have common themes: the need to Ž nd out what we already
                                                                        know; the desirability of holding this knowledge in an easily accessible and
                                                                        updateable format; and the requirement that research should be more policy-relevant
                                                                        than it has sometimes been in the past.
                                                                           Evidence-based everything, which includes evidence-informed education, repre-
                                                                        sents a paradigm shift in thinking about the relationship between academic research
                                                                        and real world policy and practice. What exactly it will achieve remains to be seen,
                                                                        but it is unlikely to be a short-lived fashion. It poses particular challenges both for
                                                                        educational research and for social science approaches to knowledge synthesis and
                                                                        management more generally. The early experience of the EPPI Centre initiative for
                                                                        supporting research synthesis work in education suggests that it is possible to
                                                                        develop collaborative, democratic and systematic structures for reviewing research
                                                                        evidence, which will help to open up the traditionally rather esoteric world of
                                                                        educational research to public scrutiny.

                                                                        Correspondence: Ann Oakley, Social Science Research Unit, University of London
                                                                        Institute of Education, 18 Woburn Square, London WC1H ONR, UK; E-mail:
                                                                        a.oakley@ioe.ac.uk

                                                                        REFERENCES
                                                                        ATKINSON, E. (2000) In defence of ideas, or why ‘what works’ is not enough, British Journal of
                                                                          Sociology of Education, 21(3), pp. 317–330.
                                                                        BARBOUR, R. S. (2001) Checklists for improving rigour in qualitativ e research : a case of the tail wagging
                                                                          the dog?, British Medical Journal, 322, pp. 115–117.
Evidence-based Everything                                                                            285

                                                                        COBB, A. K., HAGEMASTER, J. N. (1987) Ten criteria for evaluatin g qualitative research proposals ,
                                                                          Journal of Nursing Education, 26(4), pp. 138–43.
                                                                        DAVIES , P. (2000) The relevanc e of systematic reviews to educationa l policy and practice, Oxford Review
                                                                          of Education, 26(3&4), pp. 365–78.
                                                                        DODD , C. S. (2002) Intervention s for treating headlice (Cochrane Review), in: The Cochrane Library,
                                                                          Issue 1 (Oxford, Update Software).
                                                                        ELLIOTT, J. (2001) Making evidence-base d practice educational , British Educationa l Research Journal,
                                                                          27(5), pp. 555–574.
                                                                        EVANS, J. & BENEFIELD, P. (2001) Systematic reviews of educationa l research : does the medical model
                                                                          Ž t?, British Educational Research Journal, 27(5), pp. 527–41.
                                                                        FISCHER, J. (1973) Is casework effective ? A review, Social Work, January, pp. 5–20.
                                                                        HAMMERSLEY, M. (2001) On systematic reviews of research literatures: a ‘narrative’ response to Evans
                                                                          and BeneŽ eld, British Educationa l Research Journal, 27(5), pp. 543–54.
                                                                        HARDEN, A., OAKLEY, A. & OLIVER , S. (2001) Peer-delivere d health promotion for young people: a
                                                                          systematic review of different study designs, Health Education Journal, 60(4), pp. 339–53.
Downloaded by [Vrije Universiteit Amsterdam] at 03:40 27 January 2013

                                                                        HARGREAVES, D. (1996) Teaching as a research-base d profession: possibilitie s and prospects, Teacher
                                                                          Training Agency Annual Lecture (London, TTA).
                                                                        HILLAGE , J., PEARSON, R., ANDERSON, A., TAMKIN, P. (1998) Excellence in Research in Schools (London,
                                                                          Department for Education and Employment/Institute of Employment Studies).
                                                                        JACKSON, G. B. (1980) Methods for integrativ e reviews, Review of Educationa l Research, 50, pp. 438–
                                                                          60.
                                                                        LEININGER , M. (1994) Evaluation criteria and critique of qualitativ e research studies, in: J. M. MORSE
                                                                          (Ed.) Critical Issues in Qualitative Research Methods (Thousand Oaks, CA, Sage Publications) .
                                                                        LINCOLN, Y. S. & GUBA, E. G. (1985) Naturalistic Inquiry (Beverly Hills, CA, Sage Publications).
                                                                        LOGAN, C. H. (1972) Evaluation research in crime and delinquency : a reappraisal , Journal of Criminal
                                                                          Law, Criminology and Police Science, 63(3), pp. 378–87.
                                                                        MCINTYRE, D. & MCINTYRE, A. (1999) Capacity for Research into Teaching and Learning (Final Report
                                                                          to ESRC Teaching and Learning Programme, Cambridge, School of Education, University of
                                                                          Cambridge).
                                                                        MAYS, N. & POPE, C. (1995) Rigour and qualitativ e research, British Medical Journal, 311, pp. 109–112.
                                                                        MEDICAL SOCIOLOGY GROUP (1996) Criteria for the evaluation of qualitative research papers, Medical
                                                                          Sociology News, 22(1), pp. 69–71.
                                                                        MUECKE, M. A. (1994) On the evaluation of ethnographies , in: J. M. MORSE (Ed.) Critical Issues in
                                                                          Qualitative Research Methods (Thousand Oaks, CA, Sage Publications) .
                                                                        NATIONAL EDUCATIONAL RESEARCH FORUM (2000) Research and Development in Education: a National
                                                                          Strategy Consultation Paper (London, NERF).
                                                                        OAKLEY, A. (1998) Experimentatio n in social science: the case of health promotion, Social Sciences in
                                                                          Health, 4(2), pp. 73–89.
                                                                        OAKLEY, A. & FULLERTON, D. (1995) A Systematic Review of Smoking Prevention Programmes for
                                                                          Young People (London, EPPI Centre).
                                                                        OECD (2000) Knowledge Management in the Learning Society (Centre for Educational Research and
                                                                          Innovation).
                                                                        OLIVER , S., PEERSMAN, G., HARDEN, A. & OAKLEY, A. (1999) Discrepancies in Ž ndings from effective -
                                                                          ness reviews: the case of health promotion for older people in accident and injury prevention , Health
                                                                          Education Journal, 58, pp. 66–77.
                                                                        PEARSON, K. (1904) Report of certain enteric fever inoculatio n statistics , British Medical Journal, 1,
                                                                          pp. 1243–1246.
                                                                        PETROSINO, A., TURPIN-PETROSINO, C., FINCKENAUER , J. O. (2000) Well-meaning programs can have
                                                                          harmful effects! Lessons from experiment s of programs such as Scared Straight, Crime and Delin-
                                                                          quency, 46(3), pp 354–379.
                                                                        PETTICREW, M. (2001) Systematic reviews from astronomy to zoology: myths and misconceptions ,
                                                                          British Medical Journal, 322, pp. 98–101.
                                                                        POPAY, J., ROGERS, A. & WILLIAMS , G. (1998) Rationale and standards for the systematic review of
                                                                          qualitative literature in health services research , Qualitative Health Research, 8(3), pp. 341–351.
                                                                        RAVITCH, D. (1999) Physicians leave education researcher s for dead, Sydney Morning Herald, 22
                                                                          February.
                                                                        SHEPHERD, J., GARCIA, J., OLIVER S., HARDEN, A., REES, R., BRUNTON, G. & OAKLEY, A. (2001) Barriers
                                                                          to, and Facilitators of, the Health of Young People: a systematic review of evidence on young people’s
286                                                                                       A. Oakley

                                                                          views and on intervention s in mental health, physical activity and healthy eating. Volume 2, Complete
                                                                          Report. (London, EPPI Centre).
                                                                        SMITH, A. F. M. (1996) Mad cows and ecstasy: chance and choice in an evidence-base d society, Journal
                                                                          of the Royal Statistica l Society, 159(3), pp. 367–383.
                                                                        SMITH, M. L. & GLASS, G. V. (1977) Meta-analysi s of psychotherap y outcome studies, American
                                                                          Psychologist, 32, pp. 752–760.
                                                                        TOOLEY, J. & DARBY, D. (1998) Educationa l Research—a Critique (London, OfŽ ce for Standards in
                                                                          Education).
                                                                        TORGERSON, C., ROBERTS, B., THOMAS, J., DYSON, A. & ELBOURNE, D. (in press) Developing protocol s
                                                                          for systematic reviews in education: early experience s from EPPI Centre review groups, Evaluation
                                                                          and Research in Education.
                                                                        WRIGHT, W. E., DIXON , M. C. (1977) Community preventio n and treatment of juvenile delinquency ,
                                                                          Journal of Research in Crime and Delinquency, 35, p. 67.
Downloaded by [Vrije Universiteit Amsterdam] at 03:40 27 January 2013
You can also read