Drug Effectiveness Review Project - Systematic Review Methods and Procedures

 
Drug Effectiveness
                      Review Project

              Systematic Review Methods and
                       Procedures

                                              Revised January 2011

Principal Investigator: Marian McDonagh, PharmD
Oregon Evidence-based Practice Center
Oregon Health & Science University
Mark Helfand, MD, MPH, Director
Copyright © 2011 by Oregon Health & Science University
Portland, Oregon 97239. All rights reserved.
TABLE OF CONTENTS
Introduction ................................................................................................................................. 3
Review Methods .......................................................................................................................... 3
   Conflict of Interest Policy .......................................................................................................................... 3
   Drug Effectiveness Review Project Topic Selection Process................................................................... 3
   Formulating Key Questions ...................................................................................................................... 4
   The Clinical Advisory Group ..................................................................................................................... 4
   Searching the Literature and Other Sources of Data ............................................................................... 5
      MEDLINE and other database searches .............................................................................................. 5
      Dossier solicitation................................................................................................................................ 6
      Web resources ..................................................................................................................................... 6
   Study Selection and Inclusion................................................................................................................... 7
      Application of study design criteria ....................................................................................................... 7
      Cut-off date for new drug inclusion....................................................................................................... 8
      Inclusion of active-control and placebo-controlled trials....................................................................... 8
         Pooled analyses ............................................................................................................................... 9
         Systematic reviews ........................................................................................................................ 10
         Single-arm studies: Cohort or open-label extension of a trial ........................................................ 10
         Unpublished studies or data .......................................................................................................... 11
      Process for determining study eligibility ............................................................................................. 11
   Quality Assessment of Individual Studies ............................................................................................... 12
   Systematic Reviews ................................................................................................................................ 14
   Data Synthesis ........................................................................................................................................ 16
   Applicability ............................................................................................................................................. 16
   Grading the Strength of the Overall Body of Evidence ........................................................................... 17
   Summary Table....................................................................................................................................... 17
   Peer Review and Public Comment ......................................................................................................... 18
      Peer review......................................................................................................................................... 18
      Public comment .................................................................................................................................. 18
   Updating Reports .................................................................................................................................... 18
   Single Drug Addendum to Reports ......................................................................................................... 19
   Outline of a Typical Drug Effectiveness Review Project Report............................................................. 19
References................................................................................................................................. 20

Tables
Table 1. Drug Effectiveness Review Project guidelines to assess quality of trials ..................................... 13
Table 2. Strength of evidence grades and definitions ................................................................................. 17

Drug Effectiveness Review Project                                                                                                                 January 2011
Systematic Review Methods                                                                                                                               2 of 21
Introduction
The methodology used by the Evidence-based Practice Centers in producing comparative
systematic reviews for the Drug Effectiveness Review Project is described here. The methods
follow the principles of “best evidence”, focusing on randomized controlled trials with direct
comparisons and health outcomes wherever possible. The methods we use evolve as the
international methods for evidence review evolve, incorporating newly developed methods, as
appropriate, to our goal of producing high quality systematic reviews that meet the needs of the
Participating Organizations of the Drug Effectiveness Review Project (see “About DERP” for
more information on the participating organization who govern DERP).

Review Methods

Conflict of Interest Policy

Drug Effectiveness Review Project investigators and staff comply with a policy on conflicts of
interest whereby there is a formal, written declaration that there are no financial interests in any
pharmaceutical company for the duration of the time the person is working on Drug
Effectiveness Review Project projects. Prior to initiating work, all investigators and staff sign a
form indicating they have no conflicts of interest. The assurance of an absence of conflicts of
interest related to financial interests in pharmaceutical companies is declared annually for any
investigator or staff member continuing to work with the Drug Effectiveness Review Project.
        For clinicians invited to participate in a Clinical Advisory Group, the Center for
Evidence-based Policy obtains declarations of conflicts of interest. The policy on these conflicts
is discussed in the section on Clinical Advisory Groups.

Drug Effectiveness Review Project Topic Selection Process

When new topics are considered, the Drug Effectiveness Review Project Participating
Organizations follow an explicit selection process over a 3-month period. This process ensures
that all organizations participate equally and that topics selected are relevant for the majority of
participants. This process is undertaken at various points throughout the 3-year Drug
Effectiveness Review Project contract cycle depending on the needs of the Participating
Organizations and funds available. When new topics are considered, the Center for Evidence-
based Policy solicits topics from each Participating Organization. The initial list is circulated
among Participating Organizations and if there are a large number of potential topics (e.g. 10 or
more) a vote is taken to narrow the list down to approximately 5 to receive further consideration.
The number of topics chosen for additional work depends on the number of new reports to be
initiated, but in general is not more than 5. The original topic submissions include the general
scope (drugs and populations) and the reasoning behind the proposed topic. After reviewing the
list of topics proposed, the Participating Organizations discuss the pros and cons of each
potential topic prior to having the Center for Evidence-based Policy proceed with the production
of briefing papers.
         Briefing papers include original participant submissions, pros and cons, and an overview
of available evidence completed by the Oregon Evidence-based Practice Center. For each
proposed topic the Oregon Evidence-based Practice Center conducts a search of MEDLINE

Drug Effectiveness Review Project                                                           January 2011
Systematic Review Methods                                                                         3 of 21
using a search strategy designed specifically to identify systematic reviews. The Oregon
Evidence-based Practice Center also searches the Websites for Agency of Healthcare Research
and Quality, Canadian Agency for Drugs and Technologies in Health, the Cochrane
Collaboration, Effective Healthcare, National Coordinating Center for Health Technology
Assessment, National Institute for Clinical Excellence, and the National Health Service Center
for Reviews and Dissemination to identify high quality systematic reviews relevant to the
proposed topics. Additionally, a MEDLINE search for randomized controlled trials pertaining to
the new topic is included to estimate the proposed topic size.

Formulating Key Questions

Based on the discussion held during topic selection, Key Questions are formulated and serve to
define the scope of a Drug Effectiveness Review Project report. In general, the Key Questions
follow this template:

     1. What is the comparative effectiveness of  for treatment of
         in ?
     2. What are the comparative harms of  for treatment of  in
        ?
     3. Does the comparative effectiveness or harms of  vary in patient subgroups
        defined by demographics (age, racial groups, gender, etc), socioeconomic status, use of
        other medications, or presence of comorbidities?

        The questions are modified to best suit the particular review and can include specific
outcomes of focus, such as mortality or symptom relief. Additional or sub-questions can be used
when they add important nuance. However, the study inclusion criteria are intended to provide
the detailed information on specific drugs and outcome measures included.
        Draft Key Questions are brought to the Participating Organization group for discussion
and comment. A second draft of the Key Questions is then formulated and again discussed with
the Participating Organizations. Clinical experts, identified by the Participating Organizations,
are consulted via teleconference to provide assistance in refining the Key Questions. Following
modifications, this set of draft Key Questions is posted to the Drug Effectiveness Review Project
Website for public comment. Public comments and Oregon Evidence-based Practice Center
responses are documented in a spreadsheet and are discussed with the Participating
Organizations, and after any approved modifications, the final Key Questions are posted to the
Drug Effectiveness Review Project Website.

The Clinical Advisory Group

In general, the purpose of the Clinical Advisory Group is to provide insight and assistance to
Drug Effectiveness Review Project researchers and participants by offering clinically relevant
counsel throughout the stages of an original review. Currently, advisory groups are utilized for
updates of existing Drug Effectiveness Review Project reviews on a case-by-case basis
depending largely on whether a change in scope has occurred between the previous report and
the current update.

Drug Effectiveness Review Project                                                        January 2011
Systematic Review Methods                                                                      4 of 21
The Center for Evidence-based Policy identifies the potential clinical advisors based
initially on suggestions from the Participating Organizations, who are asked to recommend
clinical experts that best represent their constituencies and who also have significant recent
experience in providing direct patient care. The Center for Evidence-based Policy gathers
conflict of interest information from each clinician and coordinates the assembly of a balanced
Clinical Advisory Group. The composition of the Clinical Advisory Group and their conflicts of
interest declarations are reviewed and discussed by the Participating Organizations prior to any
clinician being contacted.
         Once the Clinical Advisory Group has been formalized, the Center for Evidence-based
Policy arranges a teleconference with the advisors and the Evidence-based Practice Center
researchers. In general, consultation with the Clinical Advisory Group focuses on the scope of
the initial Key Questions with regards to the relevant aspects of the population, including
identification of the most important subgroups, interventions, comparators, outcomes, and study
design. Further information regarding clinical experience with certain drug therapies and/or
disease-state management may also be discussed during the initial meeting and can occur
throughout the review process. For example, to assist with quality assessment, it may be useful to
seek guidance from the Clinical Advisory Group in the identification of the most important
baseline prognostic factors. Additionally, we consult with the Clinical Advisory Group members
to determine which subset of outcomes they would consider important enough to warrant formal
grading of the strength of evidence. Evidence-based Practice Center researchers consider
suggestions made by the Clinical Advisory Group members and any proposed modifications are
then discussed with the Participating Organizations.
         All Clinical Advisory Group members volunteer their time and expertise and are not
monetarily compensated by the Center for Evidence-based Policy, the Drug Effectiveness
Review Project, or Participating Organizations. Clinical Advisory Group members have the
option of becoming peer reviewers; however, this is not a required function of the group. Experts
who participated in the Clinical Advisory Group are listed on the Drug Effectiveness Review
Project Website.

Searching the Literature and Other Sources of Data

MEDLINE and other database searches
Searches for Drug Effectiveness Review Project reports are generally conducted in consultation
with a medical librarian. At a minimum, MEDLINE and the Cochrane Central Register of
Controlled Trials, Cochrane Database of Systematic Reviews, and Database of Abstracts of
Reviews of Effects are searched. Other databases (e.g., PsycInfo, CancerLit) may be searched
depending on the topic. Search strategies generally combine all included interventions (using
proprietary and generic names) and populations, see example below:

          Sample search strategy: Pegylated interferons for Hepatitis C infection
          (Numbers in parentheses represent number of studies retrieved)
          1 exp Hepatitis C/ or hepatitis C.mp. or hcv.mp. (36716)
          2 Pegasys.mp. (50)
          3 Peg-intron.mp. (25)
          4 peginterferon alfa 2a.mp. (679)
          5 peginterferon alfa 2b.mp. (521)

Drug Effectiveness Review Project                                                        January 2011
Systematic Review Methods                                                                      5 of 21
6     Interferon Alfa 2a.mp. or exp Interferon Alfa-2a/ (3015)
          7     Interferon Alfa-2b.mp. or exp Interferon Alfa-2b/ (4167)
          8     6 or 7 (6677)
          9     exp Interferons/ (83358)
          10    2a.mp. (21935)
          11    2b.mp. (17454)
          12    9 and (10 or 11) (7636)
          13    exp Polyethylene Glycols/ (26140)
          14    pegylat$.mp. (2121)
          15    peginterferon$.mp. (967)
          16    13 or 14 or 15 (27324)
          17    (8 or 12) and 16 (988)
          18    2 or 3 or 4 or 5 or 17 (1027)
          19    ribavirin.mp. or exp Ribavirin/ (5099)
          20    1 and 18 and 19 (697)
          21    from 20 keep 1-697 (697)

      Databases are searched twice, once at the beginning of the review and then between 2 to
3 months prior to submission of the draft report.

Dossier solicitation
The Center for Evidence-based Policy requests dossiers from all pharmaceutical companies that
manufacture any drug included in an individual report. Dossiers are intended to provide a
complete list of citations for all relevant studies of which the manufacturer is aware. We also
request unpublished study information and data, with the understanding that once the report is
published the public may obtain the information by requesting a copy of the dossier – in effect
making it public. Any dossier marked “confidential” are not accepted. A copy of the most recent
product label is also requested. Dossiers are reviewed by Drug Effectiveness Review Project
staff for relevant trials or other data that may not have been captured in MEDLINE or Web
searches and for unpublished data. An accounting of which companies provided dossiers is
included in the Results section of the report.

Web resources
At a minimum, the following Website must be searched for relevant information.
    • US Food and Drug Administration Center for Drug Evaluation and Research
       Drugs@FDA http://www.accessdata.fda.gov/scripts/cder/drugsatfda/
          o This site may be searched by drug name or active ingredient (not drug class) for
             statistical and medical reviews written by US Food and Drug Administration
             personnel examining information submitted by pharmaceutical companies to the
             US Food and Drug Administration for drug approval. However, the Website
             typically does not have documents related to older drugs and very new drugs.
             Reviews may be downloaded and hand searched for trials. The Center for Drug
             Evaluation and Research site also lists any postmarketing study commitments that
             are conducted after the US Food and Drug Administration has approved a product
             for marketing (e.g., studies requiring the sponsor to demonstrate clinical benefit of
             a product following accelerated approval).
Drug Effectiveness Review Project                                                        January 2011
Systematic Review Methods                                                                      6 of 21
o The Medical and Statistical review documents contain information about trials
                 submitted as part of the New Drug Application and their results. Information
                 contained in the US Food and Drug Administration reviews is typically not
                 adequate to assess trial quality. However, these data are used to verify, or add to
                 data obtained from published manuscripts of these trials. In addition, the studies
                 submitted to the US Food and Drug Administration are compared with those
                 found in the published literature and unpublished studies submitted by
                 manufacturers to identify any remaining unpublished studies. The results of the
                 trials reported in the US Food and Drug Administration documents are compared
                 to those reported in published reports of the same studies to identify variation in
                 outcome reporting. A summary of the findings of the search of US Food and Drug
                 Administration documents is included in the Results section of the report.

       At the discretion of the Lead Investigator, the following Websites may be searched for
relevant information. Other sites may also be included in the search if appropriate:

     •    ClinicalTrials.gov http://www.clinicaltrials.gov/
              o Information on planned and on-going trials, maintained by the National Institutes
                 of Health.
     •    Clinical Study Results http://www.clinicalstudyresults.org/
              o Sponsored by the Pharmaceutical Research and Manufacturers of America, and
                 provides clinical study results completed since October 2002, mostly from Phase
                 III and Phase IV studies. It includes a link to the electronic version of the drug
                 label, a bibliography of articles on the drug in question with links to the articles
                 where possible, and a complete summary of each hypothesis testing trial
                 (regardless of outcome) that has not been published in a peer-reviewed journal.
     •    Lilly Clinical Trials http://www.lillytrials.com/
              o One of several pharmaceutical manufacturers that have established clinical trial
                 registries for their own products. Trial results are searchable by therapeutic area
                 or product; new and ongoing trials are included.
     •    Current Controlled Trials http://www.controlled-trials.com/
              o Established to promote the exchange of information about ongoing randomized
                 controlled trials worldwide. Allows searching across multiple clinical trial
                 registers, including the National Health Service in England, United States
                 ClinicalTrials.gov, and direct access to Biomed Central.

Study Selection and Inclusion

Application of study design criteria
In order for any study report to be selected for inclusion in a Drug Effectiveness Review Project
review, it must meet all eligibility criteria for populations, interventions, outcomes, and study
designs, as explicitly specified, a priori, in the Key Questions determined by the Participating
Organizations. The reviewers, with approval of the Participating Organizations, set the study
design criteria. For effectiveness outcomes, the starting point for inclusion is controlled clinical
trials and good-quality systematic reviews, and for outcomes related to harms (adverse events),
these same designs are included, as well as cohort studies with a control group and case-control

Drug Effectiveness Review Project                                                            January 2011
Systematic Review Methods                                                                          7 of 21
studies. Within these study designs, direct comparisons (head-to-head studies) are the primary
focus of synthesis of the evidence. Determining eligibility of these studies is straightforward and
is based on the study reflecting a direct comparison of at least 2 drugs included in the Drug
Effectiveness Review Project report, and meeting population and outcome criteria.
        However, under the tenets of a best evidence approach, inclusion criteria for Drug
Effectiveness Review Project reports are written to allow inclusion of placebo-controlled trials,
active-control trials, single-arm cohort or open-label extension studies, and meta-analyses not
based on results of a systematic review (“pooled analyses”) when necessary to fill gaps in
evidence in instances when direct comparisons between drugs have not been made. This may be
extended to include situations where direct comparisons are available, but these studies do not
report outcomes important to the Drug Effectiveness Review Project Participating Organizations.
Determining the eligibility of good-quality systematic reviews can also be complicated and
depends on how similar the scope of the review is to the scope of the Drug Effectiveness Review
Project report, and how recent the evidence included in the review is.
        Evidence not meeting study design inclusion criteria may be included in the report if the
evidence is clearly identified as not meeting the criteria but is being included as a matter of
record. Examples of such information are US Food and Drug Administration MedWatch reports
and case series, especially those leading to black box warnings in the product label. These can be
reported in the introduction/background section or in the section discussing evidence on adverse
events.

Cut-off date for new drug inclusion
If a new drug is introduced to the market, the last date of inclusion of that drug in the report is 15
calendar days subsequent to the date the dossiers are due to be submitted by the pharmaceutical
manufacturers. Additionally, if the drug has not been approved at the time initial dossier
solicitations are sent out, the manufacturer must notify the Center for Evidence-based Policy of
the pending approval date and intent to submit a dossier prior to the dossier submission deadline.
This ensures that every pharmaceutical manufacturer is given a fair chance in submitting dossiers
related to the newly approved drug.

Inclusion of active-control and placebo-controlled trials
In Drug Effectiveness Review Project reviews, good-quality, randomized controlled trials that
directly compare different drugs (head-to-head trials) provide the most valid evidence of their
comparative effectiveness. However, Drug Effectiveness Review Project reviewers often face
instances when direct comparisons between one or more included drugs have not been studied.
Or, even when direct comparative evidence is available, it may be limited in quality, quantity,
clinical impact, generalizability and/or other important elements.
        Limitations in quality of direct comparative evidence are determined based on objective
assessment of internal validity using predefined criteria. Limitations in quantity of direct
comparative evidence are determined based on consideration of the adequacy of the number of
studies and subject sample sizes. Regarding clinical impact, a common limitation found in
clinical trials in general is their under-reporting of important health outcomes such as quality of
life and functional capacity. Likewise, regarding generalizability, clinical trials are often
criticized overall for using narrowly defined populations and for their under-reporting of
treatment outcomes in subgroups based on age, sex, race, and common comorbidities.

Drug Effectiveness Review Project                                                           January 2011
Systematic Review Methods                                                                         8 of 21
In cases where such gaps in direct comparative evidence exist, trials that compare
included drugs to placebo are considered for their usefulness in providing a source for qualitative
or quantitative indirect comparisons of effectiveness and harms. Inclusion criteria for Drug
Effectiveness Review Project reviews are written to allow inclusion of placebo-controlled trials
to fill gaps in direct evidence. Judgments regarding what constitutes a gap are determined on a
case-by-case basis, but are based on principles of the strength of evidence, including the risk of
bias, consistency, precision, directness, and applicability of the direct evidence.(1) A description
of the rationale for judgments regarding sufficiency of head-to-head trial evidence and utilization
of placebo-controlled trials are provided in each Drug Effectiveness Review Project report. In
updates, as new head-to-head trials emerge that correspond to previously-defined gaps in
evidence, reviewers should consider removing placebo-controlled trials that may no longer be
useful and should revise the description of their rationale accordingly.
         As with head-to-head trials, the quality of any placebo-controlled trials that contribute
data to the synthesis are assessed using the same standardized criteria and their data abstracted
into evidence tables. On the other hand, for areas of a Drug Effectiveness Review Project review
where direct comparative evidence is deemed sufficient, evidence from placebo-controlled trials
are included, and their data abstraction and quality assessment is not required.
         Method of synthesis of evidence from placebo-controlled trials is determined on a case-
by-case basis. Pursuit of qualitative or quantitative indirect comparison is never required and
decisions to do so must depend on consideration of clinical, methodological, and statistical
heterogeneity levels across the individual studies. Guidance on methods for quantitative indirect
synthesis can be found elsewhere.(2) In many cases, when excess heterogeneity is present, a
general discussion of findings from placebo-controlled trials can be useful for identifying which
individual drugs have any evidence of effect in the gap areas compared with those that do not
even have basic efficacy data.
         Trials that compare one of the drugs included in the review against a drug that is not
included are called “active-control” in the Drug Effectiveness Review Project. This is to
differentiate from trials with direct comparisons among the included drugs. These studies are
included only in specific, infrequent, situations. Where there is no direct evidence, and no or
very limited evidence from placebo-controlled trials, evidence from active-control trials may be
relevant. However, such indirect comparisons are typically only useful when the comparator (the
active-control drug) is the same across the included studies. If there is significant heterogeneity
in the comparators such studies are unlikely to provide good indirect evidence for comparing one
included drug to another.

Pooled analyses
A pooled analysis is a meta-analysis of a group of highly selected studies. There is typically no
related search strategy to identify the articles (although sometimes a noncomprehensive search),
and no quality assessment of the included trials. Pooled analyses are not systematic in nature.
However, there are limited situations where this level of evidence may be useful and admissible
in a Drug Effectiveness Review Project report.
       Similar to the use of placebo-controlled trial evidence, pooled analyses are used to provide
evidence where no or insufficient evidence exists. For example, pooled analysis presents data on
subgroups where primary data on these subgroups is not obtainable from the primary sources or
supplements information on outcomes not reported in the primary sources. In these cases, the

Drug Effectiveness Review Project                                                          January 2011
Systematic Review Methods                                                                        9 of 21
primary studies have been published or are available to the Drug Effectiveness Review Project
authors in sufficient detail to assess the quality of the study.
       However, pooled analyses of results already available to the Drug Effectiveness Review
Project authors from primary sources are not be included. Because pooled analyses do not follow
a systematic approach to identifying and assessing the studies, Drug Effectiveness Review
Project authors undertake an independent analysis of these studies. If the pooled analysis is the
only source of data from component studies (e.g. results of the primary studies included in the
meta-analysis are not published), it can be included at the Drug Effectiveness Review Project
report author’s discretion, but the limitations are made clear – primarily the limited ability to
assess the quality of the component studies.

Systematic reviews
As part of a high-quality approach to evidence review, existing systematic reviews are
considered for inclusion in Drug Effectiveness Review Project reports along with other types of
evidence. The intention is to include reviews that directly address the Key Questions posed in the
report and that meet minimum standards for quality.
        In order for a review to be considered for inclusion into a Drug Effectiveness Review
Project report, the review must meet at least 2 criteria that indicate it is “systematic”. First, the
review must include a comprehensive search method for the evidence. This entails searching
multiple sources of information (electronic databases, reference lists, hand searching journals,
etc). Second, the review must provide (or at least describe) the search terms used to retrieve the
evidence. Other information such as the reporting of study eligibility criteria, quality assessment,
et cetera are not required to determine if a review qualifies as being “systematic”, although this
information is useful. The review also must address questions that are similar enough to those
posed in the report to provide useful information to the report readers. Reviews that evaluate
“class effect” of drugs grouped together compared with other interventions are unlikely to be
useful in a Drug Effectiveness Review Project report. Moreover, reviews that compare a small
proportion of drugs in a large class may not be useful. An example would be if 2 of 7 drugs were
reviewed. The report authors determine the usefulness of the review in the larger context of the
drug class.
        Additional inclusion criteria are determined based on what is known about the underlying
evidence base. For example, in an area where there are many existing reviews over many years,
the authors may choose a cut-off date to examine only the most recent reviews for inclusion,
such as within 2 years of the Drug Effectiveness Review Project search dates. In other areas, this
may not be a reliable approach if the underlying literature is older and has not changed in recent
years.

Single-arm studies: Cohort or open-label extension of a trial
There are 2 types of studies considered here: observational studies of patients receiving a drug
included in the Drug Effectiveness Review Project report with no comparison group that is
relevant to the review and open-label extension studies of a randomized controlled trial.
       Single-group studies are included under the “best evidence” approach only if the study
adds important evidence on harms that is not available from other, higher quality, studies. This
means that the study must have exposure duration longer than the trials included and that no
comparative evidence is available. The minimum duration (e.g. 2 years of follow-up) is

Drug Effectiveness Review Project                                                          January 2011
Systematic Review Methods                                                                      10 of 21
determined a priori based on the current knowledge of the drugs potential adverse events and
taking account of the existing evidence from trials.
        Caveats in using open-label extension studies include that the study population is derived
from a clinical trial where the populations are typically highly selected (a narrow set of inclusion
criteria), and often patients continuing in the extension are those who had adequate response
and/or tolerated the drug during the trial period. There are no clearly agreed upon criteria for
evaluating the quality of such studies.
        Single-group cohort studies are evaluated under the same criteria used to evaluate the
quality of cohort studies with a comparison group. It can be difficult to determine the mean or
minimum duration of follow-up (exposure to the drug) in these studies, which may make them
less useful in evaluating longer-term harms. However, this type of study may include a more
broadly defined group of patients than those included in trials and could potentially increase the
applicability of this evidence.

Unpublished studies or data
Unpublished studies may be identified through pharmaceutical manufacturer dossiers, US Food
and Drug Administration documents, or trial registries. Pharmaceutical manufacturer dossiers
may also contain previously-unpublished supplemental data from published studies. Unpublished
data or studies cannot be submitted by pharmaceutical companies after the dossier process
timeline (e.g. not through the public comment process for draft reports).

Unpublished studies
Unpublished studies identified through pharmaceutical manufacturer dossiers, US Food and
Drug Administration documents, or trial registries are to be included only if the study meets the
inclusion criteria established in the key questions and sufficient detail is provided to assess the
study quality. At a minimum, information must be provided on the comparability of groups at
baseline, the number of patients analyzed, whether an intention-to-treat analysis was conducted,
and the type of statistical test used. If this information is not present in the dossier submission,
the study is not to be included.

Supplemental data
In situations where additional data are provided regarding a published study, such as additional
outcomes or subgroup data not included in the published manuscript, analyses of these data will
be included if details of the data analysis are provided. Specifically, the type of statistical test
used, numbers analyzed, and whether an intention-to-treat analysis was conducted must be
reported. In general, raw data will not be analyzed by the review team. Additional data on
subgroups will only be included for direct, head-to-head, comparisons of included drugs and it is
expected that any analyses conducted on the data will be adequately described for reviewer
evaluation. Study quality assessment will be based on the fully published study details.

Process for determining study eligibility
Overall, determination of study eligibility is based on reviewer judgment using a 2-step process.
In order to reduce potential bias, and ensure accuracy and reproducibility, all study reports
identified in searches are assessed for eligibility by at least 2 qualified reviewers (“dual review”)
and final selection decisions are made using a consensus process. Qualified reviewers are limited

Drug Effectiveness Review Project                                                           January 2011
Systematic Review Methods                                                                       11 of 21
to individuals with adequate training and experience to apply the inclusion criteria with
consistency and accuracy.
        The first step of the study selection process involves assessment of titles and abstracts
identified through literature searches for preliminary determination of study eligibility. Only
study reports with titles and abstracts that are unequivocally ineligible are rejected at this stage.
For all other reports, the full-text articles are obtained and read in detail for the second step in
determination of eligibility.
        Both steps in the study selection process should involve “dual review” and it is up to the
reviewers to decide which of 2 “dual review” options to use. If possible, it is desirable to
complete eligibility assessments for each report in duplicate by 2 independent reviewers.
However, it is also acceptable to have a first reviewer to complete eligibility assessments and a
second reviewer to check the accuracy of the first reviewer’s assessment results. If 2 reviewers
are unable to agree, a third party, as senior reviewer, is consulted.
        Results of eligibility assessments for all screened reports are displayed in a Preferred
Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement-based
diagram.(3) The flow diagram depicts the flow of information through the different phases of a
systematic review. It maps out the number of records identified, included and excluded, and the
reasons for exclusions. Studies are excluded if they do not meet predetermined inclusion criteria
as defined in the Key Questions, for example when the population, intervention, comparator,
outcomes, or study designs do not meet eligibility requirements. Studies with results presented
only in an abstract of a conference proceeding are excluded and would be described as not
meeting study design criteria. Similarly, systematic reviews are excluded if they are outdated or
of poor quality. Other reasons for study exclusion are publication in a non-English language or if
all efforts to retrieve the article were exhausted without successful retrieval.
        Finally, for reader convenience, all Drug Effectiveness Review Project reports contain an
Appendix that lists reasons for exclusion for all individual trials that were excluded at least at the
full-text level. For updates, this list is a cumulative list of trials but within a 3-5-page limit. If it
exceeds that page limit, the excluded trials list consists of trials for that specific update only and
readers are directed to the older versions of the reports available on the Drug Effectiveness
Review Project Website for trials excluded previously.

Quality Assessment of Individual Studies

For trials, we assess internal validity (quality) based on the predefined criteria. These criteria are
based on those used by the US Preventive Services Task Force and the National Health Service
Centre for Reviews and Dissemination (United Kingdom).(4, 5) We rate the internal validity of
each trial based on the methods used for randomization, allocation concealment, and blinding;
the similarity of compared groups at baseline; maintenance of comparable groups; adequate
reporting of dropouts, attrition, crossover, adherence, and contamination; loss to follow-up; and
the use of intention-to-treat analysis. Trials with a fatal flaw are rated poor quality; trials that
meet all criteria are rated good quality; the remainder are rated fair quality. As the fair-quality
category is broad, studies with this rating vary in their strengths and weaknesses: The results of
some fair-quality studies are likely to be valid, while others are only possibly valid. A poor-
quality trial is not valid; the results are at least as likely to reflect flaws in the study design as a
true difference between the compared drugs. A fatal flaw is reflected by failure to meet
combinations of items of the quality assessment checklist. A particular randomized trial might

Drug Effectiveness Review Project                                                              January 2011
Systematic Review Methods                                                                          12 of 21
receive 2 different ratings, one for effectiveness and another for adverse events. More detailed
descriptions of how each checklist item is assessed are presented in Table 1, below.

Table 1. Drug Effectiveness Review Project guidelines to assess quality of trials
1.   Was the assignment to the treatment groups really random?
                           Use of the term “randomized” alone is not sufficient for a judgment of “Yes”. Explicit
     Yes                   description of method for sequence generation must be provided. Adequate approaches
                           include: Computer-generated random numbers, random numbers tables
                           Randomization was either not attempted or was based on an inferior approach (e.g.,
     No
                           alternation, case record number, birth date, or day of week)
     Unclear               Insufficient detail provided to make a judgment of yes or no.
2.   Was the treatment allocation concealed?
                           Adequate approaches to concealment of randomization: Centralized or pharmacy-controlled
                           randomization, serially-numbered identical containers, on-site computer based system with a
     Yes                   randomization sequence that is not readable until allocation
                           Note: If a trial did not use adequate allocation concealment methods, the highest rating it can
                           receive is “Fair”.
                           Inferior approaches to concealment of randomization: Use of alternation, case record
     No
                           number, birth date, or day of week, open random numbers lists, serially numbered envelopes
                           No details about allocation methods. A statement that “allocation was concealed” is not
     Unclear
                           sufficient; details must be provided.
3.   Were groups similar at baseline in terms of prognostic factors?
                           Parallel design: No clinically important differences
                           Crossover design: Comparison of baseline characteristics must be made based on order of
                           randomization.
     Yes
                           Prognostic factors are important to consider and are discussed a priori with clinical advisory
                           groups. A statistically significant difference does not automatically constitute a clinically
                           important difference.
     No                    Clinically important differences
                           Parallel design: Statement of “no differences at baseline”, but data not reported; or data not
     Unclear               reported by group, or no mention at all of baseline characteristics
                           Crossover design: Only reported baseline characteristics of the overall group.
4.   Were eligibility criteria specified?
     Yes                   Eligibility criteria were specified a priori.
     No                    Criteria not reported or description of enrolled patients only.
5.   Were outcome assessors blinded to treatment allocation?
6.   Was the care provider blinded?
7.   Was the patient blinded?
                           Explicit statement(s) that outcome assessors/care provider/patient were blinded. Double-
     Yes                   dummy studies and use of identically appearing treatments are also considered sufficient
                           blinding methods for patients and care providers.
     No                    No blinding used, open-label
     Unclear,
                           Study described as double-blind but no details provided on how blinding was carried out or
     described as
                           who was specifically blinded.
     double-blind

Drug Effectiveness Review Project                                                                               January 2011
Systematic Review Methods                                                                                           13 of 21
Not reported             No information about blinding
8.   Did the article include an intention-to-treat analysis or provide the data needed to calculate it (that is, number
     assigned to each group, number of subjects who finished in each group and their results)?
                              All patients that were randomized were included in the analysis. Imputation methods (e.g.,
                              last-observation carried forward) should be clearly described.
                              OR
     Yes
                              Exclusion of 5% of patients or less is acceptable, given that the reasons for exclusion are not
                              related to outcome (e.g., did not take study medication) and that the exclusions would not be
                              expected to have an important impact on the effect size
                              Exclusion of greater than 5% of patients from analysis OR less than 5%, with reasons that
     No                       may affect the outcome (e.g., adverse events, lack of efficacy) or reasons that may be due to
                              bias (e.g., investigator decision)
     Unclear                  Numbers analyzed are not reported
9.   Did the study maintain comparable groups?
                              No attrition. OR, the groups analyzed remained similar in terms of their baseline prognostic
     Yes
                              factors.
     No                       Groups analyzed had clinically important differences in important baseline prognostic factors
                              There was attrition, but insufficient information to determine if groups analyzed had clinically
     Unclear
                              important differences in important baseline prognostic factors
10. Were levels of crossovers (≤ 5%), adherence (≤ 20%), and contamination (≤ 5%) acceptable?
     Yes                      Levels of crossovers, adherence and contamination were below specified cut-offs.
     No                       Levels or crossovers, adherence, and contamination were above specified cut-offs.
                              Insufficient information provided to determine the level of crossovers, adherence and
     Unclear
                              contamination.
11. Was the rate of overall attrition and the difference between groups in attrition within acceptable levels?
     Overall attrition: There is no empirical evidence to support establishment of a specific level of attrition that is
     universally considered “important”. The level of attrition considered important will vary by review and is
     determined a priori by the review teams. Attrition refers to discontinuation for ANY reason, including lost to
     follow-up, lack of efficacy, adverse events, investigator decision, protocol violation, consent withdrawal, etc.
     Yes                      The overall attrition rate was below the level that was established by the review team.
     No                       The overall attrition rate was above the level that was established by the review team.
     Unclear                  Insufficient information provided to determine the level of attrition
     Differential attrition
     Yes                      The absolute difference between groups in rate of attrition was below 10%.
                              The difference between groups in the overall attrition rate or in the rate of attrition for a
     No
                              specific reason (e.g., adverse events, protocol violations, etc.) was 10% or more.
     Unclear                  Insufficient information provided to determine the level of attrition

Systematic Reviews

Included systematic reviews are rated for quality based on a clear statement of the questions(s)
the review is intended to answer; reporting of inclusion criteria; methods used for identifying
literature (the search strategy), validity assessment, and synthesis of evidence; and details
provided about included studies. Reviews are categorized as good when all criteria are met.
Because there are different methods available for assessing the quality of systematic reviews, and
none has become the standard, reviewers can use one of the following: AMSTAR,(6-8) Oxman
and Guyatt,(9, 10) or the Centre for Reviews and Dissemination criteria (below).(4)

Drug Effectiveness Review Project                                                                                      January 2011
Systematic Review Methods                                                                                                  14 of 21
1. Is there a clear review question and inclusion/exclusion criteria reported relating to the
        primary studies?
             a. A good-quality review should focus on a well-defined question or set of
                questions, which ideally refers to the inclusion/exclusion criteria by which
                decisions are made on whether to include or exclude primary studies. The criteria
                should relate to the 4 components of study design: indications (patient
                populations), interventions (drugs), and outcomes of interest. In addition, details
                are reported relating to the process of decision-making, i.e., how many reviewers
                were involved, whether the studies were examined independently, and how
                disagreements between reviewers were resolved.
     2. Is there evidence of a substantial effort to search for all relevant research?
            a. This is usually the case if details of electronic database searches and other
                identification strategies are given. Details of the search terms used, date, and
                language restrictions are presented. In addition, descriptions of hand-searching,
                attempts to identify unpublished material, and any contact with authors, industry,
                and research institutes are provided. The appropriateness of the database(s)
                searched by the authors should also be considered, for example if MEDLINE is
                searched for a review looking at health education, then it is unlikely that all
                relevant studies have been located.
     3. Is the validity of included studies adequately assessed?
             a. A systematic assessment of the quality of primary studies should include an
                explanation of the criteria used (e.g., method of randomization, whether outcome
                assessment was blinded, whether analysis was on an intention-to-treat basis).
                Authors may use either a published checklist or scale, or one that they have
                designed specifically for their review. Again, the process relating to the
                assessment is reported (i.e. how many reviewers involved, whether the assessment
                was independent, and how discrepancies between reviewers were resolved).
     4. Is sufficient detail of the individual studies presented?
             a. The review should demonstrate that the studies included are suitable to answer the
                question posed and that a judgement on the appropriateness of the authors'
                conclusions can be made. If a paper includes a table giving information on the
                design and results of the individual studies, or includes a narrative description of
                the studies within the text, this criterion is usually fulfilled. If relevant, the tables
                or text should include information on study design, sample size in each study
                group, patient characteristics, description of interventions, settings, outcome
                measures, follow-up, drop-out rate (withdrawals), effectiveness results, and
                adverse events.
     5. Are the primary studies summarized appropriately?
             a. The authors should attempt to synthesize the results from individual studies. In all
                cases, there should be a narrative summary of results, which may or may not be
                accompanied by a quantitative summary (meta-analysis).
                         For reviews that use a meta-analysis, heterogeneity between studies should
                be assessed using statistical techniques. If heterogeneity is present, the possible
                reasons (including chance) should be investigated. In addition, the individual
                evaluations should be weighted in some way (e.g., according to sample size or

Drug Effectiveness Review Project                                                              January 2011
Systematic Review Methods                                                                          15 of 21
inverse of the variance) so that studies that are considered to provide the most
                     reliable data have greater impact on the summary statistic.

Data Synthesis

For the Drug Effectiveness Review Project, evidence tables showing the study characteristics,
quality ratings, and results for all included studies are constructed. Studies are reviewed using a
hierarchy of evidence approach, where the best evidence is the focus of our synthesis for each
question, population, intervention, and outcome addressed. Studies that evaluate one drug against
another provide direct evidence of comparative effectiveness and harms. Where possible, these
data are the primary focus. Direct comparisons are preferred over indirect comparisons;
similarly, effectiveness and long-term or serious adverse event outcomes are preferred to
efficacy and short-term tolerability outcomes.
        In theory, trials that compare a drug with other drug classes or with placebo can also
provide evidence about effectiveness. This is known as an indirect comparison and can be
difficult to interpret for a number of reasons, primarily heterogeneity of trial populations,
interventions, and outcomes assessment across the studies. Data from indirect comparisons are
used to support direct comparisons, where they exist, and are used as the primary comparison
where no direct comparisons exist. Indirect comparisons are interpreted with caution.
        Quantitative analyses are conducted using meta-analyses of outcomes reported by a
sufficient number of studies that are homogeneous enough that combining their results could be
justified. In general, the Drug Effectiveness Review Project follows the guidance on meta-
analysis put forth for Evidence-based Practice Centers in the Evidence-based Practice Center
Methods Guide.(2) In order to determine whether meta-analysis can be meaningfully performed,
we consider the quality of the studies and the heterogeneity among studies in design, patient
population, interventions, and outcomes. The Q statistic and the I2 statistic (the proportion of
variation in study estimates due to heterogeneity) are calculated to assess statistical heterogeneity
between studies.(11, 12) If significant heterogeneity is shown, potential sources are then
examined by analysis of subgroups of study design, study quality, patient population, and
variation in interventions. Meta-regression models may be used to formally test for differences
between subgroups with respect to outcomes.(13, 14) Random effects models to estimate pooled
effects are preferred in most cases, unless a case can be made that a fixed effect model is more
appropriate. Other analyses, including adjusted indirect meta-analysis and mixed treatment effect
model (network meta-analysis) are done in consultation with statisticians experienced in
conducting these analyses and using the most up-to-date and appropriate methods. When it is
determined that is unwise to pool data from a group of studies, the data are summarized
qualitatively.
        When synthesizing unpublished evidence, reviewers will conduct sensitivity analyses
where possible to determine if there is an indication of bias when unpublished data are included.
The source of unpublished information will be clearly noted in the text of the report, stating that
these are unpublished data and have not undergone a medical journal’s peer review process and
should be interpreted cautiously.

Applicability

Drug Effectiveness Review Project                                                               January 2011
Systematic Review Methods                                                                           16 of 21
An assessment of applicability is undertaken in Drug Effectiveness Review Project reports. The
applicability assessment is tailored to the Key Questions, and if possible the population to whom
the questions are intended to apply. These are defined in advance with the help of the Clinical
Advisory Group and the Drug Effectiveness Review Project Participating Organizations. A
discussion of applicability appears immediately before the Summary Table in Drug Effectiveness
Review Project reports.

Grading the Strength of the Overall Body of Evidence

Strength of evidence is assessed based on the main outcomes for each Key Question, as
determined by the Participating Organizations and with input from the Clinical Advisory Group,
and generally follows the method used by the Evidence-based Practice Center program.(1)
Individual lead investigators may choose to use the GRADE approach to grading the strength of
evidence, if they determine they are more familiar with this system.(15-17) In either system, the
main domains considered in assessing the strength of a body of evidence for a given outcome
are: risk of bias of the included studies, directness of the studies in measuring the outcome and
comparison in question, and the consistency and prevision of the results of the studies. Poor-
quality studies do not contribute to the assessment of overall risk of bias for a body of evidence
because their results are not synthesized with the fair and good quality study results. After
assessing each of these items for the group of studies, an overall assessment is made. The
evidence can be described as low, moderate, or high strength of evidence. In addition, when
there is either no evidence available or the evidence is too limited or too indirect to make
conclusions about comparative effectiveness, the evidence can be described as insufficient to
determine the strength of evidence. The Evidence-based Practice Center definitions of these
terms are listed below in Table 2, below. The tables used to assess each outcome are presented in
an Appendix. A summary grade of the strength of evidence can be included in the summary table
(above). A paragraph describing the strength of evidence of each main outcome can also be used.
Tables showing the grading of individual domains for each outcome assessed are included as an
appendix in Drug Effectiveness Review Project reports; overall assessments (low, moderate,
high, or insufficient strength of evidence) are included as part of the Summary Table found at the
end of each report.

Table 2. Strength of evidence grades and definitions
Grade                  Definition
High                   High confidence that the evidence reflects the true effect. Further research is very unlikely to
                       change our confidence in the estimate of effect.
Moderate               Moderate confidence that the evidence reflects the true effect. Further research may change our
                       confidence in the estimate of effect and may change the estimate.
Low                    Low confidence that the evidence reflects the true effect. Further research is likely to change our
                       confidence in the estimate of effect and is likely to change the estimate.
Insufficient           Evidence either is unavailable or does not permit estimation of an effect.
Source: Owens et al, 2009(18)

Summary Table

Drug Effectiveness Review Project                                                                                January 2011
Systematic Review Methods                                                                                            17 of 21
You can also read