Drug Effectiveness Review Project - Systematic Review Methods and Procedures
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Drug Effectiveness
Review Project
Systematic Review Methods and
Procedures
Revised January 2011
Principal Investigator: Marian McDonagh, PharmD
Oregon Evidence-based Practice Center
Oregon Health & Science University
Mark Helfand, MD, MPH, Director
Copyright © 2011 by Oregon Health & Science University
Portland, Oregon 97239. All rights reserved.TABLE OF CONTENTS
Introduction ................................................................................................................................. 3
Review Methods .......................................................................................................................... 3
Conflict of Interest Policy .......................................................................................................................... 3
Drug Effectiveness Review Project Topic Selection Process................................................................... 3
Formulating Key Questions ...................................................................................................................... 4
The Clinical Advisory Group ..................................................................................................................... 4
Searching the Literature and Other Sources of Data ............................................................................... 5
MEDLINE and other database searches .............................................................................................. 5
Dossier solicitation................................................................................................................................ 6
Web resources ..................................................................................................................................... 6
Study Selection and Inclusion................................................................................................................... 7
Application of study design criteria ....................................................................................................... 7
Cut-off date for new drug inclusion....................................................................................................... 8
Inclusion of active-control and placebo-controlled trials....................................................................... 8
Pooled analyses ............................................................................................................................... 9
Systematic reviews ........................................................................................................................ 10
Single-arm studies: Cohort or open-label extension of a trial ........................................................ 10
Unpublished studies or data .......................................................................................................... 11
Process for determining study eligibility ............................................................................................. 11
Quality Assessment of Individual Studies ............................................................................................... 12
Systematic Reviews ................................................................................................................................ 14
Data Synthesis ........................................................................................................................................ 16
Applicability ............................................................................................................................................. 16
Grading the Strength of the Overall Body of Evidence ........................................................................... 17
Summary Table....................................................................................................................................... 17
Peer Review and Public Comment ......................................................................................................... 18
Peer review......................................................................................................................................... 18
Public comment .................................................................................................................................. 18
Updating Reports .................................................................................................................................... 18
Single Drug Addendum to Reports ......................................................................................................... 19
Outline of a Typical Drug Effectiveness Review Project Report............................................................. 19
References................................................................................................................................. 20
Tables
Table 1. Drug Effectiveness Review Project guidelines to assess quality of trials ..................................... 13
Table 2. Strength of evidence grades and definitions ................................................................................. 17
Drug Effectiveness Review Project January 2011
Systematic Review Methods 2 of 21Introduction
The methodology used by the Evidence-based Practice Centers in producing comparative
systematic reviews for the Drug Effectiveness Review Project is described here. The methods
follow the principles of “best evidence”, focusing on randomized controlled trials with direct
comparisons and health outcomes wherever possible. The methods we use evolve as the
international methods for evidence review evolve, incorporating newly developed methods, as
appropriate, to our goal of producing high quality systematic reviews that meet the needs of the
Participating Organizations of the Drug Effectiveness Review Project (see “About DERP” for
more information on the participating organization who govern DERP).
Review Methods
Conflict of Interest Policy
Drug Effectiveness Review Project investigators and staff comply with a policy on conflicts of
interest whereby there is a formal, written declaration that there are no financial interests in any
pharmaceutical company for the duration of the time the person is working on Drug
Effectiveness Review Project projects. Prior to initiating work, all investigators and staff sign a
form indicating they have no conflicts of interest. The assurance of an absence of conflicts of
interest related to financial interests in pharmaceutical companies is declared annually for any
investigator or staff member continuing to work with the Drug Effectiveness Review Project.
For clinicians invited to participate in a Clinical Advisory Group, the Center for
Evidence-based Policy obtains declarations of conflicts of interest. The policy on these conflicts
is discussed in the section on Clinical Advisory Groups.
Drug Effectiveness Review Project Topic Selection Process
When new topics are considered, the Drug Effectiveness Review Project Participating
Organizations follow an explicit selection process over a 3-month period. This process ensures
that all organizations participate equally and that topics selected are relevant for the majority of
participants. This process is undertaken at various points throughout the 3-year Drug
Effectiveness Review Project contract cycle depending on the needs of the Participating
Organizations and funds available. When new topics are considered, the Center for Evidence-
based Policy solicits topics from each Participating Organization. The initial list is circulated
among Participating Organizations and if there are a large number of potential topics (e.g. 10 or
more) a vote is taken to narrow the list down to approximately 5 to receive further consideration.
The number of topics chosen for additional work depends on the number of new reports to be
initiated, but in general is not more than 5. The original topic submissions include the general
scope (drugs and populations) and the reasoning behind the proposed topic. After reviewing the
list of topics proposed, the Participating Organizations discuss the pros and cons of each
potential topic prior to having the Center for Evidence-based Policy proceed with the production
of briefing papers.
Briefing papers include original participant submissions, pros and cons, and an overview
of available evidence completed by the Oregon Evidence-based Practice Center. For each
proposed topic the Oregon Evidence-based Practice Center conducts a search of MEDLINE
Drug Effectiveness Review Project January 2011
Systematic Review Methods 3 of 21using a search strategy designed specifically to identify systematic reviews. The Oregon
Evidence-based Practice Center also searches the Websites for Agency of Healthcare Research
and Quality, Canadian Agency for Drugs and Technologies in Health, the Cochrane
Collaboration, Effective Healthcare, National Coordinating Center for Health Technology
Assessment, National Institute for Clinical Excellence, and the National Health Service Center
for Reviews and Dissemination to identify high quality systematic reviews relevant to the
proposed topics. Additionally, a MEDLINE search for randomized controlled trials pertaining to
the new topic is included to estimate the proposed topic size.
Formulating Key Questions
Based on the discussion held during topic selection, Key Questions are formulated and serve to
define the scope of a Drug Effectiveness Review Project report. In general, the Key Questions
follow this template:
1. What is the comparative effectiveness of for treatment of
in ?
2. What are the comparative harms of for treatment of in
?
3. Does the comparative effectiveness or harms of vary in patient subgroups
defined by demographics (age, racial groups, gender, etc), socioeconomic status, use of
other medications, or presence of comorbidities?
The questions are modified to best suit the particular review and can include specific
outcomes of focus, such as mortality or symptom relief. Additional or sub-questions can be used
when they add important nuance. However, the study inclusion criteria are intended to provide
the detailed information on specific drugs and outcome measures included.
Draft Key Questions are brought to the Participating Organization group for discussion
and comment. A second draft of the Key Questions is then formulated and again discussed with
the Participating Organizations. Clinical experts, identified by the Participating Organizations,
are consulted via teleconference to provide assistance in refining the Key Questions. Following
modifications, this set of draft Key Questions is posted to the Drug Effectiveness Review Project
Website for public comment. Public comments and Oregon Evidence-based Practice Center
responses are documented in a spreadsheet and are discussed with the Participating
Organizations, and after any approved modifications, the final Key Questions are posted to the
Drug Effectiveness Review Project Website.
The Clinical Advisory Group
In general, the purpose of the Clinical Advisory Group is to provide insight and assistance to
Drug Effectiveness Review Project researchers and participants by offering clinically relevant
counsel throughout the stages of an original review. Currently, advisory groups are utilized for
updates of existing Drug Effectiveness Review Project reviews on a case-by-case basis
depending largely on whether a change in scope has occurred between the previous report and
the current update.
Drug Effectiveness Review Project January 2011
Systematic Review Methods 4 of 21The Center for Evidence-based Policy identifies the potential clinical advisors based
initially on suggestions from the Participating Organizations, who are asked to recommend
clinical experts that best represent their constituencies and who also have significant recent
experience in providing direct patient care. The Center for Evidence-based Policy gathers
conflict of interest information from each clinician and coordinates the assembly of a balanced
Clinical Advisory Group. The composition of the Clinical Advisory Group and their conflicts of
interest declarations are reviewed and discussed by the Participating Organizations prior to any
clinician being contacted.
Once the Clinical Advisory Group has been formalized, the Center for Evidence-based
Policy arranges a teleconference with the advisors and the Evidence-based Practice Center
researchers. In general, consultation with the Clinical Advisory Group focuses on the scope of
the initial Key Questions with regards to the relevant aspects of the population, including
identification of the most important subgroups, interventions, comparators, outcomes, and study
design. Further information regarding clinical experience with certain drug therapies and/or
disease-state management may also be discussed during the initial meeting and can occur
throughout the review process. For example, to assist with quality assessment, it may be useful to
seek guidance from the Clinical Advisory Group in the identification of the most important
baseline prognostic factors. Additionally, we consult with the Clinical Advisory Group members
to determine which subset of outcomes they would consider important enough to warrant formal
grading of the strength of evidence. Evidence-based Practice Center researchers consider
suggestions made by the Clinical Advisory Group members and any proposed modifications are
then discussed with the Participating Organizations.
All Clinical Advisory Group members volunteer their time and expertise and are not
monetarily compensated by the Center for Evidence-based Policy, the Drug Effectiveness
Review Project, or Participating Organizations. Clinical Advisory Group members have the
option of becoming peer reviewers; however, this is not a required function of the group. Experts
who participated in the Clinical Advisory Group are listed on the Drug Effectiveness Review
Project Website.
Searching the Literature and Other Sources of Data
MEDLINE and other database searches
Searches for Drug Effectiveness Review Project reports are generally conducted in consultation
with a medical librarian. At a minimum, MEDLINE and the Cochrane Central Register of
Controlled Trials, Cochrane Database of Systematic Reviews, and Database of Abstracts of
Reviews of Effects are searched. Other databases (e.g., PsycInfo, CancerLit) may be searched
depending on the topic. Search strategies generally combine all included interventions (using
proprietary and generic names) and populations, see example below:
Sample search strategy: Pegylated interferons for Hepatitis C infection
(Numbers in parentheses represent number of studies retrieved)
1 exp Hepatitis C/ or hepatitis C.mp. or hcv.mp. (36716)
2 Pegasys.mp. (50)
3 Peg-intron.mp. (25)
4 peginterferon alfa 2a.mp. (679)
5 peginterferon alfa 2b.mp. (521)
Drug Effectiveness Review Project January 2011
Systematic Review Methods 5 of 216 Interferon Alfa 2a.mp. or exp Interferon Alfa-2a/ (3015)
7 Interferon Alfa-2b.mp. or exp Interferon Alfa-2b/ (4167)
8 6 or 7 (6677)
9 exp Interferons/ (83358)
10 2a.mp. (21935)
11 2b.mp. (17454)
12 9 and (10 or 11) (7636)
13 exp Polyethylene Glycols/ (26140)
14 pegylat$.mp. (2121)
15 peginterferon$.mp. (967)
16 13 or 14 or 15 (27324)
17 (8 or 12) and 16 (988)
18 2 or 3 or 4 or 5 or 17 (1027)
19 ribavirin.mp. or exp Ribavirin/ (5099)
20 1 and 18 and 19 (697)
21 from 20 keep 1-697 (697)
Databases are searched twice, once at the beginning of the review and then between 2 to
3 months prior to submission of the draft report.
Dossier solicitation
The Center for Evidence-based Policy requests dossiers from all pharmaceutical companies that
manufacture any drug included in an individual report. Dossiers are intended to provide a
complete list of citations for all relevant studies of which the manufacturer is aware. We also
request unpublished study information and data, with the understanding that once the report is
published the public may obtain the information by requesting a copy of the dossier – in effect
making it public. Any dossier marked “confidential” are not accepted. A copy of the most recent
product label is also requested. Dossiers are reviewed by Drug Effectiveness Review Project
staff for relevant trials or other data that may not have been captured in MEDLINE or Web
searches and for unpublished data. An accounting of which companies provided dossiers is
included in the Results section of the report.
Web resources
At a minimum, the following Website must be searched for relevant information.
• US Food and Drug Administration Center for Drug Evaluation and Research
Drugs@FDA http://www.accessdata.fda.gov/scripts/cder/drugsatfda/
o This site may be searched by drug name or active ingredient (not drug class) for
statistical and medical reviews written by US Food and Drug Administration
personnel examining information submitted by pharmaceutical companies to the
US Food and Drug Administration for drug approval. However, the Website
typically does not have documents related to older drugs and very new drugs.
Reviews may be downloaded and hand searched for trials. The Center for Drug
Evaluation and Research site also lists any postmarketing study commitments that
are conducted after the US Food and Drug Administration has approved a product
for marketing (e.g., studies requiring the sponsor to demonstrate clinical benefit of
a product following accelerated approval).
Drug Effectiveness Review Project January 2011
Systematic Review Methods 6 of 21o The Medical and Statistical review documents contain information about trials
submitted as part of the New Drug Application and their results. Information
contained in the US Food and Drug Administration reviews is typically not
adequate to assess trial quality. However, these data are used to verify, or add to
data obtained from published manuscripts of these trials. In addition, the studies
submitted to the US Food and Drug Administration are compared with those
found in the published literature and unpublished studies submitted by
manufacturers to identify any remaining unpublished studies. The results of the
trials reported in the US Food and Drug Administration documents are compared
to those reported in published reports of the same studies to identify variation in
outcome reporting. A summary of the findings of the search of US Food and Drug
Administration documents is included in the Results section of the report.
At the discretion of the Lead Investigator, the following Websites may be searched for
relevant information. Other sites may also be included in the search if appropriate:
• ClinicalTrials.gov http://www.clinicaltrials.gov/
o Information on planned and on-going trials, maintained by the National Institutes
of Health.
• Clinical Study Results http://www.clinicalstudyresults.org/
o Sponsored by the Pharmaceutical Research and Manufacturers of America, and
provides clinical study results completed since October 2002, mostly from Phase
III and Phase IV studies. It includes a link to the electronic version of the drug
label, a bibliography of articles on the drug in question with links to the articles
where possible, and a complete summary of each hypothesis testing trial
(regardless of outcome) that has not been published in a peer-reviewed journal.
• Lilly Clinical Trials http://www.lillytrials.com/
o One of several pharmaceutical manufacturers that have established clinical trial
registries for their own products. Trial results are searchable by therapeutic area
or product; new and ongoing trials are included.
• Current Controlled Trials http://www.controlled-trials.com/
o Established to promote the exchange of information about ongoing randomized
controlled trials worldwide. Allows searching across multiple clinical trial
registers, including the National Health Service in England, United States
ClinicalTrials.gov, and direct access to Biomed Central.
Study Selection and Inclusion
Application of study design criteria
In order for any study report to be selected for inclusion in a Drug Effectiveness Review Project
review, it must meet all eligibility criteria for populations, interventions, outcomes, and study
designs, as explicitly specified, a priori, in the Key Questions determined by the Participating
Organizations. The reviewers, with approval of the Participating Organizations, set the study
design criteria. For effectiveness outcomes, the starting point for inclusion is controlled clinical
trials and good-quality systematic reviews, and for outcomes related to harms (adverse events),
these same designs are included, as well as cohort studies with a control group and case-control
Drug Effectiveness Review Project January 2011
Systematic Review Methods 7 of 21studies. Within these study designs, direct comparisons (head-to-head studies) are the primary
focus of synthesis of the evidence. Determining eligibility of these studies is straightforward and
is based on the study reflecting a direct comparison of at least 2 drugs included in the Drug
Effectiveness Review Project report, and meeting population and outcome criteria.
However, under the tenets of a best evidence approach, inclusion criteria for Drug
Effectiveness Review Project reports are written to allow inclusion of placebo-controlled trials,
active-control trials, single-arm cohort or open-label extension studies, and meta-analyses not
based on results of a systematic review (“pooled analyses”) when necessary to fill gaps in
evidence in instances when direct comparisons between drugs have not been made. This may be
extended to include situations where direct comparisons are available, but these studies do not
report outcomes important to the Drug Effectiveness Review Project Participating Organizations.
Determining the eligibility of good-quality systematic reviews can also be complicated and
depends on how similar the scope of the review is to the scope of the Drug Effectiveness Review
Project report, and how recent the evidence included in the review is.
Evidence not meeting study design inclusion criteria may be included in the report if the
evidence is clearly identified as not meeting the criteria but is being included as a matter of
record. Examples of such information are US Food and Drug Administration MedWatch reports
and case series, especially those leading to black box warnings in the product label. These can be
reported in the introduction/background section or in the section discussing evidence on adverse
events.
Cut-off date for new drug inclusion
If a new drug is introduced to the market, the last date of inclusion of that drug in the report is 15
calendar days subsequent to the date the dossiers are due to be submitted by the pharmaceutical
manufacturers. Additionally, if the drug has not been approved at the time initial dossier
solicitations are sent out, the manufacturer must notify the Center for Evidence-based Policy of
the pending approval date and intent to submit a dossier prior to the dossier submission deadline.
This ensures that every pharmaceutical manufacturer is given a fair chance in submitting dossiers
related to the newly approved drug.
Inclusion of active-control and placebo-controlled trials
In Drug Effectiveness Review Project reviews, good-quality, randomized controlled trials that
directly compare different drugs (head-to-head trials) provide the most valid evidence of their
comparative effectiveness. However, Drug Effectiveness Review Project reviewers often face
instances when direct comparisons between one or more included drugs have not been studied.
Or, even when direct comparative evidence is available, it may be limited in quality, quantity,
clinical impact, generalizability and/or other important elements.
Limitations in quality of direct comparative evidence are determined based on objective
assessment of internal validity using predefined criteria. Limitations in quantity of direct
comparative evidence are determined based on consideration of the adequacy of the number of
studies and subject sample sizes. Regarding clinical impact, a common limitation found in
clinical trials in general is their under-reporting of important health outcomes such as quality of
life and functional capacity. Likewise, regarding generalizability, clinical trials are often
criticized overall for using narrowly defined populations and for their under-reporting of
treatment outcomes in subgroups based on age, sex, race, and common comorbidities.
Drug Effectiveness Review Project January 2011
Systematic Review Methods 8 of 21In cases where such gaps in direct comparative evidence exist, trials that compare
included drugs to placebo are considered for their usefulness in providing a source for qualitative
or quantitative indirect comparisons of effectiveness and harms. Inclusion criteria for Drug
Effectiveness Review Project reviews are written to allow inclusion of placebo-controlled trials
to fill gaps in direct evidence. Judgments regarding what constitutes a gap are determined on a
case-by-case basis, but are based on principles of the strength of evidence, including the risk of
bias, consistency, precision, directness, and applicability of the direct evidence.(1) A description
of the rationale for judgments regarding sufficiency of head-to-head trial evidence and utilization
of placebo-controlled trials are provided in each Drug Effectiveness Review Project report. In
updates, as new head-to-head trials emerge that correspond to previously-defined gaps in
evidence, reviewers should consider removing placebo-controlled trials that may no longer be
useful and should revise the description of their rationale accordingly.
As with head-to-head trials, the quality of any placebo-controlled trials that contribute
data to the synthesis are assessed using the same standardized criteria and their data abstracted
into evidence tables. On the other hand, for areas of a Drug Effectiveness Review Project review
where direct comparative evidence is deemed sufficient, evidence from placebo-controlled trials
are included, and their data abstraction and quality assessment is not required.
Method of synthesis of evidence from placebo-controlled trials is determined on a case-
by-case basis. Pursuit of qualitative or quantitative indirect comparison is never required and
decisions to do so must depend on consideration of clinical, methodological, and statistical
heterogeneity levels across the individual studies. Guidance on methods for quantitative indirect
synthesis can be found elsewhere.(2) In many cases, when excess heterogeneity is present, a
general discussion of findings from placebo-controlled trials can be useful for identifying which
individual drugs have any evidence of effect in the gap areas compared with those that do not
even have basic efficacy data.
Trials that compare one of the drugs included in the review against a drug that is not
included are called “active-control” in the Drug Effectiveness Review Project. This is to
differentiate from trials with direct comparisons among the included drugs. These studies are
included only in specific, infrequent, situations. Where there is no direct evidence, and no or
very limited evidence from placebo-controlled trials, evidence from active-control trials may be
relevant. However, such indirect comparisons are typically only useful when the comparator (the
active-control drug) is the same across the included studies. If there is significant heterogeneity
in the comparators such studies are unlikely to provide good indirect evidence for comparing one
included drug to another.
Pooled analyses
A pooled analysis is a meta-analysis of a group of highly selected studies. There is typically no
related search strategy to identify the articles (although sometimes a noncomprehensive search),
and no quality assessment of the included trials. Pooled analyses are not systematic in nature.
However, there are limited situations where this level of evidence may be useful and admissible
in a Drug Effectiveness Review Project report.
Similar to the use of placebo-controlled trial evidence, pooled analyses are used to provide
evidence where no or insufficient evidence exists. For example, pooled analysis presents data on
subgroups where primary data on these subgroups is not obtainable from the primary sources or
supplements information on outcomes not reported in the primary sources. In these cases, the
Drug Effectiveness Review Project January 2011
Systematic Review Methods 9 of 21primary studies have been published or are available to the Drug Effectiveness Review Project
authors in sufficient detail to assess the quality of the study.
However, pooled analyses of results already available to the Drug Effectiveness Review
Project authors from primary sources are not be included. Because pooled analyses do not follow
a systematic approach to identifying and assessing the studies, Drug Effectiveness Review
Project authors undertake an independent analysis of these studies. If the pooled analysis is the
only source of data from component studies (e.g. results of the primary studies included in the
meta-analysis are not published), it can be included at the Drug Effectiveness Review Project
report author’s discretion, but the limitations are made clear – primarily the limited ability to
assess the quality of the component studies.
Systematic reviews
As part of a high-quality approach to evidence review, existing systematic reviews are
considered for inclusion in Drug Effectiveness Review Project reports along with other types of
evidence. The intention is to include reviews that directly address the Key Questions posed in the
report and that meet minimum standards for quality.
In order for a review to be considered for inclusion into a Drug Effectiveness Review
Project report, the review must meet at least 2 criteria that indicate it is “systematic”. First, the
review must include a comprehensive search method for the evidence. This entails searching
multiple sources of information (electronic databases, reference lists, hand searching journals,
etc). Second, the review must provide (or at least describe) the search terms used to retrieve the
evidence. Other information such as the reporting of study eligibility criteria, quality assessment,
et cetera are not required to determine if a review qualifies as being “systematic”, although this
information is useful. The review also must address questions that are similar enough to those
posed in the report to provide useful information to the report readers. Reviews that evaluate
“class effect” of drugs grouped together compared with other interventions are unlikely to be
useful in a Drug Effectiveness Review Project report. Moreover, reviews that compare a small
proportion of drugs in a large class may not be useful. An example would be if 2 of 7 drugs were
reviewed. The report authors determine the usefulness of the review in the larger context of the
drug class.
Additional inclusion criteria are determined based on what is known about the underlying
evidence base. For example, in an area where there are many existing reviews over many years,
the authors may choose a cut-off date to examine only the most recent reviews for inclusion,
such as within 2 years of the Drug Effectiveness Review Project search dates. In other areas, this
may not be a reliable approach if the underlying literature is older and has not changed in recent
years.
Single-arm studies: Cohort or open-label extension of a trial
There are 2 types of studies considered here: observational studies of patients receiving a drug
included in the Drug Effectiveness Review Project report with no comparison group that is
relevant to the review and open-label extension studies of a randomized controlled trial.
Single-group studies are included under the “best evidence” approach only if the study
adds important evidence on harms that is not available from other, higher quality, studies. This
means that the study must have exposure duration longer than the trials included and that no
comparative evidence is available. The minimum duration (e.g. 2 years of follow-up) is
Drug Effectiveness Review Project January 2011
Systematic Review Methods 10 of 21determined a priori based on the current knowledge of the drugs potential adverse events and
taking account of the existing evidence from trials.
Caveats in using open-label extension studies include that the study population is derived
from a clinical trial where the populations are typically highly selected (a narrow set of inclusion
criteria), and often patients continuing in the extension are those who had adequate response
and/or tolerated the drug during the trial period. There are no clearly agreed upon criteria for
evaluating the quality of such studies.
Single-group cohort studies are evaluated under the same criteria used to evaluate the
quality of cohort studies with a comparison group. It can be difficult to determine the mean or
minimum duration of follow-up (exposure to the drug) in these studies, which may make them
less useful in evaluating longer-term harms. However, this type of study may include a more
broadly defined group of patients than those included in trials and could potentially increase the
applicability of this evidence.
Unpublished studies or data
Unpublished studies may be identified through pharmaceutical manufacturer dossiers, US Food
and Drug Administration documents, or trial registries. Pharmaceutical manufacturer dossiers
may also contain previously-unpublished supplemental data from published studies. Unpublished
data or studies cannot be submitted by pharmaceutical companies after the dossier process
timeline (e.g. not through the public comment process for draft reports).
Unpublished studies
Unpublished studies identified through pharmaceutical manufacturer dossiers, US Food and
Drug Administration documents, or trial registries are to be included only if the study meets the
inclusion criteria established in the key questions and sufficient detail is provided to assess the
study quality. At a minimum, information must be provided on the comparability of groups at
baseline, the number of patients analyzed, whether an intention-to-treat analysis was conducted,
and the type of statistical test used. If this information is not present in the dossier submission,
the study is not to be included.
Supplemental data
In situations where additional data are provided regarding a published study, such as additional
outcomes or subgroup data not included in the published manuscript, analyses of these data will
be included if details of the data analysis are provided. Specifically, the type of statistical test
used, numbers analyzed, and whether an intention-to-treat analysis was conducted must be
reported. In general, raw data will not be analyzed by the review team. Additional data on
subgroups will only be included for direct, head-to-head, comparisons of included drugs and it is
expected that any analyses conducted on the data will be adequately described for reviewer
evaluation. Study quality assessment will be based on the fully published study details.
Process for determining study eligibility
Overall, determination of study eligibility is based on reviewer judgment using a 2-step process.
In order to reduce potential bias, and ensure accuracy and reproducibility, all study reports
identified in searches are assessed for eligibility by at least 2 qualified reviewers (“dual review”)
and final selection decisions are made using a consensus process. Qualified reviewers are limited
Drug Effectiveness Review Project January 2011
Systematic Review Methods 11 of 21to individuals with adequate training and experience to apply the inclusion criteria with
consistency and accuracy.
The first step of the study selection process involves assessment of titles and abstracts
identified through literature searches for preliminary determination of study eligibility. Only
study reports with titles and abstracts that are unequivocally ineligible are rejected at this stage.
For all other reports, the full-text articles are obtained and read in detail for the second step in
determination of eligibility.
Both steps in the study selection process should involve “dual review” and it is up to the
reviewers to decide which of 2 “dual review” options to use. If possible, it is desirable to
complete eligibility assessments for each report in duplicate by 2 independent reviewers.
However, it is also acceptable to have a first reviewer to complete eligibility assessments and a
second reviewer to check the accuracy of the first reviewer’s assessment results. If 2 reviewers
are unable to agree, a third party, as senior reviewer, is consulted.
Results of eligibility assessments for all screened reports are displayed in a Preferred
Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement-based
diagram.(3) The flow diagram depicts the flow of information through the different phases of a
systematic review. It maps out the number of records identified, included and excluded, and the
reasons for exclusions. Studies are excluded if they do not meet predetermined inclusion criteria
as defined in the Key Questions, for example when the population, intervention, comparator,
outcomes, or study designs do not meet eligibility requirements. Studies with results presented
only in an abstract of a conference proceeding are excluded and would be described as not
meeting study design criteria. Similarly, systematic reviews are excluded if they are outdated or
of poor quality. Other reasons for study exclusion are publication in a non-English language or if
all efforts to retrieve the article were exhausted without successful retrieval.
Finally, for reader convenience, all Drug Effectiveness Review Project reports contain an
Appendix that lists reasons for exclusion for all individual trials that were excluded at least at the
full-text level. For updates, this list is a cumulative list of trials but within a 3-5-page limit. If it
exceeds that page limit, the excluded trials list consists of trials for that specific update only and
readers are directed to the older versions of the reports available on the Drug Effectiveness
Review Project Website for trials excluded previously.
Quality Assessment of Individual Studies
For trials, we assess internal validity (quality) based on the predefined criteria. These criteria are
based on those used by the US Preventive Services Task Force and the National Health Service
Centre for Reviews and Dissemination (United Kingdom).(4, 5) We rate the internal validity of
each trial based on the methods used for randomization, allocation concealment, and blinding;
the similarity of compared groups at baseline; maintenance of comparable groups; adequate
reporting of dropouts, attrition, crossover, adherence, and contamination; loss to follow-up; and
the use of intention-to-treat analysis. Trials with a fatal flaw are rated poor quality; trials that
meet all criteria are rated good quality; the remainder are rated fair quality. As the fair-quality
category is broad, studies with this rating vary in their strengths and weaknesses: The results of
some fair-quality studies are likely to be valid, while others are only possibly valid. A poor-
quality trial is not valid; the results are at least as likely to reflect flaws in the study design as a
true difference between the compared drugs. A fatal flaw is reflected by failure to meet
combinations of items of the quality assessment checklist. A particular randomized trial might
Drug Effectiveness Review Project January 2011
Systematic Review Methods 12 of 21receive 2 different ratings, one for effectiveness and another for adverse events. More detailed
descriptions of how each checklist item is assessed are presented in Table 1, below.
Table 1. Drug Effectiveness Review Project guidelines to assess quality of trials
1. Was the assignment to the treatment groups really random?
Use of the term “randomized” alone is not sufficient for a judgment of “Yes”. Explicit
Yes description of method for sequence generation must be provided. Adequate approaches
include: Computer-generated random numbers, random numbers tables
Randomization was either not attempted or was based on an inferior approach (e.g.,
No
alternation, case record number, birth date, or day of week)
Unclear Insufficient detail provided to make a judgment of yes or no.
2. Was the treatment allocation concealed?
Adequate approaches to concealment of randomization: Centralized or pharmacy-controlled
randomization, serially-numbered identical containers, on-site computer based system with a
Yes randomization sequence that is not readable until allocation
Note: If a trial did not use adequate allocation concealment methods, the highest rating it can
receive is “Fair”.
Inferior approaches to concealment of randomization: Use of alternation, case record
No
number, birth date, or day of week, open random numbers lists, serially numbered envelopes
No details about allocation methods. A statement that “allocation was concealed” is not
Unclear
sufficient; details must be provided.
3. Were groups similar at baseline in terms of prognostic factors?
Parallel design: No clinically important differences
Crossover design: Comparison of baseline characteristics must be made based on order of
randomization.
Yes
Prognostic factors are important to consider and are discussed a priori with clinical advisory
groups. A statistically significant difference does not automatically constitute a clinically
important difference.
No Clinically important differences
Parallel design: Statement of “no differences at baseline”, but data not reported; or data not
Unclear reported by group, or no mention at all of baseline characteristics
Crossover design: Only reported baseline characteristics of the overall group.
4. Were eligibility criteria specified?
Yes Eligibility criteria were specified a priori.
No Criteria not reported or description of enrolled patients only.
5. Were outcome assessors blinded to treatment allocation?
6. Was the care provider blinded?
7. Was the patient blinded?
Explicit statement(s) that outcome assessors/care provider/patient were blinded. Double-
Yes dummy studies and use of identically appearing treatments are also considered sufficient
blinding methods for patients and care providers.
No No blinding used, open-label
Unclear,
Study described as double-blind but no details provided on how blinding was carried out or
described as
who was specifically blinded.
double-blind
Drug Effectiveness Review Project January 2011
Systematic Review Methods 13 of 21Not reported No information about blinding
8. Did the article include an intention-to-treat analysis or provide the data needed to calculate it (that is, number
assigned to each group, number of subjects who finished in each group and their results)?
All patients that were randomized were included in the analysis. Imputation methods (e.g.,
last-observation carried forward) should be clearly described.
OR
Yes
Exclusion of 5% of patients or less is acceptable, given that the reasons for exclusion are not
related to outcome (e.g., did not take study medication) and that the exclusions would not be
expected to have an important impact on the effect size
Exclusion of greater than 5% of patients from analysis OR less than 5%, with reasons that
No may affect the outcome (e.g., adverse events, lack of efficacy) or reasons that may be due to
bias (e.g., investigator decision)
Unclear Numbers analyzed are not reported
9. Did the study maintain comparable groups?
No attrition. OR, the groups analyzed remained similar in terms of their baseline prognostic
Yes
factors.
No Groups analyzed had clinically important differences in important baseline prognostic factors
There was attrition, but insufficient information to determine if groups analyzed had clinically
Unclear
important differences in important baseline prognostic factors
10. Were levels of crossovers (≤ 5%), adherence (≤ 20%), and contamination (≤ 5%) acceptable?
Yes Levels of crossovers, adherence and contamination were below specified cut-offs.
No Levels or crossovers, adherence, and contamination were above specified cut-offs.
Insufficient information provided to determine the level of crossovers, adherence and
Unclear
contamination.
11. Was the rate of overall attrition and the difference between groups in attrition within acceptable levels?
Overall attrition: There is no empirical evidence to support establishment of a specific level of attrition that is
universally considered “important”. The level of attrition considered important will vary by review and is
determined a priori by the review teams. Attrition refers to discontinuation for ANY reason, including lost to
follow-up, lack of efficacy, adverse events, investigator decision, protocol violation, consent withdrawal, etc.
Yes The overall attrition rate was below the level that was established by the review team.
No The overall attrition rate was above the level that was established by the review team.
Unclear Insufficient information provided to determine the level of attrition
Differential attrition
Yes The absolute difference between groups in rate of attrition was below 10%.
The difference between groups in the overall attrition rate or in the rate of attrition for a
No
specific reason (e.g., adverse events, protocol violations, etc.) was 10% or more.
Unclear Insufficient information provided to determine the level of attrition
Systematic Reviews
Included systematic reviews are rated for quality based on a clear statement of the questions(s)
the review is intended to answer; reporting of inclusion criteria; methods used for identifying
literature (the search strategy), validity assessment, and synthesis of evidence; and details
provided about included studies. Reviews are categorized as good when all criteria are met.
Because there are different methods available for assessing the quality of systematic reviews, and
none has become the standard, reviewers can use one of the following: AMSTAR,(6-8) Oxman
and Guyatt,(9, 10) or the Centre for Reviews and Dissemination criteria (below).(4)
Drug Effectiveness Review Project January 2011
Systematic Review Methods 14 of 211. Is there a clear review question and inclusion/exclusion criteria reported relating to the
primary studies?
a. A good-quality review should focus on a well-defined question or set of
questions, which ideally refers to the inclusion/exclusion criteria by which
decisions are made on whether to include or exclude primary studies. The criteria
should relate to the 4 components of study design: indications (patient
populations), interventions (drugs), and outcomes of interest. In addition, details
are reported relating to the process of decision-making, i.e., how many reviewers
were involved, whether the studies were examined independently, and how
disagreements between reviewers were resolved.
2. Is there evidence of a substantial effort to search for all relevant research?
a. This is usually the case if details of electronic database searches and other
identification strategies are given. Details of the search terms used, date, and
language restrictions are presented. In addition, descriptions of hand-searching,
attempts to identify unpublished material, and any contact with authors, industry,
and research institutes are provided. The appropriateness of the database(s)
searched by the authors should also be considered, for example if MEDLINE is
searched for a review looking at health education, then it is unlikely that all
relevant studies have been located.
3. Is the validity of included studies adequately assessed?
a. A systematic assessment of the quality of primary studies should include an
explanation of the criteria used (e.g., method of randomization, whether outcome
assessment was blinded, whether analysis was on an intention-to-treat basis).
Authors may use either a published checklist or scale, or one that they have
designed specifically for their review. Again, the process relating to the
assessment is reported (i.e. how many reviewers involved, whether the assessment
was independent, and how discrepancies between reviewers were resolved).
4. Is sufficient detail of the individual studies presented?
a. The review should demonstrate that the studies included are suitable to answer the
question posed and that a judgement on the appropriateness of the authors'
conclusions can be made. If a paper includes a table giving information on the
design and results of the individual studies, or includes a narrative description of
the studies within the text, this criterion is usually fulfilled. If relevant, the tables
or text should include information on study design, sample size in each study
group, patient characteristics, description of interventions, settings, outcome
measures, follow-up, drop-out rate (withdrawals), effectiveness results, and
adverse events.
5. Are the primary studies summarized appropriately?
a. The authors should attempt to synthesize the results from individual studies. In all
cases, there should be a narrative summary of results, which may or may not be
accompanied by a quantitative summary (meta-analysis).
For reviews that use a meta-analysis, heterogeneity between studies should
be assessed using statistical techniques. If heterogeneity is present, the possible
reasons (including chance) should be investigated. In addition, the individual
evaluations should be weighted in some way (e.g., according to sample size or
Drug Effectiveness Review Project January 2011
Systematic Review Methods 15 of 21inverse of the variance) so that studies that are considered to provide the most
reliable data have greater impact on the summary statistic.
Data Synthesis
For the Drug Effectiveness Review Project, evidence tables showing the study characteristics,
quality ratings, and results for all included studies are constructed. Studies are reviewed using a
hierarchy of evidence approach, where the best evidence is the focus of our synthesis for each
question, population, intervention, and outcome addressed. Studies that evaluate one drug against
another provide direct evidence of comparative effectiveness and harms. Where possible, these
data are the primary focus. Direct comparisons are preferred over indirect comparisons;
similarly, effectiveness and long-term or serious adverse event outcomes are preferred to
efficacy and short-term tolerability outcomes.
In theory, trials that compare a drug with other drug classes or with placebo can also
provide evidence about effectiveness. This is known as an indirect comparison and can be
difficult to interpret for a number of reasons, primarily heterogeneity of trial populations,
interventions, and outcomes assessment across the studies. Data from indirect comparisons are
used to support direct comparisons, where they exist, and are used as the primary comparison
where no direct comparisons exist. Indirect comparisons are interpreted with caution.
Quantitative analyses are conducted using meta-analyses of outcomes reported by a
sufficient number of studies that are homogeneous enough that combining their results could be
justified. In general, the Drug Effectiveness Review Project follows the guidance on meta-
analysis put forth for Evidence-based Practice Centers in the Evidence-based Practice Center
Methods Guide.(2) In order to determine whether meta-analysis can be meaningfully performed,
we consider the quality of the studies and the heterogeneity among studies in design, patient
population, interventions, and outcomes. The Q statistic and the I2 statistic (the proportion of
variation in study estimates due to heterogeneity) are calculated to assess statistical heterogeneity
between studies.(11, 12) If significant heterogeneity is shown, potential sources are then
examined by analysis of subgroups of study design, study quality, patient population, and
variation in interventions. Meta-regression models may be used to formally test for differences
between subgroups with respect to outcomes.(13, 14) Random effects models to estimate pooled
effects are preferred in most cases, unless a case can be made that a fixed effect model is more
appropriate. Other analyses, including adjusted indirect meta-analysis and mixed treatment effect
model (network meta-analysis) are done in consultation with statisticians experienced in
conducting these analyses and using the most up-to-date and appropriate methods. When it is
determined that is unwise to pool data from a group of studies, the data are summarized
qualitatively.
When synthesizing unpublished evidence, reviewers will conduct sensitivity analyses
where possible to determine if there is an indication of bias when unpublished data are included.
The source of unpublished information will be clearly noted in the text of the report, stating that
these are unpublished data and have not undergone a medical journal’s peer review process and
should be interpreted cautiously.
Applicability
Drug Effectiveness Review Project January 2011
Systematic Review Methods 16 of 21An assessment of applicability is undertaken in Drug Effectiveness Review Project reports. The
applicability assessment is tailored to the Key Questions, and if possible the population to whom
the questions are intended to apply. These are defined in advance with the help of the Clinical
Advisory Group and the Drug Effectiveness Review Project Participating Organizations. A
discussion of applicability appears immediately before the Summary Table in Drug Effectiveness
Review Project reports.
Grading the Strength of the Overall Body of Evidence
Strength of evidence is assessed based on the main outcomes for each Key Question, as
determined by the Participating Organizations and with input from the Clinical Advisory Group,
and generally follows the method used by the Evidence-based Practice Center program.(1)
Individual lead investigators may choose to use the GRADE approach to grading the strength of
evidence, if they determine they are more familiar with this system.(15-17) In either system, the
main domains considered in assessing the strength of a body of evidence for a given outcome
are: risk of bias of the included studies, directness of the studies in measuring the outcome and
comparison in question, and the consistency and prevision of the results of the studies. Poor-
quality studies do not contribute to the assessment of overall risk of bias for a body of evidence
because their results are not synthesized with the fair and good quality study results. After
assessing each of these items for the group of studies, an overall assessment is made. The
evidence can be described as low, moderate, or high strength of evidence. In addition, when
there is either no evidence available or the evidence is too limited or too indirect to make
conclusions about comparative effectiveness, the evidence can be described as insufficient to
determine the strength of evidence. The Evidence-based Practice Center definitions of these
terms are listed below in Table 2, below. The tables used to assess each outcome are presented in
an Appendix. A summary grade of the strength of evidence can be included in the summary table
(above). A paragraph describing the strength of evidence of each main outcome can also be used.
Tables showing the grading of individual domains for each outcome assessed are included as an
appendix in Drug Effectiveness Review Project reports; overall assessments (low, moderate,
high, or insufficient strength of evidence) are included as part of the Summary Table found at the
end of each report.
Table 2. Strength of evidence grades and definitions
Grade Definition
High High confidence that the evidence reflects the true effect. Further research is very unlikely to
change our confidence in the estimate of effect.
Moderate Moderate confidence that the evidence reflects the true effect. Further research may change our
confidence in the estimate of effect and may change the estimate.
Low Low confidence that the evidence reflects the true effect. Further research is likely to change our
confidence in the estimate of effect and is likely to change the estimate.
Insufficient Evidence either is unavailable or does not permit estimation of an effect.
Source: Owens et al, 2009(18)
Summary Table
Drug Effectiveness Review Project January 2011
Systematic Review Methods 17 of 21You can also read