Natural products in drug discovery: advances and opportunities - Nature

Page created by Nathan Erickson
 
CONTINUE READING
Reviews

                                   Natural products in drug discovery:
                                   advances and opportunities
                                   Atanas G. Atanasov 1,2,3,4 ✉, Sergey B. Zotchev2, Verena M. Dirsch                         2
                                                                                                                               , the International
                                   Natural Product Sciences Taskforce* and Claudiu T. Supuran 5 ✉
                                   Abstract | Natural products and their structural analogues have historically made a major
                                   contribution to pharmacotherapy, especially for cancer and infectious diseases. Nevertheless,
                                   natural products also present challenges for drug discovery, such as technical barriers to screening,
                                   isolation, characterization and optimization, which contributed to a decline in their pursuit by
                                   the pharmaceutical industry from the 1990s onwards. In recent years, several technological and
                                   scientific developments — including improved analytical tools, genome mining and engineering
                                   strategies, and microbial culturing advances — are addressing such challenges and opening up
                                   new opportunities. Consequently, interest in natural products as drug leads is being revitalized,
                                   particularly for tackling antimicrobial resistance. Here, we summarize recent technological
                                   developments that are enabling natural product-​based drug discovery, highlight selected
                                   applications and discuss key opportunities.

sp3 carbon atoms                  Historically, natural products (NPs) have played a key       ‘bioactive’ compounds covering a wider area of chemical
Tetravalent carbon atoms          role in drug discovery, especially for cancer and infec-     space compared with typical synthetic small-​molecule
forming single covalent bonds     tious diseases1,2, but also in other therapeutic areas,      libraries13.
with other atoms within the       including cardiovascular diseases (for example, statins)         Despite these advantages and multiple successful
molecular structure. A higher
fraction of sp3 carbons within
                                  and multiple sclerosis (for example, fingolimod)3–5.         drug discovery examples, several drawbacks of NPs have
molecules is a descriptor that        NPs offer special features in comparison with con-       led pharmaceutical companies to reduce NP-​based drug
indicates more complex            ventional synthetic molecules, which confer both advan-      discovery programmes. NP screens typically involve a
3D structures.                    tages and challenges for the drug discovery process.         library of extracts from natural sources (Fig. 1), which
1
 Institute of Genetics and        NPs are characterized by enormous scaffold diversity         may not be compatible with traditional target-​based
Animal Biotechnology of the       and structural complexity. They typically have a higher      assays14. Identifying the bioactive compounds of inter-
Polish Academy of Sciences,       molecular mass, a larger number of sp3 carbon atoms and      est can be challenging, and dereplication tools have to
Jastrzebiec, Poland.
                                  oxygen atoms but fewer nitrogen and halogen atoms,           be applied to avoid rediscovery of known compounds.
2
 Department of
                                  higher numbers of H-​bond acceptors and donors, lower        Accessing sufficient biological material to isolate and
Pharmacognosy, University
of Vienna, Vienna, Austria.       calculated octanol–water partition coefficients (cLogP       characterize a bioactive NP may also be challenging15.
3
 Institute of Neurobiology,       values, indicating higher hydrophilicity) and greater        Furthermore, gaining intellectual property (IP) rights
Bulgarian Academy of              molecular rigidity compared with synthetic compound          for (unmodified) NPs exhibiting relevant bioactivities
Sciences, Sofia, Bulgaria.        libraries1,6–9. These differences can be advantageous; for   can be a hurdle, since naturally occurring compounds
4
 Ludwig Boltzmann Institute       example, the higher rigidity of NPs can be valuable in       in their original form may not always be patented (legal
for Digital Health and Patient    drug discovery tackling protein–protein interactions10.      frameworks vary between countries and are evolving)16,
Safety, Medical University
                                  Indeed, NPs are a major source of oral drugs ‘beyond         although simple derivatives can be patent-​protected
of Vienna, Vienna, Austria.
                                  Lipinski’s rule of five’11. The increasing significance of   (Box 1). An additional layer of complexity relates to the
5
 Università degli Studi di
Firenze, NEUROFARBA               drugs not conforming to this rule is illustrated by the      regulations defining the need for benefit sharing with
Dept, Sezione di Scienze          increase in molecular mass of approved oral drugs over       countries of origin of the biological material, framed
Farmaceutiche, Florence, Italy.   the past 20 years12. NPs are structurally ‘optimized’        in the United Nations 1992 Convention on Biological
*A full list of members and       by evolution to serve particular biological functions1,      Diversity and the Nagoya Protocol, which entered into
affiliations is presented         including the regulation of endogenous defence mech-         force in 2014 (ref.17), as well as recent developments con-
at the end of the paper.
                                  anisms and the interaction (often competition) with          cerning benefit sharing linked to use of marine genetic
✉e-​mail: a.atanasov.
                                  other organisms, which explains their high relevance for     resources18.
mailbox@gmail.com;
claudiu.supuran@unifi.it          infectious diseases and cancer. Furthermore, their use           Although the complexity of NP structures can be
https://doi.org/10.1038/          in traditional medicine may provide insights regarding       advantageous, the generation of structural analogues to
s41573-020-00114-​z               efficacy and safety. Overall, the NP pool is enriched with   explore structure–activity relationships and to optimize

200 | March 2021 | volume 20                                                                                                        www.nature.com/nrd
Reviews

                                        Problem 1                          Solutions for 1 and 2
                                        Not possible to culture the        • New methods for culturing
                                        organism out of the natural        • New methods for in situ
                                        habitat                              analysis                                     Problem 6            Solution for 6
                                                                                                                          Unknown molecular    New methods
                                        Problem 2                          Solution for 2                                 targets for NPs      for molecular
                                        Relevant NPs are not produced      New methods for NP synthesis                   identified with       mode of action
                                        when the organism is taken out     induction and heterologous                     phenotypic assays    elucidation
                                        of the natural habitat             expression of biosynthetic genes

                                                                                                           Multiple
                                                                                                           rounds of
                                                                                                           fractionation

                                                                                   Crude                                           Single
                                                            Organisms                                         Fractions
                                                                                   extracts                                        compounds

                                        Problem 3                          Solution for 3                  Bioactivity
                                        Presence of NPs that are known     New methods for                 evaluation
                                                                           dereplication
                                        Problem 4
                                        Presence of NPs that do not        Solution for 4 and 5
                                        have drug-like properties          New methods for
                                                                           extraction and
                                        Problem 5                          pre-fractionation
                                        Bioactive NPs are present in       of extracts
                                        insufficient amounts
Lipinski’s rule of five
This guideline for the likelihood
of a compound having oral
bioavailability is based on          Fig. 1 | Outline of traditional bioactivity-guided isolation steps in natural product drug discovery. Steps in the
several characteristics              process are shown in purple boxes, with associated key limitations shown in red boxes and advances that are helping to
containing the number 5.             address these limitations in modern natural product (NP)-​based drug discovery shown in green boxes. The process begins
It predicts that a molecule is       with extraction of NPs from organisms such as bacteria. The choice of extraction method determines which compound
likely to have poor absorption       classes will be present in the extract (for example, the use of more polar solvents will result in a higher abundance of polar
or permeation if it has more         compounds in the crude extract). To maximize the diversity of the extracted NPs, the biological material can be subjected
than one of the following            to extraction with several solvents of different polarity. Following the identification of a crude extract with promising
characteristics: there are
                                     pharmacological activity, the next step is its (often multiple) consecutive bioactivity-​guided fractionation until the pure
>5 H-​bond donors and
>10 H-​bond acceptors; the
                                     bioactive compounds are isolated. A key limitation for the potential of this approach to identify novel NPs is that many
molecular weight is >500;            potential source organisms cannot be cultured or stop producing relevant NPs when taken out of their natural habitat.
or the partition coefficient         These limitations are being addressed through development of new methods for culturing, for in situ analysis, for NP
LogP is >5. Notably, natural         synthesis induction and for heterologous expression of biosynthetic genes. At the crude extract step, challenges include
products were identified as          the presence in the extracts of NPs that are already known, NPs that do not have drug-​like properties or insufficient
common exceptions at the             amounts of NPs for characterization. These challenges can be addressed through the development of methods for
time of publication in 1997.         dereplication, extraction and pre-​fractionation of extracts. Finally, at the last stage, when bioactive compounds are
                                     identified by phenotypic assays, significant time and effort are typically needed to identify the affected molecular targets.
Dereplication
                                     This challenge can be addressed by the development of methods for accelerated elucidation of molecular modes of action,
Pharmacological screening of
natural product extracts yields
                                     such as the nematic protein organization technique (NPOT), drug affinity responsive target stability (DARTS), stable isotope
hits potentially containing          labelling with amino acids in cell culture and pulse proteolysis (SILAC-​PP), the cellular thermal shift assay (CETSA) and an
multiple natural products that       extension known as thermal proteome profiling (TPP), stability of proteins from rates of oxidation (SPROX), the similarity
need to be considered for            ensemble approach (SEA) and bioinformatics-​based analysis of connectivity (connectivity map, CMAP)23,189–192.
further study to identify
the bioactive compounds.             NP leads can be challenging, particularly if synthetic          areas: analytical techniques, genome mining and engi-
Dereplication is the process of
recognizing and excluding from
                                     routes are difficult. Also, NP-​b ased drug leads are           neering, and cultivation systems. In the concluding sec-
further study such hit mixtures      often identified by phenotypic assays, and deconvolu-           tion, we highlight promising future directions for NP
that contain already known           tion of their molecular mechanisms of action can be             drug discovery.
bioactive compounds.                 time-​consuming19. Fortunately, there have been sub-
                                     stantial advances20 both in the development of screening        Application of analytical techniques
Phenotypic assays
Assays that rely on the ability      assays (for example, harnessing the potential of induced        Classical NP-​based drug research starts with biological
of tested compounds to exert         pluripotent stem cells and gene editing technologies)           screening of ‘crude’ extracts to identify a bioactive ‘hit’
desired phenotypic changes           and in strategies to identify the modes of action of active     extract, which is further fractionated to isolate the active
in cells, isolated tissues, organs   compounds (reviewed previously21–23).                           NPs. Bioactivity-​guided isolation is a laborious process
or animals. They offer a
complementary strategy
                                         Here, we discuss recent technological and scien-            with a number of limitations, but various strategies and
to target-​based assays for          tific advances that may help to overcome challenges in          technologies can be used to address some of them (Fig. 2).
identifying new potential drugs.     NP-​based drug discovery, with an emphasis on three             For example, to create libraries that are compatible

NaTure RevIewS | DRug DiScOvERy                                                                                                   volume 20 | March 2021 | 201
Reviews

 Box 1 | Natural products that activate the KEAP1/NRF2 pathway                                       with high-​throughput screening, crude extracts can be
                                                                                                     pre-​fractionated into sub-​fractions that are more suit-
 An example of a pathway affected by diverse natural products (NPs) is the KEAP1/NRF2                able for automated liquid handling systems. In addi-
 pathway. This pathway regulates the expression of networks of genes encoding proteins               tion, fractionation methods can be adjusted so that
 with versatile cytoprotective functions and has essential roles in the maintenance
                                                                                                     sub-​fractions preferentially contain compounds with
 of redox and protein homeostasis, mitochondrial biogenesis and the resolution of
                                                                                                     drug-​like properties (typically moderate hydrophilicity).
 inflammation196–199.
   Activation of this pathway can protect against damage by most types of oxidants and               Such approaches can increase the number of hits com-
 pro-​inflammatory agents, and it restores redox and protein homeostasis200. The pathway             pared with using crude extracts, as well as enabling more
 has therefore attracted attention for the development of drugs for the prevention                   efficient follow-​up of promising hits24.
 and treatment of complex diseases, including neurological conditions such as                            Metabolomics was developed as an approach to
 relapsing–remitting multiple sclerosis201 and autism spectrum disorder202.                          simultaneously analyse multiple metabolites in biologi­
   Dimethyl fumarate (DMF), the methyl ester of the NP fumarate (a tricarboxylic acid                cal samples. Enabled by technological developments in
 (TCA) cycle intermediate that is found in both animals and plants), is one of the earliest          chromatography and spectrometry, metabolomics was
 discovered inducers of the KEAP1/NRF2 pathway203,204. The origins of the development of             historically applied first in other research fields, such as
 DMF as a drug date back to the use in traditional medicine of the plant Fumaria officinalis.
                                                                                                     biomedical and agricultural sciences2. Advances in the
 Initially, fumaric acid derivatives were used for the treatment of psoriasis as it was
                                                                                                     analytical instrumentation used in NP research25,26, cou-
 thought that psoriasis is caused by a metabolic deficiency in the TCA cycle that could
 be compensated for by repletion of fumarate205. Despite this erroneous assumption,                  pled with computational approaches that can generate
 DMF is effective in treating psoriasis, both topically and orally, and is the active principle      plausible NP analogue structures and their respective
 of Fumaderm, which has been used clinically for several decades in the treatment of plaque          simulated spectra27, have also enabled application of
 psoriasis in Germany. More recently, a DMF formulation developed by Biogen has been                 ‘omics’ approaches such as metabolomics in NP-​based
 tested in other immunological disorders, with successful phase III trials in multiple               drug discovery. Metabolomics can provide accurate infor-
 sclerosis206,207 leading to its approval by the FDA and EMA in 2013.                                mation on the metabolite composition in NP extracts,
   The isothiocyanate sulforaphane, isolated from broccoli (Brassica oleracea)208, is among          thus helping to prioritize NPs for isolation, to accelerate
 the most potent naturally occurring inducers of the KEAP1/NRF2 pathway209 and has                   dereplication28,29 and to annotate unknown analogues
 protective effects in animal models of Parkinson210, Huntington211 and Alzheimer212
                                                                                                     and new NP scaffolds. Moreover, metabolomics can
 diseases, traumatic brain injury213, spinal cord contusion injury214, stroke215, depression216
                                                                                                     detect differences between metabolite compositions in
 and multiple sclerosis217. Sulforaphane-​rich broccoli extract preparations are being
 developed as preventive interventions in areas of the world with unavoidable exposure               various physiological states of producing organisms and
 to environmental pollutants, such as China; the initial results of a randomized clinical trial      enable the generation of hypotheses to explain them,
 showed rapid and sustained, statistically significant increases in the levels of excretion          and can also provide extensive metabolite profiles to
 of the glutathione-​derived conjugates of benzene and acrolein218, and a follow-​up trial           underpin phenotypic characterization at the molecular
 (NCT02656420) also demonstrated dose–response-​dependent benzene detoxification219.                 level30. Both options are very useful in understanding the
 In a placebo-​controlled, double-​blind, randomized clinical trial in young individuals             molecular mechanisms of action of NPs.
 (age 13–27 years) with autism spectrum disorder, sulforaphane reversed many of the                      For metabolite profiling, NP extracts are analysed by
 clinical abnormalities202; these encouraging findings led to a recently completed clinical          NMR spectroscopy or high-​resolution mass spectrom-
 trial in children (age 3–12 years) (NCT02561481; results of the trial are not yet publicly
                                                                                                     etry (HRMS), or respective combined methods involv-
 available). An α-​cyclodextrin complex of sulforaphane known as SFX-01 (developed by
                                                                                                     ing upstream liquid chromatography (LC)31,32, such as
 Evgen Pharma) is being clinically studied for its potential to reverse resistance to endocrine
 therapies in patients with ER+HER2- metastatic breast cancer (phase II trial completed220)          LC–HRMS, which can separate numerous isomers pres-
 and in patients with subarachnoid haemorrhage (phase II trial NCT02614742 recently                  ent in NP extracts33. Moreover, such combined methods
 completed; results not yet publicly available). Currently, a clinical trial of SFX-01 in patients   might integrate HRMS and NMR, allowing the simulta-
 hospitalized with COVID-19 is in its final stages of preparation.                                   neous use of the advantages of both techiques34,35. NMR
   Finally, the pentacyclic triterpenoids bardoxolone methyl (also known as RTA 402) and             analysis of NP extracts is simple and reproducible, and
 omaveloxolone (RTA 408), which are semi-​synthetic derivatives of the NP oleanolic acid,            provides direct quantitative information and detailed
 are the most potent (active at nanomolar concentrations) activators of the KEAP1/NRF2               structural information, although it has relatively low sen-
 pathway known to date221. These compounds have shown protective effects in numerous                 sitivity, meaning that it generally enables profiling only
 animal models of chronic disease222, and are currently in clinical trials for a wide range
                                                                                                     of major constituents33. The applications of NMR in NP
 of indications, such as chronic kidney disease in type 2 diabetes, pulmonary arterial
                                                                                                     research are versatile36 and the technique is used both
 hypertension, melanoma, radiation dermatitis, ocular inflammation and Friedreich’s
 ataxia200. Most recently, bardoxolone methyl has entered a clinical trial in patients               directly for metabolomics of unfractionated NP extracts
 hospitalized with confirmed COVID-19 (NCT04494646).                                                 and for structural characterization of compounds and
                          O                                                                          fractions obtained with appropriate separation methods,
                                                                O
             O                                                                           S           most often LC. HRMS is the gold standard for qualita-
                              O                                 S                    C
                                                                                 N                   tive and quantitative metabolite profiling33 and is most
                 O
                                                                                                     commonly applied in combination with LC. HRMS can
           Dimethyl fumarate                                        Sulforaphane                     also be used in the direct infusion mode (called DIMS)37,
                                                                                                     whereby samples are directly profiled by MS without a
                                                                                                     chromatography step, or in MS imaging (MSI)38, which
                     O
                         HH
                                                                        O
                                                                            HH           O
                                                                                                     enables determination of the spatial distribution of NPs
                                      O
                                                                                                     within living organisms. HRMS enables routine acqui-
  N                                                  N                               N
                                                                                     H
                                                                                                     sition of accurate molecular mass information, which
                                                                                                 F
                                  O                                                          F       together with appropriate heuristic filtering can pro-
       O                                                 O                                           vide unambiguous assignment of molecular formulae
                 H                                                  H                                for hundreds to thousands of metabolites within a sin-
      Bardoxolone methyl (RTA 402)                           Omaveloxolone (RTA 408)                 gle extract over a dynamic range that may exceed five

202 | March 2021 | volume 20                                                                                                               www.nature.com/nrd
Reviews

                       a
                        Bacterial strains isolated from
                        marine sediment from the                     Image-based phenotypic                      LC–HRMS-based metabolomics
                        coastal areas of Panama were                 bioactivity profiling of the                 data also recorded with the 234
                        used for the preparation of                  234 NP extracts in HeLa cells               NP extracts
                        234 NP extracts

                                                O               OH
                                                                                       R
                                                                                                                 Integration and clustering of the
                                      Me                  N                                                      biological and chemical datasets
                                           N          N                                                          revealed 13 unique clusters
                                           H
                                                O
                                                Quinocinnolinomycin A R =

                                                Quinocinnolinomycin B R =

                                                Quinocinnolinomycin C R =                                        One of the clusters was
                                                                                                                 prioritized for further study, and
                                                                                                                 a combination of LC–MS and NMR
                                                Quinocinnolinomycin D R =                                        analysis led to the identification
                                                                                                                 of quinocinnolinomycins A–D

                       b
                        156 FACs generated containing                 Selected 56 FACs predicted
                                                                                                                 56 FACs transformed into the
                        unique BGCs from three                        to contain uncharacterized
                                                                                                                 heterologous expression host
                        species from the Aspergillus                  BGCs (i.e. BGCs with no
                                                                                                                 (A. nidulans) and NP extracts
                        genus                                         known product or well-
                                                                                                                 prepared
                                                                      characterized homologue)

                        15 new metabolites and their                  Metabolomics data analysed
                        BGCs were characterized through               with FAC-Score algorithm
                                                                                                                 NP extracts subjected to
                        combination of gene deletions                 that filters out signals present
                                                                                                                 untargeted LC–HRMS
                        within the BGCs and additional                in host extracts or in more
                        LC–MS and NMR analysis                        than one FAC strain

                       Fig. 2 | Applications of advanced analytical technologies empowering modern natural product-based drug discovery.
                       a | An illustrative example of the application of liquid chromatography–high-​resolution mass spectrometry (LC–HRMS)
                       metabolomics in the screening of natural product (NP) extracts is the work of Kurita et al.58, in which 234 bacterial extracts
                       were subjected to image-​based phenotypic bioactivity screening and LC–HRMS metabolomics. Clustering of the resulting
                       data allowed prioritization of promising extracts for further analysis, resulting in the discovery of the new NPs, quinocin-
                       nolinomycins A–D. b | Another illustrative example of LC–HRMS screening of NP extracts is the work of Clevenger et al.85,
                       who obtained novel NP extracts through heterologous expression of fungal artificial chromosomes (FACs) containing
                       uncharacterized biosynthetic gene clusters (BGCs) from diverse fungal species in Aspergillus nidulans. Analysis of the
                       LC–HRMS metabolomics data with a FAC-​Score algorithm directed the simultaneous discovery of 15 new NPs and
                       the characterization of their BGCs.

                       orders of magnitude31,39. However, challenges remain in             In this respect, the Global Natural Products Social
                       data mining and in the unambiguous identification of            (GNPS) molecular networking platform developed
                       the metabolites using various workflows relying on open         in the Dorrestein laboratory is an important addition
                       web-​based tools40.                                             to the toolbox41. Molecular networking organizes thou-
                           Dereplication of secondary metabolites in bioactive         sands of sets of MS/MS data recorded from a given set
                       extracts includes the determination of molecular mass           of extracts and visualizes the relationship of the ana-
                       and formula and cross-​searching in the literature or           lytes as clusters of structurally related molecules. This
                       structural NP databases with taxonomic information,             improves the efficiency of dereplication by enabling
                       which greatly assists the identification process. Such          annotation of isomers and analogues of a given metab-
                       metadata, which are difficult to query in the literature,       olite in a cluster42. The recorded experimental spectra
                       are often compiled in proprietary databases, such as the        can be searched against putative structures and their
                       Dictionary of Natural Products, which encompasses               corresponding predicted MS/MS spectra generated
                       all NP structures reported with links to their biological       by tools such as competitive fragmentation modelling
                       sources (see Related links). However, a comprehensive           (CFM-​ID)43. Based on such approaches, vast databases
                       experimental tandem mass spectrometry (MS/MS) data-             of theoretical NP spectra have been created and applied
                       base of all NPs reported to date does not exist, and a          in dereplication44. The GNPS molecular networking
                       search for experimental spectra across various platforms        approach has limitations, however, such as better appli-
                       is hindered by the lack of standardized collision energy        cability to some classes of NPs than others and the
                       conditions for fragmentation in LC–MS/MS25.                     uncertainty of structural assignment among possible

NaTure RevIewS | DRug DiScOvERy                                                                                    volume 20 | March 2021 | 203
Reviews

                                   predicted candidates. Efforts to address such issues are      of biosynthetically prolific bacteria, which enabled the
                                   ongoing45–47, including overlaying molecular networks of      identification of new bioactive polyketides59.
                                   large NP extract libraries with taxonomic information to          Finally, advances in analytical technologies con-
                                   improve the confidence of annotation48. Overall, molec-       tinue to support the rigorous structure determination
                                   ular networking mainly allows better prioritization           of NPs of interest. The progressive development of
                                   of the isolation of unknown compounds by strengthen-          higher-​field NMR instruments and probe technology60,61
                                   ing the dereplication process and elucidating relation-       has enabled NP structure determination from very small
                                   ships between NP analogues, and rigorous structure            quantities (below 10 µg)62,63, which is important, as the
                                   elucidation for NPs of interest should not be neglected.      available quantities of NPs are often limited. In addi-
                                       Another useful platform for metabolite identifica-        tion, microcrystal electron diffraction (MicroED) has
                                   tion is METLIN49, which includes a high-​resolution           recently emerged as a cryo-​electron microscopy-​based
                                   MS/MS database with a fragment similarity search              technique for unambiguous structure determination
                                   function that is useful for identification of unknown         of small molecules64 and is already finding important
                                   compounds. Other databases and in silico tools such           applications in NP research65. The increased resolution
                                   as Compound Structure Identification (CSI): FingerID          and sensitivity of analytical equipment can also help
                                   and Input Output Kernel Regression (IOKR) can be used         address problems associated with ‘residual complex-
                                   to search available fragment ion spectra, as well as to       ity’ of isolated NPs; that is when biologically potent
                                   generate predicted spectra of fragment ions not present       but unidentified impurities in an isolated NP sample
                                   in current databases50. A novel computational platform        (which could include structurally related metabolites or
                                   for predicting the structural identity of metabolites         conformers) lead to an incorrect assignment of structure
                                   derived from any identified compound has also been            and/or activity66,67. To avoid futile downstream develop-
                                   recently reported51, which should increase the searchable     ment efforts, Pauli and colleagues recommended that
                                   chemical space of NPs.                                        lead NPs should undergo advanced purity analysis at an
                                       To accelerate the identification of bioactive NPs in      early stage using quantitative NMR and LC–MS67.
                                   extracts, metabolomics data can be matched to the bio-
                                   logical activities of these extracts52. Various chemometric   Genome mining and engineering
                                   methods such as multivariate data analysis can correlate      Advances in knowledge on biosynthetic pathways for NPs
                                   the measured activity with signals in the NMR and MS          and in developing tools for analysing and manipulating
                                   spectra, enabling the active compounds to be traced in        genomes are further key drivers for modern NP-​based
                                   complex mixtures with no need for further bioassays53–55.     drug discovery. Two key characteristics enable the iden-
                                   Furthermore, several analytical modules involving dif-        tification of biosynthetic genes in the genomes of the
                                   ferent bioassays and detection technologies can be            producing organisms. First, these genes are clustered in
                                   linked to allow simultaneous bioactivity evaluation and       the genomes of bacteria and filamentous fungi. Second,
                                   identification of compounds present in small amounts          many NPs are based on polyketide or peptide cores,
                                   (analytical scale) in complex compound mixtures34,35.         and their biosynthetic pathways involve enzymes —
                                       Metabolomics data can be integrated with data             polyketide synthases (PKSs) and nonribosomal peptide
                                   obtained by other omics techniques such as transcrip-         synthetases (NRPSs), respectively — that are encoded by
                                   tomics and proteomics and/or with imaging-​b ased             large genes with highly conserved modules68.
                                   screens. For example, Acharya et al. used this approach           ‘Genome mining’ is based on searches for genes
                                   to characterize NP-​mediated interactions between a           that are likely to govern biosynthesis of scaffold struc-
                                   Micromonospora species and a Rhodococcus species56.           tures, and can be used to identify NP biosynthetic gene
                                   In another interesting example, Kurita et al. developed a     clusters69–71. Prioritization of gene clusters for further
                                   compound activity mapping platform for the prediction         work is facilitated by advances in biosynthetic know­
                                   of identities and mechanisms of action of constituents        ledge and predictive bioinformatics tools, which can pro-
                                   from complex NP extract libraries by integrating cyto-        vide hints about whether the metabolic products of the
                                   logical profiling57 with untargeted metabolomics data         clusters have chemical scaffolds that are new or known,
                                   from a library of extracts58, and identified quinocinno-      thereby supporting dereplication72,73. Such predictive
                                   linomycins as a new family of NPs causing endoplasmic         tools for gene cluster analysis can be applied in combi-
                                   reticulum stress58 (Fig. 2a).                                 nation with spectroscopic techniques to accelerate the
                                       Analytical advances that enable the profiling of          identification of NPs65 and determine the stereochem-
                                   responses to bioactive molecules at the single-​c ell         istry of metabolic products66. Furthermore, to extend
                                   level can also accelerate NP-​based drug discovery. Irish,    genome mining from a single genome to entire genera,
                                   Bachmann, Earl and colleagues developed a high-​              microbiomes or strain collections, computational tools
Phylogenomic approach              throughput platform for metabolomic profiling of bio-         have been developed, such as BiG-​SCAPE, which enables
The use of genomic data to
                                   activity by integrating phospho-​specific flow cytometry,     sequence similarity analysis of biosynthetic gene clusters,
reveal evolutionary
relationships. In the context of   single-​cell chemical biology and cellular barcoding with     and CORASON, which uses a phylogenomic approach
natural product drug discovery,    metabolomic arrays (characterized chromatographic             to elucidate evolutionary relationships between gene
the use of phylogenomics is        microtitre arrays originating from biological extracts)59.    clusters74.
based on the assumption that       Using this platform, the authors studied the single-​cell         Phylogenetic studies of known groups of talented
organisms that have closer
evolutionary relationships are
                                   responses of bone marrow biopsy samples from patients         secondary metabolite producers can also empower
more likely to produce similar     with acute myeloid leukaemia following exposure to            discovery of novel NPs. Recently, a study comparing
natural products.                  microbial metabolomic arrays obtained from extracts           secondary metabolite profiles and phylogenetic data in

204 | March 2021 | volume 20                                                                                                          www.nature.com/nrd
Reviews

Taxonomic distance                myxobacteria demonstrated a correlation between the          To transfer a biosynthetic gene cluster of such size, a plat-
The distance of compared taxa     taxonomic distance and the production of distinct sec-       form based on transformation-​associated recombination
on a constructed phylogenetic     ondary metabolite families75. In filamentous fungi, it was   (TAR) cloning was developed. This platform enables
tree (also known as an            likewise shown that secondary metabolite profiles are        direct cloning and manipulation of large biosynthetic
evolutionary tree). Closer
distance of compared taxa
                                  closely correlated with their phylogeny76. These organ-      gene clusters in S. cerevisiae, maintenance and manipula-
indicates a closer evolutionary   isms are rich in secondary metabolites, as demonstrated      tion of the vector in E. coli, and heterologous expression
relationship.                     by LC–MS studies of their extracts under laboratory          of the cloned gene clusters in Actinobacteria (such as
                                  conditions77. Concurrent genomic and phylogenomic            S. coelicolor) following chromosomal integration88, and
                                  analyses implied that even the genomes of well-​studied      is an alternative to BACs for heterologous expression of
                                  organism groups harbour many gene clusters for sec-          large biosynthetic gene clusters.
                                  ondary metabolite biosynthesis with as yet unknown               Heterologous expression has limitations, such as
                                  functions78. The phylogeny of biosynthetic gene clusters,    the need to clone and manipulate very large genome
                                  together with analysis of the absence of known resistance    regions occupied by biosynthetic gene clusters and the
                                  determinants, was recently used to prioritize members of     difficulty of identifying a suitable host that provides all
                                  the glycopeptide antibiotic family that could have novel     conditions necessary for the production of the corre-
                                  activities. This led to the identification of the known      sponding NPs. These limitations can be circumvented
                                  antibiotic complestatin and the newly discovered cor-        by activating biosynthetic gene clusters directly in the
                                  bomycin as compounds that act through a previously           native microorganism through targeted genetic manip-
                                  uncharacterized mechanism involving inhibition of            ulations, generally involving the insertion of activating
                                  peptidoglycan remodelling79.                                 regulatory elements or deletion of inhibitory elements
                                      Many microorganisms cannot be cultured, or tools         such as repressors or their binding sites. For example,
                                  for their genetic manipulation are not sufficiently          a derepression strategy of deleting gbnR, a gene for a
                                  developed, which makes it more challenging to access         transcriptional repressor in Streptomyces venezuelae
                                  their NP-​producing potential. However, biosynthetic         ATCC 10712 was used by Sidda et al. in the discovery of
                                  gene clusters for NPs can be cloned and heterologously       gaburedins, a family of γ-​aminobutyrate-​derived ureas89.
                                  expressed in organisms that are well-​characterized          An example of the activator-​based strategy is the consti-
                                  and easier to culture and to genetically manipulate          tutive expression of the samR0484 gene in Streptomyces
                                  (such as Streptomyces coelicolor, Escherichia coli and       ambofaciens ATCC 23877, which led to the discovery
                                  Saccharomyces cerevisiae) 80. The aim is to achieve          of stambomycins A–D, 51-​membered cytotoxic glyco-
                                  higher production titres in the heterologous hosts           sylated macrolides72. Alternatively, silent biosynthetic
                                  than in wild-​type strains, improving the availability of    gene clusters can be activated using repressor decoys90,
                                  lead compounds80–82. Vectors that can carry large DNA        which have the same DNA nucleotide sequence as the
                                  inserts are needed for the cloning of complete NP bio-       binding sites for the repressors that prevent the expres-
                                  synthetic gene clusters. Cosmids (which can have inserts     sion of the clusters. When these decoys are introduced
                                  of 30–40 kb), fosmids (which can harbour 40–50 kb) and       into the bacteria, they sequester the respective repres-
                                  bacterial artificial chromosomes (BACs; which can have       sors, and the ‘endogenous’ binding sites in the genome
                                  inserts of 100 kb to >300 kb) have been developed83. For     remain unoccupied, leading to derepression of the pre-
                                  fungal gene clusters, self-​replicating fungal artificial    viously silent biosynthetic genes and production of the
                                  chromosomes (FACs) have been developed, which can            corresponding NPs. This approach has been applied to
                                  have inserts of >100 kb (ref.84). FACs in combination        activate eight silent biosynthetic gene clusters in multi-
                                  with metabolomic scoring were used to develop a scal-        ple streptomycetes and led to the characterization of a
                                  able platform, FAC-​MS, allowing the characterization        novel NP, oxazolepoxidomycin A90. The repressor decoy
                                  of fungal biosynthetic gene clusters and their respec-       strategy is simpler, easier and faster to perform than the
                                  tive NPs at unprecedented scale85. The application of        deletion of genes encoding regulatory factors. However,
                                  FAC-​MS for the screening of 56 biosynthetic gene clus-      it has the same limitation as other approaches that rely
                                  ters from different fungal species yielded the discovery     on the introduction of recombinant DNA molecules into
                                  of 15 new metabolites, including a new macrolactone,         cells: it is necessary to develop protocols for efficient
                                  valactamide A85 (Fig. 2b).                                   introduction of DNA into the targeted host strain, and
                                      Even in culturable microorganisms, many biosyn-          the decoy must be maintained on a high-​copy plasmid
                                  thetic gene clusters may not be expressed under con-         to ensure efficient repressor sequestration.
                                  ventional culture conditions, and these silent clusters          Another approach focused on exchange of regu­
                                  could represent a large untapped source of NPs with          latory elements is based on the CRISPR–Cas9 technol-
                                  drug-​like properties86. Several approaches can be pur-      ogy. The promise of this technique is exemplified in a
                                  sued to identify such NPs. One approach is sequencing,       recent work by Zhang et al., which demonstrated that
                                  bioinformatic analysis and heterologous expression           CRISPR–Cas9-​mediated targeted promoter introduction
                                  of silent biosynthetic gene clusters, which has already      can efficiently activate diverse biosynthetic gene clusters
                                  led to the discovery of several new NP scaffolds from        in multiple Streptomyces species, leading to the produc-
                                  cultivable strains87. Direct cloning and heterologous        tion of unique metabolites, including a novel polyketide
                                  expression was also used to discover the new antibiotic      in Streptomyces viridochromogenes91. The CRISPR–Cas9
                                  taromycin A, which was identified upon the transfer          technology was also used to knock out genes encoding
                                  of a silent 67 kb NRPS biosynthetic gene cluster from        two well-​k nown and frequently rediscovered anti­
                                  Saccharomonospora sp. CNQ-490 into S. coelicolor88.          biotics in several actinomycete strains, which led to the

NaTure RevIewS | DRug DiScOvERy                                                                                            volume 20 | March 2021 | 205
Reviews

                          a
                              DNA isolation from
                                                                     Sequencing                          Bioinformatics analysis                               NP chemical
                              specific microorganism/
                                                                                                         (genome mining)                                       synthesis
                              environment/microbiota

                                                                                                                  Heterologous                           Genetic manipulations
                                                                                                                  expression                             of the native host

                                                                                                                 NP isolation and
                                                                                                                 characterization

                          b
                                                                      Analysed sequences for BGCs                                               Selected a soil sample rich in BGCs
                           Sequenced                                  that code for lipopeptides                                                from a tree branch not associated
                           metagenomes of                             with calcium-binding motifs                                               with the BGCs for known
                           2,000 soil samples                         and clustered to create a                                                 calcium-binding antibiotics and
                                                                      phylogenetic tree                                                         cloned DNA in a cosmid library

                                                                                        O

                                                                                   HO                O
                                                                                    H                         H
                                                                                    N                         N        O
                                                                                                 N                          O
                                                                                                 H
                                                                                        O
                                                                           N        O                        HN            OH OH
                                                                                    O                                            OH                  Identified a predicted BGC
                                                                                                         O
                                                                                                                                                     and transferred into
                                                                              HN                         O        NH        O                        Streptomyces albus using
                                                                          O                          O                                               transformation-associated
                                                                H                           H                                                        recombination
                                                                N                           N
                                                                               N                         N
                                                                               H                         H
                                                            O                  OH       O

                                                                          O
                                                       R                                                                   NH2
                                                                               Malacidin A R = Me
                                                                               Malacidin B R = Et

                                                                Malacidins isolated from cultures and
                                                                their structures elucidated using a                                              Extracts from cultures of S. albus
                                                                combination of mass spectrometry and                                             harbouring the BGC found to
                                                                NMR data, supported by bioinformatics                                            show antibacterial activity
                                                                analysis of the BGC                                                              against Staphylococcus aureus

                          c
                           Analysed genomic sequence                                                                                                Chemically synthesized 25
                           data from human microbiome                                                                                               ‘synthetic–bioinformatic’
                           for gene clusters predicted                          Identified 57 unique
                                                                                                                                                    NP-like compounds predicted
                           to encode large (≥5 residues)                        NRPS gene clusters
                                                                                                                                                    to be encoded by the analysed
                           nonribosomal peptides                                                                                                    gene clusters

                                                                                            OH
                                                                               O                         O        R1              O        R2              O
                                                                      H                          H                          H                       H
                                                                      N                          N                          N                       N
                                                                                    N                        N                         N                       OH
                                                                                    H                        H                         H
                                                           OH    O                          O                          O                        O
                                                                                                                                  OH

                                                                                            OH                         OH

                                                                      Humimycin A R1 = L-Phe, R2 = L-Val
                                                                      Humimycin B R1 = L-Tyr, R2 = L-Ile

                                                                                                                                                        Tested the 25 NP-like
                                                                          Identified humimycins as                                                       compounds for activity
                                                                          new antibiotics active                                                        against a panel of common
                                                                          against methicillin-resistant                                                 human commensal and
                                                                          S. aureus                                                                     pathogenic bacteria

206 | March 2021 | volume 20                                                                                                                                        www.nature.com/nrd
Reviews

◀   Fig. 3 | Strategies for genome mining-driven discovery of natural products and                 was derived from a set of ~100 genes through variable
    natural product-like compounds. a | Genome mining-​based approaches to explore                 peptide processing96.
    the biosynthetic capacity of microorganisms rely on DNA extraction, sequencing and                 Some bioactive compounds initially isolated from
    bioinformatics analysis. The vast majority of microbes from different environments             marine organisms might be products of symbionts, and
    and microbiota communities have not been cultured, and their capacity to produce
                                                                                                   genome mining can facilitate the characterization of
    natural products (NPs) was largely inaccessible until recently. In the case of unculturable
    microorganisms, the bioinformatics analysis step can be followed by either targeted
                                                                                                   such NPs. For example, it has been shown that bio­active
    heterologous expression of biosynthetic gene clusters (BGCs) prioritized as being likely       compounds from the sponge Theonella swinhoei are pro-
    to yield relevant new NPs or direct chemical synthesis of ‘synthetic–bioinformatic’            duced by bacterial symbionts97, and characterization of
    NP-​like compounds. b,c | These two approaches are exemplified by the recent discoveries       the symbiont ‘Candidatus Entotheonella serta’ using
    of malacidins (panel b) and humimycins (panel c), respectively93,94. A major strength of the   single-​cell genomics led to the discovery of gene clus-
    ‘synthetic–bioinformatic’ approach is that it is entirely independent of microbial culture     ters for misakinolide and theonellamide biosynthesis98.
    and gene expression. Its limitations are the accuracy of computational chemical structure      Another example of a marine NP produced by a bacte-
    predictions and the feasibility of total chemical synthesis. NRPS, nonribosomal peptide        rial symbiont is ET-743 (trabectedin), originally isolated
    synthetase.                                                                                    from the tunicate Ecteinascidia turbinate. A meta-omics
                                                                                                   approach developed by Rath et al. revealed that the
                                  production of different rare and previously unknown              producer of this clinically used anticancer agent is the
                                  variants of antibiotics that were otherwise obscured,            bacterial symbiont ‘Candidatus Endoecteinascidia
                                  including amicetin, thiolactomycin, phenanthroviridin            frumentensis’99.
                                  and 5-​chloro-3-​formylindole92.                                     Similarly, plant microbiomes also represent a large
                                      Approaches that rely on sequencing, bioinformat-             reservoir for the identification of novel bioactive NPs
                                  ics and heterologous expression can also enable the              (such as the antitumour agents maytansine, paclitaxel
                                  identification of novel NPs from bacterial strains that          and camptothecin, which were initially isolated from
                                  have not yet been cultivated (Fig. 3a) . For example,            plants and later shown to be produced by microbial
                                  Hover et al. searched the metagenomes of 2,000 soil              endophytes)100 that can be tapped by genome mining
                                  samples for biosynthetic gene clusters for lipopeptides          approaches. An illustrative example is a recent work
                                  with calcium-​binding motifs. This led to the discov-            by Helfrich et al. that identified hundreds of novel bio­
                                  ery of malacidins, members of the calcium-​dependent             synthetic gene clusters by genome mining of 224 bac­
                                  antibiotic family, via heterologous expression of a 72 kb        terial strains isolated from Arabidopsis thaliana leaves101.
                                  biosynthetic gene cluster from a desert soil sample in a         A combination of bioactivity screening and imaging mass
                                  Streptomyces albus host strain93 (Fig. 3b). However, in          spectrometry was used to select a single species for fur-
                                  comparison with some of the other above-​discussed               ther genomic analysis and led to the isolation of a NP with
                                  strategies 72,89,90, this metagenome-​b ased discovery           an unprecedented structure, the trans-​acyltransferase
                                  approach is more suited to finding new members of                PKS-​derived antibiotic macrobrevin101.
                                  known NP classes rather than discovery of entirely                   Targeted genetic engineering of NP biosynthetic gene
                                  new classes. In another study, Chu et al. developed a            clusters can be of high value if the producing organism
                                  human microbiome-​based approach that identified                 is difficult to cultivate or the yield of a NP is too low
                                  nonribosomal linear heptapeptides called humimycins              to allow comprehensive NP characterization. Rational
                                  as novel antibiotics active against methicillin-​resistant       genetic engineering and heterologous expression con-
                                  Staphylococcus aureus (MRSA)94 (Fig. 3c). The structure          tributed to increase the production of vioprolides, a
                                  of the NPs was predicted via bioinformatics analysis of          depsi­peptide class of anticancer and antifungal NPs in
                                  gene clusters found in human commensal bacteria, fol-            the myxobacterium Cystobacter violaceus Cb vi35, by
                                  lowed by their chemical synthesis. A major strength of           several orders of magnitude. In addition, non-natural
                                  this innovative approach is that it is entirely independent      vio­prolide analogues were generated by this approach102.
                                  of microbial cultivation and heterologous gene expres-           Similarly, promoter engineering and heterologous
                                  sion. Nevertheless, there are limitations related to the         expres­sion of biosynthetic gene clusters was reported
                                  accuracy of computational chemical structure predic-             to result in a 7-​fold increase in the production of the
                                  tions and the feasibility of total chemical synthesis if         cytotoxic NP disorazol103, and a 328-​fold increase in
                                  structures are complex.                                          the production of spinosad, an insecticidal macrolide
                                      The genomes of plants or animals can also be mined           produced by the bacterium Saccharopolyspora spinosa104.
                                  for novel NPs. For example, mining of 116 plant genomes              Besides increasing NP yields, targeted gene manip-
                                  enabled by identification of a precursor gene for the            ulation can also be used to alter biosynthetic pathways
                                  biosynthesis of lyciumins, a class of branched cyclic            in a predictable manner to produce new NP analogues
                                  ribosomal peptides with hypotensive action produced              with improved pharmacological properties, such as
                                  by Lycium barbarum (popularly known as goji), identi-            higher specific activity, lower toxicity and better pharma­
                                  fied diverse novel lyciumin chemotypes in seven other            cokinetics. Such biosynthetic engineering approaches
                                  plants, including crops such as soybean, beet, quinoa and        depend on a solid understanding of the biosynthetic
                                  eggplant95. Genome mining in the animal kingdom is               pathway leading to a specific NP, access to the genes
                                  exemplified by the work of Dutertre et al., which used           specifying this pathway and the ability to manipulate
                                  an integrated transcriptomics and proteomics approach            them in either the original or a hetero­logous host.
                                  to discover thousands of novel venom peptides from               Recent advances in biosynthetic engineering have
                                  Conus marmoreus snails96. Proteomics analysis revea­             enabled faster and more efficient production of NP
                                  led that the vast majority of the conopeptide diversity          analogues, including the development of methods for

    NaTure RevIewS | DRug DiScOvERy                                                                                           volume 20 | March 2021 | 207
Reviews

                          accelerated engineering and recombination of modules          identification of specific growth factors, allowing expan-
                          of PKS gene clusters105, NRPSs106,107 and NRPS–PKS            sion of the number of species that can be successfully
                          assembly lines108, as well as elucidation of mechanisms       cultured. This strategy was used by D’Onofrio et al. for
                          for polyketide chain release that are contributing to         the identification of new acyl-​desferrioxamine sidero-
                          NP structural diversification109,110. Examples of bio­        phores (iron-​chelating compounds) as growth factors
                          synthetic engineering applied to several important NPs        produced by helper strains promoting the growth of
                          include the generation of analogues of the immuno­            previously uncultured isolates from marine sedi-
                          suppressant rapamycin 111, the antitumour agents              ment biofilm117,126. The siderophore-​assisted growth is
                          mithramycin112 and bleomycin113, and the antifungal           based on the property of these compounds to provide
                          agent nystatin114.                                            iron for microbes unable to autonomously produce
                             It should be noted that biosynthetic engineering has       siderophores themselves, and the application of this
                          limitations regarding the parts of the NP molecule that       approach led to the isolation of previously uncultivated
                          can be targeted for modifications, and the chemical           microorganisms126. The development of strategies to
                          groups that can be introduced or removed. Considering         cultivate microbial symbionts that produce NPs only
                          the complexity of many NPs, however, total synthesis          upon interaction with their hosts can promote access
                          may be prohibitively costly, and a combined approach          to new NPs. Microbial symbionts interacting with
                          of biosynthetic engineering and chemical modification         insects or other organisms are a highly promising reser-
                          can provide a viable alternative for identifying improved     voir for the discovery of novel bioactive NPs produced
                          drug candidates. For example, biosynthetic engineering        in a unique ecological context127–130. To stimulate NP
                          may create a ‘handle’ for addition of a beneficial chem-      production, culturing strategies can be developed that
                          ical group by synthetic chemistry, as demonstrated for        better mimic the native environment of microbial sym-
                          the biosynthetically engineered analogues of nystatin         bionts of insects, including the use of media containing
                          mentioned above; further synthetic chemistry modifi-          either lyophilized dead insects131 or l-​proline, a major
                          cations resulted in compounds with improved in vivo           constituent of insect haemolymph132.
                          pharmacotherapeutic characteristics compared with                  Strategies to mimic the natural environment even
                          amphotericin B115,116.                                        more closely by harnessing in situ incubation in the
                                                                                        environment from which the microorganism is sam-
                          Advances in microbial culturing systems                       pled have been developed, dating back to more than
                          The complex regulation of NP biosynthesis in response         20 years ago with the biotech companies OneCell and
                          to the environment means that the conditions under            Diversa. They developed platforms that allowed the
                          which producing organisms are cultivated can have a           growth of some previously uncultivated microbes from
                          major impact on the chance of identifying novel NPs87.        various environments based on diluting out and sus-
                          Several strategies have been developed to improve the         pension in a single drop of medium120,133. More recently,
                          likelihood of identifying novel NPs compared with             such strategies have been highlighted by the develop-
                          mono­c ulture under standard laboratory conditions            ment and application of a platform dubbed the iChip,
                          and to make ‘uncultured’ microorganisms grow in a             in which diluted soil samples are seeded in multiple
                          simulated natural environment117 (Fig. 4).                    small chambers separated from the environment with
                              One well-​established approach to promote the identi-     a semipermeable membrane134. After seeding, the iChip
                          fication of novel NPs is the modulation of culture condi-     is placed back into the soil from which the sample was
                          tions such as temperature, pH and nutrient sources. This      taken for an in situ incubation period, allowing the cul-
                          strategy may lead to activation of silent gene clusters,      tured microorganisms to be exposed to influences from
                          thereby promoting production of different NPs. The            their native environment. The power of this culturing
                          term ‘One Strain Many Compounds’ (OSMAC) was                  approach was demonstrated by the discovery of a new
                          coined for this approach about 20 years ago118, but the       antibiotic, teixobactin, produced by a previously uncul-
                          concept has a longer history119, with its use being routine   tured soil bacterium135,136 (Fig. 4a). This platform may be
                          in industrial microbiology since the 1960s120.                of great significance for NP drug discovery, given that
                              While OSMAC is still widely used for the identifi-        it has been estimated that only 1% of soil organisms
                          cation of new bioactive compounds121,122, this approach       have so far been successfully cultured using traditional
                          has limited capacity to mimic the complexities of natu-       culturing techniques137.
                          ral habitats. It is difficult to predict the combination of        The omics strategies discussed in previous sections
                          cues (which might also involve metabolites secreted by        can complement efforts to explore NPs produced upon
                          other members of the microbial community) to which            microbial interactions. The application of such a strategy
                          the microorganism has evolved to respond by switching         is illustrated in the work of Derewacz et al., who analysed
                          metabolic programmes. To account for such kinds of            the metabolome of a genome-​sequenced Nocardiopsis
                          interactions, co-​culturing using ‘helper’ strains can be     bacterium upon co-​culture with bacteria of the genera
                          applied123. This can enable the production and identifica-    Escherichia, Bacillus, Tsukamurella and Rhodococcus138.
                          tion of new NPs, as illustrated by recent studies in which    Around 14% of the metabolomic features found in
                          particular fungi were co-​cultured with Streptomcyes          co-​cultures were undetectable in monocultures, with
                          species124,125.                                               many of those being unique to specific co-​culture gen-
                              Study of the molecular mechanisms underlying              era, and the previously unreported polyketides ciro-
                          the ability of helper strains to increase the cultivabil-     micin A and B, which possess an unusual pyrrolidinol
                          ity of previously uncultured microbes can lead to the         substructure and displayed moderate and selective

208 | March 2021 | volume 20                                                                                                 www.nature.com/nrd
Reviews

                       a
                                                                                 Extracts from 10,000 iChip                                  An extract from a new
                        iChip device with diluted soil
                                                                                 isolates screened for                                       bacterial species, Eleftheria
                        samples incubated in soil to
                                                                                 antimicrobial activity                                      terrae, showed good
                        simultaneously grow and
                                                                                 against Staphylococcus                                      antimicrobial activity
                        isolate uncultured bacteria
                                                                                 aureus                                                      against S. aureus

                                                                                                                          O
                                                          O       NH2                                O
                                                                                                                 N
                                                                                                                 H                           Compound isolation from the
                                                                                                         O                                   E. terrae extract, and structure
                               O                 O                      O                   O                        HN
                           H                H                     H                    H                     H                    NH         elucidation by NMR and
                           N                N                     N                    N                     N                               advanced Marfey’s analysis
                                   N                  N                     N                    N                        O
                                   H                  H                     H                    H                            N        NH    yielded teixobactin, a new
                                        O                     O                    O                     O                    H              antibiotic with activity against
                                                 OH                                         OH
                                                                                                                                             Gram-positive bacteria
                                                                        Teixobactin

                       b
                        Microbial species selected                              Bioinformatics analysis                                     Antigens representative of the
                        from human oral microbiome                              identified membrane proteins                                 selected extracellular protein
                        for targeted isolation based                            with extracellular domains that                             domains designed, synthesized
                        on genomic sequence data                                could serve as antigens for                                 and injected into rabbits for
                                                                                antibody development                                        antibody production

                        Three species of Saccharibacterium                      Oral microbiota samples stained
                        isolated along with their interacting                   with fluorescently labelled                                  IgG purified from the rabbit
                        Actinobacterium hosts, as well as                       antibodies, followed by flow                                 serum and fluorescently
                        SR1 bacteria that are members of                        cytometry and cell sorting to                               labelled
                        a candidate phylum with no                              isolate the targeted bacterial
                        previously cultured representatives                     species

                       Fig. 4 | Application of advanced microbial culturing approaches to identify new natural products. New strategies
                       for isolating previously uncultured microorganisms can enable access to new natural products (NPs) produced by them.
                       a | To recapitulate the effect of complex signals coming from the native environment, microorganisms can be cultivated
                       directly in the environment from which they were isolated. This concept is used with the iChip platform, in which diluted
                       environmental samples are seeded in multiple small chambers separated from the native environment with a semipermeable
                       membrane. The potential of this approach is illustrated by the recent discovery of teixobactin, a new antibiotic with
                       activity against Gram-​positive bacteria134,135. b | Another important recent development involves obtaining information from
                       environmental samples using omics techniques such as metagenomics to identify and partially characterize microorganisms
                       present in a specific environment before culturing. An approach relying on such preliminary information was recently
                       used to engineer the capture of antibodies based on genetic information, which resulted in the successful cultivation of
                       previously uncultured bacteria from the human mouth145. This reverse genomics workflow was validated by the isolation
                       and cultivation of three species of Saccharibacteria (TM7) along with their interacting Actinobacteria hosts, as well as
                       SR1 bacteria that are members of a candidate phylum with no previously cultured representatives.

                       cytotoxicity, were identified138. Other examples include                      such ‘easy to culture’ microbes have already been charac-
                       a ‘culturomics’ approach that combines multiple culture                       terized, and conventional screening efforts tend to yield
                       conditions with MS profiling and 16S rRNA-​based tax-                         disappointing returns associated with frequent rediscov-
                       onomy to identify prokaryotic species from the human                          ery of known NPs and their closely related congeners.
                       gut139, and an ultrahigh-​throughput screening platform                       Therefore, culturing strategies aimed at previously unex-
                       based on microfluidic droplet single-​cell encapsulation                      plored (or under-​investigated) microbial groups, with
                       and cultivation followed by next-​generation sequenc-                         the potential to produce NPs with entirely new scaffolds
                       ing and LC–MS, which allows investigation of pairwise                         and bioactivities (such as Burkholderia, Clostridium and
                       interactions between target microorganisms140. The lat-                       Xenorhabdus) are of high interest141,142. Closthioamide,
                       ter approach enabled identification of a slow-​growing                        the first secondary metabolite from a strictly anaerobic
                       oral microbiota species that inhibits the growth of                           bacterium, was discovered from Clostridium cellulo-
                       S. aureus140.                                                                 lyticum by this approach143. Targeted isolation of such
                           Historically early-​a dopted microbial culturing                          species is important, and a genome-​guided approach
                       approaches led to a bias reflected in the predominant                         to achieve this goal has recently been demonstrated
                       discovery of NPs from microorganisms that are easy to                         for Burkholderia strains in environmental samples144.
                       cultivate (such as streptomycetes and some common fil-                        Another highly innovative approach to the isolation
                       amentous fungi). As a result, a vast number of NPs from                       and cultivation of previously uncultured bacteria was

NaTure RevIewS | DRug DiScOvERy                                                                                                             volume 20 | March 2021 | 209
You can also read