PIPELINE SOFTWARE GEOMX - NGS - BLOG | NANOSTRING

Page created by Sandra Pierce
 
CONTINUE READING
PIPELINE SOFTWARE GEOMX - NGS - BLOG | NANOSTRING
GeoMx - NGS
Pipeline Software
User Manual

FOR RESEARCH USE ONLY. Not for use in diagnostic procedures.
© 2021 NanoString Technologies, Inc. All rights reserved.

MAN-10118-03 for v2.2 SW FEB 2021
PIPELINE SOFTWARE GEOMX - NGS - BLOG | NANOSTRING
GeoMx-NGS Pipeline Software User Manual                        MAN-10118-03 for software v2.2
Table of Contents

Table of Contents

    GeoMx DSP NGS User Manual                                                               1
    Conventions                                                                             3
    GeoMx DSP Workflow                                                                      4
    Introduction to GeoMx NGS Pipeline                                                      6
     Input files needed                                                                     8
     Installation options for the GeoMx NGS Pipeline and GUI                                9
     Choosing the option that is right for you                                              9
     System requirements                                                                   11
    Installing the GeoMx NGS Pipeline                                                      12
     Installing on Linux or AWS server                                                     13
    Running GeoMx NGS Pipeline                                                             15
     Running the Pipeline using GUI on Windows or Mac                                      15
     Running the Pipeline on a server                                                      18
     Running the Pipeline using CLI on a Linux or AWS server                               19
    Appendix I: Setting up GeoMx NGS Pipeline on Amazon Web Services (AWS)                 21

2
PIPELINE SOFTWARE GEOMX - NGS - BLOG | NANOSTRING
MAN-10118-03 for software v2.2                           GeoMx-NGS Pipeline Software User Manual
Conventions

Conventions
The following conventions are used throughout this manual and are described for your reference.
Bold text is typically used to highlight a specific button, keystroke, or menu option. It may also be
used to highlight important text or terms.
Blue underlined text is typically used to highlight links and/or references to other sections of the
manual. It may also be used to highlight references to other manuals or instructional material.

 The gray box indicates general information that may be useful for improving assay performance. These
 notes may clarify other instructions or provide guidance to improve the efficiency of the assay workflow.

       IMPORTANT: This symbol indicates important information that is critical to ensure a successful
      assay. Following these instructions may help improve the quality of your data.

       WARNING: This symbol indicates the potential for bodily injury or damage to the instrument if the
      instructions are not followed correctly. Always carefully read and follow the instructions
      accompanied by this symbol to avoid potential hazards.

                                                                                                             3
PIPELINE SOFTWARE GEOMX - NGS - BLOG | NANOSTRING
GeoMx-NGS Pipeline Software User Manual                           MAN-10118-03 for software v2.2
GeoMx DSP Workflow

GeoMx DSP Workflow

FOR NGS PLATFORMS

The GeoMx® Digital Spatial Profiling (DSP) technology is a novel platform developed by
NanoString. This product relies upon antibody or RNA probes coupled to photocleavable
oligonucleotide tags. After the hybridization of probes to slide-mounted tissue sections, the

4
PIPELINE SOFTWARE GEOMX - NGS - BLOG | NANOSTRING
MAN-10118-03 for software v2.2                        GeoMx-NGS Pipeline Software User Manual
GeoMx DSP Workflow

oligonucleotide tags are released from discrete regions of the tissue via UV exposure. Released
tags are quantitated in an Illumina NGS assay and counts are mapped back to tissue location,
yielding a spatially resolved digital profile of analyte abundance. The GeoMx DSP workflow
describes the process of accomplishing these steps.

l   Day 1: Slide Staining . During this phase, prepare slides and hybridize biological targets with
    UV- cleavable biological probes. For protein samples, label cell types of interest with
    fluorescent morphology markers.

l   Day 2: For RNA samples, label cell types of interest with fluorescent morphology markers,
    Process Slide on GeoMx DSP. Load prepared slides onto the GeoMx DSP instrument,
    enter identifying information for them, scan them to create fluorescent images, select regions
    of interest (ROIs), and then collect UV-cleaved oligos from these ROIs into the wells of a
    collection plate.

l   Day 3: Transfer the contents of the DSP collection to a PCR plate for Library Prep and
    Sequencing. The products will be pooled and purified, then sequenced on an Illumina NGS
    instrument.

l   Day 4: Process FASTQ sequencing files to DCC digitalcount files using NanoString's
    GeoMx NGS Pipeline software.

l   Day 5: Transfer the DCCs to the GeoMx DSP Data Analysis Suite and run platform and
    readout-specific quality control checks, perform data analysis, and generate analysis plots.

GEOMX DSP USER MANUALS AND OTHER USER DOCUMENTATION

l   All of the GeoMx DSP user documentation exists in the GeoMx DSP Online User Manual,
    accessible from the help icon on the GeoMx DSP Control Center and online at
    https://www.nanostring.com/geomx-online-user-manual.

l   PDF versions of GeoMx DSP documentation are also available for both nCounter and NGS
    readouts. The Slide Prep, Instrument, Readout, GeoMx NGS Pipeline (for NGS only),
    and Data Analysis user manuals are available for download from the GeoMx DSP Online
    User Manual (see above).

l   Illumina platform documentation can be found in their                respective manuals at
    https://support.illumina.com/.

                                                                                                  5
PIPELINE SOFTWARE GEOMX - NGS - BLOG | NANOSTRING
GeoMx-NGS Pipeline Software User Manual                                 MAN-10118-03 for software v2.2
Intro GeoMx NGS Pipeline Software

Introduction to GeoMx NGS Pipeline
The GeoMx NGS Pipeline, developed by NanoString, is an essential part of the GeoMx NGS
workflow. The Pipeline processes RNA-sequencing files (FASTQ files) from Illumina sequencers
according to parameters defined in the Configuration File (which is generated from the GeoMx
DSP run). The Pipeline processes information from these files and outputs .dcc files, which can
then be uploaded to the GeoMx DSP system for data analysis.
The Automated Data Processing Pipeline depicted here (see Figure 1) illustrates the steps
the Pipeline undertakes.

                               Figure 1: GeoMx NGS Pipeline pipeline of steps

6
PIPELINE SOFTWARE GEOMX - NGS - BLOG | NANOSTRING
MAN-10118-03 for software v2.2                       GeoMx-NGS Pipeline Software User Manual
Intro GeoMx NGS Pipeline Software

Each photocleaved oligo in the GeoMx DSP collection plate contains a readout tag sequence
identifier (RTS ID) that identifies the target. It also includes a unique molecular identifier
(UMI), which allows for removal of PCR duplicates when converting reads to digital counts. Read 1
(SPR1) and Read 2 (SPR2) are binding sites for Illumina sequencing primers. The GeoMx Seq
Code primers that hybridize to SPR1 and SPR2 contain i5 or i7 indexing sequences as well as
P5 or P7 sequences for binding to Illumina flow cells. Depending on the sequencing platform
used, the i5 index will be read in either the forward (workflow A; MiSeq, HiSeq 2000/2500 or Nova
Seq) or reverse (Workflow B; MiniSeq, NextSeq, or HiSeq 3000/4000x) direction.
The GeoMx NGS Pipeline performs a series of actions to process the sequence reads to output
digital code counts. In the first step, the raw reads (raw sequencing FASTQ files) are selected for a
pipeline run. Next, the raw reads are processed for high quality, the adapters are removed
(resulting in trimmed reads), and the paired-end reads are merged (resulting in stitched reads).
In the third step, the reads are aligned to the RTS-ID barcodes, creating aligned reads. Then,
PCR duplicates are removed by matching on the Unique Molecular Index (UMI), resulting in
deduplicated reads. The Digital Count Conversion (DCC) file is created. These DCC files are
presented as a .zip file in a folder which you designate and can then be uploaded into the DSP
Control Center for study creation in the DSP Data Analysis Suite.

                                                                                                  7
PIPELINE SOFTWARE GEOMX - NGS - BLOG | NANOSTRING
GeoMx-NGS Pipeline Software User Manual                                           MAN-10118-03 for software v2.2
Intro GeoMx NGS Pipeline Software

Input files needed
After your GeoMx DSP run (see the GeoMx-NGS DSP Instrument User Manual), you will
download one .ZIP file containing the following (see Figure 2).

l   The Seq Code UDI Indices, which is a digital file with sample information to input into the
    Illumina software.

l   The Lab Worksheet, which is an Excel spreadsheet to use for guidance in setting up the library.

l   The GeoMx NGS Pipeline Config file, which contains information relating to each well of the
    collection plate. This is the critical GeoMx DSP file to input into the GeoMx NGS Pipeline
    software.

                                          Figure 2: Output files from GeoMx run

After your Illumina NGS run (see the GeoMx-NGS Readout Library Prep User Manual),
you will download a group of FASTQ files, which contain the sequencing data relating to each well
of the collection plate. These are the critical Illumina NGS files to input into the Pipeline software.
Save these files in an accessible location for input to the GeoMx NGS Pipeline software.

    Do not modify FASTQ filenames from the Illumina FASTQ file naming conventions. If possible, retain the
    naming from the sample sheets exported from DSP, which incorporate the plate barcode.If the sample
    name portions of your filenames are modified from the default pipeline workflow, it is best to optionally
    use a sample ID translator file to point the pipeline to the correct input fastqs before beginning your NGS
    processing pipeline run.

8
PIPELINE SOFTWARE GEOMX - NGS - BLOG | NANOSTRING
MAN-10118-03 for software v2.2                           GeoMx-NGS Pipeline Software User Manual
Intro GeoMx NGS Pipeline Software

Installation options for the GeoMx NGS Pipeline and GUI
The GeoMx NGS Pipeline software has two components to consider: the graphical user
interface (GUI) and the GeoMx Pipeline itself, usually run on either a Linux or Amazon Web
Services (AWS) server. Users have the option to use a command line interface (CLI) if they
choose not to use the GUI.
The following installation combinations are available:

         Operating System                   User Interface                Pipeline location
                                                                                Local
               Windows                            GUI
                                                                          Server (Linux/AWS)
                                                                                Local
                MacOS                             GUI
                                                                          Server (Linux/AWS)
                 Linux                            CLI                     Server (Linux/AWS)

Choosing the option that is right for you
Many users choose to install and run GeoMx NGS Pipeline on a server connected to their computer
which has adequate computing power. There are two ways you can do this: use the Windows or
MacOS GUI to submit jobs to a Linux server where you set up the Pipeline to run or remotely log in
to your server and run the Pipeline from the command line interface (CLI) on the server.
Alternatively, for smaller datasets or if you have a fast CPU and a lot of RAM, you can use the
Windows, Mac or Linux versions to process files locally on your computer.

AWS

l   While you will be billed for services used on AWS, depending on the amount of processing you
    need, these costs should be considerably less than buying your own hardware.

l   You will need to go through a one-time setup process to prepare an AWS environment for your
    data processing; see Appendix I: Setting up GeoMx NGS Pipeline on Amazon Web
    Services (AWS) on page 21.

l   You will need: AWS account setup, GeoMx NGS Pipeline installed on AWS virtual machine, a
    file transfer protocol client (FTP Client) software (such as WinSCP) or a way for your computer
    to communicate with and send your data to and from AWS.

                                                                                                9
PIPELINE SOFTWARE GEOMX - NGS - BLOG | NANOSTRING
GeoMx-NGS Pipeline Software User Manual                            MAN-10118-03 for software v2.2
Intro GeoMx NGS Pipeline Software

Windows/MacOS GUI connected to Linux server

l   Connecting to a Linux server using the Windows or MacOS GUI provides a few advantages: it is
    easier for users who are not familiar with command line to submit their data for processing and it
    allows users to specify the amount of parallel processing threads for more control over the
    server.

l   You will need: a server with GeoMx NGS Pipeline installed. Users also need the Pipeline
    installed on their computers.

Remotely running on server using CLI

l   Using a remote connection to the server may be more convenient if the server already has direct
    access to your data.

l   You will need: Server needs GeoMx NGS Pipeline installed, need a way to remotely log in to
    server, server needs access to your data.

Running locally

l   For smaller datasets or if you have a fast CPU and a lot of RAM, you can use the Windows, Mac
    or Linux versions to process files locally on your computer. This may consume much of your
    system resources and we do not recommend planning to multitask while processing.

l   What you need for local processing: GeoMx NGS Pipeline installed and access to your data
    (copied to your computer).

10
MAN-10118-03 for software v2.2                         GeoMx-NGS Pipeline Software User Manual
Intro GeoMx NGS Pipeline Software

System requirements
    Local and remote                 Pipeline                                                Memory
                     Interface                              OS                CPU
          runs                       location                                                 (GB)
                                    Local or server                     Intel Core i5-4750
        Windows            GUI                          Windows 10                             16
                                     (Linux/AWS)                            3.20 GHz
                                    Local or server    MacOS Catalina   Intel Core i5 2.60
          Mac              GUI                                                                 16
                                     (Linux/AWS)         V.10.15.5            GHz
                                                        Linux Ubuntu    AMD Phenom 8650
          Linux            CLI    Server (Linux/AWS)                                           16
                                                           18.04           2.30 MHz
AWS instance type: AWS
                           CLI      Server (AWS)        AWS Ubuntu          vCPU 4             16
        t2.xlarge

If running locally:
Files should be available locally. The specifications below reflect needs of pipeline run and may be
impacted by other programs running on the same machine. These resources are adequate for runs
containing up to 96 segments/ROIs (with up to ~50 million reads per segment/ROI).
For Apple OS:

l   Macbook pro with a CPU at 1.4 GHz processor or better

l   16 GB RAM

For Windows OS:

l   Intel® Core™ i5-835OU @ 1.70GHZ 1.90 GHz

l   16 GB RAM

l   64-bit Operating System, x64-based processor

If running GeoMx pipeline on a Linux server or AWS instance (larger experiments):
Files should be available on the server. The specifications below reflect needs of pipeline run and
may be impacted by other programs running on the same server. These resources are adequate
for runs containing segments/ROIs exceeding ~50 million reads per segment/ROI.
Linux server:

l   OS: Ubuntu 18 and up

l   16 GB RAM

l   Adequate storage for data files via EBS, EFS or attached NAS. We recommend at least 2 GB
    RAM per thread and at least 1 GB of available free memory.

                                                                                                    11
GeoMx-NGS Pipeline Software User Manual                                MAN-10118-03 for software v2.2
Installing GeoMx NGS Pipeline Software

Installing the GeoMx NGS Pipeline

1. Download the installation file.

2. Right-click on the installation file and select Extract Here.

3. Double-click on the resulting installer application .

     l   Follow the instructions in the Wizard to install the GeoMx
         NGS Pipeline software.

     l   Read and accept the terms of GeoMx NGS Pipeline and
         wait until the Pipeline sets up the environment.

     l   If you plan on running the Pipeline on your local computer,
                                                                              Figure 3: User agreement
         check the box GeoMxNGSPipeline Local Server.

4. Once the GeoMx NGS Pipeline software has been installed, open the application.

5. (Optional) If you plan on using a remote server for processing, you will first need to install
   the GeoMx NGS Pipeline on your server. See Installing on Linux or AWS server on
   page 13.

     l   Once installed, add the server in the UI by              Figure 4: Adding a server

         clicking new server , entering the Public
         IPv4 address of the server (four integers separated by periods followed by :5000), then
         clicking Add.

     l   Enter API server address, including port (insert your server name in lieu of the red text):
         http://:5000

     l   In the main GeoMx NGS Pipeline menu, ensure the toggle Run locally is switched to Run
         remotely, and the server you saved is selected from the adjacent dropdown menu (if not
         already by default).

     l   For all GeoMx NGS Pipeline runs moving forward, you can click the gear icon and select the
         server from server address drop down, click save, then move slider and run on the server.

     Proceed to Running the Pipeline using GUI on Windows or Mac on page 15 .

12
MAN-10118-03 for software v2.2                           GeoMx-NGS Pipeline Software User Manual
Installing GeoMx NGS Pipeline Software

Installing on Linux or AWS server

 Installation comes as zip archive which contains the following files:

 l   GeoMxNGSPipeline_Linux_2.0.0.15.sh (or similar) – this is the installation script

 l   GeoMxNGSPipeline.tgz – this is the installation package file

 During the installation you will need to execute the installation script file. It will unpack API server
 files to the proper location, update configuration, and start the service.
 You will first need to have access to a Linux server with Ubuntu or Amazon Linux (ubuntu)
 distributive as well as sudo user privileges on that server.

1. Use secure copy protocol (SCP) to copy installation files on the server. You can use
   WinSCP or similar software. Unpack the installation zip archive and using SCP
   (WinSCP) to copy GeoMxNGSPipeline_ Linux_ 2.0.0.15.sh (or similar) and
   GeoMxNGSPipeline.tgz files to the home folder on the target server.

2. Using SSH client (PuTTY) connect to the server. Make sure you are connecting with user
   who has sudo privileges on that server.

3. Sometime during SCP (secure copy) the execute (x) permission may be lost.

     l   To check this, navigate to your home folder (cd /home/ {your user name} ) and
         execute the following command: ls - l or ll and check that you have execute or x
         permissions for the GeoMxNGSPipeline_Linux_2.0.0.15.sh file.

     l   If the x permission is missing, you see something like this: -rw-rw-r— (which means no
         one can execute this script and you will get permission denied error).

     l   Run the following command: sudo chmod +x GeoMxNGSPipeline_ Linux_
         2.0.0.15.sh. This will add execute x permission and the permission set will look like this:
         -rwxrwxr-x.

4. Run the installation script: sudo ./GeoMxNGSPipeline_Linux_2.0.0.15.sh.
     The installation script will ask you to specify the port. You can either specify the port under
     which the application will be running or leave the default port – 5000. To keep the default port,
     click Enter.
     If you already have installed GeoMxNGSPipeline API on this server and want to rerun the
     installation, the system will ask you whether you would like to override existing settings and
     whether you want to override folder mappings settings. In both cases type Y or y to confirm or
     any other character to reject.

5. After installation, you need to check if the service is running and port you have specified during

                                                                                                       13
GeoMx-NGS Pipeline Software User Manual                              MAN-10118-03 for software v2.2
Installing GeoMx NGS Pipeline Software

     installation is listening. To do this, run the following command: sudo netstat -tulpn. In
     the output, check that port 5000 is listening by GeoMxNGSPipeline service.

             IMPORTANT: Depending on which distributive you are using (Ubuntu or Amazon Linux) the
             output of this command may be slightly different.

6. Configure folder mapping (optional). This step is required only if you plan to use the GUI to
   connect to a remote server for running the pipeline.

     l   Folder mapping enables you to navigate to and view folders and folder contents on the
         server you connect to.

     l   Folder mapping is discretionary, based on what you want to have visible in the GUI, and
         most likely where the data resides (i.e., if fastq files are in your /home directory, then the
         /home directory should be mapped).

     l   These particular lines provide examples of what the folder mapping could be, but should be
         modified based on your environment, preference, and organizational habits.

     To configure server folder mapping, you need to edit the runtimesettings.xml file. By default,
     this file has mapping for home folder (/home). You will need to use one of Ubuntu Linux editing
     tools like mcedit (part of mc), vi or nano, to edit server mappings.
     In this example, we will be using mcedit. Type sudo mc to open Midnight Commander.
     Navigate to the /var/GeoMxNGSPipeline folder and open runtime-settings.xml file: Under the
     server_folders node, add folder mappings by adding/changing folder elements. Every folder
     element has 2 attributes:

     l   path – physical path on the server (which can point also to mapped EFS volumes)

     l   name – the name of this mapping.
     Also remove folder mappings which are incorrect. It is important to keep only valid folder
     mappings. Otherwise, the GUI will report an error while trying to connect to the server. Hit F2
     to save your edits. You don’t need to restart API server. The changes in runtime-
     settings.xml will be processed automatically.

7. Finally, you can try to connect to newly installed API instance using GeoMxNGSPipeline GUI.

 Proceed to Running the Pipeline using GUI on Windows or Mac on page 15 .

14
MAN-10118-03 for software v2.2                              GeoMx-NGS Pipeline Software User Manual
Running GeoMx NGS Pipeline Software

Running GeoMx NGS Pipeline

Running the Pipeline using GUI on Windows or Mac

1. Save your GeoMx NGS Pipeline
   Config file (from the GeoMx DSP run)
   to your computer. Save your FASTQ
   files (from the Illumina NGS run) to your
   computer and ensure they are not
   compressed and in a common directory.

                                                       Figure 5: GeoMx NGS Pipeline run setup window

       The pipeline is designed to handle FASTQs stored on a server to which there is direct access. If you
       are using a VPN or local server for storage of FASTQ files and encounter errors, move files to a
       direct access server to improve performance.

2. Open the GeoMx NGS Pipeline software.

3. Select Run locally or Run remotely, depending on what is appropriate for your workplace.

   l   The Resources available section lists the processing power of your local computer (if run
       locally is selected) or server (if run remotely is selected).

   l   The Number of threads dropdown at the bottom of the window indicates the number of
       parallel processes possible given the available resources. To run to process as fast as
       available resources will allow, change this number to the maximum number of threads. The
       default is set to 1.

   l   Use the gear icon or drop down folder next to Run on Server to select server.

4. Create an Run name (see Figure 5).

5. Browse to your Input directory - the folder housing your Illumina Raw Data (FASTQ) files
   (GZ format).

6. Browse to the location of your Configuration file.

7. Browse to your Output directory - the location in which you would like the output files saved.

                                                                                                          15
GeoMx-NGS Pipeline Software User Manual                                      MAN-10118-03 for software v2.2
  Running GeoMx NGS Pipeline Software

 8. (Optional) Browse to the Translation file (if applicable).

           The Sample ID Translator File can be used as input when Fastq files have been named something
           other than the defaults from the DSP. This file has two columns, one for the AOI list from the config
           file, and the second column for FASTQ file root name. Thus, the software will have a key for
           translating.

 9. (Optional) Check the Create DCC metadata box, if desired.

           The DCC metadata file as optional output will provide additional traceability from a pipeline run. This
           file shows the relationships between files as they transition from Fastq to DCC for every AOI, and
           unique MD5 checksums for identification. This can be helpful for submission/publication of data.
           Keep in mind, producing this additional output will increase run execution time.

10. (Optional) Check the Keep interim files box, if desired.

11. Click Run.

12. Monitor the progress (see Figure 6).

       l   Click the Log , Error , Warning , or
           Processing Parameters icon to view
           the respective information.

       l   You may run the Pipeline on
           sequencing runs that have not
           concluded if your samples of interest
           are done. If some FASTQ files are
                                                          Figure 6: GeoMx NGS Pipeline run monitoring window
           missing or unrecognized, you will
           receive a warning that the system did
           not find FASTQ files for all samples
           listed in the config file. You may proceed and generate “empty” DCC files (no counts for any
           probes) for the ROIs/segments with missing sequence data. This allows you to continue
           with your data analysis without needing to ensure all sequencing data is complete for all
           samples. You can upload a set of DCC files later, but you will need to create a new study to
           access the updated counts in Data Analysis.

13. When the process is complete, the status bar will read 100%. Click Done.

14. Open the output folder and locate the zipped DCC files subfolder. These files are ready to be
    uploaded to the GeoMx DSP Data Analysis Suite.

  16
MAN-10118-03 for software v2.2                  GeoMx-NGS Pipeline Software User Manual
Running GeoMx NGS Pipeline Software

        IMPORTANT: Check DCC file sizes and summary.txt to ensure files were processed as
       expected.

  Proceed to the GeoMx-NGS Data Analysis User Manual.

                                                                                            17
GeoMx-NGS Pipeline Software User Manual                             MAN-10118-03 for software v2.2
Running GeoMx NGS Pipeline Software

Running the Pipeline on a server

 We suggest two ways to run GeoMx NGS Pipeline on a central server:

1. Connect remotely to the server and run the Pipeline from a command line interface on the
   server. For this method, you need a way to remotely access the command line on the server
   and then you can follow the CLI instructions below (see Running the Pipeline using CLI on
   a Linux or AWS server on page 19).

2. Start GeoMx NGS Pipeline on a Linux server and submit the Pipeline processing jobs to it
   through the GUI. Once the run has started, you can close your local GeoMx NGS Pipeline GUI
   and the work will continue to process on the server. You will need to open the app again and
   establish a connection to receive your output files. See Running the Pipeline using GUI on
   Windows or Mac on page 15.

          IMPORTANT: Submitting jobs from the GUI to a server may result in errors if there is already
         a run processing on the server.

18
MAN-10118-03 for software v2.2                             GeoMx-NGS Pipeline Software User Manual
Running GeoMx NGS Pipeline Software

Running the Pipeline using CLI on a Linux or AWS server

 The following steps are for running the Pipeline on a remote Linux server. You may choose to
 use this if you are connecting to another computer running Linux for your Pipeline processing.
 Running the Pipeline from the CLI is similar to running it from the GUI in that you need specify
 three main options: the location of the config file, the location of the FASTQ files, and the
 location of the output folder where you would like to receive your DCC files.

1. Ensure your files are copied to an acceptable location. You need the absolute path to the files
   on the computer where the Pipeline will be run. If you are remotely connected to a server, the
   file path must be accessible to the server.

       The pipeline is designed to handle FASTQs stored on a server to which there is direct access. If you
       are using a VPN or local server for storage of FASTQ files and encounter errors, move files to a
       direct access server to improve performance.

2. Login to the server. If you are processing on a remote server, you need to run the command
   from the server.

3. Create a dropoff folder on the server for your config file and your FASTQ files. Create an
   output folder where you would like your DCCs saved.

4. Copy config and FASTQ files to server.

5. To be able to call this command from any place either restart your SSH session (logout and
   login) or run the following command: export PATH=$PATH:/var/GeoMxNGSPipeline.

   l   In the event of a permissions error: if you already have installed GeoMxNGSPipeline API
       on this server and you rerun the installation, you need to check the ownership and
       permissions of the /var/tmp/.net/ subfolder. To do this, navigate to the above directory and
       type ls -l or ll.
   sudo chgrp -R ubuntu /var/tmp/.net/
   sudo chown -R ubuntu /var/tmp/.net/
   sudo chmod 777 /var/tmp/.net/

   l   The CLI processing usage command is as follows:
   geomxngspipeline --in=INPUT_DIR_PATH --out=OUTPUT_DIR_PATH --
   ini=INI_CONFIG_PATH [OPTIONS]

   l   A CLI usage example run command is as follows:

                                                                                                          19
GeoMx-NGS Pipeline Software User Manual                           MAN-10118-03 for software v2.2
Running GeoMx NGS Pipeline Software
     geomxngspipeline --in=/mnt/efs/project1/FASTQ --
     out=/mnt/efs/project1/results --ini=/mnt/efs/project1/project1_
     config.ini --save-interim-files=true --threads=4

     l   To see all available run command arguments, please use the following help command:
     geomxngspipeline --help

6. When you have DCC files and a summary.txt file ready, you may copy them from your server
   to a local folder on your computer. Copy files using your usual method for interacting with your
   server, such as via a shared network drive or using secure copy protocol (scp).

           IMPORTANT: Check DCC file sizes and summary.txt to ensure files were processed as
          expected.

 Proceed to the GeoMx-NGS Data Analysis User Manual.

20
MAN-10118-03 for software v2.2                      GeoMx-NGS Pipeline Software User Manual
Setting up Amazon Web Services (AWS)

Appendix I: Setting up GeoMx NGS Pipeline on Amazon
Web Services (AWS)

SET UP AMAZON WEB SERVICES (AWS)
NanoString uses AWS with the GeoMx NGS Pipeline software to efficiently process the Illumina
FASTQ files and produce DCCs, which can be read by the GeoMx DSP Data Analysis Suite.

Set up your AWS account
Follow the AWS instructions below, which are borrowed heavily from:
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/get-set-up-for-amazon-ec2.html

SIGN UP FOR AWS
 When you sign up for Amazon Web Services (AWS), your AWS account is automatically
 signed up for all services in AWS, including Amazon EC2. You are charged only for the
 services that you use. With Amazon EC2, you pay only for what you use. If you are a new
 AWS customer, you can get started with Amazon EC2 for free. For more information, see
 AWS Free Tier. If you have an AWS account already, skip to the next task. If you don't have
 an AWS account, use the following procedure to create one.
 To create an AWS account, open https://portal.aws.amazon.com/billing/signup and follow the
 online instructions.
 Part of the sign-up procedure involves receiving a phone call and entering a verification code
 on the phone keypad. Note your AWS account number, because you'll need it for the next
 task.

CREATE AN IDENTITY AND ACCESS MANAGEMENT (IAM) USER
 Services in AWS, such as Amazon EC2, require that you provide credentials when you
 access them, so that the service can determine whether you have permission to access its
 resources. The console requires your password. You can create access keys for your AWS
 account to access the command line interface or API. However, we don't recommend that you
 access AWS using the credentials for your AWS account; we recommend that you use AWS
 Identity and Access Management (IAM) instead. Create an IAM user, and then add the user to
 an IAM group with administrative permissions or grant this user administrative permissions.
 You can then access AWS using a special URL and the credentials for the IAM user. If you
 signed up for AWS but have not created an IAM user for yourself, you can create one using
 the IAM console. If you aren't familiar with using the console, see Working with the AWS

                                                                                              21
GeoMx-NGS Pipeline Software User Manual                               MAN-10118-03 for software v2.2
 Setting up Amazon Web Services (AWS)

  Management Console for an overview.
  To create an administrator user for yourself and add the user to an administrators
  group (console)

 1. Use your AWS account email address and password to sign in as the AWS account root
    user to the IAM console at https://console.aws.amazon.com/iam/.

      We strongly recommend that you adhere to the best practice of using the Administrator IAM
      user below and securely lock away the root user credentials. Sign in as the root user only to
      perform a few account and service management tasks.

 2. In the navigation pane, choose Users and then choose Add user.

 3. For User name, enter Administrator.

 4. Select the check box next to AWS Management Console access. Then select Custom
    password, and then enter your new password in the text box.

 5. (Optional) By default, AWS requires the new user to create a new password when first
    signing in. You can clear the check box next to User must create a new password at
    next sign-in to allow the new user to reset their password after they sign in.

 6. Choose Next: Permissions.

 7. Under Set permissions, choose Add user to group.

 8. Choose Create group.

 9. In the Create group dialog box, for Group name enter Administrators.

10. Choose Filter policies, and then select AWS managed -job function to filter the table
    contents.

11. In the policy list, select the check box for AdministratorAccess. Then choose Create
    group.

      You must activate IAM user and role access to Billing before you can use the
      AdministratorAccess permissions to access the AWS Billing and Cost Management console. To
      do this, follow the instructions in step 1 of the tutorial about delegating access to the billing
      console.

12. Back in the list of groups, select the check box for your new group. Choose Refresh if
    necessary to see the group in the list.

13. Choose Next: Tags.

 22
MAN-10118-03 for software v2.2                   GeoMx-NGS Pipeline Software User Manual
 Setting up Amazon Web Services (AWS)

14. (Optional) Add metadata to the user by attaching tags as key- value pairs. For more
    information about using tags in IAM, see Tagging IAM Entities in the IAM User Guide.

15. Choose Next: Review to see the list of group memberships to be added to the new user.
    When you are ready to proceed, choose Create user.

  You can use this same process to create more groups and users and to give your users
  access to your AWS account resources. To learn about using policies that restrict user
  permissions to specific AWS resources, see Access Management and Example Policies.
  To sign in as this new IAM user, sign out of the AWS console, then use the following URL,
  where your_aws_account_id is your AWS account number without the hyphens (for example,
  if your AWS account number is 1234-5678-9012, your AWS account ID is 123456789012):

           https://your_aws_account_id.signin.aws.amazon.com/console/

  Enter the IAM user name (not your email address) and password that you just created. When
  you're signed in, the navigation bar displays "your_user_name @ your_aws_account_id".
  If you don't want the URL for your sign-in page to contain your AWS account ID, you can
  create an account alias. From the IAM console, choose Dashboard in the navigation pane.
  From the dashboard, choose Customize and enter an alias such as your company name. To
  sign in after you create an account alias, use the following URL:

           https://your_account_alias.signin.aws.amazon.com/console/

  To verify the sign-in link for IAM users for your account, open the IAM console and check
  under IAM users sign-in link on the dashboard.
  For more information about IAM, see IAM and Amazon EC2.

                                                                                          23
GeoMx-NGS Pipeline Software User Manual                               MAN-10118-03 for software v2.2
Setting up Amazon Web Services (AWS)

CREATE A KEY PAIR
AWS uses public-key cryptography to secure the login information for your instance. A Linux
instance has no password; you use a key pair to log in to your instance securely. You specify the
name of the key pair when you launch your instance, then provide the private key when you log
in using SSH.
 If you haven't created a key pair already, you can create one using the Amazon EC2 console.
 Note that if you plan to launch instances in multiple regions, you'll need to create a key pair in
 each region. For more information about regions, see Regions, Availability Zones, and Local
 Zones.
 To create a key pair

1. Sign in to AWS using the URL that you created in the previous
   section.

2. From the AWS dashboard, choose EC2 to open the Amazon EC2
   console.

3. From the navigation bar, select a region for the key pair. You can
   select any region that's available to you, regardless of your location.
   However, key pairs are specific to a region; for example, if you plan
   to launch an instance in the US East (Ohio) Region, you must
   create a key pair for the instance in the US East (Ohio) Region.

4. In the navigation pane, under NETWORK &
   SECURITY , choose Key Pairs . The                                             Figure 7: Select a
   navigation pane is on the left side of the                                         region

   console. If you do not see the pane, it might
   be minimized; choose the arrow to expand
   the pane. You may have to scroll down to            Figure 8: Key Pairs

   see the Key Pairs link.

5. Choose Create Key Pair.

6. Enter a name for the new key pair in the Key pair name field of the Create Key Pair dialog
   box, and then choose Create. Use a name that is easy for you to remember, such as your
   IAM user name, followed by -key-pair, plus the region name. For example, me-key-pair-
   useast2.

7. The private key file is automatically downloaded by your browser. The base file name is the
   name you specified as the name of your key pair, and the file name extension is .pem. Save
   the private key file in a safe place.

24
MAN-10118-03 for software v2.2                        GeoMx-NGS Pipeline Software User Manual
Setting up Amazon Web Services (AWS)

         IMPORTANT: This is the only chance for you to save the private key file. You'll need to
        provide the name of your key pair when you launch an instance and the corresponding
        private key each time you connect to the instance.

After you launch your instance, if you use Windows, we recommend you use the program
PuTTY to connect to your AWS EC2 Linux instance and convert your .pem to a .ppk file
using PuTTYgen (see below).
After you launch your instance, if you use MacOS or Linux, you can connect using secure
shell (ssh) from your terminal. Before connecting with SSH you will need to use the following
command to set the permissions of your private key file so that only you can read it:

               chmod 400 your_user_name-key-pair-region_name.pem

If you do not set these permissions, then you cannot connect to your instance using this key
pair. For more information, see Error: Unprotected Private Key File.
For more information, see Amazon EC2 Key Pairs.

                                                                                                   25
GeoMx-NGS Pipeline Software User Manual                          MAN-10118-03 for software v2.2
Setting up Amazon Web Services (AWS)

 To prepare to connect to a Linux instance from Windows using PuTTY

1. Download and install PuTTY from http://www.chiark.greenend.org.uk/~sgtatham/putty/.
   Be sure to install the entire suite.

2. Start PuTTYgen (for example, from the Start menu, choose All Programs > PuTTY >
   PuTTYgen).

3. Under Type of key to generate ,
   choose RSA.

                                                        Figure 9: Type of key to generate

4. Choose Load . By default, PuTTYgen
   displays only files with the extension
   .ppk. To locate your .pem file, select the              Figure 10: Choose All Files
   option to display files of all types.

5. Select the private key file that you created in the previous procedure and choose Open.
   Choose OK to dismiss the confirmation dialog box.

6. Choose Save private key. PuTTYgen displays a warning about saving the key without a
   passphrase. Choose Yes.

7. Specify the same name for the key that you used for the key pair. PuTTY automatically adds
   the .ppk file extension.

26
MAN-10118-03 for software v2.2                     GeoMx-NGS Pipeline Software User Manual
Setting up Amazon Web Services (AWS)

CREATE A VIRTUAL PRIVATE CLOUD (VPC)
 Amazon VPC enables you to launch AWS resources into a virtual network that you've
 defined, known as a virtual private cloud (VPC). The newer EC2 instance types require that
 you launch your instances in a VPC. If you have a default VPC, you can skip this section and
 move to the next task, Create a Security Group. To determine whether you have a default
 VPC, open the Amazon EC2 console and look for Default VPC under Account Attributes on
 the dashboard. If you do not have a default VPC listed on the dashboard, you can create a
 nondefault VPC using the steps below.
 To create a nondefault VPC

1. Open the Amazon VPC console at https://console.aws.amazon.com/vpc/.

2. From the navigation bar, select a region for the VPC. VPCs are specific to a region, so you
   should select the same region in which you created your key pair.

3. On the VPC dashboard, choose Launch VPC Wizard.

4. On the Step 1: Select a VPC Configuration page, ensure that VPC with a Single
   Public Subnet is selected, and choose Select.

5. On the Step 2: VPC with a Single Public Subnet page, enter a friendly name for your
   VPC in the VPC name field. Leave the other default configuration settings, and choose
   Create VPC. On the confirmation page, choose OK.

 For more information about VPCs, see the Amazon VPC User Guide.

                                                                                             27
GeoMx-NGS Pipeline Software User Manual                             MAN-10118-03 for software v2.2
Setting up Amazon Web Services (AWS)

CREATE A SECURITY GROUP
 Security groups act as a firewall for associated instances, controlling both inbound and
 outbound traffic at the instance level. You must add rules to a security group that enable you to
 connect to your instance from your IP address using SSH. You can also add rules that allow
 inbound and outbound HTTP and HTTPS access from anywhere.
 Note that if you plan to launch instances in multiple regions, you'll need to create a security
 group in each region. For more information about regions, see Regions, Availability Zones,
 and Local Zones.
 Prerequisites
 You'll need the public IPv4 address of your local computer. The security group editor in the
 Amazon EC2 console can automatically detect the public IPv4 address for you. Alternatively,
 you can use the search phrase "what is my IP address" in an Internet browser, or use the
 following service: Check IP. If you are connecting through an Internet service provider (ISP) or
 from behind a firewall without a static IP address, you need to find out the range of IP
 addresses used by client computers.
 To create a security group with least privilege

1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.

     Alternatively, you can use the Amazon VPC console to create a security group. However, the
     instructions in this procedure don't match the Amazon VPC console. Therefore, if you switched
     to the Amazon VPC console in the previous section, either switch back to the Amazon EC2
     console and use these instructions, or use the instructions in Set Up a Security Group for Your
     VPC in the Amazon VPC Getting Started Guide.

28
MAN-10118-03 for software v2.2                          GeoMx-NGS Pipeline Software User Manual
Setting up Amazon Web Services (AWS)

2. From the navigation bar, select a region for the security group.
   Security groups are specific to a region, so you should select the
   same region in which you created your key pair.

                                                                                   Figure 11: Select a
                                                                                        region

3. Choose Security Groups in the navigation pane.

4. Choose Create Security Group.

5. Enter a name for the new security group and a description. Use a name that is easy for you
   to remember, such as your IAM user name, followed by _SG_, plus the region name. For
   example, me_SG_uswest2.

6. In the VPC list, select your VPC. If you have a default VPC, it's the one that is marked with
   an asterisk (*).

7. On the Inbound tab, create the following rules (choose Add Rule for each new rule), and
   then choose Create:

   l   Choose HTTP from the Type list, and make sure that Source is set to Anywhere
       (0.0.0.0/0).

   l   Choose SSH from the Type list. In the Source box, choose My IP to automatically
       populate the field with the public IPv4 address of your local computer. Alternatively,
       choose Custom and specify the public IPv4 address of your computer or network in
       CIDR notation. To specify an individual IP address in CIDR notation, add the routing
       suffix /32, for example, 203.0.113.25/32. If your company allocates addresses from a
       range, specify the entire range, such as 203.0.113.0/24.

           IMPORTANT: For security reasons, we don't recommend that you allow SSH access
           from all IPv4 addresses (0.0.0.0/0) to your instance, except for testing purposes and only
           for a short time.

                                                                                                         29
GeoMx-NGS Pipeline Software User Manual                       MAN-10118-03 for software v2.2
Setting up Amazon Web Services (AWS)

     For more information, see Amazon EC2 Security Groups for Linux Instances.

30
MAN-10118-03 for software v2.2                        GeoMx-NGS Pipeline Software User Manual
Setting up Amazon Web Services (AWS)

 Launch an EC2-Instance
 This is the cloud-based server on which you will be running the GeoMx NGS Pipeline software.
 Follow the AWS instructions below, which are borrowed heavily from:
 https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EC2_GetStarted.html
 Launch an Instance
 You can launch a Linux instance using the AWS Management Console as described in the
 following procedure. This tutorial is intended to help you launch your first instance quickly, so it
 doesn't cover all possible options. For more information about the advanced options, see
 Launching an Instance.

1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.

2. From the console dashboard, choose Launch Instance.

3. The Choose an Amazon Machine Image (AMI) page displays a list of basic configurations,
   called Amazon Machine Images (AMIs), that serve as templates for your instance. Select an
   HVM version of Ubuntu 18.04. Notice that these AMIs are marked "Free tier eligible."

4. On the Choose an Instance Type page, you can select the hardware configuration of your
   instance. Select the t3.xlarge type, which is selected by default. Notice that this instance type
   is eligible for the free tier.

5. Choose Review and Launch to let the wizard complete the other configuration settings for
   you.

6. On the Review Instance Launch page, under Security Groups, you'll see that the wizard
   created and selected a security group for you. You can use this security group, or alternatively
   you can select the security group that you created when getting set up using the following
   steps:

   l   Choose Edit security groups.

   l   On the Configure Security Group page, ensure that Select an existing security group is
       selected.

   l   Select your security group from the list of existing security groups, and then choose Review
       and Launch.

7. On the Review Instance Launch page, choose Launch.

8. When prompted for a key pair, select Choose an existing key pair, then select the key pair
   that you created when getting set up.

                                                                                                   31
GeoMx-NGS Pipeline Software User Manual                                 MAN-10118-03 for software v2.2
  Setting up Amazon Web Services (AWS)

       Alternatively, you can create a new key pair. Select Create a new key pair, enter a name for
       the key pair, and then choose Download Key Pair. This is the only chance for you to save the
       private key file, so be sure to download it. Save the private key file in a safe place. You'll need
       to provide the name of your key pair when you launch an instance and the corresponding
       private key each time you connect to the instance.

              IMPORTANT: Don't select the Proceed without a key pair option. If you launch your
             instance without a key pair, then you can't connect to it.

       When you are ready, select the acknowledgment check box, and then choose Launch
       Instances.

 9. A confirmation page lets you know that your instance is launching. Choose View Instances to
    close the confirmation page and return to the console.

10. On the Instances screen, you can view the status of the launch. It takes a short time for an
    instance to launch. When you launch an instance, its initial state is pending. After the instance
    starts, its state changes to running and it receives a public DNS name. (If the Public DNS
    (IPv4) column is hidden, choose Show/Hide Columns (the gear-shaped icon) in the top right
    corner of the page and then select Public DNS (IPv4).)

11. It can take a few minutes for the instance to be ready so that you can connect to it. Check that
    your instance has passed its status checks; you can view this information in the Status
    Checks column.

  l    Instance specifics: t3.xlarge running Ubuntu 18.04

  32
MAN-10118-03 for software v2.2                      GeoMx-NGS Pipeline Software User Manual
Setting up Amazon Web Services (AWS)

 Connect to your instance
 The instructions here use PuTTy to connect to the instance. Follow the AWS instructions below,
 which are borrowed heavily from:
 https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/putty.html.
 You need the .ppk file that you created for your private key. For more information, see Convert
 Your Private Key Using PuTTYgen in the preceding section. If you receive an error while
 attempting to connect to your instance, see Troubleshooting Connecting to Your Instance.
 To connect to your instance using PuTTY

1. Start PuTTY (from the Start menu, choose All Programs, PuTTY, PuTTY).

2. In the Category pane, choose Session and complete the following fields:

   l   In the Host Name box (Public DNS): To connect using
       your instance's public DNS, enter user_name@public_
       dns_name
       For information about how to get the public DNS name
       or IPv6 address of the instance, see Get Information
       About Your Instance. For user_ name, be sure to
       specify the appropriate user name for your AMI. For
       example: using Ubuntu, in which username would be
       ubuntu. Otherwise, if username and root don't work,          Figure 12: Putty configuration
       check with the AMI provider.                                            window

   l   Ensure that the Port value is 22.

   l   Under Connection type, select SSH.

3. (Optional) You can configure PuTTY to automatically send 'keepalive' data at regular intervals
   to keep the session active. This is useful to avoid disconnecting from your instance due to
   session inactivity. In the Category pane, choose Connection, and then enter the required
   interval in the Seconds between keepalives field. For example, if your session disconnects
   after 10 minutes of inactivity, enter 180 to configure PuTTY to send keepalive data every 3
   minutes.

                                                                                                     33
GeoMx-NGS Pipeline Software User Manual                               MAN-10118-03 for software v2.2
Setting up Amazon Web Services (AWS)

4. In the Category pane, expand Connection , expand
   SSH, and then choose Auth. Complete the following:

     l   Choose Browse.

     l   Select the .ppk file that you generated for your key pair
         and choose Open.

     l   (Optional) If you plan to start this session again later,
         you can save the session information for future use.
         Under Category, choose Session, enter a name for                   Figure 13: Auth settings

         the session in Saved Sessions , and then choose
         Save.

     l   Choose Open.

5. If this is the first time you have connected to this instance, PuTTY displays a security alert
   dialog box that asks whether you trust the host to which you are connecting.

     l   (Optional) Verify that the fingerprint in the security alert dialog box matches the fingerprint
         that you previously obtained in (Optional) Get the Instance Fingerprint. If these fingerprints
         don't match, someone might be attempting a "man-in-the-middle" attack. If they match,
         continue to the next step.

     l   Choose Yes. A window opens and you are connected to your instance.

          If you specified a passphrase when you converted your private key to PuTTY's format, you must
          provide that passphrase when you log in to the instance.

 If you receive an error while attempting to connect to your instance, see Troubleshooting
 Connecting to Your Instance.
 To connect to your instance on MacOS or Linux
 You need your private key, the .pem file you downloaded and set the permissions for in “Create a
 key pair” step 7. You need to know the path to this .pem file.
 In the AWS console web portal in the EC2 menu, select your instance and then from “action”
 select “connect". The connect menu will have instructions to connect to your instance.
 Open your terminal or console and copy&paste or type in the ssh command. You will need to
 specify the path to your .pem file if it is not in the current directory.

34
MAN-10118-03 for software v2.2                   GeoMx-NGS Pipeline Software User Manual
Setting up Amazon Web Services (AWS)

                            Figure 14: Connecting to an instance

                                                                                     35
GeoMx-NGS Pipeline Software User Manual                             MAN-10118-03 for software v2.2
Setting up Amazon Web Services (AWS)

 Setup an EFS Drive
 This is the cloud-based storage location in which you will store data files. Follow the AWS
 instructions below, which are borrowed heavily from: https://aws.amazon.com/getting-
 started/tutorials/create-network-file-system/
 Create a File System
 You can easily create a highly available and scalable network file system from the Amazon EFS
 console.

1. Open the AWS Management Console.

     l   Enter your user name and password
         to get started.

     l   Find EFS under Storage, and click to
         open the EFS Console.

                                                                Figure 15: AWS Services

2. In the Amazon EFS console, click Create
   file system.

                                                       Figure 16: Create File System in EFS console

36
MAN-10118-03 for software v2.2                        GeoMx-NGS Pipeline Software User Manual
Setting up Amazon Web Services (AWS)

3. If the Default VPC is not selected in the
   VPC dropdown field, select the dropdown
   arrow and select the Default VPC. Accept
   all the defaults in Step 1: Configure file
   system access and click Next Step.

                                                          Figure 17: Configure file system access

4. Accept all the defaults in Step 2:
   Configure optional settings and click
   Next Step.

                                                          Figure 18: Configure optional settings

5. Accept all the defaults in Step 3: Review
   and create and click Create File System.

                                                              Figure 19: Review and create

 l   Recommend selecting Max I/O in the “Choose performance mode” section

 l   You can name your EFS drive by selecting your EFS drive and tagging it. To do this: select

                                                                                                    37
GeoMx-NGS Pipeline Software User Manual                         MAN-10118-03 for software v2.2
Setting up Amazon Web Services (AWS)

     your EFS drive, click Manage Tags, enter Name under Key and a unique identifier of your
     choice under Value.

38
You can also read