Skip to content

Setup DESeq2 Public App

DESeq2 is a Bioconductor package used to perform DGE analysis by fitting the negative binomial model to the count data. It requires a counts table as input along with a phenotype file describing the experimental groups.

DESeq2 performs multiple steps including:

  • estimating size factors to account for differences in library depth
  • estimating gene-wise dispersions to generate accurate estimates of within-group variation
  • shrinkage of dispersion estimates which reduces false positives in the DGE analysis
  • hypothesis testing using the Wald test or Likelihood Ratio test

DESeq2 automatically removes outlier genes from analysis using Cook's distance and filters genes with low counts which helps improve detection power by making the multiple testing adjustment of the p-values less severe. Refer to the DESeq2 vignette for a more detailed explanation, helpful suggestions, and examples.

Cavatica offers DESeq2 as a stand alone public app which consists of a Common Workflow Language (CWL) wrapper around a script with functions from the DESeq2 package. In this lesson we learn to copy, edit, and setup the DESeq2 app in the project folder with cancer data files.

Terminology

  • Count data - represents the number of sequence reads that originated from a particular gene
  • Dispersion - a measure of spread or variability in the data. DESeq2 dispersion estimates are inversely related to the mean and directly related to variance
  • LFC - log2 fold change

Step 1: Search & copy DESeq2 app

Vidlets

We recommend watching the vidlets first before utilizing the step wise written instructions to follow along.

The first step is to obtain a copy of the DESeq2 app in the project folder.

  • Click the Apps tab which is currently empty and click Add App button which opens the list of Public Apps.
  • You can find the DESeq2 app by typing "DESEQ" in the search bar.
  • In the DESeq2 app box select the Other versions drop down box and click on the version 1.18.1.
  • This opens the app in a new tab where you can click on the on the right hand corner and click Copy.
  • Select the project folder cancer-dge (or the project name you have chosen) and click Copy.
  • Navigate to your project Dashboard using Projects drop down menu and view the app under the Apps tab. You can also click the project link in the popup box that appears on top of the page.

Step 2: Edit DESeq2 app (Optional)

DESeq2 App Version

The IgnoreTxVersion bug was fixed in Revision 17 of the DESeq2 1.18.1 app and will be the default selection when you copy the app. Follow the steps in this section if using older Revision versions of DESeq2 1.18.1 app.

The DESeq2 app has a bug with the IgnoreTxVersion parameter that can be rectified by editing the app using the tool editor.

  • To do so, click on DESeq2 in the Apps tab. This opens the app page.
  • Click the Edit button on right hand upper corner which prompts a popup box with a warning message about losing update notifications for the original app. Click Proceed to editing.
  • In the DESeq2's tool editor, find the IgnoreTxVersion input port and click on it.
  • In the Value transform field of the port, click on </>, enter the following code and click Save.

    {
        if ($job.inputs.ignoreTxVersion) {
          return "TRUE"
        }
       else {
          return "FALSE"
        }
    }
    
  • Click icon on the top right hand corner to add a revision note.

  • On the app page, the revision history is updated to read Revision 1.

Step 3: Obtain reference gene annotation

A reference gene annotation file in GTF format is required by DESeq2 app to summarize the transcript level abundances contained in the Kallisto files for gene-level analysis. Internally, tximport, another Bioconductor package, is utilized to obtain the gene level summary.

  • Navigate to the Files tab and edit the metadata columns to show Reference genome column. To do so, click on the icon and select Reference genome. All files in this dataset used the GRCh38 (hg38) homo sapiens genome assembly released by Genome Reference Consortium.
  • Click on Data drop down menu and click on Public Reference Files.
  • This takes you to a new page for Public Files.
  • Click on Type: All button to bring a drop down list and select GTF.
  • From the results, select Homo_sapiens.GRCh38.84.gtf which is the ENSEMBL Release 84 version of the Human gene annotation in GTF format.
  • Click on Copy and select the project folder with the cancer files.
  • Select Copy in the popup window.
  • A notification menu will highlight the successful copy of the file and clicking on the project folder name will take you to the Files tab in folder.
  • Check for the reference file using the Type: All button and select GTF.

In our next lesson, we will learn to edit our previously downloaded phenotype file and upload it to Cavatica!


Last update: February 24, 2021