Differential Gene Expression Analysis on Cavatica Cloud Platform¶
RNA sequencing (RNA-Seq) is a high throughput technique that provides qualitative and quantitative information about RNA biology including transcriptome-wide expression quantification, discovery of novel genes and gene isoforms, and differential expression.
The goal of this tutorial is to enable you to:
- create virtual cancer cohorts using the NIH Common Fund-supported Gabriella Miller Kids First Data Portal (KF Portal).
- analyze differential gene expression (DGE) on Cavatica, an integrated cloud based platform.
You will learn two different approaches for DGE analysis using open access human cancer data on Cavatica: (a) using a public workflow app and (b) running code from an analysis script on an instance with RStudio computational environment.
Table of contents
Est. Time | Lesson Name | Description |
---|---|---|
10 mins | An Introduction to RNA-Seq | Background about RNA-Seq |
20 mins | Selecting Kids First Cancer Cohort | Select Kids First open access cancer RNA-Seq files and push to Cavatica |
20 mins | Cavatica - View, Filter, Tag and Download | Filter imported data, tag and download relevant metadata from Cavatica |
20 mins | Setup DESeq2 Public App | Setting up the workflow app based on DESeq2 on Cavatica |
15 mins | Phenotype File and Upload to Cavatica | Reformat metadata file and upload it to Cavatica |
50 mins | Analysis with DESeq2 Public App | Run the DESeq2 app with appropriate inputs and computational settings |
60 mins | Analysis using Data Cruncher | Analysis on an instance in the RStudio environment |
Learning Objectives
- learn to build virtual cohorts on KF portal
- learn to navigate project folder and perform file operations on Cavatica
- learn to upload and download data from Cavatica
- learn to search, copy, and edit public workflow apps on Cavatica
- learn to perform differential gene expression (DGE) analysis using DESeq2 app
- learn to setup analysis environment and execute code for DGE analysis
- Setup: Integrated login accounts on Kid's First Data Portal & Cavatica - Follow our lessons on account setup and connecting the two accounts.
Login Credentials
You do not need eRA Commons ID to do the lesson!
- Background: Knowledge of biology and rudimentary genetics.
- Technology: Basic knowledge of R and command line. Familiarity with RStudio is useful.
- Financial: Pilot funds ($100) are provided to every user on Cavatica with linked KF accounts.
- Time: Initial account setup may take hours to a day for verification. Setup of eRA Commons ID may take days and is institute dependent.
- DESeq2 app < $1.00
- Analysis with R < $1.00
Last update: December 10, 2021