Example Data Files¶
For this tutorial there are two options to obtain the example files. You can either use example files compiled using open access data from multiple repositories or use open access data from Kids First, a Common Fund program.
General Example Files¶
Download the compressed data and extract to view the files.
curl -L https://osf.io/erj8n/download -o MIME_example_files.tar.gz
tar -zxvf MIME_example_files.tar.gz
example_files_MIME
:
.
├── 6285633006_R03C01_Red.idat
├── ERR458493.fastq
├── HCM-BROD-0028-C71.json
├── MANIFEST.txt
├── TARGET_NBL_ClinicalData_Discovery_20200507.xlsx
├── TCGA.UCS.varscan.a35f9f72-e4da-4d93-ad40-11cbd417b037.DR-10.0.somatic.maf
├── annotations.txt
├── coatColor.pheno
├── commit-detached.svg
├── nationwidechildrens.org_clinical.TCGA-AA-A00L.xml
├── pruned_coatColor_maf_geno.vcf
├── wa.BG02.rep-1.J7.hg19.pks.bed
├── wgEncodeUwRepliSeqBg02esG1bAlnRep1.bam
├── wgEncodeUwRepliSeqBg02esS3AlnRep1.bam.bai
└── wgEncodeUwRepliSeqBg02esS3PctSignalRep1.bigWig
Kids First Files¶
For this tutorial, you can also use open access files from the Kids First Data Portal (KF Data Portal).
KF Data Portal Login
Downloading the files requires an account on the KF data portal. You can choose one of three available options (Google, ORCID, Facebook) to setup an account, although we recommend using ORCID if you have one. For more details can be found on the Registration tutorial for Kids First.
Download the KF_File_ID.csv. Login to Kids First Data Portal, select File Repository
and File Filters
. Scroll all the way down to Search by File ID
field and click on upload your list of ids
.
One can either copy paste the File IDs in the box or upload the KF_File_ID.csv. To obtain a local copy, select View Results
and for each individual file click on download button under the Actions
column.
The directory structure of KF_data
with all the files should include:
.
├── 254ca234-b7f2-4305-ab93-d7e15f4a530d.rsem.isoforms.results
├── 32503692.gpr
├── 801039e0-8675-452f-86bf-474297f17cbf.mutect2_somatic.vep.maf
├── 80c5a306-15a0-4241-bedb-cf153a1fb866.rsem.isoforms.results
├── 86d75258-8a3a-4f6c-b108-fd26d1bcabf4.arriba.fusions.pdf
├── 9969477031_R02C01_Red.idat
├── ad45e585-d356-4d8b-97cd-4e050418754c.mutect2_somatic.vep.maf
├── ba6f060d-582b-46c1-833a-9c0d6e68ccc5.STAR.fusion_predictions.abridged.coding_effect.tsv
├── c133da79-1f7c-41e1-a081-e3aa0d39e764.rsem.isoforms.results
├── dfe951a5-0b00-42e3-81f2-99aaaf54b898.CNVs.p.value.txt
├── e7adf495-d7da-4b85-8f27-e1f7ecbe45a3.png
├── ee49988d-8ab8-4902-9e4e-dfc2face240d.somaticSV.vcf
├── f06987bc-aca3-4f7e-8e77-6e2cf9ce28a0.STAR.fusion_predictions.abridged.coding_effect.tsv
└── f6bc4b78-280b-420a-8af4-1a6d85304e37.png