TCGA-RNASeq-tutorial

The tutorial for a yale training session: TCGA RNA-seq Data, Download and Analyses all on your laptop.

See the slides used during the workshop here.

Go to TCGA data hub

Navigate and select files to basket
Download metadata and manifest from basket
Download the files with GDC-client

Preprocess the metadata

Convert to csv use the online tool json-to-csv
Metadata Description here
Choose and rename fields in a speadsheet or a R script.

.. Note: To to run the R script, you can install Rstudio.

Preprocess the FPKM matrix

Convert the downloaded files to a FPKM matrix in unix shell/terminal

for f in */*.gz; do
  id=$(dirname $f); echo $id > $id.tmp; 
  zcat $f | cut -f2 >> $id.tmp; 
done
echo 'featureId' > tmp.index
zcat $f | cut -f1 >> tmp.index
paste tmp.index *.tmp > ../geneId_fileId_FPKM.txt
rm tmp.index; rm *.tmp

.. Note: to use linux shell, run terminal on mac (OS X); install and run babun on a PC (windows).

Description of the Barcode
Description of the pipeline
Download the GENCODE gene annotation file
Map the FPKM matrix to gene symbol and barcode with preprocess_count_matrix.R.

Introduction of analyses in R

Using the script to:

Filter the genes and convert FPKM to log scale
Id genes coexpressed with your gene of interest
Id genes differently expressed between paired normal and tumor
PCA plot

Introduction of the analyses by FireHose

Gene
Cohort summary
Cohort data and workflow
Cohort analysis

FAQS

quick fix to get the miRNA FPM matrix for Queen Okoro

Convert the downloaded files to a FPKM matrix in unix shell/terminal

# cd to the folder with all the txt files under each sample directory.
for f in */*.txt; do
  id=$(dirname $f); echo $id > $id.tmp; #colnames to-be
  cat $f | cut -f3 >> $id.tmp;  #cell values to-be
done
echo 'featureId' > tmp.index
cat $f | cut -f1 >> tmp.index #the rownames to be
paste tmp.index *.tmp > ../featureId_fileId_FPKM.txt
rm tmp.index; rm *.tmp

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
README.md		README.md
gencode_v22_geneInfo.csv		gencode_v22_geneInfo.csv
introduction_to_analyses.R		introduction_to_analyses.R
join_files_tcga.sh		join_files_tcga.sh
preprocess_count_matrix.R		preprocess_count_matrix.R
preprocess_metadata.R		preprocess_metadata.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TCGA-RNASeq-tutorial

Go to TCGA data hub

Preprocess the metadata

Preprocess the FPKM matrix

Introduction of analyses in R

Introduction of the analyses by FireHose

FAQS

quick fix to get the miRNA FPM matrix for Queen Okoro

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TCGA-RNASeq-tutorial

Go to TCGA data hub

Preprocess the metadata

Preprocess the FPKM matrix

Introduction of analyses in R

Introduction of the analyses by FireHose

FAQS

quick fix to get the miRNA FPM matrix for Queen Okoro

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages