Skip to content

zz2liu/TCGA-RNASeq-tutorial

Repository files navigation

TCGA-RNASeq-tutorial

The tutorial for a yale training session: TCGA RNA-seq Data, Download and Analyses all on your laptop.

See the slides used during the workshop here.

  • Navigate and select files to basket
  • Download metadata and manifest from basket
  • Download the files with GDC-client

Preprocess the metadata

.. Note: To to run the R script, you can install Rstudio.

Preprocess the FPKM matrix

  • Convert the downloaded files to a FPKM matrix in unix shell/terminal
for f in */*.gz; do
  id=$(dirname $f); echo $id > $id.tmp; 
  zcat $f | cut -f2 >> $id.tmp; 
done
echo 'featureId' > tmp.index
zcat $f | cut -f1 >> tmp.index
paste tmp.index *.tmp > ../geneId_fileId_FPKM.txt
rm tmp.index; rm *.tmp

.. Note: to use linux shell, run terminal on mac (OS X); install and run babun on a PC (windows).

Introduction of analyses in R

Using the script to:

  • Filter the genes and convert FPKM to log scale
  • Id genes coexpressed with your gene of interest
  • Id genes differently expressed between paired normal and tumor
  • PCA plot

Introduction of the analyses by FireHose

  • Gene
  • Cohort summary
  • Cohort data and workflow
  • Cohort analysis

FAQS

quick fix to get the miRNA FPM matrix for Queen Okoro

  • Convert the downloaded files to a FPKM matrix in unix shell/terminal
# cd to the folder with all the txt files under each sample directory.
for f in */*.txt; do
  id=$(dirname $f); echo $id > $id.tmp; #colnames to-be
  cat $f | cut -f3 >> $id.tmp;  #cell values to-be
done
echo 'featureId' > tmp.index
cat $f | cut -f1 >> tmp.index #the rownames to be
paste tmp.index *.tmp > ../featureId_fileId_FPKM.txt
rm tmp.index; rm *.tmp

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors