Quellcode durchsuchen

Add current draft of thesis to examples

Ryan C. Thompson vor 5 Jahren
Ursprung
Commit
1d4a0eb2a8

+ 2 - 0
examples/Salomon/CD4/README.mkdn

@@ -1,3 +1,5 @@
+<!-- TODO: Update this -->
+
 This is a series of example plots and tables from a combined
 RNA-seq/ChIP-seq study on differences between naive and memory T-cell
 activation. You can view the (old and messy) code for these plots

+ 3 - 2
examples/Salomon/CD4/index.html

@@ -1,9 +1,10 @@
+<!-- TODO: Update this -->
 <p>This is a series of example plots and tables from a combined RNA-seq/ChIP-seq study on differences between naive and memory T-cell activation. You can view the (old and messy) code for these plots <a href="https://github.com/DarwinAwardWinner/cd4-histone-paper-code">here</a>.</p>
 <ul>
 <li><a href="p-value%20distributions.pdf"><code>p-value distributions.pdf</code></a> is a series of p-value histograms for each of the contrasts tested. A contrast with no significant differential expression would exhibit a uniform distribution, while differential expression would be reflected by an excess of small p-values.</li>
 <li><a href="FPKM%20by%20Peak%20Status%20H3K4.pdf"><code>FPKM by Peak Status H3K4.pdf</code></a> shows the variation in gene expression based on the presence or absence of two histone marks in the gene promoters.</li>
 <li><a href="promoter-edger-topgenes3-ql.xlsx"><code>promoter-edger-topgenes3-ql.xlsx</code></a> is a spreadsheet of all promoters with differential histone modification in their promoters based on the ChIP-seq read counts.</li>
-<li><a href="Promoter%20Peak%20Distance%20Profile.pdf"><code>Promoter Peak Distance Profile.pdf</code></a> shows the distribution of distances from transcription start sites to the nearest peak for the three histone modifications studied. This was used to determine the &quot;promoter radius&quot; for read counting. Notably, the three histone marks do not all have the same promoter radius.</li>
+<li><a href="Promoter%20Peak%20Distance%20Profile.pdf"><code>Promoter Peak Distance Profile.pdf</code></a> shows the distribution of distances from transcription start sites to the nearest peak for the three histone modifications studied. This was used to determine the “promoter radius” for read counting. Notably, the three histone marks do not all have the same promoter radius.</li>
 <li><a href="rnaseq-MDSPlots.pdf"><code>rnaseq-MDSPlots.pdf</code></a> shows a series of MDS plots (similar to PCA plots) before and after correction of a known batch effect. Note the implausible zigzag-shaped progression over time before correction, compared to the more plausible cyclic time progression after.</li>
 <li><a href="rnaseq-edgeR-vs-limma.pdf"><code>rnaseq-edgeR-vs-limma.pdf</code></a> and <a href="rnaseq-limma-weighted-vs-uw.pdf"><code>rnaseq-limma-weighted-vs-uw.pdf</code></a> show comparisons of p-values for all genes in each contrast of the RNA-seq data, comparing edgeR and limma-voom with/without sample quality weights. The final choice of method was limma-voom with sample quality weights.</li>
 <li><a href="rnaseq-maplots-limma-sampleweights.pdf"><code>rnaseq-maplots-limma-sampleweights.pdf</code></a> shows the MA plot for each contrast of the RNA-seq data</li>
@@ -11,7 +12,7 @@
 <p>There are also some plots from an in-progress analysis of the same data based on sliding windows, rather than just analyzing promoter regions. You can view the code for generating these plots <a href="https://github.com/DarwinAwardWinner/CD4-csaw">here</a>, and you can view some presentation slides based on this analysis <a href="./ChIP-Seq%20presentation.pdf">here</a>.</p>
 <ul>
 <li><a href="CCF-plots.pdf"><code>CCF-plots.pdf</code></a> shows the cross-correlation functions of the ChIP-Seq data for 3 different histone marks, at several different levels of smoothing. This plot is used to determine the fragment size. You can also observe from the periodic wave-like pattern, indicating that multiple adjacent histones tend to share the same histone modification.</li>
-<li><a href="CCF-plots-noBL.pdf"><code>CCF-plots-noBL.pdf</code></a> shows the same plots as above, but without removing reads in so-called &quot;blacklist&quot; regions that typically contain high-coverage artifact signals. The result is a much messier plot, with many samples having an artifactual peak at the read length (dotted line) rather than the actual width of a histone (solid line).</li>
+<li><a href="CCF-plots-noBL.pdf"><code>CCF-plots-noBL.pdf</code></a> shows the same plots as above, but without removing reads in so-called “blacklist” regions that typically contain high-coverage artifact signals. The result is a much messier plot, with many samples having an artifactual peak at the read length (dotted line) rather than the actual width of a histone (solid line).</li>
 <li><a href="site-profile-plots.pdf"><code>site-profile-plots.pdf</code></a> shows plots of the relative coverage depth profiles around local coverage maxima in the ChIP-Seq data. This plot is used to determine the footprint size of the protein being imunoprecipitated. Since this is histone mark data, the footprint size should match the size of a nucleosome, about 147 bp.</li>
 <li><a href="D4659vsD5053_idrplots.pdf"><code>D4659vsD5053_idrplots.pdf</code></a> shows an example plot from the <a href="https://sites.google.com/site/anshulkundaje/projects/idr">Irreproducible Discovery Rate</a> analysis used to identify biologically reproducible peaks in the ChIP-Seq data. The plot shows the degree of consistency in the scores for overlapping peaks in two biological replicates. Peaks with consistently high-ranking scores in both replicates are considered reproducible.</li>
 <li>The following reports show QC and exploratory analysis for 3 histone marks and RNA-seq: <a href="reports/ChIP-seq/H3K4me3-exploration.html">H3K4me3</a>, <a href="reports/ChIP-seq/H3K4me2-exploration.html">H3K4me2</a>, <a href="reports/ChIP-seq/H3K27me3-exploration.html">H3K27me3</a>, <a href="reports/RNA-seq/salmon_hg38.analysisSet_ensembl.85-exploration.html">RNA-seq</a>. The purpose of these reports is to ensure that the modelling assumptions and strategies are appropriate for the data. Sometimes several strategies are tested against each other, and the best performer is chosen for the subsequent differential expression/modification analysis.</li>

+ 22 - 7
examples/Salomon/README.mkdn

@@ -1,4 +1,16 @@
-## Sub-folders
+## My thesis
+
+- [My thesis](thesis-final.pdf): *Bioinformatic analysis of complex,
+  high-throughput genomic and epigenomic data in the context of CD4⁺
+  T-cell differentiation and diagnosis and treatment of transplant
+  rejection*
+- [Slides for my defense talk](defense-slides.pdf): The slides for
+  my dissertation defense, mainly covering Chapter 2 of my thesis
+
+## Interesting sub-folders
+
+Each folder showcases a different project or aspect of my work in the
+Salomon lab.
 
 - [`Teaching`](Teaching): Teaching materials for my lecture and lab on
   introductory RNA-seq analysis
@@ -17,14 +29,17 @@
 
 ## Presentations
 
-- [`DGE Presentation.pdf`](DGE Presentation.pdf): A presentation
-  comparing edgeR, DESeq, and limma on both a conceptual and practical
-  level
-- [`Advanced RNA-seq Analysis.pdf`](Advanced RNA-seq Analysis.pdf): A
-  presentation on the advanced features of limma for RNA-seq analysis
-- [`Reproducible Workflow Presentation.pdf`](Reproducible Workflow Presentation.pdf):
+- [`DGE Presentation.pdf`](DGE Presentation.pdf): comparing edgeR,
+  DESeq, and limma on both a conceptual and practical level
+- [`Advanced RNA-seq Analysis.pdf`](Advanced RNA-seq Analysis.pdf): On
+  the advanced features provided by limma for RNA-seq analysis
+- [`Reproducible Workflow Presentation.pdf`](Reproducible Workflow
+  Presentation.pdf): Showcases the use of
+  [Snakemake](https://snakemake.readthedocs.io/en/stable/) for
+  building a reproducible analysis
 
 ## Other files
+
 - [`Classifier Math Write-up.pdf`](Classifier Math Write-up.pdf): A
   short write-up to explain the mathematical principles behind the
   classifier method used in a machine learning project 

BIN
examples/Salomon/defense-slides.pdf


+ 10 - 4
examples/Salomon/index.html

@@ -1,4 +1,10 @@
-<h2 id="sub-folders">Sub-folders</h2>
+<h2 id="my-thesis">My thesis</h2>
+<ul>
+<li><a href="thesis-final.pdf">My thesis</a>: <em>Bioinformatic analysis of complex, high-throughput genomic and epigenomic data in the context of CD4⁺ T-cell differentiation and diagnosis and treatment of transplant rejection</em></li>
+<li><a href="defense-slides.pdf">Slides for my defense talk</a>: The slides for my dissertation defense, mainly covering Chapter 2 of my thesis</li>
+</ul>
+<h2 id="interesting-sub-folders">Interesting sub-folders</h2>
+<p>Each folder showcases a different project or aspect of my work in the Salomon lab.</p>
 <ul>
 <li><a href="Teaching"><code>Teaching</code></a>: Teaching materials for my lecture and lab on introductory RNA-seq analysis</li>
 <li><a href="CD4"><code>CD4</code></a>: Examples relating to my CD4 memory RNA-seq &amp; ChIP-seq project</li>
@@ -9,9 +15,9 @@
 </ul>
 <h2 id="presentations">Presentations</h2>
 <ul>
-<li><a href="DGE%20Presentation.pdf"><code>DGE Presentation.pdf</code></a>: A presentation comparing edgeR, DESeq, and limma on both a conceptual and practical level</li>
-<li><a href="Advanced%20RNA-seq%20Analysis.pdf"><code>Advanced RNA-seq Analysis.pdf</code></a>: A presentation on the advanced features of limma for RNA-seq analysis</li>
-<li><a href="Reproducible%20Workflow%20Presentation.pdf"><code>Reproducible Workflow Presentation.pdf</code></a>:</li>
+<li><a href="DGE%20Presentation.pdf"><code>DGE Presentation.pdf</code></a>: comparing edgeR, DESeq, and limma on both a conceptual and practical level</li>
+<li><a href="Advanced%20RNA-seq%20Analysis.pdf"><code>Advanced RNA-seq Analysis.pdf</code></a>: On the advanced features provided by limma for RNA-seq analysis</li>
+<li><a href="Reproducible%20Workflow%20Presentation.pdf"><code>Reproducible Workflow Presentation.pdf</code></a>: Showcases the use of <a href="https://snakemake.readthedocs.io/en/stable/">Snakemake</a> for building a reproducible analysis</li>
 </ul>
 <h2 id="other-files">Other files</h2>
 <ul>

BIN
examples/Salomon/thesis-final.pdf


+ 1 - 1
ryan_thompson_resume.lyx

@@ -610,7 +610,7 @@ literal "false"
 \begin_inset CommandInset href
 LatexCommand href
 name "Slides"
-target "https://darwinawardwinner.github.io/resume/examples/Salomon/CD4/ChIP-Seq%20presentation.pdf"
+target "https://darwinawardwinner.github.io/resume/examples/Salomon/defense-slides.pdf"
 literal "false"
 
 \end_inset