presentation.mkdn 15 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431
  1. ---
  2. title: Bioinformatic analysis of complex, high-throughput genomic and epigenomic data in the context of $\mathsf{CD4}^{+}$ T-cell differentiation and diagnosis and treatment of transplant rejection
  3. author: |
  4. Ryan C. Thompson \
  5. Su Lab \
  6. The Scripps Research Institute
  7. date: October 24, 2019
  8. theme: Boadilla
  9. aspectratio: 169
  10. fontsize: 14pt
  11. ---
  12. ## Organ transplants are a life-saving treatment
  13. ::: incremental
  14. * 36,528 transplants performed in the USA in 2018[^organdonor]
  15. * 100 transplants every day!
  16. * Over 113,000 people on the national transplant waiting list as of
  17. July 2019
  18. :::
  19. [^organdonor]: [organdonor.gov](https://www.organdonor.gov/statistics-stories/statistics.html)
  20. ## Organ donation statistics for the USA in 2018[^organdonor]
  21. \centering
  22. ![](graphics/presentation/transplants-organ-CROP.pdf)
  23. ## Graft rejection is an adaptive immune response
  24. <!-- TODO: Need a graphic for this -->
  25. ::: incremental
  26. * The host's adaptive immune system identifies and attacks cells
  27. bearing non-self antigens
  28. * An allograft contains differnet genetic variants from the host,
  29. resulting in protein-coding differences
  30. * Left unchecked, the host immune system eventually notices these
  31. alloantigens and begins attacking (rejecting) the graft
  32. * Rejection is the major long-term threat to organ allografts
  33. :::
  34. ## Allograft rejection remains a major long-term problem
  35. ![Kidney allograft survival rates in children by transplant year[^kim-marks]](graphics/presentation/kidney-graft-survival.png){ height=65% }
  36. [^kim-marks]:[ Kim & Marks. "Long-term outcomes of children after solid organ transplantation". In: Clinics (2014)](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3884158/?report=classic)
  37. ## Rejection is treated with immune suppressive drugs
  38. <!-- TODO: Need a graphic, or maybe a table of common drugs +
  39. mechanisms, or a diagram for periodic checking. -->
  40. ::: incremental
  41. * To prevent rejection, a graft recipient must take immune suppressive
  42. drugs for the rest of their life
  43. * The graft is periodically checked for signs of rejection, and immune
  44. suppression dosage is adjusted accordingly
  45. * Immune suppression is a delicate balance: too much leads to immune
  46. compromise; too little leads to rejection.
  47. :::
  48. ## My thesis topics
  49. ### Topic 1: Immune memory
  50. Genome-wide epigenetic analysis of H3K4 and H3K27 methylation in naïve
  51. and memory $\mathsf{CD4}^{+}$ T-cell activation
  52. ### Topic 2: Diagnostics for rejection
  53. Improving array-based diagnostics for transplant rejection by
  54. optimizing data preprocessing
  55. ### Topic 3: Blood profiling during treatment
  56. Globin-blocking for more effective blood RNA-seq analysis in primate
  57. animal model for experimental graft rejection treatment
  58. ## Today's focus
  59. ### \Large Topic 1: Immune memory
  60. \Large
  61. Genome-wide epigenetic analysis of H3K4 and H3K27 methylation in naïve
  62. and memory $\mathsf{CD4}^{+}$ T-cell activation
  63. ## Memory cells: faster, stronger, and more independent
  64. ![Naïve and memory T-cell responses to activation](graphics/presentation/T-cells-A-SVG.png)
  65. ## Memory cells: faster, stronger, and more independent
  66. ![Naïve and memory T-cell responses to activation](graphics/presentation/T-cells-B-SVG.png)
  67. ## Memory cells: faster, stronger, and more independent
  68. ![Naïve and memory T-cell responses to activation](graphics/presentation/T-cells-C-SVG.png)
  69. ## Memory cells: faster, stronger, and more independent
  70. ![Naïve and memory T-cell responses to activation](graphics/presentation/T-cells-D-SVG.png)
  71. ## Memory cells are a problem for immune suppression
  72. <!-- Need graphics? Or maybe just mark this slide as speaker notes for the previous one -->
  73. \large
  74. Compared to naïve cells, memory cells:
  75. \normalsize
  76. * respond to a lower antigen concentration
  77. * respond more strongly at any given antigen concentration
  78. * require less co-stimulation
  79. * are somewhat independent of some types of co-stimulation required by
  80. naïve cells
  81. * evolve over time to respond even more strongly to their antigen
  82. ## Memory cells are a problem for immune suppression
  83. \large
  84. Result:
  85. \normalsize
  86. * Memory cells require progressively higher doses of immune suppresive
  87. drugs
  88. * Dosage cannot be increased indefinitely without compromising the
  89. immune system's ability to fight infection
  90. ## We need a better understanding of immune memory
  91. * Cell surface markers of naïve and memory $\mathsf{CD4}^{+}$ T-cells
  92. are fairly well-characterized
  93. * But internal mechanisms that allow memory cells to respond
  94. differently to the same stimulus (antigen presentation) are not
  95. well-understood
  96. . . .
  97. * A reasonable hypothesis is that some of these mechanisms are
  98. epigenetic: using histone marks or DNA methylation to regulate the
  99. expression of certain genes
  100. * We can test this hypothesis by measuring gene expression (using
  101. RNA-seq) and histone methylation (using ChIP-seq) in naïve and
  102. memory T-cells before and after activation
  103. ## Experimental design
  104. * Separately isolate naïve and memory $\mathsf{CD4}^{+}$ T-cells from
  105. 4 donors
  106. * Activate with CD3/CD28 beads
  107. * Take samples at 4 time points: Day 0 (pre-activation), Day 1 (early
  108. activation), Day 5 (peak activation), and Day 14 (post-activation)
  109. * Do RNA-seq + ChIP-seq for 3 histone marks (H3K4me2, H3K4me3, &
  110. H3K27me3) for each sample.
  111. Data generated by Sarah Lamere, published in GEO as
  112. [GSE73214](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE73214)
  113. ## A few intermediate analysis steps are required
  114. ![Flowchart of workflow for data analysis](graphics/CD4-csaw/rulegraphs/rulegraph-all-RASTER100.png)
  115. ## Histone modifications occur on consecutive histones
  116. ![ChIP-seq coverage in IL2 gene[^lamerethesis]](graphics/presentation/LaMere-thesis-fig3.9-SVG-CROP.png){ height=65% }
  117. [^lamerethesis]: Sarah LaMere. "Dynamic epigenetic regulation of CD4 T cell activation and memory formation". PhD thesis. TSRI, 2015.
  118. ## Histone modifications occur on consecutive histones
  119. ![Strand cross-correlation plots](graphics/presentation/CCF-plots-A-SVG.png)
  120. ## Histone modifications occur on consecutive histones
  121. ![Strand cross-correlation plots](graphics/presentation/CCF-plots-B-SVG.png)
  122. ## Histone modifications occur on consecutive histones
  123. ![Strand cross-correlation plots](graphics/presentation/CCF-plots-C-SVG.png)
  124. ## SICER identifies enriched regions across the genome
  125. ![Finding "islands" of coverage with SICER[^sicer]](graphics/presentation/SICER-fig1-SVG.png)
  126. [^sicer]: [Zang et al. “A clustering approach for identification of enriched domains from histone modification ChIP-Seq data”. In: Bioinformatics 25.15 (2009)](https://doi.org/10.1093/bioinformatics/btp340)
  127. ## IDR identifies *reproducible* enriched regions
  128. ![Example irreproducible discovery rate[^idr] score consistency plot](graphics/presentation/IDR-example-CROP-RASTER.png){ height=65% }
  129. [^idr]: [Li et al. “Measuring reproducibility of high-throughput experiments”. In: AOAS (2011)](https://doi.org/10.1214/11-AOAS466)
  130. ## Finding enriched regions across the genome
  131. ![Peak-calling summary statistics](graphics/presentation/RCT-thesis-table2.2-SVG-CROP.png)
  132. ## Each histone mark has an "effective promoter radius"
  133. ![Enrichment of peaks near promoters](graphics/CD4-csaw/Promoter-Peak-Distance-Profile-PAGE1-CROP.pdf)
  134. ## Peaks in promoters correlate with gene expression
  135. ![Expression distributions of genes with and without promoter peaks](graphics/presentation/FPKM-by-Peak-Violin-Plots-A-SVG.png)
  136. ## Peaks in promoters correlate with gene expression
  137. ![Expression distributions of genes with and without promoter peaks](graphics/presentation/FPKM-by-Peak-Violin-Plots-B-SVG.png)
  138. ## Peaks in promoters correlate with gene expression
  139. ![Expression distributions of genes with and without promoter peaks](graphics/presentation/FPKM-by-Peak-Violin-Plots-C-SVG.png)
  140. ## Peaks in promoters correlate with gene expression
  141. ![Expression distributions of genes with and without promoter peaks](graphics/presentation/FPKM-by-Peak-Violin-Plots-D-SVG.png)
  142. ## Peaks in promoters correlate with gene expression
  143. ![Expression distributions of genes with and without promoter peaks](graphics/presentation/FPKM-by-Peak-Violin-Plots-Z-SVG.png)
  144. ## The story so far
  145. <!-- TODO: Left column: text; right column: flip through relevant image -->
  146. * H3K4me2, H3K4me3, and H3K27me3 occur on many consecutive histones in
  147. broad regions across the genome
  148. * These enriched regions occur more commonly within a certain radius
  149. of gene promoters
  150. * This "effective promoter radius" is consistent across all samples
  151. for a given histone mark, but differs between histone marks
  152. * Presence or absence of a peak within this radius is correlated with
  153. gene expression
  154. . . .
  155. Next: Does the position of a histone modification within a gene
  156. promoter matter to that gene's expression, or is it merely the
  157. presence or absence anywhere within the promoter?
  158. ## H3K4me2 promoter neighborhood K-means clusters
  159. ![Cluster means for H3K4me2](graphics/CD4-csaw/ChIP-seq/H3K4me2-neighborhood-clusters-CROP.png)
  160. ## H3K4me2 promoter neighborhood cluster PCA
  161. ::::: {.columns}
  162. ::: {.column width="50%"}
  163. ![Cluster means for H3K4me2](graphics/CD4-csaw/ChIP-seq/H3K4me2-neighborhood-clusters-CROP.png)
  164. :::
  165. ::: {.column width="50%"}
  166. ![PCA plot of promoters](graphics/CD4-csaw/ChIP-seq/H3K4me2-neighborhood-PCA-CROP.png)
  167. :::
  168. :::::
  169. ## H3K4me2 promoter neighborhood cluster expression
  170. ::::: {.columns}
  171. ::: {.column width="50%"}
  172. ![Cluster means for H3K4me2](graphics/CD4-csaw/ChIP-seq/H3K4me2-neighborhood-clusters-CROP.png)
  173. :::
  174. ::: {.column width="50%"}
  175. ![Cluster expression distributions](graphics/CD4-csaw/ChIP-seq/H3K4me2-neighborhood-expression-CROP-ROT90.png)
  176. :::
  177. :::::
  178. ## H3K4me3 promoter neighborhood K-means clusters
  179. ![Cluster means for H3K4me3](graphics/CD4-csaw/ChIP-seq/H3K4me3-neighborhood-clusters-CROP.png)
  180. ## H3K4me3 promoter neighborhood K-means clusters
  181. ::::: {.columns}
  182. ::: {.column width="50%"}
  183. ![Cluster means for H3K4me3](graphics/CD4-csaw/ChIP-seq/H3K4me3-neighborhood-clusters-CROP.png)
  184. :::
  185. ::: {.column width="50%"}
  186. ![PCA plot of promoters](graphics/CD4-csaw/ChIP-seq/H3K4me3-neighborhood-PCA-CROP.png)
  187. :::
  188. :::::
  189. ## H3K4me3 promoter neighborhood cluster expression
  190. ::::: {.columns}
  191. ::: {.column width="50%"}
  192. ![Cluster means for H3K4me3](graphics/CD4-csaw/ChIP-seq/H3K4me3-neighborhood-clusters-CROP.png)
  193. :::
  194. ::: {.column width="50%"}
  195. ![Cluster expression distributions](graphics/CD4-csaw/ChIP-seq/H3K4me3-neighborhood-expression-CROP-ROT90.png)
  196. :::
  197. :::::
  198. ## H3K27me3 promoter neighborhood K-means clusters
  199. ![Cluster means for H3K27me3](graphics/CD4-csaw/ChIP-seq/H3K27me3-neighborhood-clusters-CROP.png)
  200. ## H3K27me3 promoter neighborhood K-means clusters
  201. ::::: {.columns}
  202. ::: {.column width="50%"}
  203. ![Cluster means for H3K27me3](graphics/CD4-csaw/ChIP-seq/H3K27me3-neighborhood-clusters-CROP.png)
  204. :::
  205. ::: {.column width="50%"}
  206. ![PCA plot of promoters](graphics/CD4-csaw/ChIP-seq/H3K27me3-neighborhood-PCA-CROP.png)
  207. :::
  208. :::::
  209. ## H3K27me3 promoter neighborhood cluster expression
  210. ::::: {.columns}
  211. ::: {.column width="50%"}
  212. ![Cluster means for H3K27me3](graphics/CD4-csaw/ChIP-seq/H3K27me3-neighborhood-clusters-CROP.png)
  213. :::
  214. ::: {.column width="50%"}
  215. ![Cluster expression distributions](graphics/CD4-csaw/ChIP-seq/H3K27me3-neighborhood-expression-CROP-ROT90.png)
  216. :::
  217. :::::
  218. ## What have we learned?
  219. ### H3K4me2 & H3K4me3
  220. * Peak closer to promoter $\Rightarrow$ more likely gene is highly
  221. expressed
  222. * Slightly asymmetric in favor of peaks downstream of TSS
  223. . . .
  224. ### H3K27me3
  225. * Depletion of H3K27me3 at TSS associated with elevated gene
  226. expression
  227. * Enrichment of H3K27me3 upstream of TSS even more strongly associated
  228. with elevated expression
  229. * Other coverage profiles not associated with elevated expression
  230. ## Differential modification disappears by Day 14
  231. ![Differential modification between naïve and memory samples at each time point](graphics/presentation/RCT-thesis-table2.4-A-SVG-CROP.png)
  232. ## Differential modification disappears by Day 14
  233. ![Differential modification between naïve and memory samples at each time point](graphics/presentation/RCT-thesis-table2.4-B-SVG-CROP.png)
  234. ## Convergence at Day 14 H3K4me2
  235. ![(Insert figure legend)](graphics/CD4-csaw/ChIP-seq/H3K4me2-promoter-PCA-group-CROP.png)
  236. ## Convergence at Day 14 H3K4me3
  237. ![(Insert figure legend)](graphics/CD4-csaw/ChIP-seq/H3K4me3-promoter-PCA-group-CROP.png)
  238. ## Convergence at Day 14 H3K27me3
  239. ![(Insert figure legend)](graphics/CD4-csaw/ChIP-seq/H3K27me3-promoter-PCA-group-CROP.png)
  240. ## Convergence at Day 14 RNA-seq (PC 2 & 3)
  241. ![(Insert figure legend)](graphics/CD4-csaw/RNA-seq/PCA-final-23-CROP.png)
  242. ## MOFA identifies shared variation across all 4 data sets
  243. ![(Insert figure legend)](graphics/CD4-csaw/MOFA-varExplaiend-matrix-CROP.png)
  244. ## MOFA shared variation captures convergence pattern
  245. ![(Insert figure legend)](graphics/CD4-csaw/MOFA-LF-scatter-small.png)
  246. ## What have we learned?
  247. * Almost no differential modification observed between naïve and
  248. memory at Day 14, despite plenty of differential modification at
  249. earlier time points.
  250. * RNA-seq data and all 3 histone marks' ChIP-seq data all show
  251. "convergence" between naïve and memory by Day 14 in the first 2 or 3
  252. principal coordinates.
  253. * MOFA captures this convergence pattern in one of the latent factors,
  254. indicating that this is a shared pattern across all 4 data sets.
  255. <!-- ## Slide -->
  256. <!-- ![(Insert figure legend)](graphics/CD4-csaw/LaMere2016_fig8.pdf) -->
  257. ## Takeaway 1: Each histone mark has an "effective promoter radius"
  258. * H3K4me2, H3K4me3, and H3K27me3 ChIP-seq reads are enriched in broad
  259. regions across the genome, representing areas where the histone
  260. modification is present
  261. * These enriched regions occur more commonly within a certain radius
  262. of gene promoters
  263. * This "effective promoter radius" is specific to each histone mark
  264. * Presence or absence of a peak within this radius is correlated with
  265. gene expression
  266. ## Takeaway 2: Peak position within the promoter is important
  267. * H3K4me2 and H3K4me3 peaks are more strongly associated with elevated
  268. gene expression the closer they are to the TSS, with a slight bias
  269. toward downstream peaks.
  270. * H3K27me3 depletion at the TSS and enrichement upstream are both
  271. associated with elevated expression, while other patterns are not.
  272. * In all histone marks, position of modification within promoter
  273. appears to be an important factor in association with gene
  274. expression
  275. ## Takeaway 3: Expression & epigenetic state both converge at Day 14
  276. * At Day 14, almost no differential modification observed between
  277. naïve and memory cells
  278. * Naïve and memory converge visually in PCoA plots
  279. * Convergence is a shared pattern of variation across all 3 histone
  280. marks and gene expression
  281. * This is consistent with the hypothesis that the naïve cells have
  282. differentiated into a more memory-like phenotype by day 14.