presentation.mkdn 15 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483
  1. ---
  2. title: Bioinformatic analysis of complex, high-throughput genomic and epigenomic data in the context of $\mathsf{CD4}^{+}$ T-cell differentiation and diagnosis and treatment of transplant rejection
  3. author: |
  4. Ryan C. Thompson \
  5. Su Lab \
  6. The Scripps Research Institute
  7. date: October 24, 2019
  8. theme: Boadilla
  9. aspectratio: 169
  10. fontsize: 14pt
  11. ---
  12. ## Organ transplants are a life-saving treatment
  13. ::: incremental
  14. * 36,528 transplants performed in the USA in 2018[^organdonor]
  15. * 100 transplants every day!
  16. * Over 113,000 people on the national transplant waiting list as of
  17. July 2019
  18. :::
  19. [^organdonor]: [organdonor.gov](https://www.organdonor.gov/statistics-stories/statistics.html)
  20. ## Organ donation statistics for the USA in 2018[^organdonor]
  21. \centering
  22. ![](graphics/presentation/transplants-organ-CROP.pdf)
  23. ## Types of grafts
  24. A graft is categorized based on the relationship between donor and recipient:
  25. . . .
  26. ::: incremental
  27. * **Autograft:** Donor and recipient are the *same individual*
  28. * **Allograft:** Donor and recipient are *different individuals* of
  29. the *same species*
  30. * **Xenograft:** Donor and recipient are *different species*
  31. :::
  32. ## Recipient T-cells reject allogenic MHCs
  33. :::::::::: {.columns}
  34. ::: {.column width="55%"}
  35. :::: incremental
  36. <!-- Vertical alignment hacks -->
  37. \rule{\linewidth}{0pt}
  38. \vspace*{12pt}
  39. * TCR binds to both antigen *and* MHC surface \vspace{10pt}
  40. * HLA genes encoding MHC proteins are highly polymorphic \vspace{10pt}
  41. * Variants in donor MHC can trigger the same T-cell response as a
  42. foreign antigen
  43. \vspace*{12pt}
  44. \rule{\linewidth}{0pt}
  45. ::::
  46. :::
  47. ::: {.column width="40%"}
  48. <!-- ![\footnotesize Janeway's Immunobio- logy (2012), Fig. 9.19](graphics/presentation/janeway-fig9.19-TCR.png){ height=70% } -->
  49. ![TCR binding to self (right) and allogenic (left) MHC\footnotemark](graphics/presentation/tcr_mhc.jpg){ height=70% }
  50. :::
  51. ::::::::::
  52. \footnotetext[3]{\href{https://doi.org/10.1016/j.cell.2007.01.048}{Colf, Bankovich, et al. "How a Single T Cell Receptor Recognizes Both Self and Foreign MHC". In: Cell (2007)}}
  53. ## Allograft rejection is a major long-term problem
  54. ![Kidney allograft survival rates in children by transplant year[^kim-marks]](graphics/presentation/kidney-graft-survival.png){ height=65% }
  55. [^kim-marks]: [Kim & Marks. "Long-term outcomes of children after solid organ transplantation". In: Clinics (2014)](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3884158/?report=classic)
  56. ## Rejection is treated with immune suppressive drugs
  57. <!-- TODO: Need a graphic, or maybe a table of common drugs +
  58. mechanisms, or a diagram for periodic checking. -->
  59. ::: incremental
  60. * Graft recipient must take immune suppressive drugs indefinitely
  61. * Graft is monitored for rejection and dosage adjusted over time
  62. * Immune suppression is a delicate balance: too much and too little
  63. are both problematic.
  64. :::
  65. ## My thesis topics
  66. <!-- TODO: Needs revision -->
  67. ### Topic 1: Immune memory
  68. Genome-wide epigenetic analysis of H3K4 and H3K27 methylation in naïve
  69. and memory $\mathsf{CD4}^{+}$ T-cell activation
  70. ### Topic 2: Diagnostics for rejection
  71. Improving array-based diagnostics for transplant rejection by
  72. optimizing data preprocessing
  73. ### Topic 3: Blood profiling during treatment
  74. Globin-blocking for more effective blood RNA-seq analysis in primate
  75. animal model for experimental graft rejection treatment
  76. ## Today's focus
  77. ### \Large Topic 1: Immune memory
  78. \Large
  79. Genome-wide epigenetic analysis of H3K4 and H3K27 methylation in naïve
  80. and memory $\mathsf{CD4}^{+}$ T-cell activation
  81. ## Memory cells: faster, stronger, and more independent
  82. ![Naïve T-cell activated by APC](graphics/presentation/T-cells-A-SVG.png)
  83. ## Memory cells: faster, stronger, and more independent
  84. ![Naïve T-cell differentiates and proliferates into effector T-cells](graphics/presentation/T-cells-B-SVG.png)
  85. ## Memory cells: faster, stronger, and more independent
  86. ![Post-infection, some effectors cells remain as memory cells](graphics/presentation/T-cells-C-SVG.png)
  87. ## Memory cells: faster, stronger, and more independent
  88. ![Memory T-cells respond more strongly to activation](graphics/presentation/T-cells-D-SVG.png)
  89. ::: notes
  90. Compared to naïve cells, memory cells:
  91. * respond to a lower antigen concentration
  92. * respond more strongly at any given antigen concentration
  93. * require less co-stimulation
  94. * are somewhat independent of some types of co-stimulation required by
  95. naïve cells
  96. * evolve over time to respond even more strongly to their antigen
  97. Result:
  98. \normalsize
  99. * Memory cells require progressively higher doses of immune suppresive
  100. drugs
  101. * Dosage cannot be increased indefinitely without compromising the
  102. immune system's ability to fight infection
  103. :::
  104. ## We need a better understanding of immune memory
  105. * Cell surface markers fairly well-characterized
  106. * But internal mechanisms poorly understood
  107. . . .
  108. \vfill
  109. \large
  110. **Hypothesis:** Epigenetic regulation of gene expression through
  111. histone modification is involved in $\mathsf{CD4}^{+}$ T-cell
  112. activation and memory.
  113. ## Experimental design
  114. * Separately isolate naïve and memory $\mathsf{CD4}^{+}$ T-cells from
  115. 4 donors
  116. * Activate with CD3/CD28 beads
  117. * Take samples at 4 time points: Day 0 (pre-activation), Day 1 (early
  118. activation), Day 5 (peak activation), and Day 14 (post-activation)
  119. * Do RNA-seq + ChIP-seq for 3 histone marks (H3K4me2, H3K4me3, &
  120. H3K27me3) for each sample.
  121. Data generated by Sarah Lamere, published in GEO as
  122. [GSE73214](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE73214)
  123. ## Time points capture phases of immune response
  124. \centering
  125. ![](graphics/presentation/immune-response.png)<!-- { height=75% } -->
  126. ## Why study these histone marks?
  127. ::: incremental
  128. * **H3K4me3:** "activating" mark associated with active transcription
  129. * **H3K4me2:** Correlated with H3K4me3, hypothesized as a "poised" state
  130. * **H3K27me3:** "repressive" mark associated with inactive
  131. * All 3 involved in T-cell differentiation, but activation dynamics
  132. unexplored
  133. :::
  134. ## ChIP-seq sequences DNA bound to marked histones[^chipseq]
  135. \centering
  136. ![](graphics/presentation/NRG-chipseq.png){ height=70% }
  137. [^chipseq]: [Furey. "ChIP-seq and beyond: New and improved methodologies to detect and characterize protein-DNA interactions". In: Nature Reviews Genetics (2012)](http://www.nature.com/articles/nrg3306)
  138. ## A few intermediate analysis steps are required
  139. ![Flowchart of workflow for data analysis](graphics/CD4-csaw/rulegraphs/rulegraph-all-RASTER100.png)
  140. ## Histone modifications occur on consecutive histones
  141. ![ChIP-seq coverage in IL2 gene[^lamerethesis]](graphics/presentation/LaMere-thesis-fig3.9-SVG-CROP.png){ height=65% }
  142. [^lamerethesis]: Sarah LaMere. "Dynamic epigenetic regulation of CD4 T cell activation and memory formation". PhD thesis. TSRI, 2015.
  143. ## Histone modifications occur on consecutive histones
  144. ![Strand cross-correlation plots](graphics/presentation/CCF-plots-A-SVG.png)
  145. ## Histone modifications occur on consecutive histones
  146. ![Strand cross-correlation plots](graphics/presentation/CCF-plots-B-SVG.png)
  147. ## Histone modifications occur on consecutive histones
  148. ![Strand cross-correlation plots](graphics/presentation/CCF-plots-C-SVG.png)
  149. ## SICER identifies enriched regions across the genome
  150. ![Finding "islands" of coverage with SICER[^sicer]](graphics/presentation/SICER-fig1-SVG.png)
  151. [^sicer]: [Zang et al. “A clustering approach for identification of enriched domains from histone modification ChIP-Seq data”. In: Bioinformatics 25.15 (2009)](https://doi.org/10.1093/bioinformatics/btp340)
  152. ## IDR identifies *reproducible* enriched regions
  153. ![Example irreproducible discovery rate[^idr] score consistency plot](graphics/presentation/IDR-example-CROP-RASTER.png){ height=65% }
  154. [^idr]: [Li et al. “Measuring reproducibility of high-throughput experiments”. In: AOAS (2011)](https://doi.org/10.1214/11-AOAS466)
  155. ## Finding enriched regions across the genome
  156. ![Peak-calling summary statistics](graphics/presentation/RCT-thesis-table2.2-SVG-CROP.png)
  157. ## Each histone mark has an "effective promoter radius"
  158. ![Enrichment of peaks near promoters](graphics/CD4-csaw/Promoter-Peak-Distance-Profile-PAGE1-CROP.pdf)
  159. ## Peaks in promoters correlate with gene expression
  160. ![Expression distributions of genes with and without promoter peaks](graphics/presentation/FPKM-by-Peak-Violin-Plots-A-SVG.png)
  161. ## Peaks in promoters correlate with gene expression
  162. ![Expression distributions of genes with and without promoter peaks](graphics/presentation/FPKM-by-Peak-Violin-Plots-B-SVG.png)
  163. ## Peaks in promoters correlate with gene expression
  164. ![Expression distributions of genes with and without promoter peaks](graphics/presentation/FPKM-by-Peak-Violin-Plots-C-SVG.png)
  165. ## Peaks in promoters correlate with gene expression
  166. ![Expression distributions of genes with and without promoter peaks](graphics/presentation/FPKM-by-Peak-Violin-Plots-D-SVG.png)
  167. ## Peaks in promoters correlate with gene expression
  168. ![Expression distributions of genes with and without promoter peaks](graphics/presentation/FPKM-by-Peak-Violin-Plots-Z-SVG.png)
  169. ## The story so far
  170. <!-- TODO: Left column: text; right column: flip through relevant image -->
  171. * H3K4me2, H3K4me3, and H3K27me3 occur on many consecutive histones in
  172. broad regions across the genome
  173. * These enriched regions occur more commonly within a certain radius
  174. of gene promoters
  175. * This "effective promoter radius" is consistent across all samples
  176. for a given histone mark, but differs between histone marks
  177. * Presence or absence of a peak within this radius is correlated with
  178. gene expression
  179. . . .
  180. Next: Does the position of a histone modification within a gene
  181. promoter matter to that gene's expression, or is it merely the
  182. presence or absence anywhere within the promoter?
  183. ## H3K4me2 promoter neighborhood K-means clusters
  184. ![Cluster means for H3K4me2](graphics/CD4-csaw/ChIP-seq/H3K4me2-neighborhood-clusters-CROP.png)
  185. ## H3K4me2 promoter neighborhood cluster PCA
  186. :::::::::: {.columns}
  187. ::: {.column width="50%"}
  188. ![Cluster means for H3K4me2](graphics/CD4-csaw/ChIP-seq/H3K4me2-neighborhood-clusters-CROP.png)
  189. :::
  190. ::: {.column width="50%"}
  191. ![PCA plot of promoters](graphics/CD4-csaw/ChIP-seq/H3K4me2-neighborhood-PCA-CROP.png)
  192. :::
  193. ::::::::::
  194. ## H3K4me2 promoter neighborhood cluster expression
  195. :::::::::: {.columns}
  196. ::: {.column width="50%"}
  197. ![Cluster means for H3K4me2](graphics/CD4-csaw/ChIP-seq/H3K4me2-neighborhood-clusters-CROP.png)
  198. :::
  199. ::: {.column width="50%"}
  200. ![Cluster expression distributions](graphics/CD4-csaw/ChIP-seq/H3K4me2-neighborhood-expression-CROP-ROT90.png)
  201. :::
  202. ::::::::::
  203. ## H3K4me3 promoter neighborhood cluster PCA
  204. :::::::::: {.columns}
  205. ::: {.column width="50%"}
  206. ![Cluster means for H3K4me3](graphics/CD4-csaw/ChIP-seq/H3K4me3-neighborhood-clusters-CROP.png)
  207. :::
  208. ::: {.column width="50%"}
  209. ![PCA plot of promoters](graphics/CD4-csaw/ChIP-seq/H3K4me3-neighborhood-PCA-CROP.png)
  210. :::
  211. ::::::::::
  212. ## H3K4me3 promoter neighborhood cluster expression
  213. :::::::::: {.columns}
  214. ::: {.column width="50%"}
  215. ![Cluster means for H3K4me3](graphics/CD4-csaw/ChIP-seq/H3K4me3-neighborhood-clusters-CROP.png)
  216. :::
  217. ::: {.column width="50%"}
  218. ![Cluster expression distributions](graphics/CD4-csaw/ChIP-seq/H3K4me3-neighborhood-expression-CROP-ROT90.png)
  219. :::
  220. ::::::::::
  221. ## H3K27me3 promoter neighborhood cluster PCA
  222. :::::::::: {.columns}
  223. ::: {.column width="50%"}
  224. ![Cluster means for H3K27me3](graphics/CD4-csaw/ChIP-seq/H3K27me3-neighborhood-clusters-CROP.png)
  225. :::
  226. ::: {.column width="50%"}
  227. ![PCA plot of promoters](graphics/CD4-csaw/ChIP-seq/H3K27me3-neighborhood-PCA-CROP.png)
  228. :::
  229. ::::::::::
  230. ## H3K27me3 promoter neighborhood cluster expression
  231. :::::::::: {.columns}
  232. ::: {.column width="50%"}
  233. ![Cluster means for H3K27me3](graphics/CD4-csaw/ChIP-seq/H3K27me3-neighborhood-clusters-CROP.png)
  234. :::
  235. ::: {.column width="50%"}
  236. ![Cluster expression distributions](graphics/CD4-csaw/ChIP-seq/H3K27me3-neighborhood-expression-CROP-ROT90.png)
  237. :::
  238. ::::::::::
  239. ## What have we learned?
  240. ### H3K4me2 & H3K4me3
  241. * Peak closer to promoter $\Rightarrow$ more likely gene is highly
  242. expressed
  243. * Slightly asymmetric in favor of peaks downstream of TSS
  244. . . .
  245. ### H3K27me3
  246. * Depletion of H3K27me3 at TSS associated with elevated gene
  247. expression
  248. * Enrichment of H3K27me3 upstream of TSS even more strongly associated
  249. with elevated expression
  250. * Other coverage profiles not associated with elevated expression
  251. ## Differential modification disappears by Day 14
  252. ![Differential modification between naïve and memory samples at each time point](graphics/presentation/RCT-thesis-table2.4-A-SVG-CROP.png)
  253. ## Differential modification disappears by Day 14
  254. ![Differential modification between naïve and memory samples at each time point](graphics/presentation/RCT-thesis-table2.4-B-SVG-CROP.png)
  255. ## Convergence at Day 14 H3K4me2
  256. ![(Insert figure legend)](graphics/CD4-csaw/ChIP-seq/H3K4me2-promoter-PCA-group-CROP.png)
  257. ## Convergence at Day 14 H3K4me3
  258. ![(Insert figure legend)](graphics/CD4-csaw/ChIP-seq/H3K4me3-promoter-PCA-group-CROP.png)
  259. ## Convergence at Day 14 H3K27me3
  260. ![(Insert figure legend)](graphics/CD4-csaw/ChIP-seq/H3K27me3-promoter-PCA-group-CROP.png)
  261. ## Convergence at Day 14 RNA-seq (PC 2 & 3)
  262. ![(Insert figure legend)](graphics/CD4-csaw/RNA-seq/PCA-final-23-CROP.png)
  263. ## MOFA identifies shared variation across all 4 data sets
  264. ![(Insert figure legend)](graphics/presentation/MOFA-varExplained-matrix-A-CROP.png)
  265. ## MOFA identifies shared variation across all 4 data sets
  266. ![(Insert figure legend)](graphics/presentation/MOFA-varExplained-matrix-B-CROP.png)
  267. ## MOFA shared variation captures convergence pattern
  268. ![(Insert figure legend)](graphics/CD4-csaw/MOFA-LF-scatter-small.png)
  269. ## What have we learned?
  270. * Almost no differential modification observed between naïve and
  271. memory at Day 14, despite plenty of differential modification at
  272. earlier time points.
  273. * RNA-seq data and all 3 histone marks' ChIP-seq data all show
  274. "convergence" between naïve and memory by Day 14 in the first 2 or 3
  275. principal coordinates.
  276. * MOFA captures this convergence pattern in one of the latent factors,
  277. indicating that this is a shared pattern across all 4 data sets.
  278. <!-- ## Slide -->
  279. <!-- ![(Insert figure legend)](graphics/CD4-csaw/LaMere2016_fig8.pdf) -->
  280. ## Takeaway 1: Each histone mark has an "effective promoter radius"
  281. * H3K4me2, H3K4me3, and H3K27me3 ChIP-seq reads are enriched in broad
  282. regions across the genome, representing areas where the histone
  283. modification is present
  284. * These enriched regions occur more commonly within a certain radius
  285. of gene promoters
  286. * This "effective promoter radius" is specific to each histone mark
  287. * Presence or absence of a peak within this radius is correlated with
  288. gene expression
  289. ## Takeaway 2: Peak position within the promoter is important
  290. * H3K4me2 and H3K4me3 peaks are more strongly associated with elevated
  291. gene expression the closer they are to the TSS, with a slight bias
  292. toward downstream peaks.
  293. * H3K27me3 depletion at the TSS and enrichement upstream are both
  294. associated with elevated expression, while other patterns are not.
  295. * In all histone marks, position of modification within promoter
  296. appears to be an important factor in association with gene
  297. expression
  298. ## Takeaway 3: Expression & epigenetic state both converge at Day 14
  299. * At Day 14, almost no differential modification observed between
  300. naïve and memory cells
  301. * Naïve and memory converge visually in PCoA plots
  302. * Convergence is a shared pattern of variation across all 3 histone
  303. marks and gene expression
  304. * This is consistent with the hypothesis that the naïve cells have
  305. differentiated into a more memory-like phenotype by day 14.