123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473 |
- ---
- title: Bioinformatic analysis of complex, high-throughput genomic and epigenomic data in the context of $\mathsf{CD4}^{+}$ T-cell differentiation and diagnosis and treatment of transplant rejection
- author: |
- Ryan C. Thompson \
- Su Lab \
- The Scripps Research Institute
- date: October 24, 2019
- theme: Boadilla
- aspectratio: 169
- fontsize: 14pt
- ---
- ## Organ transplants are a life-saving treatment
- ::: incremental
- * 36,528 transplants performed in the USA in 2018[^organdonor]
- * 100 transplants every day!
- * Over 113,000 people on the national transplant waiting list as of
- July 2019
- :::
- [^organdonor]: [organdonor.gov](https://www.organdonor.gov/statistics-stories/statistics.html)
- ## Organ donation statistics for the USA in 2018[^organdonor]
- \centering
- 
- ## Types of grafts
- A graft is categorized based on the relationship between donor and recipient:
- . . .
- ::: incremental
- * **Autograft:** Donor and recipient are the *same individual*
- * **Allograft:** Donor and recipient are *different individuals* of
- the *same species*
- * **Xenograft:** Donor and recipient are *different species*
- :::
- ## Recipient T-cells reject allogenic MHCs
- :::::::::: {.columns}
- ::: {.column width="55%"}
- :::: incremental
- * TCR binds to both antigen *and* MHC surface \vspace{10pt}
- * HLA genes encoding MHC proteins are highly polymorphic \vspace{10pt}
- * Variants in donor MHC can trigger the same T-cell response as a
- foreign antigen
- ::::
- :::
- ::: {.column width="40%"}
- <!-- { height=70% } -->
- { height=70% }
- :::
- ::::::::::
- \footnotetext[3]{\href{https://doi.org/10.1016/j.cell.2007.01.048}{Colf, Bankovich, et al. "How a Single T Cell Receptor Recognizes Both Self and Foreign MHC". In: Cell (2007)}}
- ## Allograft rejection is a major long-term problem
- ![Kidney allograft survival rates in children by transplant year[^kim-marks]](graphics/presentation/kidney-graft-survival.png){ height=65% }
- [^kim-marks]: [Kim & Marks. "Long-term outcomes of children after solid organ transplantation". In: Clinics (2014)](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3884158/?report=classic)
- ## Rejection is treated with immune suppressive drugs
- <!-- TODO: Need a graphic, or maybe a table of common drugs +
- mechanisms, or a diagram for periodic checking. -->
- ::: incremental
- * Graft recipient must take immune suppressive drugs indefinitely
- * Graft is monitored for rejection and dosage adjusted over time
- * Immune suppression is a delicate balance: too much and too little
- are both problematic.
- :::
- ## My thesis topics
- <!-- TODO: Needs revision -->
- ### Topic 1: Immune memory
- Genome-wide epigenetic analysis of H3K4 and H3K27 methylation in naïve
- and memory $\mathsf{CD4}^{+}$ T-cell activation
- ### Topic 2: Diagnostics for rejection
- Improving array-based diagnostics for transplant rejection by
- optimizing data preprocessing
- ### Topic 3: Blood profiling during treatment
- Globin-blocking for more effective blood RNA-seq analysis in primate
- animal model for experimental graft rejection treatment
- ## Today's focus
- ### \Large Topic 1: Immune memory
- \Large
- Genome-wide epigenetic analysis of H3K4 and H3K27 methylation in naïve
- and memory $\mathsf{CD4}^{+}$ T-cell activation
- ## Memory cells: faster, stronger, and more independent
- 
- ## Memory cells: faster, stronger, and more independent
- 
- ## Memory cells: faster, stronger, and more independent
- 
- ## Memory cells: faster, stronger, and more independent
- 
- ::: notes
- Compared to naïve cells, memory cells:
- * respond to a lower antigen concentration
- * respond more strongly at any given antigen concentration
- * require less co-stimulation
- * are somewhat independent of some types of co-stimulation required by
- naïve cells
- * evolve over time to respond even more strongly to their antigen
- Result:
- \normalsize
- * Memory cells require progressively higher doses of immune suppresive
- drugs
- * Dosage cannot be increased indefinitely without compromising the
- immune system's ability to fight infection
- :::
- ## We need a better understanding of immune memory
- * Cell surface markers fairly well-characterized
- * But internal mechanisms poorly understood
-
- . . .
- \vfill
- \large
- **Hypothesis:** Epigenetic regulation of gene expression through
- histone modification is involved in $\mathsf{CD4}^{+}$ T-cell
- activation and memory.
-
- ## Experimental design
- * Separately isolate naïve and memory $\mathsf{CD4}^{+}$ T-cells from
- 4 donors
- * Activate with CD3/CD28 beads
- * Take samples at 4 time points: Day 0 (pre-activation), Day 1 (early
- activation), Day 5 (peak activation), and Day 14 (post-activation)
- * Do RNA-seq + ChIP-seq for 3 histone marks (H3K4me2, H3K4me3, &
- H3K27me3) for each sample.
- Data generated by Sarah Lamere, published in GEO as
- [GSE73214](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE73214)
- ## Time points capture phases of immune response
- \centering
- <!-- { height=75% } -->
- ## Why study these histone marks?
- ::: incremental
- * **H3K4me3:** "activating" mark associated with active transcription
- * **H3K4me2:** Correlated with H3K4me3, hypothesized as a "poised" state
- * **H3K27me3:** "repressive" mark associated with inactive
- * All 3 involved in T-cell differentiation, but activation dynamics
- unexplored
- :::
- ## ChIP-seq sequences DNA bound to marked histones[^chipseq]
- \centering
- { height=70% }
- [^chipseq]: [Furey. "ChIP-seq and beyond: New and improved methodologies to detect and characterize protein-DNA interactions". In: Nature Reviews Genetics (2012)](http://www.nature.com/articles/nrg3306)
- ## A few intermediate analysis steps are required
- 
- ## Histone modifications occur on consecutive histones
- ![ChIP-seq coverage in IL2 gene[^lamerethesis]](graphics/presentation/LaMere-thesis-fig3.9-SVG-CROP.png){ height=65% }
- [^lamerethesis]: Sarah LaMere. "Dynamic epigenetic regulation of CD4 T cell activation and memory formation". PhD thesis. TSRI, 2015.
- ## Histone modifications occur on consecutive histones
- 
- ## Histone modifications occur on consecutive histones
- 
- ## Histone modifications occur on consecutive histones
- 
- ## SICER identifies enriched regions across the genome
- ![Finding "islands" of coverage with SICER[^sicer]](graphics/presentation/SICER-fig1-SVG.png)
- [^sicer]: [Zang et al. “A clustering approach for identification of enriched domains from histone modification ChIP-Seq data”. In: Bioinformatics 25.15 (2009)](https://doi.org/10.1093/bioinformatics/btp340)
- ## IDR identifies *reproducible* enriched regions
- ![Example irreproducible discovery rate[^idr] score consistency plot](graphics/presentation/IDR-example-CROP-RASTER.png){ height=65% }
- [^idr]: [Li et al. “Measuring reproducibility of high-throughput experiments”. In: AOAS (2011)](https://doi.org/10.1214/11-AOAS466)
- ## Finding enriched regions across the genome
- 
- ## Each histone mark has an "effective promoter radius"
- 
- ## Peaks in promoters correlate with gene expression
- 
- ## Peaks in promoters correlate with gene expression
- 
- ## Peaks in promoters correlate with gene expression
- 
- ## Peaks in promoters correlate with gene expression
- 
- ## Peaks in promoters correlate with gene expression
- 
- ## The story so far
- <!-- TODO: Left column: text; right column: flip through relevant image -->
- * H3K4me2, H3K4me3, and H3K27me3 occur on many consecutive histones in
- broad regions across the genome
- * These enriched regions occur more commonly within a certain radius
- of gene promoters
- * This "effective promoter radius" is consistent across all samples
- for a given histone mark, but differs between histone marks
- * Presence or absence of a peak within this radius is correlated with
- gene expression
-
- . . .
- Next: Does the position of a histone modification within a gene
- promoter matter to that gene's expression, or is it merely the
- presence or absence anywhere within the promoter?
-
- ## H3K4me2 promoter neighborhood K-means clusters
- 
- ## H3K4me2 promoter neighborhood cluster PCA
- :::::::::: {.columns}
- ::: {.column width="50%"}
- 
- :::
- ::: {.column width="50%"}
- 
- :::
- ::::::::::
- ## H3K4me2 promoter neighborhood cluster expression
- :::::::::: {.columns}
- ::: {.column width="50%"}
- 
- :::
- ::: {.column width="50%"}
- 
- :::
- ::::::::::
- ## H3K4me3 promoter neighborhood cluster PCA
- :::::::::: {.columns}
- ::: {.column width="50%"}
- 
- :::
- ::: {.column width="50%"}
- 
- :::
- ::::::::::
- ## H3K4me3 promoter neighborhood cluster expression
- :::::::::: {.columns}
- ::: {.column width="50%"}
- 
- :::
- ::: {.column width="50%"}
- 
- :::
- ::::::::::
- ## H3K27me3 promoter neighborhood cluster PCA
- :::::::::: {.columns}
- ::: {.column width="50%"}
- 
- :::
- ::: {.column width="50%"}
- 
- :::
- ::::::::::
- ## H3K27me3 promoter neighborhood cluster expression
- :::::::::: {.columns}
- ::: {.column width="50%"}
- 
- :::
- ::: {.column width="50%"}
- 
- :::
- ::::::::::
- ## What have we learned?
- ### H3K4me2 & H3K4me3
- * Peak closer to promoter $\Rightarrow$ more likely gene is highly
- expressed
- * Slightly asymmetric in favor of peaks downstream of TSS
- . . .
- ### H3K27me3
- * Depletion of H3K27me3 at TSS associated with elevated gene
- expression
- * Enrichment of H3K27me3 upstream of TSS even more strongly associated
- with elevated expression
- * Other coverage profiles not associated with elevated expression
- ## Differential modification disappears by Day 14
- 
- ## Differential modification disappears by Day 14
- 
- ## Convergence at Day 14 H3K4me2
- 
- ## Convergence at Day 14 H3K4me3
- 
- ## Convergence at Day 14 H3K27me3
- 
- ## Convergence at Day 14 RNA-seq (PC 2 & 3)
- 
- ## MOFA identifies shared variation across all 4 data sets
- 
- ## MOFA identifies shared variation across all 4 data sets
- 
- ## MOFA shared variation captures convergence pattern
- 
- ## What have we learned?
- * Almost no differential modification observed between naïve and
- memory at Day 14, despite plenty of differential modification at
- earlier time points.
- * RNA-seq data and all 3 histone marks' ChIP-seq data all show
- "convergence" between naïve and memory by Day 14 in the first 2 or 3
- principal coordinates.
- * MOFA captures this convergence pattern in one of the latent factors,
- indicating that this is a shared pattern across all 4 data sets.
- <!-- ## Slide -->
- <!--  -->
- ## Takeaway 1: Each histone mark has an "effective promoter radius"
- * H3K4me2, H3K4me3, and H3K27me3 ChIP-seq reads are enriched in broad
- regions across the genome, representing areas where the histone
- modification is present
- * These enriched regions occur more commonly within a certain radius
- of gene promoters
- * This "effective promoter radius" is specific to each histone mark
- * Presence or absence of a peak within this radius is correlated with
- gene expression
-
- ## Takeaway 2: Peak position within the promoter is important
- * H3K4me2 and H3K4me3 peaks are more strongly associated with elevated
- gene expression the closer they are to the TSS, with a slight bias
- toward downstream peaks.
- * H3K27me3 depletion at the TSS and enrichement upstream are both
- associated with elevated expression, while other patterns are not.
- * In all histone marks, position of modification within promoter
- appears to be an important factor in association with gene
- expression
- ## Takeaway 3: Expression & epigenetic state both converge at Day 14
- * At Day 14, almost no differential modification observed between
- naïve and memory cells
- * Naïve and memory converge visually in PCoA plots
- * Convergence is a shared pattern of variation across all 3 histone
- marks and gene expression
- * This is consistent with the hypothesis that the naïve cells have
- differentiated into a more memory-like phenotype by day 14.
|