|
@@ -487,7 +487,20 @@ Reproducible genome-wide epigenetic analysis of H3K4 and H3K27 methylation
|
|
|
status open
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-Author list: Me, Sarah, Dan
|
|
|
+Chapter author list: Me, Sarah, Dan
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\end_inset
|
|
|
+
|
|
|
+
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\begin_layout Standard
|
|
|
+\begin_inset Flex TODO Note (inline)
|
|
|
+status open
|
|
|
+
|
|
|
+\begin_layout Plain Layout
|
|
|
+Need better section titles throughout the chapter
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -691,44 +704,6 @@ Promoter counts in sliding windows around each gene's highest-expressed
|
|
|
TSS to investigate coverage distribution within promoters
|
|
|
\end_layout
|
|
|
|
|
|
-\begin_layout Section
|
|
|
-Results
|
|
|
-\end_layout
|
|
|
-
|
|
|
-\begin_layout Standard
|
|
|
-\begin_inset Note Note
|
|
|
-status open
|
|
|
-
|
|
|
-\begin_layout Plain Layout
|
|
|
-Focus on what hypotheses were tested, then select figures that show how
|
|
|
- those hypotheses were tested, even if the result is a negative.
|
|
|
-\end_layout
|
|
|
-
|
|
|
-\begin_layout Plain Layout
|
|
|
-Not every interesting result needs to be in here.
|
|
|
- Chapter should tell a story.
|
|
|
-
|
|
|
-\end_layout
|
|
|
-
|
|
|
-\end_inset
|
|
|
-
|
|
|
-
|
|
|
-\end_layout
|
|
|
-
|
|
|
-\begin_layout Standard
|
|
|
-\begin_inset Flex TODO Note (inline)
|
|
|
-status open
|
|
|
-
|
|
|
-\begin_layout Plain Layout
|
|
|
-Maybe reorder these sections to do RNA-seq, then ChIP-seq, then combined
|
|
|
- analyses?
|
|
|
-\end_layout
|
|
|
-
|
|
|
-\end_inset
|
|
|
-
|
|
|
-
|
|
|
-\end_layout
|
|
|
-
|
|
|
\begin_layout Subsection
|
|
|
RNA-seq align+quant method comparison
|
|
|
\end_layout
|
|
@@ -750,7 +725,7 @@ Maybe fix up the excessive axis ranges for these plots?
|
|
|
\begin_inset Float figure
|
|
|
wide false
|
|
|
sideways false
|
|
|
-status open
|
|
|
+status collapsed
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
\align center
|
|
@@ -775,10 +750,6 @@ Comparison of STAR quantification between Ensembl and Entrez gene identifiers
|
|
|
\end_inset
|
|
|
|
|
|
|
|
|
-\end_layout
|
|
|
-
|
|
|
-\begin_layout Plain Layout
|
|
|
-
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -790,7 +761,7 @@ Comparison of STAR quantification between Ensembl and Entrez gene identifiers
|
|
|
\begin_inset Float figure
|
|
|
wide false
|
|
|
sideways false
|
|
|
-status open
|
|
|
+status collapsed
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
\align center
|
|
@@ -827,7 +798,7 @@ Comparison of Salmon+Shoal quantification between Ensembl and Entrez gene
|
|
|
\begin_inset Float figure
|
|
|
wide false
|
|
|
sideways false
|
|
|
-status open
|
|
|
+status collapsed
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
\align center
|
|
@@ -863,7 +834,7 @@ Comparison of quantification between STAR and HISAT2 for identical annotation
|
|
|
\begin_inset Float figure
|
|
|
wide false
|
|
|
sideways false
|
|
|
-status open
|
|
|
+status collapsed
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
\align center
|
|
@@ -899,7 +870,7 @@ Comparison of quantification between STAR and Salmon for identical annotation
|
|
|
\begin_inset Float figure
|
|
|
wide false
|
|
|
sideways false
|
|
|
-status open
|
|
|
+status collapsed
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
\align center
|
|
@@ -936,7 +907,7 @@ n
|
|
|
\begin_inset Float figure
|
|
|
wide false
|
|
|
sideways false
|
|
|
-status open
|
|
|
+status collapsed
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
\align center
|
|
@@ -978,6 +949,22 @@ Ultimately selected shoal as quantification, Ensembl as annotation.
|
|
|
an informed decision.
|
|
|
\end_layout
|
|
|
|
|
|
+\begin_layout Standard
|
|
|
+\begin_inset ERT
|
|
|
+status collapsed
|
|
|
+
|
|
|
+\begin_layout Plain Layout
|
|
|
+
|
|
|
+
|
|
|
+\backslash
|
|
|
+FloatBarrier
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\end_inset
|
|
|
+
|
|
|
+
|
|
|
+\end_layout
|
|
|
+
|
|
|
\begin_layout Subsection
|
|
|
RNA-seq has a large confounding batch effect
|
|
|
\end_layout
|
|
@@ -986,7 +973,7 @@ RNA-seq has a large confounding batch effect
|
|
|
\begin_inset Float figure
|
|
|
wide false
|
|
|
sideways false
|
|
|
-status open
|
|
|
+status collapsed
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
\begin_inset Flex TODO Note (inline)
|
|
@@ -1032,10 +1019,6 @@ RNA-seq sample weights, grouped by experimental and technical covariates
|
|
|
\end_inset
|
|
|
|
|
|
|
|
|
-\end_layout
|
|
|
-
|
|
|
-\begin_layout Plain Layout
|
|
|
-
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -1047,7 +1030,7 @@ RNA-seq sample weights, grouped by experimental and technical covariates
|
|
|
\begin_inset Float figure
|
|
|
wide false
|
|
|
sideways false
|
|
|
-status open
|
|
|
+status collapsed
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
\align center
|
|
@@ -1091,7 +1074,7 @@ RNA-seq PCoA plot showing clear batch effect
|
|
|
\begin_inset Float figure
|
|
|
wide false
|
|
|
sideways false
|
|
|
-status open
|
|
|
+status collapsed
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
\begin_inset Flex TODO Note (inline)
|
|
@@ -1148,7 +1131,7 @@ RNA-seq PCoA plot showing clear batch effect
|
|
|
\begin_inset Float figure
|
|
|
wide false
|
|
|
sideways false
|
|
|
-status open
|
|
|
+status collapsed
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
\align center
|
|
@@ -1208,18 +1191,15 @@ Figures showing p-value histograms for within-batch and cross-batch contrasts,
|
|
|
|
|
|
\end_layout
|
|
|
|
|
|
-\begin_layout Subsection
|
|
|
-H3K4 and H3K27 methylation occur in broad regions and are enriched near
|
|
|
- promoters
|
|
|
-\end_layout
|
|
|
-
|
|
|
\begin_layout Standard
|
|
|
-\begin_inset Flex TODO Note (inline)
|
|
|
-status open
|
|
|
+\begin_inset ERT
|
|
|
+status collapsed
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-Replace these figures with a single table of # of peaks called at chosen
|
|
|
- IDR threshold, showing that SICER has more
|
|
|
+
|
|
|
+
|
|
|
+\backslash
|
|
|
+FloatBarrier
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -1227,19 +1207,23 @@ Replace these figures with a single table of # of peaks called at chosen
|
|
|
|
|
|
\end_layout
|
|
|
|
|
|
+\begin_layout Subsection
|
|
|
+ChIP-seq blacklisting is important
|
|
|
+\end_layout
|
|
|
+
|
|
|
\begin_layout Standard
|
|
|
\begin_inset Float figure
|
|
|
wide false
|
|
|
sideways false
|
|
|
-status open
|
|
|
-
|
|
|
-\begin_layout Plain Layout
|
|
|
-\begin_inset Flex TODO Note (inline)
|
|
|
-status open
|
|
|
+status collapsed
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-Re-generate IDR rank consistency plots for SICER and MACS side-by-side
|
|
|
-\end_layout
|
|
|
+\align center
|
|
|
+\begin_inset Graphics
|
|
|
+ filename graphics/CD4-csaw/csaw/CCF-plots-PAGE2-CROP.pdf
|
|
|
+ lyxscale 50
|
|
|
+ width 100col%
|
|
|
+ groupId colwidth
|
|
|
|
|
|
\end_inset
|
|
|
|
|
@@ -1250,15 +1234,7 @@ Re-generate IDR rank consistency plots for SICER and MACS side-by-side
|
|
|
\begin_inset Caption Standard
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-
|
|
|
-\series bold
|
|
|
-\begin_inset CommandInset label
|
|
|
-LatexCommand label
|
|
|
-name "fig:IDR-RC-H3K4me2"
|
|
|
-
|
|
|
-\end_inset
|
|
|
-
|
|
|
-Irreproducible Discovery Rate consistency plots for H3K4me2
|
|
|
+Cross-correlation plots with blacklisted reads removed
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -1275,15 +1251,15 @@ Irreproducible Discovery Rate consistency plots for H3K4me2
|
|
|
\begin_inset Float figure
|
|
|
wide false
|
|
|
sideways false
|
|
|
-status open
|
|
|
-
|
|
|
-\begin_layout Plain Layout
|
|
|
-\begin_inset Flex TODO Note (inline)
|
|
|
-status open
|
|
|
+status collapsed
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-Re-generate IDR rank consistency plots for SICER and MACS side-by-side
|
|
|
-\end_layout
|
|
|
+\align center
|
|
|
+\begin_inset Graphics
|
|
|
+ filename graphics/CD4-csaw/csaw/CCF-plots-noBL-PAGE2-CROP.pdf
|
|
|
+ lyxscale 50
|
|
|
+ width 100col%
|
|
|
+ groupId colwidth
|
|
|
|
|
|
\end_inset
|
|
|
|
|
@@ -1294,15 +1270,12 @@ Re-generate IDR rank consistency plots for SICER and MACS side-by-side
|
|
|
\begin_inset Caption Standard
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-
|
|
|
-\series bold
|
|
|
-\begin_inset CommandInset label
|
|
|
-LatexCommand label
|
|
|
-name "fig:IDR-RC-H3K4me3"
|
|
|
+Cross-correlation plots without removing blacklisted reads
|
|
|
+\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
-Irreproducible Discovery Rate consistency plots for H3K4me3
|
|
|
+
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -1310,6 +1283,18 @@ Irreproducible Discovery Rate consistency plots for H3K4me3
|
|
|
|
|
|
\end_layout
|
|
|
|
|
|
+\begin_layout Subsection
|
|
|
+ChIP-seq normalization
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\begin_layout Standard
|
|
|
+\begin_inset Flex TODO Note (inline)
|
|
|
+status open
|
|
|
+
|
|
|
+\begin_layout Plain Layout
|
|
|
+Maybe just one of these figures and then say the other 2 were similar
|
|
|
+\end_layout
|
|
|
+
|
|
|
\end_inset
|
|
|
|
|
|
|
|
@@ -1322,12 +1307,12 @@ sideways false
|
|
|
status open
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-\begin_inset Flex TODO Note (inline)
|
|
|
-status open
|
|
|
-
|
|
|
-\begin_layout Plain Layout
|
|
|
-Re-generate IDR rank consistency plots for SICER and MACS side-by-side
|
|
|
-\end_layout
|
|
|
+\align center
|
|
|
+\begin_inset Graphics
|
|
|
+ filename graphics/CD4-csaw/ChIP-seq/H3K4me2-sample-MAplot-bins-CROP.png
|
|
|
+ lyxscale 25
|
|
|
+ width 100col%
|
|
|
+ groupId colwidth-raster
|
|
|
|
|
|
\end_inset
|
|
|
|
|
@@ -1340,13 +1325,7 @@ Re-generate IDR rank consistency plots for SICER and MACS side-by-side
|
|
|
\begin_layout Plain Layout
|
|
|
|
|
|
\series bold
|
|
|
-\begin_inset CommandInset label
|
|
|
-LatexCommand label
|
|
|
-name "fig:IDR-RC-H3K27me3"
|
|
|
-
|
|
|
-\end_inset
|
|
|
-
|
|
|
-Irreproducible Discovery Rate consistency plots for H3K27me3
|
|
|
+MA plot of H3K4me2 read counts in 10kb bins for two arbitrary samples
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -1360,23 +1339,18 @@ Irreproducible Discovery Rate consistency plots for H3K27me3
|
|
|
\end_layout
|
|
|
|
|
|
\begin_layout Standard
|
|
|
-\begin_inset Float table
|
|
|
+\begin_inset Float figure
|
|
|
wide false
|
|
|
sideways false
|
|
|
status open
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
\align center
|
|
|
-\begin_inset Flex TODO Note (inline)
|
|
|
-status open
|
|
|
-
|
|
|
-\begin_layout Plain Layout
|
|
|
-Need
|
|
|
-\emph on
|
|
|
-median
|
|
|
-\emph default
|
|
|
- peak width, not mean
|
|
|
-\end_layout
|
|
|
+\begin_inset Graphics
|
|
|
+ filename graphics/CD4-csaw/ChIP-seq/H3K4me3-sample-MAplot-bins-CROP.png
|
|
|
+ lyxscale 25
|
|
|
+ width 100col%
|
|
|
+ groupId colwidth-raster
|
|
|
|
|
|
\end_inset
|
|
|
|
|
@@ -1384,204 +1358,137 @@ median
|
|
|
\end_layout
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-\align center
|
|
|
-\begin_inset Tabular
|
|
|
-<lyxtabular version="3" rows="4" columns="5">
|
|
|
-<features tabularvalignment="middle">
|
|
|
-<column alignment="center" valignment="top">
|
|
|
-<column alignment="center" valignment="top">
|
|
|
-<column alignment="center" valignment="top">
|
|
|
-<column alignment="center" valignment="top">
|
|
|
-<column alignment="center" valignment="top">
|
|
|
-<row>
|
|
|
-<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
|
|
|
-\begin_inset Text
|
|
|
+\begin_inset Caption Standard
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-Histone Mark
|
|
|
+
|
|
|
+\series bold
|
|
|
+MA plot of H3K4me3 read counts in 10kb bins for two arbitrary samples
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
-</cell>
|
|
|
-<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
|
|
|
-\begin_inset Text
|
|
|
|
|
|
-\begin_layout Plain Layout
|
|
|
-# Peaks
|
|
|
+
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
-</cell>
|
|
|
-<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
|
|
|
-\begin_inset Text
|
|
|
|
|
|
-\begin_layout Plain Layout
|
|
|
-Mean peak width
|
|
|
+
|
|
|
\end_layout
|
|
|
|
|
|
-\end_inset
|
|
|
-</cell>
|
|
|
-<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
|
|
|
-\begin_inset Text
|
|
|
+\begin_layout Standard
|
|
|
+\begin_inset Float figure
|
|
|
+wide false
|
|
|
+sideways false
|
|
|
+status open
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-genome coverage
|
|
|
-\end_layout
|
|
|
+\align center
|
|
|
+\begin_inset Graphics
|
|
|
+ filename graphics/CD4-csaw/ChIP-seq/H3K27me3-sample-MAplot-bins-CROP.png
|
|
|
+ lyxscale 25
|
|
|
+ width 100col%
|
|
|
+ groupId colwidth-raster
|
|
|
|
|
|
\end_inset
|
|
|
-</cell>
|
|
|
-<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
|
|
|
-\begin_inset Text
|
|
|
|
|
|
-\begin_layout Plain Layout
|
|
|
-read coverage
|
|
|
-\end_layout
|
|
|
-
|
|
|
-\end_inset
|
|
|
-</cell>
|
|
|
-</row>
|
|
|
-<row>
|
|
|
-<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
|
|
|
-\begin_inset Text
|
|
|
|
|
|
-\begin_layout Plain Layout
|
|
|
-H3K4me2
|
|
|
\end_layout
|
|
|
|
|
|
-\end_inset
|
|
|
-</cell>
|
|
|
-<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
|
|
|
-\begin_inset Text
|
|
|
-
|
|
|
\begin_layout Plain Layout
|
|
|
-14965
|
|
|
-\end_layout
|
|
|
-
|
|
|
-\end_inset
|
|
|
-</cell>
|
|
|
-<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
|
|
|
-\begin_inset Text
|
|
|
+\begin_inset Caption Standard
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-3970
|
|
|
+
|
|
|
+\series bold
|
|
|
+MA plot of H3K27me3 read counts in 10kb bins for two arbitrary samples
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
-</cell>
|
|
|
-<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
|
|
|
-\begin_inset Text
|
|
|
|
|
|
-\begin_layout Plain Layout
|
|
|
-1.92%
|
|
|
+
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
-</cell>
|
|
|
-<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
|
|
|
-\begin_inset Text
|
|
|
|
|
|
-\begin_layout Plain Layout
|
|
|
-14.2%
|
|
|
-\end_layout
|
|
|
|
|
|
-\end_inset
|
|
|
-</cell>
|
|
|
-</row>
|
|
|
-<row>
|
|
|
-<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
|
|
|
-\begin_inset Text
|
|
|
+\end_layout
|
|
|
|
|
|
-\begin_layout Plain Layout
|
|
|
-H3K4me3
|
|
|
+\begin_layout Subsection
|
|
|
+ChIP-seq must be corrected for hidden confounding factors
|
|
|
\end_layout
|
|
|
|
|
|
-\end_inset
|
|
|
-</cell>
|
|
|
-<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
|
|
|
-\begin_inset Text
|
|
|
+\begin_layout Standard
|
|
|
+\begin_inset Flex TODO Note (inline)
|
|
|
+status open
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-6163
|
|
|
+Consolidate these into 1 2x3 grid.
|
|
|
+ For now, just refer to them as if they were a single figure.
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
-</cell>
|
|
|
-<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
|
|
|
-\begin_inset Text
|
|
|
|
|
|
-\begin_layout Plain Layout
|
|
|
-2946
|
|
|
+
|
|
|
\end_layout
|
|
|
|
|
|
-\end_inset
|
|
|
-</cell>
|
|
|
-<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
|
|
|
-\begin_inset Text
|
|
|
+\begin_layout Standard
|
|
|
+\begin_inset Float figure
|
|
|
+wide false
|
|
|
+sideways false
|
|
|
+status collapsed
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-0.588%
|
|
|
-\end_layout
|
|
|
+\align center
|
|
|
+\begin_inset Graphics
|
|
|
+ filename graphics/CD4-csaw/ChIP-seq/H3K4me2-PCA-raw-CROP.png
|
|
|
+ lyxscale 25
|
|
|
+ width 100col%
|
|
|
+ groupId colwidth-raster
|
|
|
|
|
|
\end_inset
|
|
|
-</cell>
|
|
|
-<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
|
|
|
-\begin_inset Text
|
|
|
|
|
|
-\begin_layout Plain Layout
|
|
|
-6.57%
|
|
|
+
|
|
|
\end_layout
|
|
|
|
|
|
-\end_inset
|
|
|
-</cell>
|
|
|
-</row>
|
|
|
-<row>
|
|
|
-<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
|
|
|
-\begin_inset Text
|
|
|
+\begin_layout Plain Layout
|
|
|
+\begin_inset Caption Standard
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-H3K27me3
|
|
|
-\end_layout
|
|
|
+
|
|
|
+\series bold
|
|
|
+\begin_inset CommandInset label
|
|
|
+LatexCommand label
|
|
|
+name "fig:PCoA-H3K4me2-bad"
|
|
|
|
|
|
\end_inset
|
|
|
-</cell>
|
|
|
-<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
|
|
|
-\begin_inset Text
|
|
|
|
|
|
-\begin_layout Plain Layout
|
|
|
-18139
|
|
|
+PCoA plot of H3K4me2 windows, before subtracting surrogate variables
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
-</cell>
|
|
|
-<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
|
|
|
-\begin_inset Text
|
|
|
|
|
|
-\begin_layout Plain Layout
|
|
|
-18967
|
|
|
+
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
-</cell>
|
|
|
-<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
|
|
|
-\begin_inset Text
|
|
|
|
|
|
-\begin_layout Plain Layout
|
|
|
-11.1%
|
|
|
+
|
|
|
\end_layout
|
|
|
|
|
|
-\end_inset
|
|
|
-</cell>
|
|
|
-<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
|
|
|
-\begin_inset Text
|
|
|
+\begin_layout Standard
|
|
|
+\begin_inset Float figure
|
|
|
+wide false
|
|
|
+sideways false
|
|
|
+status collapsed
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-22.5%
|
|
|
-\end_layout
|
|
|
-
|
|
|
-\end_inset
|
|
|
-</cell>
|
|
|
-</row>
|
|
|
-</lyxtabular>
|
|
|
+\align center
|
|
|
+\begin_inset Graphics
|
|
|
+ filename graphics/CD4-csaw/ChIP-seq/H3K4me2-PCA-SVsub-CROP.png
|
|
|
+ lyxscale 25
|
|
|
+ width 100col%
|
|
|
+ groupId colwidth-raster
|
|
|
|
|
|
\end_inset
|
|
|
|
|
@@ -1596,11 +1503,11 @@ H3K27me3
|
|
|
\series bold
|
|
|
\begin_inset CommandInset label
|
|
|
LatexCommand label
|
|
|
-name "tab:peak-calling-summary"
|
|
|
+name "fig:PCoA-H3K4me2-good"
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
-SICER+IDR peak-calling summary
|
|
|
+PCoA plot of H3K4me2 windows, after subtracting surrogate variables
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -1614,71 +1521,62 @@ SICER+IDR peak-calling summary
|
|
|
\end_layout
|
|
|
|
|
|
\begin_layout Standard
|
|
|
-Figures
|
|
|
-\begin_inset CommandInset ref
|
|
|
-LatexCommand ref
|
|
|
-reference "fig:IDR-RC-H3K4me2"
|
|
|
-plural "false"
|
|
|
-caps "false"
|
|
|
-noprefix "false"
|
|
|
+\begin_inset Float figure
|
|
|
+wide false
|
|
|
+sideways false
|
|
|
+status collapsed
|
|
|
+
|
|
|
+\begin_layout Plain Layout
|
|
|
+\align center
|
|
|
+\begin_inset Graphics
|
|
|
+ filename graphics/CD4-csaw/ChIP-seq/H3K4me3-PCA-raw-CROP.png
|
|
|
+ lyxscale 25
|
|
|
+ width 100col%
|
|
|
+ groupId colwidth-raster
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
-,
|
|
|
-\begin_inset CommandInset ref
|
|
|
-LatexCommand ref
|
|
|
-reference "fig:IDR-RC-H3K4me3"
|
|
|
-plural "false"
|
|
|
-caps "false"
|
|
|
-noprefix "false"
|
|
|
+
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\begin_layout Plain Layout
|
|
|
+\begin_inset Caption Standard
|
|
|
+
|
|
|
+\begin_layout Plain Layout
|
|
|
+
|
|
|
+\series bold
|
|
|
+\begin_inset CommandInset label
|
|
|
+LatexCommand label
|
|
|
+name "fig:PCoA-H3K4me3-bad"
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
-, and
|
|
|
-\begin_inset CommandInset ref
|
|
|
-LatexCommand ref
|
|
|
-reference "fig:IDR-RC-H3K27me3"
|
|
|
-plural "false"
|
|
|
-caps "false"
|
|
|
-noprefix "false"
|
|
|
+PCoA plot of H3K4me3 windows, before subtracting surrogate variables
|
|
|
+\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
- show the IDR rank-consistency plots for peaks called in an arbitrarily-chosen
|
|
|
- pair of donors.
|
|
|
- For all 3 histone marks, when the peaks for each donor are ranked according
|
|
|
- to their scores, SICER produces much more reproducible results between
|
|
|
- donors.
|
|
|
- This is consistent with SICER's stated goal of identifying broad peaks,
|
|
|
- in contrast to MACS, which is designed for identifying sharp peaks.
|
|
|
- Based on this observation, the SICER peak calls were used for all downstream
|
|
|
- analyses that involved ChIP-seq peaks.
|
|
|
- Table
|
|
|
-\begin_inset CommandInset ref
|
|
|
-LatexCommand ref
|
|
|
-reference "tab:peak-calling-summary"
|
|
|
-plural "false"
|
|
|
-caps "false"
|
|
|
-noprefix "false"
|
|
|
+
|
|
|
+\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
- gives a summary of the peak calling statistics for each histone mark.
|
|
|
+
|
|
|
\end_layout
|
|
|
|
|
|
\begin_layout Standard
|
|
|
\begin_inset Float figure
|
|
|
wide false
|
|
|
sideways false
|
|
|
-status open
|
|
|
+status collapsed
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
\align center
|
|
|
\begin_inset Graphics
|
|
|
- filename graphics/CD4-csaw/Promoter Peak Distance Profile-PAGE1-CROP.pdf
|
|
|
- lyxscale 50
|
|
|
+ filename graphics/CD4-csaw/ChIP-seq/H3K4me3-PCA-SVsub-CROP.png
|
|
|
+ lyxscale 25
|
|
|
width 100col%
|
|
|
- groupId colwidth
|
|
|
+ groupId colwidth-raster
|
|
|
|
|
|
\end_inset
|
|
|
|
|
@@ -1693,20 +1591,16 @@ status open
|
|
|
\series bold
|
|
|
\begin_inset CommandInset label
|
|
|
LatexCommand label
|
|
|
-name "fig:effective-promoter-radius"
|
|
|
+name "fig:PCoA-H3K4me3-good"
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
-Enrichment of peaks in promoter neighborhoods.
|
|
|
+PCoA plot of H3K4me3 windows, after subtracting surrogate variables
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
|
|
|
-\end_layout
|
|
|
-
|
|
|
-\begin_layout Plain Layout
|
|
|
-
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -1714,58 +1608,38 @@ Enrichment of peaks in promoter neighborhoods.
|
|
|
|
|
|
\end_layout
|
|
|
|
|
|
-\begin_layout Itemize
|
|
|
-Each histone mark is enriched within a certain radius of gene TSS positions,
|
|
|
- but that radius is different for each mark (figure
|
|
|
-\begin_inset CommandInset ref
|
|
|
-LatexCommand ref
|
|
|
-reference "fig:effective-promoter-radius"
|
|
|
-plural "false"
|
|
|
-caps "false"
|
|
|
-noprefix "false"
|
|
|
-
|
|
|
-\end_inset
|
|
|
-
|
|
|
-, previously in
|
|
|
-\begin_inset CommandInset citation
|
|
|
-LatexCommand cite
|
|
|
-key "LaMere2016"
|
|
|
-literal "false"
|
|
|
-
|
|
|
-\end_inset
|
|
|
-
|
|
|
- Fig.
|
|
|
- S2)
|
|
|
-\end_layout
|
|
|
-
|
|
|
-\begin_layout Subsection
|
|
|
-ChIP-seq blacklisting is important
|
|
|
-\end_layout
|
|
|
-
|
|
|
\begin_layout Standard
|
|
|
\begin_inset Float figure
|
|
|
wide false
|
|
|
sideways false
|
|
|
-status open
|
|
|
+status collapsed
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
\align center
|
|
|
\begin_inset Graphics
|
|
|
- filename graphics/CD4-csaw/csaw/CCF-plots-PAGE2-CROP.pdf
|
|
|
- lyxscale 50
|
|
|
+ filename graphics/CD4-csaw/ChIP-seq/H3K27me3-PCA-raw-CROP.png
|
|
|
+ lyxscale 25
|
|
|
width 100col%
|
|
|
- groupId colwidth
|
|
|
+ groupId colwidth-raster
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
|
|
|
\end_layout
|
|
|
|
|
|
-\begin_layout Plain Layout
|
|
|
-\begin_inset Caption Standard
|
|
|
+\begin_layout Plain Layout
|
|
|
+\begin_inset Caption Standard
|
|
|
+
|
|
|
+\begin_layout Plain Layout
|
|
|
+
|
|
|
+\series bold
|
|
|
+\begin_inset CommandInset label
|
|
|
+LatexCommand label
|
|
|
+name "fig:PCoA-H3K27me3-bad"
|
|
|
+
|
|
|
+\end_inset
|
|
|
|
|
|
-\begin_layout Plain Layout
|
|
|
-Cross-correlation plots with blacklisted reads removed
|
|
|
+PCoA plot of H3K27me3 windows, before subtracting surrogate variables
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -1782,15 +1656,15 @@ Cross-correlation plots with blacklisted reads removed
|
|
|
\begin_inset Float figure
|
|
|
wide false
|
|
|
sideways false
|
|
|
-status open
|
|
|
+status collapsed
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
\align center
|
|
|
\begin_inset Graphics
|
|
|
- filename graphics/CD4-csaw/csaw/CCF-plots-noBL-PAGE2-CROP.pdf
|
|
|
- lyxscale 50
|
|
|
+ filename graphics/CD4-csaw/ChIP-seq/H3K27me3-PCA-SVsub-CROP.png
|
|
|
+ lyxscale 25
|
|
|
width 100col%
|
|
|
- groupId colwidth
|
|
|
+ groupId colwidth-raster
|
|
|
|
|
|
\end_inset
|
|
|
|
|
@@ -1801,7 +1675,15 @@ status open
|
|
|
\begin_inset Caption Standard
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-Cross-correlation plots without removing blacklisted reads
|
|
|
+
|
|
|
+\series bold
|
|
|
+\begin_inset CommandInset label
|
|
|
+LatexCommand label
|
|
|
+name "fig:PCoA-H3K27me3-good"
|
|
|
+
|
|
|
+\end_inset
|
|
|
+
|
|
|
+PCoA plot of H3K27me3 windows, after subtracting surrogate variables
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -1814,21 +1696,29 @@ Cross-correlation plots without removing blacklisted reads
|
|
|
|
|
|
\end_layout
|
|
|
|
|
|
-\begin_layout Subsection
|
|
|
-ChIP-seq normalization
|
|
|
+\begin_layout Itemize
|
|
|
+Figures showing BCV plots with and without SVA for each histone mark.
|
|
|
\end_layout
|
|
|
|
|
|
\begin_layout Standard
|
|
|
-\begin_inset Flex TODO Note (inline)
|
|
|
-status open
|
|
|
+\begin_inset ERT
|
|
|
+status collapsed
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-Maybe just one of these figures and then say the other 2 were similar
|
|
|
+
|
|
|
+
|
|
|
+\backslash
|
|
|
+FloatBarrier
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
+
|
|
|
+\end_layout
|
|
|
|
|
|
+\begin_layout Subsection
|
|
|
+MOFA recovers biologically relevant variation from blind analysis by correlating
|
|
|
+ across datasets
|
|
|
\end_layout
|
|
|
|
|
|
\begin_layout Standard
|
|
@@ -1840,7 +1730,7 @@ status open
|
|
|
\begin_layout Plain Layout
|
|
|
\align center
|
|
|
\begin_inset Graphics
|
|
|
- filename graphics/CD4-csaw/ChIP-seq/H3K4me2-sample-MAplot-bins-CROP.png
|
|
|
+ filename graphics/CD4-csaw/MOFA-varExplaiend-matrix-CROP.png
|
|
|
lyxscale 25
|
|
|
width 100col%
|
|
|
groupId colwidth-raster
|
|
@@ -1856,7 +1746,13 @@ status open
|
|
|
\begin_layout Plain Layout
|
|
|
|
|
|
\series bold
|
|
|
-MA plot of H3K4me2 read counts in 10kb bins for two arbitrary samples
|
|
|
+\begin_inset CommandInset label
|
|
|
+LatexCommand label
|
|
|
+name "fig:mofa-varexplained"
|
|
|
+
|
|
|
+\end_inset
|
|
|
+
|
|
|
+Variance explained in each data set by each latent factor estimated by MOFA.
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -1869,16 +1765,43 @@ MA plot of H3K4me2 read counts in 10kb bins for two arbitrary samples
|
|
|
|
|
|
\end_layout
|
|
|
|
|
|
+\begin_layout Itemize
|
|
|
+Figure
|
|
|
+\begin_inset CommandInset ref
|
|
|
+LatexCommand ref
|
|
|
+reference "fig:mofa-varexplained"
|
|
|
+plural "false"
|
|
|
+caps "false"
|
|
|
+noprefix "false"
|
|
|
+
|
|
|
+\end_inset
|
|
|
+
|
|
|
+ shows that LF1, 4, and 5 explain substantial var in all data sets
|
|
|
+\end_layout
|
|
|
+
|
|
|
\begin_layout Standard
|
|
|
\begin_inset Float figure
|
|
|
wide false
|
|
|
sideways false
|
|
|
status open
|
|
|
|
|
|
+\begin_layout Plain Layout
|
|
|
+\begin_inset Flex TODO Note (inline)
|
|
|
+status open
|
|
|
+
|
|
|
+\begin_layout Plain Layout
|
|
|
+Maybe drop this one
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\end_inset
|
|
|
+
|
|
|
+
|
|
|
+\end_layout
|
|
|
+
|
|
|
\begin_layout Plain Layout
|
|
|
\align center
|
|
|
\begin_inset Graphics
|
|
|
- filename graphics/CD4-csaw/ChIP-seq/H3K4me3-sample-MAplot-bins-CROP.png
|
|
|
+ filename graphics/CD4-csaw/MOFA-LF-distributions-CROP.png
|
|
|
lyxscale 25
|
|
|
width 100col%
|
|
|
groupId colwidth-raster
|
|
@@ -1894,7 +1817,13 @@ status open
|
|
|
\begin_layout Plain Layout
|
|
|
|
|
|
\series bold
|
|
|
-MA plot of H3K4me3 read counts in 10kb bins for two arbitrary samples
|
|
|
+\begin_inset CommandInset label
|
|
|
+LatexCommand label
|
|
|
+name "fig:mofa-lf-dist"
|
|
|
+
|
|
|
+\end_inset
|
|
|
+
|
|
|
+Sample distribution for each latent factor estimated by MOFA.
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -1913,10 +1842,23 @@ wide false
|
|
|
sideways false
|
|
|
status open
|
|
|
|
|
|
+\begin_layout Plain Layout
|
|
|
+\begin_inset Flex TODO Note (inline)
|
|
|
+status open
|
|
|
+
|
|
|
+\begin_layout Plain Layout
|
|
|
+Talk about how this supports the convergence hypothesis
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\end_inset
|
|
|
+
|
|
|
+
|
|
|
+\end_layout
|
|
|
+
|
|
|
\begin_layout Plain Layout
|
|
|
\align center
|
|
|
\begin_inset Graphics
|
|
|
- filename graphics/CD4-csaw/ChIP-seq/H3K27me3-sample-MAplot-bins-CROP.png
|
|
|
+ filename graphics/CD4-csaw/MOFA-LF-scatter-CROP.png
|
|
|
lyxscale 25
|
|
|
width 100col%
|
|
|
groupId colwidth-raster
|
|
@@ -1932,7 +1874,13 @@ status open
|
|
|
\begin_layout Plain Layout
|
|
|
|
|
|
\series bold
|
|
|
-MA plot of H3K27me3 read counts in 10kb bins for two arbitrary samples
|
|
|
+\begin_inset CommandInset label
|
|
|
+LatexCommand label
|
|
|
+name "fig:mofa-lf-scatter"
|
|
|
+
|
|
|
+\end_inset
|
|
|
+
|
|
|
+Scatter plots of specific pairs of MOFA latent factors.
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -1945,33 +1893,45 @@ MA plot of H3K27me3 read counts in 10kb bins for two arbitrary samples
|
|
|
|
|
|
\end_layout
|
|
|
|
|
|
-\begin_layout Subsection
|
|
|
-ChIP-seq must be corrected for hidden confounding factors
|
|
|
-\end_layout
|
|
|
+\begin_layout Itemize
|
|
|
+Figures
|
|
|
+\begin_inset CommandInset ref
|
|
|
+LatexCommand ref
|
|
|
+reference "fig:mofa-lf-dist"
|
|
|
+plural "false"
|
|
|
+caps "false"
|
|
|
+noprefix "false"
|
|
|
|
|
|
-\begin_layout Standard
|
|
|
-\begin_inset Flex TODO Note (inline)
|
|
|
-status open
|
|
|
+\end_inset
|
|
|
|
|
|
-\begin_layout Plain Layout
|
|
|
-Consolidate these into 1 2x3 grid
|
|
|
-\end_layout
|
|
|
+ and
|
|
|
+\begin_inset CommandInset ref
|
|
|
+LatexCommand ref
|
|
|
+reference "fig:mofa-lf-scatter"
|
|
|
+plural "false"
|
|
|
+caps "false"
|
|
|
+noprefix "false"
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
+ show that those same 3 LFs, (1, 4, & 5) also correlate best with the experiment
|
|
|
+al factors (cell type & time point)
|
|
|
+\end_layout
|
|
|
|
|
|
+\begin_layout Itemize
|
|
|
+LF2 is clearly the RNA-seq batch effect
|
|
|
\end_layout
|
|
|
|
|
|
\begin_layout Standard
|
|
|
\begin_inset Float figure
|
|
|
wide false
|
|
|
sideways false
|
|
|
-status collapsed
|
|
|
+status open
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
\align center
|
|
|
\begin_inset Graphics
|
|
|
- filename graphics/CD4-csaw/ChIP-seq/H3K4me2-PCA-raw-CROP.png
|
|
|
+ filename graphics/CD4-csaw/MOFA-batch-correct-CROP.png
|
|
|
lyxscale 25
|
|
|
width 100col%
|
|
|
groupId colwidth-raster
|
|
@@ -1989,11 +1949,16 @@ status collapsed
|
|
|
\series bold
|
|
|
\begin_inset CommandInset label
|
|
|
LatexCommand label
|
|
|
-name "fig:PCoA-H3K4me2-bad"
|
|
|
+name "fig:mofa-batchsub"
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
-PCoA plot of H3K4me2 windows, before subtracting surrogate variables
|
|
|
+Result of RNA-seq batch-correction using MOFA latent factors
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\end_inset
|
|
|
+
|
|
|
+
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -2001,48 +1966,117 @@ PCoA plot of H3K4me2 windows, before subtracting surrogate variables
|
|
|
|
|
|
\end_layout
|
|
|
|
|
|
+\begin_layout Itemize
|
|
|
+Attempting to remove the effect of LF2 (Figure
|
|
|
+\begin_inset CommandInset ref
|
|
|
+LatexCommand ref
|
|
|
+reference "fig:mofa-batchsub"
|
|
|
+plural "false"
|
|
|
+caps "false"
|
|
|
+noprefix "false"
|
|
|
+
|
|
|
+\end_inset
|
|
|
+
|
|
|
+) results in batch correction comparable to ComBat (Figure
|
|
|
+\begin_inset CommandInset ref
|
|
|
+LatexCommand ref
|
|
|
+reference "fig:RNA-PCA-ComBat-batchsub"
|
|
|
+plural "false"
|
|
|
+caps "false"
|
|
|
+noprefix "false"
|
|
|
+
|
|
|
\end_inset
|
|
|
|
|
|
+)
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\begin_layout Itemize
|
|
|
+MOFA was able to do this batch subtraction without directly using the sample
|
|
|
+ labels (sample labels were used implicitly to select which factor to subtract)
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\begin_layout Itemize
|
|
|
+Similarity of results shows that batch correction can't get much better
|
|
|
+ than ComBat (despite ComBat ignoring time point)
|
|
|
+\end_layout
|
|
|
|
|
|
+\begin_layout Subsection
|
|
|
+MOFA does some interesting stuff but is mostly confirmatory in this context
|
|
|
\end_layout
|
|
|
|
|
|
\begin_layout Standard
|
|
|
-\begin_inset Float figure
|
|
|
-wide false
|
|
|
-sideways false
|
|
|
-status collapsed
|
|
|
+\begin_inset Flex TODO Note (inline)
|
|
|
+status open
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-\align center
|
|
|
-\begin_inset Graphics
|
|
|
- filename graphics/CD4-csaw/ChIP-seq/H3K4me2-PCA-SVsub-CROP.png
|
|
|
- lyxscale 25
|
|
|
- width 100col%
|
|
|
- groupId colwidth-raster
|
|
|
+MOFA should be a footnote to something else, not its own point
|
|
|
+\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
|
|
|
\end_layout
|
|
|
|
|
|
-\begin_layout Plain Layout
|
|
|
-\begin_inset Caption Standard
|
|
|
+\begin_layout Standard
|
|
|
+\begin_inset Flex TODO Note (inline)
|
|
|
+status open
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-
|
|
|
-\series bold
|
|
|
-\begin_inset CommandInset label
|
|
|
-LatexCommand label
|
|
|
-name "fig:PCoA-H3K4me2-good"
|
|
|
+Combine with previous subsection
|
|
|
+\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
-PCoA plot of H3K4me2 windows, after subtracting surrogate variables
|
|
|
+
|
|
|
\end_layout
|
|
|
|
|
|
-\end_inset
|
|
|
+\begin_layout Itemize
|
|
|
+MOFA shows great promise for accelerating discovery of major biological
|
|
|
+ effects in multi-omics datasets
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\begin_deeper
|
|
|
+\begin_layout Itemize
|
|
|
+MOFA successfully separates biologically relevant patterns of variation
|
|
|
+ from technical confounding factors without knowing the sample labels, by
|
|
|
+ finding latent factors that explain variation across multiple data sets.
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\begin_layout Itemize
|
|
|
+MOFA was added to this analysis late and played primarily a confirmatory
|
|
|
+ role, but it was able to confirm earlier conclusions with much less prior
|
|
|
+ information (no sample labels) and much less analyst effort/input
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\begin_layout Itemize
|
|
|
+Less input from analyst means less opportunity to introduce unwanted bias
|
|
|
+ into results
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\begin_layout Itemize
|
|
|
+MOFA confirmed that the already-implemented batch correction in the RNA-seq
|
|
|
+ data was already performing as well as possible given the limitations of
|
|
|
+ the data
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\end_deeper
|
|
|
+\begin_layout Section
|
|
|
+Results
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\begin_layout Standard
|
|
|
+\begin_inset Note Note
|
|
|
+status open
|
|
|
|
|
|
+\begin_layout Plain Layout
|
|
|
+Focus on what hypotheses were tested, then select figures that show how
|
|
|
+ those hypotheses were tested, even if the result is a negative.
|
|
|
+\end_layout
|
|
|
|
|
|
+\begin_layout Plain Layout
|
|
|
+Not every interesting result needs to be in here.
|
|
|
+ Chapter should tell a story.
|
|
|
+
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -2051,42 +2085,31 @@ PCoA plot of H3K4me2 windows, after subtracting surrogate variables
|
|
|
\end_layout
|
|
|
|
|
|
\begin_layout Standard
|
|
|
-\begin_inset Float figure
|
|
|
-wide false
|
|
|
-sideways false
|
|
|
-status collapsed
|
|
|
+\begin_inset Flex TODO Note (inline)
|
|
|
+status open
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-\align center
|
|
|
-\begin_inset Graphics
|
|
|
- filename graphics/CD4-csaw/ChIP-seq/H3K4me3-PCA-raw-CROP.png
|
|
|
- lyxscale 25
|
|
|
- width 100col%
|
|
|
- groupId colwidth-raster
|
|
|
+Maybe reorder these sections to do RNA-seq, then ChIP-seq, then combined
|
|
|
+ analyses?
|
|
|
+\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
|
|
|
\end_layout
|
|
|
|
|
|
-\begin_layout Plain Layout
|
|
|
-\begin_inset Caption Standard
|
|
|
-
|
|
|
-\begin_layout Plain Layout
|
|
|
-
|
|
|
-\series bold
|
|
|
-\begin_inset CommandInset label
|
|
|
-LatexCommand label
|
|
|
-name "fig:PCoA-H3K4me3-bad"
|
|
|
-
|
|
|
-\end_inset
|
|
|
-
|
|
|
-PCoA plot of H3K4me3 windows, before subtracting surrogate variables
|
|
|
+\begin_layout Subsection
|
|
|
+H3K4 and H3K27 methylation occur in broad regions and are enriched near
|
|
|
+ promoters
|
|
|
\end_layout
|
|
|
|
|
|
-\end_inset
|
|
|
-
|
|
|
+\begin_layout Standard
|
|
|
+\begin_inset Flex TODO Note (inline)
|
|
|
+status open
|
|
|
|
|
|
+\begin_layout Plain Layout
|
|
|
+Replace these figures with a single table of # of peaks called at chosen
|
|
|
+ IDR threshold, showing that SICER has more
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -2098,15 +2121,15 @@ PCoA plot of H3K4me3 windows, before subtracting surrogate variables
|
|
|
\begin_inset Float figure
|
|
|
wide false
|
|
|
sideways false
|
|
|
-status collapsed
|
|
|
+status open
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-\align center
|
|
|
-\begin_inset Graphics
|
|
|
- filename graphics/CD4-csaw/ChIP-seq/H3K4me3-PCA-SVsub-CROP.png
|
|
|
- lyxscale 25
|
|
|
- width 100col%
|
|
|
- groupId colwidth-raster
|
|
|
+\begin_inset Flex TODO Note (inline)
|
|
|
+status open
|
|
|
+
|
|
|
+\begin_layout Plain Layout
|
|
|
+Re-generate IDR rank consistency plots for SICER and MACS side-by-side
|
|
|
+\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
|
|
@@ -2121,11 +2144,11 @@ status collapsed
|
|
|
\series bold
|
|
|
\begin_inset CommandInset label
|
|
|
LatexCommand label
|
|
|
-name "fig:PCoA-H3K4me3-good"
|
|
|
+name "fig:IDR-RC-H3K4me2"
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
-PCoA plot of H3K4me3 windows, after subtracting surrogate variables
|
|
|
+Irreproducible Discovery Rate consistency plots for H3K4me2
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -2142,15 +2165,15 @@ PCoA plot of H3K4me3 windows, after subtracting surrogate variables
|
|
|
\begin_inset Float figure
|
|
|
wide false
|
|
|
sideways false
|
|
|
-status collapsed
|
|
|
+status open
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-\align center
|
|
|
-\begin_inset Graphics
|
|
|
- filename graphics/CD4-csaw/ChIP-seq/H3K27me3-PCA-raw-CROP.png
|
|
|
- lyxscale 25
|
|
|
- width 100col%
|
|
|
- groupId colwidth-raster
|
|
|
+\begin_inset Flex TODO Note (inline)
|
|
|
+status open
|
|
|
+
|
|
|
+\begin_layout Plain Layout
|
|
|
+Re-generate IDR rank consistency plots for SICER and MACS side-by-side
|
|
|
+\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
|
|
@@ -2165,11 +2188,11 @@ status collapsed
|
|
|
\series bold
|
|
|
\begin_inset CommandInset label
|
|
|
LatexCommand label
|
|
|
-name "fig:PCoA-H3K27me3-bad"
|
|
|
+name "fig:IDR-RC-H3K4me3"
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
-PCoA plot of H3K27me3 windows, before subtracting surrogate variables
|
|
|
+Irreproducible Discovery Rate consistency plots for H3K4me3
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -2186,15 +2209,15 @@ PCoA plot of H3K27me3 windows, before subtracting surrogate variables
|
|
|
\begin_inset Float figure
|
|
|
wide false
|
|
|
sideways false
|
|
|
-status collapsed
|
|
|
+status open
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-\align center
|
|
|
-\begin_inset Graphics
|
|
|
- filename graphics/CD4-csaw/ChIP-seq/H3K27me3-PCA-SVsub-CROP.png
|
|
|
- lyxscale 25
|
|
|
- width 100col%
|
|
|
- groupId colwidth-raster
|
|
|
+\begin_inset Flex TODO Note (inline)
|
|
|
+status open
|
|
|
+
|
|
|
+\begin_layout Plain Layout
|
|
|
+Re-generate IDR rank consistency plots for SICER and MACS side-by-side
|
|
|
+\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
|
|
@@ -2209,11 +2232,11 @@ status collapsed
|
|
|
\series bold
|
|
|
\begin_inset CommandInset label
|
|
|
LatexCommand label
|
|
|
-name "fig:PCoA-H3K27me3-good"
|
|
|
+name "fig:IDR-RC-H3K27me3"
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
-PCoA plot of H3K27me3 windows, after subtracting surrogate variables
|
|
|
+Irreproducible Discovery Rate consistency plots for H3K27me3
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -2226,35 +2249,23 @@ PCoA plot of H3K27me3 windows, after subtracting surrogate variables
|
|
|
|
|
|
\end_layout
|
|
|
|
|
|
-\begin_layout Itemize
|
|
|
-Figures showing BCV plots with and without SVA for each histone mark.
|
|
|
-\end_layout
|
|
|
-
|
|
|
-\begin_layout Itemize
|
|
|
-\begin_inset Flex TODO Note (inline)
|
|
|
+\begin_layout Standard
|
|
|
+\begin_inset Float table
|
|
|
+wide false
|
|
|
+sideways false
|
|
|
status open
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-Can I do supplementary data on a thesis? This is a lot of plots for this
|
|
|
- section.
|
|
|
-\end_layout
|
|
|
-
|
|
|
-\end_inset
|
|
|
-
|
|
|
-
|
|
|
-\end_layout
|
|
|
-
|
|
|
-\begin_layout Subsection
|
|
|
-H3K4 and H3K27 promoter methylation has broadly the expected correlation
|
|
|
- with gene expression
|
|
|
-\end_layout
|
|
|
-
|
|
|
-\begin_layout Standard
|
|
|
+\align center
|
|
|
\begin_inset Flex TODO Note (inline)
|
|
|
status open
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-This section can easily be cut, especially if I can't find those plots.
|
|
|
+Need
|
|
|
+\emph on
|
|
|
+median
|
|
|
+\emph default
|
|
|
+ peak width, not mean
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -2262,181 +2273,205 @@ This section can easily be cut, especially if I can't find those plots.
|
|
|
|
|
|
\end_layout
|
|
|
|
|
|
-\begin_layout Itemize
|
|
|
-H3K4 is correlated with higher expression, and H3K27 is correlated with
|
|
|
- lower expression genome-wide
|
|
|
-\end_layout
|
|
|
-
|
|
|
-\begin_layout Standard
|
|
|
-\begin_inset Flex TODO Note (inline)
|
|
|
-status open
|
|
|
+\begin_layout Plain Layout
|
|
|
+\align center
|
|
|
+\begin_inset Tabular
|
|
|
+<lyxtabular version="3" rows="4" columns="5">
|
|
|
+<features tabularvalignment="middle">
|
|
|
+<column alignment="center" valignment="top">
|
|
|
+<column alignment="center" valignment="top">
|
|
|
+<column alignment="center" valignment="top">
|
|
|
+<column alignment="center" valignment="top">
|
|
|
+<column alignment="center" valignment="top">
|
|
|
+<row>
|
|
|
+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
|
|
|
+\begin_inset Text
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-Grr, gotta find these figures.
|
|
|
- Maybe in the old analysis?
|
|
|
+Histone Mark
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
+</cell>
|
|
|
+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
|
|
|
+\begin_inset Text
|
|
|
|
|
|
-
|
|
|
-\end_layout
|
|
|
-
|
|
|
-\begin_layout Itemize
|
|
|
-Figures showing these correlations: box/violin plots of expression distributions
|
|
|
- with every combination of peak presence/absence in promoter
|
|
|
+\begin_layout Plain Layout
|
|
|
+# Peaks
|
|
|
\end_layout
|
|
|
|
|
|
-\begin_layout Itemize
|
|
|
-Appropriate statistical tests showing significant differences in expected
|
|
|
- directions
|
|
|
-\end_layout
|
|
|
+\end_inset
|
|
|
+</cell>
|
|
|
+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
|
|
|
+\begin_inset Text
|
|
|
|
|
|
-\begin_layout Subsection
|
|
|
-MOFA recovers biologically relevant variation from blind analysis by correlating
|
|
|
- across datasets
|
|
|
+\begin_layout Plain Layout
|
|
|
+Mean peak width
|
|
|
\end_layout
|
|
|
|
|
|
-\begin_layout Standard
|
|
|
-\begin_inset Float figure
|
|
|
-wide false
|
|
|
-sideways false
|
|
|
-status open
|
|
|
+\end_inset
|
|
|
+</cell>
|
|
|
+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
|
|
|
+\begin_inset Text
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-\align center
|
|
|
-\begin_inset Graphics
|
|
|
- filename graphics/CD4-csaw/MOFA-varExplaiend-matrix-CROP.png
|
|
|
- lyxscale 25
|
|
|
- width 100col%
|
|
|
- groupId colwidth-raster
|
|
|
+genome coverage
|
|
|
+\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
+</cell>
|
|
|
+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
|
|
|
+\begin_inset Text
|
|
|
|
|
|
-
|
|
|
+\begin_layout Plain Layout
|
|
|
+read coverage
|
|
|
\end_layout
|
|
|
|
|
|
-\begin_layout Plain Layout
|
|
|
-\begin_inset Caption Standard
|
|
|
+\end_inset
|
|
|
+</cell>
|
|
|
+</row>
|
|
|
+<row>
|
|
|
+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
|
|
|
+\begin_inset Text
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-
|
|
|
-\series bold
|
|
|
-\begin_inset CommandInset label
|
|
|
-LatexCommand label
|
|
|
-name "fig:mofa-varexplained"
|
|
|
+H3K4me2
|
|
|
+\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
+</cell>
|
|
|
+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
|
|
|
+\begin_inset Text
|
|
|
|
|
|
-Variance explained in each data set by each latent factor estimated by MOFA.
|
|
|
+\begin_layout Plain Layout
|
|
|
+14965
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
+</cell>
|
|
|
+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
|
|
|
+\begin_inset Text
|
|
|
|
|
|
-
|
|
|
+\begin_layout Plain Layout
|
|
|
+3970
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
+</cell>
|
|
|
+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
|
|
|
+\begin_inset Text
|
|
|
|
|
|
-
|
|
|
+\begin_layout Plain Layout
|
|
|
+1.92%
|
|
|
\end_layout
|
|
|
|
|
|
-\begin_layout Itemize
|
|
|
-Figure
|
|
|
-\begin_inset CommandInset ref
|
|
|
-LatexCommand ref
|
|
|
-reference "fig:mofa-varexplained"
|
|
|
-plural "false"
|
|
|
-caps "false"
|
|
|
-noprefix "false"
|
|
|
-
|
|
|
\end_inset
|
|
|
+</cell>
|
|
|
+<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
|
|
|
+\begin_inset Text
|
|
|
|
|
|
- shows that LF1, 4, and 5 explain substantial var in all data sets
|
|
|
+\begin_layout Plain Layout
|
|
|
+14.2%
|
|
|
\end_layout
|
|
|
|
|
|
-\begin_layout Standard
|
|
|
-\begin_inset Float figure
|
|
|
-wide false
|
|
|
-sideways false
|
|
|
-status open
|
|
|
-
|
|
|
-\begin_layout Plain Layout
|
|
|
-\begin_inset Flex TODO Note (inline)
|
|
|
-status open
|
|
|
+\end_inset
|
|
|
+</cell>
|
|
|
+</row>
|
|
|
+<row>
|
|
|
+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
|
|
|
+\begin_inset Text
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-Maybe drop this one
|
|
|
+H3K4me3
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
-
|
|
|
-
|
|
|
-\end_layout
|
|
|
+</cell>
|
|
|
+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
|
|
|
+\begin_inset Text
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-\align center
|
|
|
-\begin_inset Graphics
|
|
|
- filename graphics/CD4-csaw/MOFA-LF-distributions-CROP.png
|
|
|
- lyxscale 25
|
|
|
- width 100col%
|
|
|
- groupId colwidth-raster
|
|
|
+6163
|
|
|
+\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
+</cell>
|
|
|
+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
|
|
|
+\begin_inset Text
|
|
|
|
|
|
-
|
|
|
+\begin_layout Plain Layout
|
|
|
+2946
|
|
|
\end_layout
|
|
|
|
|
|
-\begin_layout Plain Layout
|
|
|
-\begin_inset Caption Standard
|
|
|
+\end_inset
|
|
|
+</cell>
|
|
|
+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
|
|
|
+\begin_inset Text
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-
|
|
|
-\series bold
|
|
|
-\begin_inset CommandInset label
|
|
|
-LatexCommand label
|
|
|
-name "fig:mofa-lf-dist"
|
|
|
+0.588%
|
|
|
+\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
+</cell>
|
|
|
+<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
|
|
|
+\begin_inset Text
|
|
|
|
|
|
-Sample distribution for each latent factor estimated by MOFA.
|
|
|
+\begin_layout Plain Layout
|
|
|
+6.57%
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
+</cell>
|
|
|
+</row>
|
|
|
+<row>
|
|
|
+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
|
|
|
+\begin_inset Text
|
|
|
|
|
|
-
|
|
|
+\begin_layout Plain Layout
|
|
|
+H3K27me3
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
+</cell>
|
|
|
+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
|
|
|
+\begin_inset Text
|
|
|
|
|
|
-
|
|
|
+\begin_layout Plain Layout
|
|
|
+18139
|
|
|
\end_layout
|
|
|
|
|
|
-\begin_layout Standard
|
|
|
-\begin_inset Float figure
|
|
|
-wide false
|
|
|
-sideways false
|
|
|
-status open
|
|
|
+\end_inset
|
|
|
+</cell>
|
|
|
+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
|
|
|
+\begin_inset Text
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-\begin_inset Flex TODO Note (inline)
|
|
|
-status open
|
|
|
+18967
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\end_inset
|
|
|
+</cell>
|
|
|
+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
|
|
|
+\begin_inset Text
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-Talk about how this supports the convergence hypothesis
|
|
|
+11.1%
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
+</cell>
|
|
|
+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
|
|
|
+\begin_inset Text
|
|
|
|
|
|
-
|
|
|
+\begin_layout Plain Layout
|
|
|
+22.5%
|
|
|
\end_layout
|
|
|
|
|
|
-\begin_layout Plain Layout
|
|
|
-\align center
|
|
|
-\begin_inset Graphics
|
|
|
- filename graphics/CD4-csaw/MOFA-LF-scatter-CROP.png
|
|
|
- lyxscale 25
|
|
|
- width 100col%
|
|
|
- groupId colwidth-raster
|
|
|
+\end_inset
|
|
|
+</cell>
|
|
|
+</row>
|
|
|
+</lyxtabular>
|
|
|
|
|
|
\end_inset
|
|
|
|
|
@@ -2451,11 +2486,11 @@ Talk about how this supports the convergence hypothesis
|
|
|
\series bold
|
|
|
\begin_inset CommandInset label
|
|
|
LatexCommand label
|
|
|
-name "fig:mofa-lf-scatter"
|
|
|
+name "tab:peak-calling-summary"
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
-Scatter plots of specific pairs of MOFA latent factors.
|
|
|
+SICER+IDR peak-calling summary
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -2468,33 +2503,57 @@ Scatter plots of specific pairs of MOFA latent factors.
|
|
|
|
|
|
\end_layout
|
|
|
|
|
|
-\begin_layout Itemize
|
|
|
+\begin_layout Standard
|
|
|
Figures
|
|
|
\begin_inset CommandInset ref
|
|
|
LatexCommand ref
|
|
|
-reference "fig:mofa-lf-dist"
|
|
|
+reference "fig:IDR-RC-H3K4me2"
|
|
|
plural "false"
|
|
|
caps "false"
|
|
|
noprefix "false"
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
- and
|
|
|
+,
|
|
|
\begin_inset CommandInset ref
|
|
|
LatexCommand ref
|
|
|
-reference "fig:mofa-lf-scatter"
|
|
|
+reference "fig:IDR-RC-H3K4me3"
|
|
|
plural "false"
|
|
|
caps "false"
|
|
|
noprefix "false"
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
- show that those same 3 LFs, (1, 4, & 5) also correlate best with the experiment
|
|
|
-al factors (cell type & time point)
|
|
|
-\end_layout
|
|
|
+, and
|
|
|
+\begin_inset CommandInset ref
|
|
|
+LatexCommand ref
|
|
|
+reference "fig:IDR-RC-H3K27me3"
|
|
|
+plural "false"
|
|
|
+caps "false"
|
|
|
+noprefix "false"
|
|
|
|
|
|
-\begin_layout Itemize
|
|
|
-LF2 is clearly the RNA-seq batch effect
|
|
|
+\end_inset
|
|
|
+
|
|
|
+ show the IDR rank-consistency plots for peaks called in an arbitrarily-chosen
|
|
|
+ pair of donors.
|
|
|
+ For all 3 histone marks, when the peaks for each donor are ranked according
|
|
|
+ to their scores, SICER produces much more reproducible results between
|
|
|
+ donors.
|
|
|
+ This is consistent with SICER's stated goal of identifying broad peaks,
|
|
|
+ in contrast to MACS, which is designed for identifying sharp peaks.
|
|
|
+ Based on this observation, the SICER peak calls were used for all downstream
|
|
|
+ analyses that involved ChIP-seq peaks.
|
|
|
+ Table
|
|
|
+\begin_inset CommandInset ref
|
|
|
+LatexCommand ref
|
|
|
+reference "tab:peak-calling-summary"
|
|
|
+plural "false"
|
|
|
+caps "false"
|
|
|
+noprefix "false"
|
|
|
+
|
|
|
+\end_inset
|
|
|
+
|
|
|
+ gives a summary of the peak calling statistics for each histone mark.
|
|
|
\end_layout
|
|
|
|
|
|
\begin_layout Standard
|
|
@@ -2506,10 +2565,10 @@ status open
|
|
|
\begin_layout Plain Layout
|
|
|
\align center
|
|
|
\begin_inset Graphics
|
|
|
- filename graphics/CD4-csaw/MOFA-batch-correct-CROP.png
|
|
|
- lyxscale 25
|
|
|
+ filename graphics/CD4-csaw/Promoter Peak Distance Profile-PAGE1-CROP.pdf
|
|
|
+ lyxscale 50
|
|
|
width 100col%
|
|
|
- groupId colwidth-raster
|
|
|
+ groupId colwidth
|
|
|
|
|
|
\end_inset
|
|
|
|
|
@@ -2524,16 +2583,20 @@ status open
|
|
|
\series bold
|
|
|
\begin_inset CommandInset label
|
|
|
LatexCommand label
|
|
|
-name "fig:mofa-batchsub"
|
|
|
+name "fig:effective-promoter-radius"
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
-Result of RNA-seq batch-correction using MOFA latent factors
|
|
|
+Enrichment of peaks in promoter neighborhoods.
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\begin_layout Plain Layout
|
|
|
+
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -2542,37 +2605,74 @@ Result of RNA-seq batch-correction using MOFA latent factors
|
|
|
\end_layout
|
|
|
|
|
|
\begin_layout Itemize
|
|
|
-Attempting to remove the effect of LF2 (Figure
|
|
|
+Each histone mark is enriched within a certain radius of gene TSS positions,
|
|
|
+ but that radius is different for each mark (figure
|
|
|
\begin_inset CommandInset ref
|
|
|
LatexCommand ref
|
|
|
-reference "fig:mofa-batchsub"
|
|
|
+reference "fig:effective-promoter-radius"
|
|
|
plural "false"
|
|
|
caps "false"
|
|
|
noprefix "false"
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
-) results in batch correction comparable to ComBat (Figure
|
|
|
-\begin_inset CommandInset ref
|
|
|
-LatexCommand ref
|
|
|
-reference "fig:RNA-PCA-ComBat-batchsub"
|
|
|
-plural "false"
|
|
|
-caps "false"
|
|
|
-noprefix "false"
|
|
|
+, previously in
|
|
|
+\begin_inset CommandInset citation
|
|
|
+LatexCommand cite
|
|
|
+key "LaMere2016"
|
|
|
+literal "false"
|
|
|
|
|
|
\end_inset
|
|
|
|
|
|
-)
|
|
|
+ Fig.
|
|
|
+ S2)
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\begin_layout Subsection
|
|
|
+H3K4 and H3K27 promoter methylation has broadly the expected correlation
|
|
|
+ with gene expression
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\begin_layout Standard
|
|
|
+\begin_inset Flex TODO Note (inline)
|
|
|
+status open
|
|
|
+
|
|
|
+\begin_layout Plain Layout
|
|
|
+This section can easily be cut, especially if I can't find those plots.
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\end_inset
|
|
|
+
|
|
|
+
|
|
|
\end_layout
|
|
|
|
|
|
\begin_layout Itemize
|
|
|
-MOFA was able to do this batch subtraction without directly using the sample
|
|
|
- labels (sample labels were used implicitly to select which factor to subtract)
|
|
|
+H3K4 is correlated with higher expression, and H3K27 is correlated with
|
|
|
+ lower expression genome-wide
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\begin_layout Standard
|
|
|
+\begin_inset Flex TODO Note (inline)
|
|
|
+status open
|
|
|
+
|
|
|
+\begin_layout Plain Layout
|
|
|
+Grr, gotta find these figures.
|
|
|
+ Maybe in the old analysis?
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\end_inset
|
|
|
+
|
|
|
+
|
|
|
\end_layout
|
|
|
|
|
|
\begin_layout Itemize
|
|
|
-Similarity of results shows that batch correction can't get much better
|
|
|
- than ComBat (despite ComBat ignoring time point)
|
|
|
+Figures showing these correlations: box/violin plots of expression distributions
|
|
|
+ with every combination of peak presence/absence in promoter
|
|
|
+\end_layout
|
|
|
+
|
|
|
+\begin_layout Itemize
|
|
|
+Appropriate statistical tests showing significant differences in expected
|
|
|
+ directions
|
|
|
\end_layout
|
|
|
|
|
|
\begin_layout Subsection
|
|
@@ -2995,46 +3095,23 @@ Try to boil it down to 3 main messages to get across
|
|
|
\end_layout
|
|
|
|
|
|
\begin_layout Itemize
|
|
|
-"Promoter radius" is not constant and must be defined empirically for a
|
|
|
- given data set.
|
|
|
- Coverage within promoter radius has an expression correlation as well
|
|
|
-\end_layout
|
|
|
-
|
|
|
-\begin_layout Standard
|
|
|
-\begin_inset Flex TODO Note (inline)
|
|
|
-status open
|
|
|
-
|
|
|
-\begin_layout Plain Layout
|
|
|
-MOFA should be a footnote to something else, not its own point
|
|
|
-\end_layout
|
|
|
-
|
|
|
-\end_inset
|
|
|
-
|
|
|
-
|
|
|
-\end_layout
|
|
|
-
|
|
|
-\begin_layout Itemize
|
|
|
-MOFA shows great promise for accelerating discovery of major biological
|
|
|
- effects in multi-omics datasets
|
|
|
+3 Main points
|
|
|
\end_layout
|
|
|
|
|
|
\begin_deeper
|
|
|
\begin_layout Itemize
|
|
|
-MOFA successfully separates biologically relevant patterns of variation
|
|
|
- from technical confounding factors without knowing the sample labels, by
|
|
|
- finding latent factors that explain variation across multiple data sets.
|
|
|
+"Promoter radius" is not constant and must be defined empirically for a
|
|
|
+ given data set.
|
|
|
+ Coverage within promoter radius has an expression correlation as well
|
|
|
\end_layout
|
|
|
|
|
|
\begin_layout Itemize
|
|
|
-MOFA was added to this analysis late and played primarily a confirmatory
|
|
|
- role, but it was able to confirm earlier conclusions with much less prior
|
|
|
- information (no sample labels) and much less analyst effort
|
|
|
+Naive-to-memory convergence in certain data sets but not others, implies
|
|
|
+ which marks are involved in memory differentiation
|
|
|
\end_layout
|
|
|
|
|
|
\begin_layout Itemize
|
|
|
-MOFA confirmed that the already-implemented batch correction in the RNA-seq
|
|
|
- data was already performing as well as possible given the limitations of
|
|
|
- the data
|
|
|
+TSS positional coverage, hints of something interesting but no clear conclusions
|
|
|
\end_layout
|
|
|
|
|
|
\end_deeper
|
|
@@ -3169,7 +3246,7 @@ Improving array-based analyses of transplant rejection by optimizing data
|
|
|
status open
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-Author list: Me, Sunil, Tom, Padma, Dan
|
|
|
+Chapter author list: Me, Sunil, Tom, Padma, Dan
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|
|
@@ -3530,7 +3607,7 @@ literal "true"
|
|
|
|
|
|
\begin_layout Standard
|
|
|
\begin_inset Flex TODO Note (inline)
|
|
|
-status collapsed
|
|
|
+status open
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
Find appropriate GEO identifiers if possible.
|
|
@@ -3583,10 +3660,11 @@ on of TX and AR samples was considered.
|
|
|
|
|
|
\begin_layout Standard
|
|
|
\begin_inset Flex TODO Note (inline)
|
|
|
-status collapsed
|
|
|
+status open
|
|
|
|
|
|
\begin_layout Plain Layout
|
|
|
-Summarize the get.best.threshold algorithm for PAM threshold selection
|
|
|
+Summarize the get.best.threshold algorithm for PAM threshold selection, or
|
|
|
+ just put the code online?
|
|
|
\end_layout
|
|
|
|
|
|
\end_inset
|