瀏覽代碼

Restructure Chapter 2 according to Andrew's advice

Ryan C. Thompson 5 年之前
父節點
當前提交
96cdef6477
共有 1 個文件被更改,包括 701 次插入623 次删除
  1. 701 623
      thesis.lyx

+ 701 - 623
thesis.lyx

@@ -487,7 +487,20 @@ Reproducible genome-wide epigenetic analysis of H3K4 and H3K27 methylation
 status open
 
 \begin_layout Plain Layout
-Author list: Me, Sarah, Dan
+Chapter author list: Me, Sarah, Dan
+\end_layout
+
+\end_inset
+
+
+\end_layout
+
+\begin_layout Standard
+\begin_inset Flex TODO Note (inline)
+status open
+
+\begin_layout Plain Layout
+Need better section titles throughout the chapter
 \end_layout
 
 \end_inset
@@ -691,44 +704,6 @@ Promoter counts in sliding windows around each gene's highest-expressed
  TSS to investigate coverage distribution within promoters
 \end_layout
 
-\begin_layout Section
-Results
-\end_layout
-
-\begin_layout Standard
-\begin_inset Note Note
-status open
-
-\begin_layout Plain Layout
-Focus on what hypotheses were tested, then select figures that show how
- those hypotheses were tested, even if the result is a negative.
-\end_layout
-
-\begin_layout Plain Layout
-Not every interesting result needs to be in here.
- Chapter should tell a story.
- 
-\end_layout
-
-\end_inset
-
-
-\end_layout
-
-\begin_layout Standard
-\begin_inset Flex TODO Note (inline)
-status open
-
-\begin_layout Plain Layout
-Maybe reorder these sections to do RNA-seq, then ChIP-seq, then combined
- analyses?
-\end_layout
-
-\end_inset
-
-
-\end_layout
-
 \begin_layout Subsection
 RNA-seq align+quant method comparison
 \end_layout
@@ -750,7 +725,7 @@ Maybe fix up the excessive axis ranges for these plots?
 \begin_inset Float figure
 wide false
 sideways false
-status open
+status collapsed
 
 \begin_layout Plain Layout
 \align center
@@ -775,10 +750,6 @@ Comparison of STAR quantification between Ensembl and Entrez gene identifiers
 \end_inset
 
 
-\end_layout
-
-\begin_layout Plain Layout
-
 \end_layout
 
 \end_inset
@@ -790,7 +761,7 @@ Comparison of STAR quantification between Ensembl and Entrez gene identifiers
 \begin_inset Float figure
 wide false
 sideways false
-status open
+status collapsed
 
 \begin_layout Plain Layout
 \align center
@@ -827,7 +798,7 @@ Comparison of Salmon+Shoal quantification between Ensembl and Entrez gene
 \begin_inset Float figure
 wide false
 sideways false
-status open
+status collapsed
 
 \begin_layout Plain Layout
 \align center
@@ -863,7 +834,7 @@ Comparison of quantification between STAR and HISAT2 for identical annotation
 \begin_inset Float figure
 wide false
 sideways false
-status open
+status collapsed
 
 \begin_layout Plain Layout
 \align center
@@ -899,7 +870,7 @@ Comparison of quantification between STAR and Salmon for identical annotation
 \begin_inset Float figure
 wide false
 sideways false
-status open
+status collapsed
 
 \begin_layout Plain Layout
 \align center
@@ -936,7 +907,7 @@ n
 \begin_inset Float figure
 wide false
 sideways false
-status open
+status collapsed
 
 \begin_layout Plain Layout
 \align center
@@ -978,6 +949,22 @@ Ultimately selected shoal as quantification, Ensembl as annotation.
  an informed decision.
 \end_layout
 
+\begin_layout Standard
+\begin_inset ERT
+status collapsed
+
+\begin_layout Plain Layout
+
+
+\backslash
+FloatBarrier
+\end_layout
+
+\end_inset
+
+
+\end_layout
+
 \begin_layout Subsection
 RNA-seq has a large confounding batch effect
 \end_layout
@@ -986,7 +973,7 @@ RNA-seq has a large confounding batch effect
 \begin_inset Float figure
 wide false
 sideways false
-status open
+status collapsed
 
 \begin_layout Plain Layout
 \begin_inset Flex TODO Note (inline)
@@ -1032,10 +1019,6 @@ RNA-seq sample weights, grouped by experimental and technical covariates
 \end_inset
 
 
-\end_layout
-
-\begin_layout Plain Layout
-
 \end_layout
 
 \end_inset
@@ -1047,7 +1030,7 @@ RNA-seq sample weights, grouped by experimental and technical covariates
 \begin_inset Float figure
 wide false
 sideways false
-status open
+status collapsed
 
 \begin_layout Plain Layout
 \align center
@@ -1091,7 +1074,7 @@ RNA-seq PCoA plot showing clear batch effect
 \begin_inset Float figure
 wide false
 sideways false
-status open
+status collapsed
 
 \begin_layout Plain Layout
 \begin_inset Flex TODO Note (inline)
@@ -1148,7 +1131,7 @@ RNA-seq PCoA plot showing clear batch effect
 \begin_inset Float figure
 wide false
 sideways false
-status open
+status collapsed
 
 \begin_layout Plain Layout
 \align center
@@ -1208,18 +1191,15 @@ Figures showing p-value histograms for within-batch and cross-batch contrasts,
 
 \end_layout
 
-\begin_layout Subsection
-H3K4 and H3K27 methylation occur in broad regions and are enriched near
- promoters
-\end_layout
-
 \begin_layout Standard
-\begin_inset Flex TODO Note (inline)
-status open
+\begin_inset ERT
+status collapsed
 
 \begin_layout Plain Layout
-Replace these figures with a single table of # of peaks called at chosen
- IDR threshold, showing that SICER has more
+
+
+\backslash
+FloatBarrier
 \end_layout
 
 \end_inset
@@ -1227,19 +1207,23 @@ Replace these figures with a single table of # of peaks called at chosen
 
 \end_layout
 
+\begin_layout Subsection
+ChIP-seq blacklisting is important
+\end_layout
+
 \begin_layout Standard
 \begin_inset Float figure
 wide false
 sideways false
-status open
-
-\begin_layout Plain Layout
-\begin_inset Flex TODO Note (inline)
-status open
+status collapsed
 
 \begin_layout Plain Layout
-Re-generate IDR rank consistency plots for SICER and MACS side-by-side
-\end_layout
+\align center
+\begin_inset Graphics
+	filename graphics/CD4-csaw/csaw/CCF-plots-PAGE2-CROP.pdf
+	lyxscale 50
+	width 100col%
+	groupId colwidth
 
 \end_inset
 
@@ -1250,15 +1234,7 @@ Re-generate IDR rank consistency plots for SICER and MACS side-by-side
 \begin_inset Caption Standard
 
 \begin_layout Plain Layout
-
-\series bold
-\begin_inset CommandInset label
-LatexCommand label
-name "fig:IDR-RC-H3K4me2"
-
-\end_inset
-
-Irreproducible Discovery Rate consistency plots for H3K4me2
+Cross-correlation plots with blacklisted reads removed
 \end_layout
 
 \end_inset
@@ -1275,15 +1251,15 @@ Irreproducible Discovery Rate consistency plots for H3K4me2
 \begin_inset Float figure
 wide false
 sideways false
-status open
-
-\begin_layout Plain Layout
-\begin_inset Flex TODO Note (inline)
-status open
+status collapsed
 
 \begin_layout Plain Layout
-Re-generate IDR rank consistency plots for SICER and MACS side-by-side
-\end_layout
+\align center
+\begin_inset Graphics
+	filename graphics/CD4-csaw/csaw/CCF-plots-noBL-PAGE2-CROP.pdf
+	lyxscale 50
+	width 100col%
+	groupId colwidth
 
 \end_inset
 
@@ -1294,15 +1270,12 @@ Re-generate IDR rank consistency plots for SICER and MACS side-by-side
 \begin_inset Caption Standard
 
 \begin_layout Plain Layout
-
-\series bold
-\begin_inset CommandInset label
-LatexCommand label
-name "fig:IDR-RC-H3K4me3"
+Cross-correlation plots without removing blacklisted reads
+\end_layout
 
 \end_inset
 
-Irreproducible Discovery Rate consistency plots for H3K4me3
+
 \end_layout
 
 \end_inset
@@ -1310,6 +1283,18 @@ Irreproducible Discovery Rate consistency plots for H3K4me3
 
 \end_layout
 
+\begin_layout Subsection
+ChIP-seq normalization
+\end_layout
+
+\begin_layout Standard
+\begin_inset Flex TODO Note (inline)
+status open
+
+\begin_layout Plain Layout
+Maybe just one of these figures and then say the other 2 were similar
+\end_layout
+
 \end_inset
 
 
@@ -1322,12 +1307,12 @@ sideways false
 status open
 
 \begin_layout Plain Layout
-\begin_inset Flex TODO Note (inline)
-status open
-
-\begin_layout Plain Layout
-Re-generate IDR rank consistency plots for SICER and MACS side-by-side
-\end_layout
+\align center
+\begin_inset Graphics
+	filename graphics/CD4-csaw/ChIP-seq/H3K4me2-sample-MAplot-bins-CROP.png
+	lyxscale 25
+	width 100col%
+	groupId colwidth-raster
 
 \end_inset
 
@@ -1340,13 +1325,7 @@ Re-generate IDR rank consistency plots for SICER and MACS side-by-side
 \begin_layout Plain Layout
 
 \series bold
-\begin_inset CommandInset label
-LatexCommand label
-name "fig:IDR-RC-H3K27me3"
-
-\end_inset
-
-Irreproducible Discovery Rate consistency plots for H3K27me3
+MA plot of H3K4me2 read counts in 10kb bins for two arbitrary samples
 \end_layout
 
 \end_inset
@@ -1360,23 +1339,18 @@ Irreproducible Discovery Rate consistency plots for H3K27me3
 \end_layout
 
 \begin_layout Standard
-\begin_inset Float table
+\begin_inset Float figure
 wide false
 sideways false
 status open
 
 \begin_layout Plain Layout
 \align center
-\begin_inset Flex TODO Note (inline)
-status open
-
-\begin_layout Plain Layout
-Need 
-\emph on
-median
-\emph default
- peak width, not mean
-\end_layout
+\begin_inset Graphics
+	filename graphics/CD4-csaw/ChIP-seq/H3K4me3-sample-MAplot-bins-CROP.png
+	lyxscale 25
+	width 100col%
+	groupId colwidth-raster
 
 \end_inset
 
@@ -1384,204 +1358,137 @@ median
 \end_layout
 
 \begin_layout Plain Layout
-\align center
-\begin_inset Tabular
-<lyxtabular version="3" rows="4" columns="5">
-<features tabularvalignment="middle">
-<column alignment="center" valignment="top">
-<column alignment="center" valignment="top">
-<column alignment="center" valignment="top">
-<column alignment="center" valignment="top">
-<column alignment="center" valignment="top">
-<row>
-<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
-\begin_inset Text
+\begin_inset Caption Standard
 
 \begin_layout Plain Layout
-Histone Mark
+
+\series bold
+MA plot of H3K4me3 read counts in 10kb bins for two arbitrary samples
 \end_layout
 
 \end_inset
-</cell>
-<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
-\begin_inset Text
 
-\begin_layout Plain Layout
-# Peaks
+
 \end_layout
 
 \end_inset
-</cell>
-<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
-\begin_inset Text
 
-\begin_layout Plain Layout
-Mean peak width
+
 \end_layout
 
-\end_inset
-</cell>
-<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
-\begin_inset Text
+\begin_layout Standard
+\begin_inset Float figure
+wide false
+sideways false
+status open
 
 \begin_layout Plain Layout
-genome coverage
-\end_layout
+\align center
+\begin_inset Graphics
+	filename graphics/CD4-csaw/ChIP-seq/H3K27me3-sample-MAplot-bins-CROP.png
+	lyxscale 25
+	width 100col%
+	groupId colwidth-raster
 
 \end_inset
-</cell>
-<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
-\begin_inset Text
 
-\begin_layout Plain Layout
-read coverage
-\end_layout
-
-\end_inset
-</cell>
-</row>
-<row>
-<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
-\begin_inset Text
 
-\begin_layout Plain Layout
-H3K4me2
 \end_layout
 
-\end_inset
-</cell>
-<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
-\begin_inset Text
-
 \begin_layout Plain Layout
-14965
-\end_layout
-
-\end_inset
-</cell>
-<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
-\begin_inset Text
+\begin_inset Caption Standard
 
 \begin_layout Plain Layout
-3970
+
+\series bold
+MA plot of H3K27me3 read counts in 10kb bins for two arbitrary samples
 \end_layout
 
 \end_inset
-</cell>
-<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
-\begin_inset Text
 
-\begin_layout Plain Layout
-1.92%
+
 \end_layout
 
 \end_inset
-</cell>
-<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
-\begin_inset Text
 
-\begin_layout Plain Layout
-14.2%
-\end_layout
 
-\end_inset
-</cell>
-</row>
-<row>
-<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
-\begin_inset Text
+\end_layout
 
-\begin_layout Plain Layout
-H3K4me3
+\begin_layout Subsection
+ChIP-seq must be corrected for hidden confounding factors
 \end_layout
 
-\end_inset
-</cell>
-<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
-\begin_inset Text
+\begin_layout Standard
+\begin_inset Flex TODO Note (inline)
+status open
 
 \begin_layout Plain Layout
-6163
+Consolidate these into 1 2x3 grid.
+ For now, just refer to them as if they were a single figure.
 \end_layout
 
 \end_inset
-</cell>
-<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
-\begin_inset Text
 
-\begin_layout Plain Layout
-2946
+
 \end_layout
 
-\end_inset
-</cell>
-<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
-\begin_inset Text
+\begin_layout Standard
+\begin_inset Float figure
+wide false
+sideways false
+status collapsed
 
 \begin_layout Plain Layout
-0.588%
-\end_layout
+\align center
+\begin_inset Graphics
+	filename graphics/CD4-csaw/ChIP-seq/H3K4me2-PCA-raw-CROP.png
+	lyxscale 25
+	width 100col%
+	groupId colwidth-raster
 
 \end_inset
-</cell>
-<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
-\begin_inset Text
 
-\begin_layout Plain Layout
-6.57%
+
 \end_layout
 
-\end_inset
-</cell>
-</row>
-<row>
-<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
-\begin_inset Text
+\begin_layout Plain Layout
+\begin_inset Caption Standard
 
 \begin_layout Plain Layout
-H3K27me3
-\end_layout
+
+\series bold
+\begin_inset CommandInset label
+LatexCommand label
+name "fig:PCoA-H3K4me2-bad"
 
 \end_inset
-</cell>
-<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
-\begin_inset Text
 
-\begin_layout Plain Layout
-18139
+PCoA plot of H3K4me2 windows, before subtracting surrogate variables
 \end_layout
 
 \end_inset
-</cell>
-<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
-\begin_inset Text
 
-\begin_layout Plain Layout
-18967
+
 \end_layout
 
 \end_inset
-</cell>
-<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
-\begin_inset Text
 
-\begin_layout Plain Layout
-11.1%
+
 \end_layout
 
-\end_inset
-</cell>
-<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
-\begin_inset Text
+\begin_layout Standard
+\begin_inset Float figure
+wide false
+sideways false
+status collapsed
 
 \begin_layout Plain Layout
-22.5%
-\end_layout
-
-\end_inset
-</cell>
-</row>
-</lyxtabular>
+\align center
+\begin_inset Graphics
+	filename graphics/CD4-csaw/ChIP-seq/H3K4me2-PCA-SVsub-CROP.png
+	lyxscale 25
+	width 100col%
+	groupId colwidth-raster
 
 \end_inset
 
@@ -1596,11 +1503,11 @@ H3K27me3
 \series bold
 \begin_inset CommandInset label
 LatexCommand label
-name "tab:peak-calling-summary"
+name "fig:PCoA-H3K4me2-good"
 
 \end_inset
 
-SICER+IDR peak-calling summary
+PCoA plot of H3K4me2 windows, after subtracting surrogate variables
 \end_layout
 
 \end_inset
@@ -1614,71 +1521,62 @@ SICER+IDR peak-calling summary
 \end_layout
 
 \begin_layout Standard
-Figures 
-\begin_inset CommandInset ref
-LatexCommand ref
-reference "fig:IDR-RC-H3K4me2"
-plural "false"
-caps "false"
-noprefix "false"
+\begin_inset Float figure
+wide false
+sideways false
+status collapsed
+
+\begin_layout Plain Layout
+\align center
+\begin_inset Graphics
+	filename graphics/CD4-csaw/ChIP-seq/H3K4me3-PCA-raw-CROP.png
+	lyxscale 25
+	width 100col%
+	groupId colwidth-raster
 
 \end_inset
 
-, 
-\begin_inset CommandInset ref
-LatexCommand ref
-reference "fig:IDR-RC-H3K4me3"
-plural "false"
-caps "false"
-noprefix "false"
+
+\end_layout
+
+\begin_layout Plain Layout
+\begin_inset Caption Standard
+
+\begin_layout Plain Layout
+
+\series bold
+\begin_inset CommandInset label
+LatexCommand label
+name "fig:PCoA-H3K4me3-bad"
 
 \end_inset
 
-, and 
-\begin_inset CommandInset ref
-LatexCommand ref
-reference "fig:IDR-RC-H3K27me3"
-plural "false"
-caps "false"
-noprefix "false"
+PCoA plot of H3K4me3 windows, before subtracting surrogate variables
+\end_layout
 
 \end_inset
 
- show the IDR rank-consistency plots for peaks called in an arbitrarily-chosen
- pair of donors.
- For all 3 histone marks, when the peaks for each donor are ranked according
- to their scores, SICER produces much more reproducible results between
- donors.
- This is consistent with SICER's stated goal of identifying broad peaks,
- in contrast to MACS, which is designed for identifying sharp peaks.
- Based on this observation, the SICER peak calls were used for all downstream
- analyses that involved ChIP-seq peaks.
- Table 
-\begin_inset CommandInset ref
-LatexCommand ref
-reference "tab:peak-calling-summary"
-plural "false"
-caps "false"
-noprefix "false"
+
+\end_layout
 
 \end_inset
 
- gives a summary of the peak calling statistics for each histone mark.
+
 \end_layout
 
 \begin_layout Standard
 \begin_inset Float figure
 wide false
 sideways false
-status open
+status collapsed
 
 \begin_layout Plain Layout
 \align center
 \begin_inset Graphics
-	filename graphics/CD4-csaw/Promoter Peak Distance Profile-PAGE1-CROP.pdf
-	lyxscale 50
+	filename graphics/CD4-csaw/ChIP-seq/H3K4me3-PCA-SVsub-CROP.png
+	lyxscale 25
 	width 100col%
-	groupId colwidth
+	groupId colwidth-raster
 
 \end_inset
 
@@ -1693,20 +1591,16 @@ status open
 \series bold
 \begin_inset CommandInset label
 LatexCommand label
-name "fig:effective-promoter-radius"
+name "fig:PCoA-H3K4me3-good"
 
 \end_inset
 
-Enrichment of peaks in promoter neighborhoods.
+PCoA plot of H3K4me3 windows, after subtracting surrogate variables
 \end_layout
 
 \end_inset
 
 
-\end_layout
-
-\begin_layout Plain Layout
-
 \end_layout
 
 \end_inset
@@ -1714,58 +1608,38 @@ Enrichment of peaks in promoter neighborhoods.
 
 \end_layout
 
-\begin_layout Itemize
-Each histone mark is enriched within a certain radius of gene TSS positions,
- but that radius is different for each mark (figure 
-\begin_inset CommandInset ref
-LatexCommand ref
-reference "fig:effective-promoter-radius"
-plural "false"
-caps "false"
-noprefix "false"
-
-\end_inset
-
-, previously in 
-\begin_inset CommandInset citation
-LatexCommand cite
-key "LaMere2016"
-literal "false"
-
-\end_inset
-
- Fig.
- S2)
-\end_layout
-
-\begin_layout Subsection
-ChIP-seq blacklisting is important
-\end_layout
-
 \begin_layout Standard
 \begin_inset Float figure
 wide false
 sideways false
-status open
+status collapsed
 
 \begin_layout Plain Layout
 \align center
 \begin_inset Graphics
-	filename graphics/CD4-csaw/csaw/CCF-plots-PAGE2-CROP.pdf
-	lyxscale 50
+	filename graphics/CD4-csaw/ChIP-seq/H3K27me3-PCA-raw-CROP.png
+	lyxscale 25
 	width 100col%
-	groupId colwidth
+	groupId colwidth-raster
 
 \end_inset
 
 
 \end_layout
 
-\begin_layout Plain Layout
-\begin_inset Caption Standard
+\begin_layout Plain Layout
+\begin_inset Caption Standard
+
+\begin_layout Plain Layout
+
+\series bold
+\begin_inset CommandInset label
+LatexCommand label
+name "fig:PCoA-H3K27me3-bad"
+
+\end_inset
 
-\begin_layout Plain Layout
-Cross-correlation plots with blacklisted reads removed
+PCoA plot of H3K27me3 windows, before subtracting surrogate variables
 \end_layout
 
 \end_inset
@@ -1782,15 +1656,15 @@ Cross-correlation plots with blacklisted reads removed
 \begin_inset Float figure
 wide false
 sideways false
-status open
+status collapsed
 
 \begin_layout Plain Layout
 \align center
 \begin_inset Graphics
-	filename graphics/CD4-csaw/csaw/CCF-plots-noBL-PAGE2-CROP.pdf
-	lyxscale 50
+	filename graphics/CD4-csaw/ChIP-seq/H3K27me3-PCA-SVsub-CROP.png
+	lyxscale 25
 	width 100col%
-	groupId colwidth
+	groupId colwidth-raster
 
 \end_inset
 
@@ -1801,7 +1675,15 @@ status open
 \begin_inset Caption Standard
 
 \begin_layout Plain Layout
-Cross-correlation plots without removing blacklisted reads
+
+\series bold
+\begin_inset CommandInset label
+LatexCommand label
+name "fig:PCoA-H3K27me3-good"
+
+\end_inset
+
+PCoA plot of H3K27me3 windows, after subtracting surrogate variables
 \end_layout
 
 \end_inset
@@ -1814,21 +1696,29 @@ Cross-correlation plots without removing blacklisted reads
 
 \end_layout
 
-\begin_layout Subsection
-ChIP-seq normalization
+\begin_layout Itemize
+Figures showing BCV plots with and without SVA for each histone mark.
 \end_layout
 
 \begin_layout Standard
-\begin_inset Flex TODO Note (inline)
-status open
+\begin_inset ERT
+status collapsed
 
 \begin_layout Plain Layout
-Maybe just one of these figures and then say the other 2 were similar
+
+
+\backslash
+FloatBarrier
 \end_layout
 
 \end_inset
 
+ 
+\end_layout
 
+\begin_layout Subsection
+MOFA recovers biologically relevant variation from blind analysis by correlating
+ across datasets
 \end_layout
 
 \begin_layout Standard
@@ -1840,7 +1730,7 @@ status open
 \begin_layout Plain Layout
 \align center
 \begin_inset Graphics
-	filename graphics/CD4-csaw/ChIP-seq/H3K4me2-sample-MAplot-bins-CROP.png
+	filename graphics/CD4-csaw/MOFA-varExplaiend-matrix-CROP.png
 	lyxscale 25
 	width 100col%
 	groupId colwidth-raster
@@ -1856,7 +1746,13 @@ status open
 \begin_layout Plain Layout
 
 \series bold
-MA plot of H3K4me2 read counts in 10kb bins for two arbitrary samples
+\begin_inset CommandInset label
+LatexCommand label
+name "fig:mofa-varexplained"
+
+\end_inset
+
+Variance explained in each data set by each latent factor estimated by MOFA.
 \end_layout
 
 \end_inset
@@ -1869,16 +1765,43 @@ MA plot of H3K4me2 read counts in 10kb bins for two arbitrary samples
 
 \end_layout
 
+\begin_layout Itemize
+Figure 
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "fig:mofa-varexplained"
+plural "false"
+caps "false"
+noprefix "false"
+
+\end_inset
+
+ shows that LF1, 4, and 5 explain substantial var in all data sets
+\end_layout
+
 \begin_layout Standard
 \begin_inset Float figure
 wide false
 sideways false
 status open
 
+\begin_layout Plain Layout
+\begin_inset Flex TODO Note (inline)
+status open
+
+\begin_layout Plain Layout
+Maybe drop this one
+\end_layout
+
+\end_inset
+
+
+\end_layout
+
 \begin_layout Plain Layout
 \align center
 \begin_inset Graphics
-	filename graphics/CD4-csaw/ChIP-seq/H3K4me3-sample-MAplot-bins-CROP.png
+	filename graphics/CD4-csaw/MOFA-LF-distributions-CROP.png
 	lyxscale 25
 	width 100col%
 	groupId colwidth-raster
@@ -1894,7 +1817,13 @@ status open
 \begin_layout Plain Layout
 
 \series bold
-MA plot of H3K4me3 read counts in 10kb bins for two arbitrary samples
+\begin_inset CommandInset label
+LatexCommand label
+name "fig:mofa-lf-dist"
+
+\end_inset
+
+Sample distribution for each latent factor estimated by MOFA.
 \end_layout
 
 \end_inset
@@ -1913,10 +1842,23 @@ wide false
 sideways false
 status open
 
+\begin_layout Plain Layout
+\begin_inset Flex TODO Note (inline)
+status open
+
+\begin_layout Plain Layout
+Talk about how this supports the convergence hypothesis
+\end_layout
+
+\end_inset
+
+
+\end_layout
+
 \begin_layout Plain Layout
 \align center
 \begin_inset Graphics
-	filename graphics/CD4-csaw/ChIP-seq/H3K27me3-sample-MAplot-bins-CROP.png
+	filename graphics/CD4-csaw/MOFA-LF-scatter-CROP.png
 	lyxscale 25
 	width 100col%
 	groupId colwidth-raster
@@ -1932,7 +1874,13 @@ status open
 \begin_layout Plain Layout
 
 \series bold
-MA plot of H3K27me3 read counts in 10kb bins for two arbitrary samples
+\begin_inset CommandInset label
+LatexCommand label
+name "fig:mofa-lf-scatter"
+
+\end_inset
+
+Scatter plots of specific pairs of MOFA latent factors.
 \end_layout
 
 \end_inset
@@ -1945,33 +1893,45 @@ MA plot of H3K27me3 read counts in 10kb bins for two arbitrary samples
 
 \end_layout
 
-\begin_layout Subsection
-ChIP-seq must be corrected for hidden confounding factors
-\end_layout
+\begin_layout Itemize
+Figures 
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "fig:mofa-lf-dist"
+plural "false"
+caps "false"
+noprefix "false"
 
-\begin_layout Standard
-\begin_inset Flex TODO Note (inline)
-status open
+\end_inset
 
-\begin_layout Plain Layout
-Consolidate these into 1 2x3 grid
-\end_layout
+ and 
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "fig:mofa-lf-scatter"
+plural "false"
+caps "false"
+noprefix "false"
 
 \end_inset
 
+ show that those same 3 LFs, (1, 4, & 5) also correlate best with the experiment
+al factors (cell type & time point)
+\end_layout
 
+\begin_layout Itemize
+LF2 is clearly the RNA-seq batch effect
 \end_layout
 
 \begin_layout Standard
 \begin_inset Float figure
 wide false
 sideways false
-status collapsed
+status open
 
 \begin_layout Plain Layout
 \align center
 \begin_inset Graphics
-	filename graphics/CD4-csaw/ChIP-seq/H3K4me2-PCA-raw-CROP.png
+	filename graphics/CD4-csaw/MOFA-batch-correct-CROP.png
 	lyxscale 25
 	width 100col%
 	groupId colwidth-raster
@@ -1989,11 +1949,16 @@ status collapsed
 \series bold
 \begin_inset CommandInset label
 LatexCommand label
-name "fig:PCoA-H3K4me2-bad"
+name "fig:mofa-batchsub"
 
 \end_inset
 
-PCoA plot of H3K4me2 windows, before subtracting surrogate variables
+Result of RNA-seq batch-correction using MOFA latent factors
+\end_layout
+
+\end_inset
+
+
 \end_layout
 
 \end_inset
@@ -2001,48 +1966,117 @@ PCoA plot of H3K4me2 windows, before subtracting surrogate variables
 
 \end_layout
 
+\begin_layout Itemize
+Attempting to remove the effect of LF2 (Figure 
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "fig:mofa-batchsub"
+plural "false"
+caps "false"
+noprefix "false"
+
+\end_inset
+
+) results in batch correction comparable to ComBat (Figure 
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "fig:RNA-PCA-ComBat-batchsub"
+plural "false"
+caps "false"
+noprefix "false"
+
 \end_inset
 
+)
+\end_layout
+
+\begin_layout Itemize
+MOFA was able to do this batch subtraction without directly using the sample
+ labels (sample labels were used implicitly to select which factor to subtract)
+\end_layout
+
+\begin_layout Itemize
+Similarity of results shows that batch correction can't get much better
+ than ComBat (despite ComBat ignoring time point)
+\end_layout
 
+\begin_layout Subsection
+MOFA does some interesting stuff but is mostly confirmatory in this context
 \end_layout
 
 \begin_layout Standard
-\begin_inset Float figure
-wide false
-sideways false
-status collapsed
+\begin_inset Flex TODO Note (inline)
+status open
 
 \begin_layout Plain Layout
-\align center
-\begin_inset Graphics
-	filename graphics/CD4-csaw/ChIP-seq/H3K4me2-PCA-SVsub-CROP.png
-	lyxscale 25
-	width 100col%
-	groupId colwidth-raster
+MOFA should be a footnote to something else, not its own point
+\end_layout
 
 \end_inset
 
 
 \end_layout
 
-\begin_layout Plain Layout
-\begin_inset Caption Standard
+\begin_layout Standard
+\begin_inset Flex TODO Note (inline)
+status open
 
 \begin_layout Plain Layout
-
-\series bold
-\begin_inset CommandInset label
-LatexCommand label
-name "fig:PCoA-H3K4me2-good"
+Combine with previous subsection
+\end_layout
 
 \end_inset
 
-PCoA plot of H3K4me2 windows, after subtracting surrogate variables
+
 \end_layout
 
-\end_inset
+\begin_layout Itemize
+MOFA shows great promise for accelerating discovery of major biological
+ effects in multi-omics datasets
+\end_layout
+
+\begin_deeper
+\begin_layout Itemize
+MOFA successfully separates biologically relevant patterns of variation
+ from technical confounding factors without knowing the sample labels, by
+ finding latent factors that explain variation across multiple data sets.
+\end_layout
+
+\begin_layout Itemize
+MOFA was added to this analysis late and played primarily a confirmatory
+ role, but it was able to confirm earlier conclusions with much less prior
+ information (no sample labels) and much less analyst effort/input
+\end_layout
+
+\begin_layout Itemize
+Less input from analyst means less opportunity to introduce unwanted bias
+ into results
+\end_layout
+
+\begin_layout Itemize
+MOFA confirmed that the already-implemented batch correction in the RNA-seq
+ data was already performing as well as possible given the limitations of
+ the data
+\end_layout
+
+\end_deeper
+\begin_layout Section
+Results
+\end_layout
+
+\begin_layout Standard
+\begin_inset Note Note
+status open
 
+\begin_layout Plain Layout
+Focus on what hypotheses were tested, then select figures that show how
+ those hypotheses were tested, even if the result is a negative.
+\end_layout
 
+\begin_layout Plain Layout
+Not every interesting result needs to be in here.
+ Chapter should tell a story.
+ 
 \end_layout
 
 \end_inset
@@ -2051,42 +2085,31 @@ PCoA plot of H3K4me2 windows, after subtracting surrogate variables
 \end_layout
 
 \begin_layout Standard
-\begin_inset Float figure
-wide false
-sideways false
-status collapsed
+\begin_inset Flex TODO Note (inline)
+status open
 
 \begin_layout Plain Layout
-\align center
-\begin_inset Graphics
-	filename graphics/CD4-csaw/ChIP-seq/H3K4me3-PCA-raw-CROP.png
-	lyxscale 25
-	width 100col%
-	groupId colwidth-raster
+Maybe reorder these sections to do RNA-seq, then ChIP-seq, then combined
+ analyses?
+\end_layout
 
 \end_inset
 
 
 \end_layout
 
-\begin_layout Plain Layout
-\begin_inset Caption Standard
-
-\begin_layout Plain Layout
-
-\series bold
-\begin_inset CommandInset label
-LatexCommand label
-name "fig:PCoA-H3K4me3-bad"
-
-\end_inset
-
-PCoA plot of H3K4me3 windows, before subtracting surrogate variables
+\begin_layout Subsection
+H3K4 and H3K27 methylation occur in broad regions and are enriched near
+ promoters
 \end_layout
 
-\end_inset
-
+\begin_layout Standard
+\begin_inset Flex TODO Note (inline)
+status open
 
+\begin_layout Plain Layout
+Replace these figures with a single table of # of peaks called at chosen
+ IDR threshold, showing that SICER has more
 \end_layout
 
 \end_inset
@@ -2098,15 +2121,15 @@ PCoA plot of H3K4me3 windows, before subtracting surrogate variables
 \begin_inset Float figure
 wide false
 sideways false
-status collapsed
+status open
 
 \begin_layout Plain Layout
-\align center
-\begin_inset Graphics
-	filename graphics/CD4-csaw/ChIP-seq/H3K4me3-PCA-SVsub-CROP.png
-	lyxscale 25
-	width 100col%
-	groupId colwidth-raster
+\begin_inset Flex TODO Note (inline)
+status open
+
+\begin_layout Plain Layout
+Re-generate IDR rank consistency plots for SICER and MACS side-by-side
+\end_layout
 
 \end_inset
 
@@ -2121,11 +2144,11 @@ status collapsed
 \series bold
 \begin_inset CommandInset label
 LatexCommand label
-name "fig:PCoA-H3K4me3-good"
+name "fig:IDR-RC-H3K4me2"
 
 \end_inset
 
-PCoA plot of H3K4me3 windows, after subtracting surrogate variables
+Irreproducible Discovery Rate consistency plots for H3K4me2
 \end_layout
 
 \end_inset
@@ -2142,15 +2165,15 @@ PCoA plot of H3K4me3 windows, after subtracting surrogate variables
 \begin_inset Float figure
 wide false
 sideways false
-status collapsed
+status open
 
 \begin_layout Plain Layout
-\align center
-\begin_inset Graphics
-	filename graphics/CD4-csaw/ChIP-seq/H3K27me3-PCA-raw-CROP.png
-	lyxscale 25
-	width 100col%
-	groupId colwidth-raster
+\begin_inset Flex TODO Note (inline)
+status open
+
+\begin_layout Plain Layout
+Re-generate IDR rank consistency plots for SICER and MACS side-by-side
+\end_layout
 
 \end_inset
 
@@ -2165,11 +2188,11 @@ status collapsed
 \series bold
 \begin_inset CommandInset label
 LatexCommand label
-name "fig:PCoA-H3K27me3-bad"
+name "fig:IDR-RC-H3K4me3"
 
 \end_inset
 
-PCoA plot of H3K27me3 windows, before subtracting surrogate variables
+Irreproducible Discovery Rate consistency plots for H3K4me3
 \end_layout
 
 \end_inset
@@ -2186,15 +2209,15 @@ PCoA plot of H3K27me3 windows, before subtracting surrogate variables
 \begin_inset Float figure
 wide false
 sideways false
-status collapsed
+status open
 
 \begin_layout Plain Layout
-\align center
-\begin_inset Graphics
-	filename graphics/CD4-csaw/ChIP-seq/H3K27me3-PCA-SVsub-CROP.png
-	lyxscale 25
-	width 100col%
-	groupId colwidth-raster
+\begin_inset Flex TODO Note (inline)
+status open
+
+\begin_layout Plain Layout
+Re-generate IDR rank consistency plots for SICER and MACS side-by-side
+\end_layout
 
 \end_inset
 
@@ -2209,11 +2232,11 @@ status collapsed
 \series bold
 \begin_inset CommandInset label
 LatexCommand label
-name "fig:PCoA-H3K27me3-good"
+name "fig:IDR-RC-H3K27me3"
 
 \end_inset
 
-PCoA plot of H3K27me3 windows, after subtracting surrogate variables
+Irreproducible Discovery Rate consistency plots for H3K27me3
 \end_layout
 
 \end_inset
@@ -2226,35 +2249,23 @@ PCoA plot of H3K27me3 windows, after subtracting surrogate variables
 
 \end_layout
 
-\begin_layout Itemize
-Figures showing BCV plots with and without SVA for each histone mark.
-\end_layout
-
-\begin_layout Itemize
-\begin_inset Flex TODO Note (inline)
+\begin_layout Standard
+\begin_inset Float table
+wide false
+sideways false
 status open
 
 \begin_layout Plain Layout
-Can I do supplementary data on a thesis? This is a lot of plots for this
- section.
-\end_layout
-
-\end_inset
-
-
-\end_layout
-
-\begin_layout Subsection
-H3K4 and H3K27 promoter methylation has broadly the expected correlation
- with gene expression
-\end_layout
-
-\begin_layout Standard
+\align center
 \begin_inset Flex TODO Note (inline)
 status open
 
 \begin_layout Plain Layout
-This section can easily be cut, especially if I can't find those plots.
+Need 
+\emph on
+median
+\emph default
+ peak width, not mean
 \end_layout
 
 \end_inset
@@ -2262,181 +2273,205 @@ This section can easily be cut, especially if I can't find those plots.
 
 \end_layout
 
-\begin_layout Itemize
-H3K4 is correlated with higher expression, and H3K27 is correlated with
- lower expression genome-wide
-\end_layout
-
-\begin_layout Standard
-\begin_inset Flex TODO Note (inline)
-status open
+\begin_layout Plain Layout
+\align center
+\begin_inset Tabular
+<lyxtabular version="3" rows="4" columns="5">
+<features tabularvalignment="middle">
+<column alignment="center" valignment="top">
+<column alignment="center" valignment="top">
+<column alignment="center" valignment="top">
+<column alignment="center" valignment="top">
+<column alignment="center" valignment="top">
+<row>
+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
+\begin_inset Text
 
 \begin_layout Plain Layout
-Grr, gotta find these figures.
- Maybe in the old analysis?
+Histone Mark
 \end_layout
 
 \end_inset
+</cell>
+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
+\begin_inset Text
 
-
-\end_layout
-
-\begin_layout Itemize
-Figures showing these correlations: box/violin plots of expression distributions
- with every combination of peak presence/absence in promoter
+\begin_layout Plain Layout
+# Peaks
 \end_layout
 
-\begin_layout Itemize
-Appropriate statistical tests showing significant differences in expected
- directions
-\end_layout
+\end_inset
+</cell>
+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
+\begin_inset Text
 
-\begin_layout Subsection
-MOFA recovers biologically relevant variation from blind analysis by correlating
- across datasets
+\begin_layout Plain Layout
+Mean peak width
 \end_layout
 
-\begin_layout Standard
-\begin_inset Float figure
-wide false
-sideways false
-status open
+\end_inset
+</cell>
+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
+\begin_inset Text
 
 \begin_layout Plain Layout
-\align center
-\begin_inset Graphics
-	filename graphics/CD4-csaw/MOFA-varExplaiend-matrix-CROP.png
-	lyxscale 25
-	width 100col%
-	groupId colwidth-raster
+genome coverage
+\end_layout
 
 \end_inset
+</cell>
+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
+\begin_inset Text
 
-
+\begin_layout Plain Layout
+read coverage
 \end_layout
 
-\begin_layout Plain Layout
-\begin_inset Caption Standard
+\end_inset
+</cell>
+</row>
+<row>
+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
+\begin_inset Text
 
 \begin_layout Plain Layout
-
-\series bold
-\begin_inset CommandInset label
-LatexCommand label
-name "fig:mofa-varexplained"
+H3K4me2
+\end_layout
 
 \end_inset
+</cell>
+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
+\begin_inset Text
 
-Variance explained in each data set by each latent factor estimated by MOFA.
+\begin_layout Plain Layout
+14965
 \end_layout
 
 \end_inset
+</cell>
+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
+\begin_inset Text
 
-
+\begin_layout Plain Layout
+3970
 \end_layout
 
 \end_inset
+</cell>
+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
+\begin_inset Text
 
-
+\begin_layout Plain Layout
+1.92%
 \end_layout
 
-\begin_layout Itemize
-Figure 
-\begin_inset CommandInset ref
-LatexCommand ref
-reference "fig:mofa-varexplained"
-plural "false"
-caps "false"
-noprefix "false"
-
 \end_inset
+</cell>
+<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
+\begin_inset Text
 
- shows that LF1, 4, and 5 explain substantial var in all data sets
+\begin_layout Plain Layout
+14.2%
 \end_layout
 
-\begin_layout Standard
-\begin_inset Float figure
-wide false
-sideways false
-status open
-
-\begin_layout Plain Layout
-\begin_inset Flex TODO Note (inline)
-status open
+\end_inset
+</cell>
+</row>
+<row>
+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
+\begin_inset Text
 
 \begin_layout Plain Layout
-Maybe drop this one
+H3K4me3
 \end_layout
 
 \end_inset
-
-
-\end_layout
+</cell>
+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
+\begin_inset Text
 
 \begin_layout Plain Layout
-\align center
-\begin_inset Graphics
-	filename graphics/CD4-csaw/MOFA-LF-distributions-CROP.png
-	lyxscale 25
-	width 100col%
-	groupId colwidth-raster
+6163
+\end_layout
 
 \end_inset
+</cell>
+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
+\begin_inset Text
 
-
+\begin_layout Plain Layout
+2946
 \end_layout
 
-\begin_layout Plain Layout
-\begin_inset Caption Standard
+\end_inset
+</cell>
+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
+\begin_inset Text
 
 \begin_layout Plain Layout
-
-\series bold
-\begin_inset CommandInset label
-LatexCommand label
-name "fig:mofa-lf-dist"
+0.588%
+\end_layout
 
 \end_inset
+</cell>
+<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
+\begin_inset Text
 
-Sample distribution for each latent factor estimated by MOFA.
+\begin_layout Plain Layout
+6.57%
 \end_layout
 
 \end_inset
+</cell>
+</row>
+<row>
+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
+\begin_inset Text
 
-
+\begin_layout Plain Layout
+H3K27me3
 \end_layout
 
 \end_inset
+</cell>
+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
+\begin_inset Text
 
-
+\begin_layout Plain Layout
+18139
 \end_layout
 
-\begin_layout Standard
-\begin_inset Float figure
-wide false
-sideways false
-status open
+\end_inset
+</cell>
+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
+\begin_inset Text
 
 \begin_layout Plain Layout
-\begin_inset Flex TODO Note (inline)
-status open
+18967
+\end_layout
+
+\end_inset
+</cell>
+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
+\begin_inset Text
 
 \begin_layout Plain Layout
-Talk about how this supports the convergence hypothesis
+11.1%
 \end_layout
 
 \end_inset
+</cell>
+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
+\begin_inset Text
 
-
+\begin_layout Plain Layout
+22.5%
 \end_layout
 
-\begin_layout Plain Layout
-\align center
-\begin_inset Graphics
-	filename graphics/CD4-csaw/MOFA-LF-scatter-CROP.png
-	lyxscale 25
-	width 100col%
-	groupId colwidth-raster
+\end_inset
+</cell>
+</row>
+</lyxtabular>
 
 \end_inset
 
@@ -2451,11 +2486,11 @@ Talk about how this supports the convergence hypothesis
 \series bold
 \begin_inset CommandInset label
 LatexCommand label
-name "fig:mofa-lf-scatter"
+name "tab:peak-calling-summary"
 
 \end_inset
 
-Scatter plots of specific pairs of MOFA latent factors.
+SICER+IDR peak-calling summary
 \end_layout
 
 \end_inset
@@ -2468,33 +2503,57 @@ Scatter plots of specific pairs of MOFA latent factors.
 
 \end_layout
 
-\begin_layout Itemize
+\begin_layout Standard
 Figures 
 \begin_inset CommandInset ref
 LatexCommand ref
-reference "fig:mofa-lf-dist"
+reference "fig:IDR-RC-H3K4me2"
 plural "false"
 caps "false"
 noprefix "false"
 
 \end_inset
 
- and 
+, 
 \begin_inset CommandInset ref
 LatexCommand ref
-reference "fig:mofa-lf-scatter"
+reference "fig:IDR-RC-H3K4me3"
 plural "false"
 caps "false"
 noprefix "false"
 
 \end_inset
 
- show that those same 3 LFs, (1, 4, & 5) also correlate best with the experiment
-al factors (cell type & time point)
-\end_layout
+, and 
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "fig:IDR-RC-H3K27me3"
+plural "false"
+caps "false"
+noprefix "false"
 
-\begin_layout Itemize
-LF2 is clearly the RNA-seq batch effect
+\end_inset
+
+ show the IDR rank-consistency plots for peaks called in an arbitrarily-chosen
+ pair of donors.
+ For all 3 histone marks, when the peaks for each donor are ranked according
+ to their scores, SICER produces much more reproducible results between
+ donors.
+ This is consistent with SICER's stated goal of identifying broad peaks,
+ in contrast to MACS, which is designed for identifying sharp peaks.
+ Based on this observation, the SICER peak calls were used for all downstream
+ analyses that involved ChIP-seq peaks.
+ Table 
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "tab:peak-calling-summary"
+plural "false"
+caps "false"
+noprefix "false"
+
+\end_inset
+
+ gives a summary of the peak calling statistics for each histone mark.
 \end_layout
 
 \begin_layout Standard
@@ -2506,10 +2565,10 @@ status open
 \begin_layout Plain Layout
 \align center
 \begin_inset Graphics
-	filename graphics/CD4-csaw/MOFA-batch-correct-CROP.png
-	lyxscale 25
+	filename graphics/CD4-csaw/Promoter Peak Distance Profile-PAGE1-CROP.pdf
+	lyxscale 50
 	width 100col%
-	groupId colwidth-raster
+	groupId colwidth
 
 \end_inset
 
@@ -2524,16 +2583,20 @@ status open
 \series bold
 \begin_inset CommandInset label
 LatexCommand label
-name "fig:mofa-batchsub"
+name "fig:effective-promoter-radius"
 
 \end_inset
 
-Result of RNA-seq batch-correction using MOFA latent factors
+Enrichment of peaks in promoter neighborhoods.
 \end_layout
 
 \end_inset
 
 
+\end_layout
+
+\begin_layout Plain Layout
+
 \end_layout
 
 \end_inset
@@ -2542,37 +2605,74 @@ Result of RNA-seq batch-correction using MOFA latent factors
 \end_layout
 
 \begin_layout Itemize
-Attempting to remove the effect of LF2 (Figure 
+Each histone mark is enriched within a certain radius of gene TSS positions,
+ but that radius is different for each mark (figure 
 \begin_inset CommandInset ref
 LatexCommand ref
-reference "fig:mofa-batchsub"
+reference "fig:effective-promoter-radius"
 plural "false"
 caps "false"
 noprefix "false"
 
 \end_inset
 
-) results in batch correction comparable to ComBat (Figure 
-\begin_inset CommandInset ref
-LatexCommand ref
-reference "fig:RNA-PCA-ComBat-batchsub"
-plural "false"
-caps "false"
-noprefix "false"
+, previously in 
+\begin_inset CommandInset citation
+LatexCommand cite
+key "LaMere2016"
+literal "false"
 
 \end_inset
 
-)
+ Fig.
+ S2)
+\end_layout
+
+\begin_layout Subsection
+H3K4 and H3K27 promoter methylation has broadly the expected correlation
+ with gene expression
+\end_layout
+
+\begin_layout Standard
+\begin_inset Flex TODO Note (inline)
+status open
+
+\begin_layout Plain Layout
+This section can easily be cut, especially if I can't find those plots.
+\end_layout
+
+\end_inset
+
+
 \end_layout
 
 \begin_layout Itemize
-MOFA was able to do this batch subtraction without directly using the sample
- labels (sample labels were used implicitly to select which factor to subtract)
+H3K4 is correlated with higher expression, and H3K27 is correlated with
+ lower expression genome-wide
+\end_layout
+
+\begin_layout Standard
+\begin_inset Flex TODO Note (inline)
+status open
+
+\begin_layout Plain Layout
+Grr, gotta find these figures.
+ Maybe in the old analysis?
+\end_layout
+
+\end_inset
+
+
 \end_layout
 
 \begin_layout Itemize
-Similarity of results shows that batch correction can't get much better
- than ComBat (despite ComBat ignoring time point)
+Figures showing these correlations: box/violin plots of expression distributions
+ with every combination of peak presence/absence in promoter
+\end_layout
+
+\begin_layout Itemize
+Appropriate statistical tests showing significant differences in expected
+ directions
 \end_layout
 
 \begin_layout Subsection
@@ -2995,46 +3095,23 @@ Try to boil it down to 3 main messages to get across
 \end_layout
 
 \begin_layout Itemize
-"Promoter radius" is not constant and must be defined empirically for a
- given data set.
- Coverage within promoter radius has an expression correlation as well
-\end_layout
-
-\begin_layout Standard
-\begin_inset Flex TODO Note (inline)
-status open
-
-\begin_layout Plain Layout
-MOFA should be a footnote to something else, not its own point
-\end_layout
-
-\end_inset
-
-
-\end_layout
-
-\begin_layout Itemize
-MOFA shows great promise for accelerating discovery of major biological
- effects in multi-omics datasets
+3 Main points
 \end_layout
 
 \begin_deeper
 \begin_layout Itemize
-MOFA successfully separates biologically relevant patterns of variation
- from technical confounding factors without knowing the sample labels, by
- finding latent factors that explain variation across multiple data sets.
+"Promoter radius" is not constant and must be defined empirically for a
+ given data set.
+ Coverage within promoter radius has an expression correlation as well
 \end_layout
 
 \begin_layout Itemize
-MOFA was added to this analysis late and played primarily a confirmatory
- role, but it was able to confirm earlier conclusions with much less prior
- information (no sample labels) and much less analyst effort
+Naive-to-memory convergence in certain data sets but not others, implies
+ which marks are involved in memory differentiation
 \end_layout
 
 \begin_layout Itemize
-MOFA confirmed that the already-implemented batch correction in the RNA-seq
- data was already performing as well as possible given the limitations of
- the data
+TSS positional coverage, hints of something interesting but no clear conclusions
 \end_layout
 
 \end_deeper
@@ -3169,7 +3246,7 @@ Improving array-based analyses of transplant rejection by optimizing data
 status open
 
 \begin_layout Plain Layout
-Author list: Me, Sunil, Tom, Padma, Dan
+Chapter author list: Me, Sunil, Tom, Padma, Dan
 \end_layout
 
 \end_inset
@@ -3530,7 +3607,7 @@ literal "true"
 
 \begin_layout Standard
 \begin_inset Flex TODO Note (inline)
-status collapsed
+status open
 
 \begin_layout Plain Layout
 Find appropriate GEO identifiers if possible.
@@ -3583,10 +3660,11 @@ on of TX and AR samples was considered.
 
 \begin_layout Standard
 \begin_inset Flex TODO Note (inline)
-status collapsed
+status open
 
 \begin_layout Plain Layout
-Summarize the get.best.threshold algorithm for PAM threshold selection
+Summarize the get.best.threshold algorithm for PAM threshold selection, or
+ just put the code online?
 \end_layout
 
 \end_inset