6 年之前 · 40b0e7f13e
--- a/ROC-TXvsAR-external-AUC.xlsx
+++ b/ROC-TXvsAR-external-AUC.xlsx
--- a/ROC-TXvsAR-internal-AUC.xlsx
+++ b/ROC-TXvsAR-internal-AUC.xlsx
--- a/graphics/PAM/README.md
+++ b/graphics/PAM/README.md
@@ -0,0 +1,26 @@
 
				+(This was written back in 2013, and I can't necessarily vouch for any
			
 
				+of the claims within it.)
			
 
				+
			
 
				+# Questions
			
 
				+
			
 
				+* Overarching question: Can we accurately distinguish AR from TX?
			
 
				+* Can we work well in "clinical" mode, i.e. classifying single samples?
			
 
				+  * How to normalize new sample with training set?
			
 
				+  * How to avoid recalculating classifier for each sample?
			
 
				+* Can we perform well on an external validation set (GEO data)?
			
 
				+  * Are the same genes predictive in both datasets?
			
 
				+  * Can a classifier trained on our data perform well on GEO data?
			
 
				+
			
 
				+# Experiments
			
 
				+
			
 
				+* pam-analysis.R 
			
 
				+    * How important is it to normalize to the training set? (RMA separate vs together)
			
 
				+    * Conclusion: must normalize together. Separate introduced bias
			
 
				+      toward one class or the other.
			
 
				+    * Question: how to do it with a single sample?
			
 
				+* pam-analysis-norm.R
			
 
				+    * Can single-channel normalization improve classification results? Yes.
			
 
				+    * Try PAM with RMA and two single-channel normalizations
			
 
				+    * fRMA improves cross-dataset accuracy from 65% to 71%.
			
 
				+* limma-analysis-norm.R
			
 
				+    * What is the source of the variation
			
--- a/graphics/PAM/external-roc-frma.pdf
+++ b/graphics/PAM/external-roc-frma.pdf
--- a/graphics/frma-pax-bx/M-BX-violin.pdf
+++ b/graphics/frma-pax-bx/M-BX-violin.pdf
--- a/graphics/frma-pax-bx/M-PAX-violin.pdf
+++ b/graphics/frma-pax-bx/M-PAX-violin.pdf
--- a/graphics/frma-pax-bx/MA-BX-RMA.fRMA.pdf
+++ b/graphics/frma-pax-bx/MA-BX-RMA.fRMA.pdf
--- a/graphics/frma-pax-bx/MA-BX-fRMA.fRMA.pdf
+++ b/graphics/frma-pax-bx/MA-BX-fRMA.fRMA.pdf
--- a/graphics/frma-pax-bx/MA-PAX-RMA.fRMA.pdf
+++ b/graphics/frma-pax-bx/MA-PAX-RMA.fRMA.pdf
--- a/graphics/frma-pax-bx/MA-PAX-fRMA.fRMA.pdf
+++ b/graphics/frma-pax-bx/MA-PAX-fRMA.fRMA.pdf
--- a/refs.bib
+++ b/refs.bib
--- a/thesis.lyx
+++ b/thesis.lyx
@@ -890,7 +890,7 @@ The choice of pre-processing algorithms used in the analysis of an array
 
				 \end_layout
			
 
				 
			
 
				 \begin_layout Subsection
			
 
				-Frozen RMA for clinical microarray classifiers
			
 
				+Normalization for clinical microarray classifiers must be single-channel
			
 
				 \end_layout
			
 
				 
			
 
				 \begin_layout Subsubsection
			
@@ -941,10 +941,19 @@ exist
 
				  This would ensure that each array's normalization is independent of every
			
 
				  other array, and that arrays normalized separately can still be compared
			
 
				  to each other without bias.
			
 
				+ Such a normalization is commonly referred to as 
			
 
				+\begin_inset Quotes eld
			
 
				+\end_inset
			
 
				+
			
 
				+single-channel normalization
			
 
				+\begin_inset Quotes erd
			
 
				+\end_inset
			
 
				+
			
 
				+.
			
 
				 \end_layout
			
 
				 
			
 
				 \begin_layout Subsubsection
			
 
				-Frozen RMA satisfies clinical normalization requirements
			
 
				+Several strategies are available to meet clinical normalization requirements
			
 
				 \end_layout
			
 
				 
			
 
				 \begin_layout Standard
			
@@ -985,16 +994,33 @@ One important limitation of fRMA is that it requires a separate reference
 
				  samples on that platform 
			
 
				 \begin_inset CommandInset citation
			
 
				 LatexCommand cite
			
 
				-key "HudsonK.&RemediosC.2010"
			
 
				+key "McCall2011"
			
 
				+literal "false"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+.
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Standard
			
 
				+One other option is the aptly-named Single Channel Array Normalization (SCAN),
			
 
				+ which adapts a normalization method originally designed for tiling arrays
			
 
				+ 
			
 
				+\begin_inset CommandInset citation
			
 
				+LatexCommand cite
			
 
				+key "Piccolo2012"
			
 
				 literal "false"
			
 
				 
			
 
				 \end_inset
			
 
				 
			
 
				 .
			
 
				+ SCAN is truly single-channel in that it does not require a set of normalization
			
 
				+ paramters estimated from an external set of reference samples like fRMA
			
 
				+ does.
			
 
				 \end_layout
			
 
				 
			
 
				 \begin_layout Subsection
			
 
				-Adapting voom to model heteroskedasticity in methylation array data
			
 
				+Heteroskedasticity must be accounted for in methylation array data 
			
 
				 \end_layout
			
 
				 
			
 
				 \begin_layout Subsubsection
			
@@ -1156,13 +1182,14 @@ Methods
 
				 \end_layout
			
 
				 
			
 
				 \begin_layout Subsection
			
 
				-fRMA
			
 
				+Evaluation of classifier performance with different normalization methods
			
 
				 \end_layout
			
 
				 
			
 
				 \begin_layout Standard
			
 
				-For testing RMA against fRMA, a data set of 157 hgu133plus2 arrays was used,
			
 
				- consisting of blood samples from kidney transplant patients whose grafts
			
 
				- had been graded as TX, AR, or ADNR via biopsy and histology 
			
 
				+For testing different normalizations, a data set of 157 hgu133plus2 arrays
			
 
				+ was used, consisting of blood samples from kidney transplant patients whose
			
 
				+ grafts had been graded as TX, AR, or ADNR via biopsy and histology (46
			
 
				+ TX, 69 AR, 42 ADNR) 
			
 
				 \begin_inset CommandInset citation
			
 
				 LatexCommand cite
			
 
				 key "Kurian2014"
			
@@ -1171,10 +1198,9 @@ literal "true"
 
				 \end_inset
			
 
				 
			
 
				 .
			
 
				- These were split into a training set (23 TX, 35 AR, 21 ADNR) and a validation
			
 
				- set (23 TX, 34 AR, 21 ADNR).
			
 
				- Additionally, an external validation was gathered from public GEO data
			
 
				- (37 TX, 38 AR, no ADNR).
			
 
				+ Additionally, an external validation set of 75 samples was gathered from
			
 
				+ public GEO data (37 TX, 38 AR, no ADNR).
			
 
				+ 
			
 
				 \end_layout
			
 
				 
			
 
				 \begin_layout Standard
			
@@ -1192,20 +1218,154 @@ Find appropriate GEO identifiers if possible.
 
				 
			
 
				 \end_layout
			
 
				 
			
 
				-\begin_layout Itemize
			
 
				-Expression array normalization for detecting acute rejection
			
 
				+\begin_layout Standard
			
 
				+To evaluate the effect of each normalization on classifier performance,
			
 
				+ the same classifier training and validation procedure was used after each
			
 
				+ normalization method.
			
 
				+ The PAM package was used to train a nearest shrunken centroid classifier
			
 
				+ on the training set and select the appropriate threshold for centroid shrinking.
			
 
				+ Then the trained classifier was used to predict the class probabilities
			
 
				+ of each validation sample.
			
 
				+ From these class probabilities, ROC curves and area-under-curve (AUC) values
			
 
				+ were generated 
			
 
				+\begin_inset CommandInset citation
			
 
				+LatexCommand cite
			
 
				+key "Turck2011"
			
 
				+literal "false"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+.
			
 
				+ Each normalization was tested on two different sets of training and validation
			
 
				+ samples.
			
 
				+ For internal validation, the 115 TX and AR arrays in the internal set were
			
 
				+ split at random into two equal sized sets, one for training and one for
			
 
				+ validation, each containing the same numbers of TX and AR samples as the
			
 
				+ other set.
			
 
				+ For external validation, the full set of 115 TX and AR samples were used
			
 
				+ as a training set, and the 75 external TX and AR samples were used as the
			
 
				+ validation set.
			
 
				+ Thus, 2 ROC curves and AUC values were generated for each normalization
			
 
				+ method: one internal and one external.
			
 
				+ Because the external validation set contains no ADNR samples, only classificati
			
 
				+on of TX and AR samples was considered.
			
 
				+ The ADNR samples were included during normalization but excluded from all
			
 
				+ classifier training and validation.
			
 
				+ This ensures that the performance on internal and external validation sets
			
 
				+ is directly comparable.
			
 
				 \end_layout
			
 
				 
			
 
				-\begin_layout Itemize
			
 
				-Use frozen RMA, a single-channel variant of RMA
			
 
				+\begin_layout Standard
			
 
				+\begin_inset Flex TODO Note (inline)
			
 
				+status collapsed
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+Summarize the get.best.threshold algorithm for PAM threshold selection
			
 
				 \end_layout
			
 
				 
			
 
				-\begin_layout Itemize
			
 
				-Generate custom fRMA normalization vectors for each tissue (biopsy, blood)
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				 \end_layout
			
 
				 
			
 
				-\begin_layout Subsubsection
			
 
				-Methylation arrays
			
 
				+\begin_layout Standard
			
 
				+Six different normalization strategies were evaluated.
			
 
				+ First, 2 well-known non-single-channel normalization methods were considered:
			
 
				+ RMA and dChip 
			
 
				+\begin_inset CommandInset citation
			
 
				+LatexCommand cite
			
 
				+key "Li2001,Irizarry2003a"
			
 
				+literal "false"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+.
			
 
				+ Since RMA produces expression values on a log2 scale and dChip does not,
			
 
				+ the values from dChip were log2 transformed after normalization.
			
 
				+ Next, RMA and dChip followed by Global Rank-invariant Set Normalization
			
 
				+ (GRSN) were tested 
			
 
				+\begin_inset CommandInset citation
			
 
				+LatexCommand cite
			
 
				+key "Pelz2008"
			
 
				+literal "false"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+.
			
 
				+ Post-processing with GRSN does not turn RMA or dChip into single-channel
			
 
				+ methods, but it may help mitigate batch effects and is therefore useful
			
 
				+ as a benchmark.
			
 
				+ Lastly, the two single-channel normalization methods, fRMA and SCAN, were
			
 
				+ tested 
			
 
				+\begin_inset CommandInset citation
			
 
				+LatexCommand cite
			
 
				+key "McCall2010,Piccolo2012"
			
 
				+literal "false"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+.
			
 
				+ When evaluting internal validation performance, only the 157 internal samples
			
 
				+ were normalized; when evaluating external validation performance, all 157
			
 
				+ internal samples and 75 external samples were normalized together.
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Standard
			
 
				+For demonstrating the problem with separate normalization of training and
			
 
				+ validation data, one additional normalization was performed: the internal
			
 
				+ and external sets were each normalized separately using RMA, and the normalized
			
 
				+ data for each set were combined into a single set with no further attempts
			
 
				+ at normalizing between the two sets.
			
 
				+ The represents approximately how RMA would have to be used in a clinical
			
 
				+ setting, where the samples to be classified are not available at the time
			
 
				+ the classifier is trained.
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Subsection
			
 
				+Generating custom fRMA vectors for hthgu133pluspm array platform
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Standard
			
 
				+In order to enable fRMA normalization for the hthgu133pluspm array platform,
			
 
				+ custom fRMA normalization vectors were trained using the frmaTools package
			
 
				+ 
			
 
				+\begin_inset CommandInset citation
			
 
				+LatexCommand cite
			
 
				+key "McCall2011"
			
 
				+literal "false"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+.
			
 
				+ Separate vectors were created for two types of samples: kidney graft biopsy
			
 
				+ samples and blood samples from graft recipients.
			
 
				+ For training, a 341 kidney biopsy samples from 2 data sets and 965 blood
			
 
				+ samples from 5 data sets were used as the reference set.
			
 
				+ Arrays were groups into batches based on unique combinations of sample
			
 
				+ type (blood or biopsy), diagnosis (TX, AR, etc.), data set, and scan date.
			
 
				+ Thus, each batch represents arrays of the same kind that were run together
			
 
				+ on the same day.
			
 
				+ For estimating the probe inverse variance weights, frmaTools requires equal-siz
			
 
				+ed batches, which means a batch size must be chosen, and then batches smaller
			
 
				+ than that size must be ignored, while batches larger than the chosen size
			
 
				+ must be downsampled.
			
 
				+ This downsampling is performed randomly, so the sampling process is repeated
			
 
				+ 5 times and the resulting normalizations are compared to each other.
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Standard
			
 
				+To evaluate the consistency of the generated normalization vectors, the
			
 
				+ 5 fRMA vector sets generated from 5 random batch samplings were each used
			
 
				+ to normalize the same 20 randomly selected samples from each tissue.
			
 
				+ Then the normalized expression values for each probe on each array were
			
 
				+ compared across all normalizations.
			
 
				+ Each fRMA normalization was also compared against the normalized expression
			
 
				+ values obtained by normalizing the same 20 samples with ordinary RMA.
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Subsection
			
 
				+Modeling methylation array M-value heteroskedasticy with modified voom implement
			
 
				+ation
			
 
				 \end_layout
			
 
				 
			
 
				 \begin_layout Itemize
			
@@ -1238,15 +1398,981 @@ Improve subsection titles in this section
 
				 \end_inset
			
 
				 
			
 
				 
			
 
				-\end_layout
			
 
				-
			
 
				-\begin_layout Subsection
			
 
				-fRMA eliminates unwanted dependence of classifier training on normalization
			
 
				- strategy caused by RMA
			
 
				-\end_layout
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Subsection
			
 
				+fRMA eliminates unwanted dependence of classifier training on normalization
			
 
				+ strategy caused by RMA
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Subsubsection
			
 
				+Separate normalization with RMA introduces unwanted biases in classification
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Standard
			
 
				+\begin_inset Float figure
			
 
				+wide false
			
 
				+sideways false
			
 
				+status collapsed
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+\begin_inset Graphics
			
 
				+	filename graphics/PAM/predplot.pdf
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+\begin_inset Caption Standard
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+\begin_inset CommandInset label
			
 
				+LatexCommand label
			
 
				+name "fig:Classifier-probabilities-RMA"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\series bold
			
 
				+Classifier probabilities on validation samples when normalized with RMA
			
 
				+ together vs.
			
 
				+ separately.
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Standard
			
 
				+To demonstrate the problem with non-single-channel methods, we considered
			
 
				+ the problem of training a classifier to distinguish TX from AR using the
			
 
				+ samples from the internal set as training data, evaluating performance
			
 
				+ on the external set.
			
 
				+ First, training and evaluation were performed after normalizing all array
			
 
				+ samples together as a single set using RMA, and second, the internal samples
			
 
				+ were normalized separately from the external samples and the training and
			
 
				+ evaluation were repeated.
			
 
				+ For each sample in the validation set, the classifier probabilities from
			
 
				+ both classifiers were plotted against each other (Fig.
			
 
				+ 
			
 
				+\begin_inset CommandInset ref
			
 
				+LatexCommand ref
			
 
				+reference "fig:Classifier-probabilities-RMA"
			
 
				+plural "false"
			
 
				+caps "false"
			
 
				+noprefix "false"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+).
			
 
				+ As expected, separate normalization biases the classifier probabilities,
			
 
				+ resulting in several misclassifications.
			
 
				+ In this case, the bias from separate normalization causes the classifier
			
 
				+ to assign a lower probability of AR to every sample.
			
 
				+ 
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Subsubsection
			
 
				+fRMA and SCAN achieve maintain classification performance while eliminating
			
 
				+ dependence on normalization strategy
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Standard
			
 
				+\begin_inset Float figure
			
 
				+wide false
			
 
				+sideways false
			
 
				+status collapsed
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+\begin_inset Graphics
			
 
				+	filename graphics/PAM/ROC-TXvsAR-internal.pdf
			
 
				+	width 100col%
			
 
				+	groupId colwidth
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+\begin_inset Caption Standard
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+\begin_inset CommandInset label
			
 
				+LatexCommand label
			
 
				+name "fig:ROC-PAM-int"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+ROC curves for PAM on internal validation data using different normalization
			
 
				+ strategies
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Standard
			
 
				+\begin_inset Float table
			
 
				+wide false
			
 
				+sideways false
			
 
				+status collapsed
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+\begin_inset Tabular
			
 
				+<lyxtabular version="3" rows="7" columns="4">
			
 
				+<features tabularvalignment="middle">
			
 
				+<column alignment="center" valignment="top">
			
 
				+<column alignment="center" valignment="top">
			
 
				+<column alignment="center" valignment="top">
			
 
				+<column alignment="center" valignment="top">
			
 
				+<row>
			
 
				+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+
			
 
				+\family roman
			
 
				+\series medium
			
 
				+\shape up
			
 
				+\size normal
			
 
				+\emph off
			
 
				+\bar no
			
 
				+\strikeout off
			
 
				+\xout off
			
 
				+\uuline off
			
 
				+\uwave off
			
 
				+\noun off
			
 
				+\color none
			
 
				+Normalization
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+Single-channel
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+
			
 
				+\family roman
			
 
				+\series medium
			
 
				+\shape up
			
 
				+\size normal
			
 
				+\emph off
			
 
				+\bar no
			
 
				+\strikeout off
			
 
				+\xout off
			
 
				+\uuline off
			
 
				+\uwave off
			
 
				+\noun off
			
 
				+\color none
			
 
				+Internal Validation AUC
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+External Validation AUC
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+</row>
			
 
				+<row>
			
 
				+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+
			
 
				+\family roman
			
 
				+\series medium
			
 
				+\shape up
			
 
				+\size normal
			
 
				+\emph off
			
 
				+\bar no
			
 
				+\strikeout off
			
 
				+\xout off
			
 
				+\uuline off
			
 
				+\uwave off
			
 
				+\noun off
			
 
				+\color none
			
 
				+RMA
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+No
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+
			
 
				+\family roman
			
 
				+\series medium
			
 
				+\shape up
			
 
				+\size normal
			
 
				+\emph off
			
 
				+\bar no
			
 
				+\strikeout off
			
 
				+\xout off
			
 
				+\uuline off
			
 
				+\uwave off
			
 
				+\noun off
			
 
				+\color none
			
 
				+0.852
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+
			
 
				+\family roman
			
 
				+\series medium
			
 
				+\shape up
			
 
				+\size normal
			
 
				+\emph off
			
 
				+\bar no
			
 
				+\strikeout off
			
 
				+\xout off
			
 
				+\uuline off
			
 
				+\uwave off
			
 
				+\noun off
			
 
				+\color none
			
 
				+0.713
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+</row>
			
 
				+<row>
			
 
				+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+
			
 
				+\family roman
			
 
				+\series medium
			
 
				+\shape up
			
 
				+\size normal
			
 
				+\emph off
			
 
				+\bar no
			
 
				+\strikeout off
			
 
				+\xout off
			
 
				+\uuline off
			
 
				+\uwave off
			
 
				+\noun off
			
 
				+\color none
			
 
				+dChip
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+No
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+
			
 
				+\family roman
			
 
				+\series medium
			
 
				+\shape up
			
 
				+\size normal
			
 
				+\emph off
			
 
				+\bar no
			
 
				+\strikeout off
			
 
				+\xout off
			
 
				+\uuline off
			
 
				+\uwave off
			
 
				+\noun off
			
 
				+\color none
			
 
				+0.891
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+
			
 
				+\family roman
			
 
				+\series medium
			
 
				+\shape up
			
 
				+\size normal
			
 
				+\emph off
			
 
				+\bar no
			
 
				+\strikeout off
			
 
				+\xout off
			
 
				+\uuline off
			
 
				+\uwave off
			
 
				+\noun off
			
 
				+\color none
			
 
				+0.657
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+</row>
			
 
				+<row>
			
 
				+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+
			
 
				+\family roman
			
 
				+\series medium
			
 
				+\shape up
			
 
				+\size normal
			
 
				+\emph off
			
 
				+\bar no
			
 
				+\strikeout off
			
 
				+\xout off
			
 
				+\uuline off
			
 
				+\uwave off
			
 
				+\noun off
			
 
				+\color none
			
 
				+RMA + GRSN
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+No
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+
			
 
				+\family roman
			
 
				+\series medium
			
 
				+\shape up
			
 
				+\size normal
			
 
				+\emph off
			
 
				+\bar no
			
 
				+\strikeout off
			
 
				+\xout off
			
 
				+\uuline off
			
 
				+\uwave off
			
 
				+\noun off
			
 
				+\color none
			
 
				+0.816
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+
			
 
				+\family roman
			
 
				+\series medium
			
 
				+\shape up
			
 
				+\size normal
			
 
				+\emph off
			
 
				+\bar no
			
 
				+\strikeout off
			
 
				+\xout off
			
 
				+\uuline off
			
 
				+\uwave off
			
 
				+\noun off
			
 
				+\color none
			
 
				+0.750
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+</row>
			
 
				+<row>
			
 
				+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+
			
 
				+\family roman
			
 
				+\series medium
			
 
				+\shape up
			
 
				+\size normal
			
 
				+\emph off
			
 
				+\bar no
			
 
				+\strikeout off
			
 
				+\xout off
			
 
				+\uuline off
			
 
				+\uwave off
			
 
				+\noun off
			
 
				+\color none
			
 
				+dChip + GRSN
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+No
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+
			
 
				+\family roman
			
 
				+\series medium
			
 
				+\shape up
			
 
				+\size normal
			
 
				+\emph off
			
 
				+\bar no
			
 
				+\strikeout off
			
 
				+\xout off
			
 
				+\uuline off
			
 
				+\uwave off
			
 
				+\noun off
			
 
				+\color none
			
 
				+0.875
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+
			
 
				+\family roman
			
 
				+\series medium
			
 
				+\shape up
			
 
				+\size normal
			
 
				+\emph off
			
 
				+\bar no
			
 
				+\strikeout off
			
 
				+\xout off
			
 
				+\uuline off
			
 
				+\uwave off
			
 
				+\noun off
			
 
				+\color none
			
 
				+0.642
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+</row>
			
 
				+<row>
			
 
				+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+
			
 
				+\family roman
			
 
				+\series medium
			
 
				+\shape up
			
 
				+\size normal
			
 
				+\emph off
			
 
				+\bar no
			
 
				+\strikeout off
			
 
				+\xout off
			
 
				+\uuline off
			
 
				+\uwave off
			
 
				+\noun off
			
 
				+\color none
			
 
				+fRMA
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+Yes
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+
			
 
				+\family roman
			
 
				+\series medium
			
 
				+\shape up
			
 
				+\size normal
			
 
				+\emph off
			
 
				+\bar no
			
 
				+\strikeout off
			
 
				+\xout off
			
 
				+\uuline off
			
 
				+\uwave off
			
 
				+\noun off
			
 
				+\color none
			
 
				+0.863
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+
			
 
				+\family roman
			
 
				+\series medium
			
 
				+\shape up
			
 
				+\size normal
			
 
				+\emph off
			
 
				+\bar no
			
 
				+\strikeout off
			
 
				+\xout off
			
 
				+\uuline off
			
 
				+\uwave off
			
 
				+\noun off
			
 
				+\color none
			
 
				+0.718
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+</row>
			
 
				+<row>
			
 
				+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+
			
 
				+\family roman
			
 
				+\series medium
			
 
				+\shape up
			
 
				+\size normal
			
 
				+\emph off
			
 
				+\bar no
			
 
				+\strikeout off
			
 
				+\xout off
			
 
				+\uuline off
			
 
				+\uwave off
			
 
				+\noun off
			
 
				+\color none
			
 
				+SCAN
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+Yes
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+
			
 
				+\family roman
			
 
				+\series medium
			
 
				+\shape up
			
 
				+\size normal
			
 
				+\emph off
			
 
				+\bar no
			
 
				+\strikeout off
			
 
				+\xout off
			
 
				+\uuline off
			
 
				+\uwave off
			
 
				+\noun off
			
 
				+\color none
			
 
				+0.853
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
			
 
				+\begin_inset Text
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+
			
 
				+\family roman
			
 
				+\series medium
			
 
				+\shape up
			
 
				+\size normal
			
 
				+\emph off
			
 
				+\bar no
			
 
				+\strikeout off
			
 
				+\xout off
			
 
				+\uuline off
			
 
				+\uwave off
			
 
				+\noun off
			
 
				+\color none
			
 
				+0.689
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+</cell>
			
 
				+</row>
			
 
				+</lyxtabular>
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+\begin_inset Caption Standard
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+\begin_inset CommandInset label
			
 
				+LatexCommand label
			
 
				+name "tab:AUC-PAM"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\series bold
			
 
				+AUC values for internal and external validation with 6 different normalization
			
 
				+ strategies.
			
 
				+
			
 
				+\series default
			
 
				+ Only fRMA and SCAN are single-channel normalizations.
			
 
				+ The other 4 normalizations are for comparison.
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Standard
			
 
				+For internal validation, the 6 methods' AUC values ranged from 0.816 to 0.891,
			
 
				+ as shown in Table 
			
 
				+\begin_inset CommandInset ref
			
 
				+LatexCommand ref
			
 
				+reference "tab:AUC-PAM"
			
 
				+plural "false"
			
 
				+caps "false"
			
 
				+noprefix "false"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+.
			
 
				+ Among the non-single-channel normalizations, dChip outperformed RMA, while
			
 
				+ GRSN reduced the AUC values for both dChip and RMA.
			
 
				+ Both single-channel methods, fRMA and SCAN, slightly outperformed RMA,
			
 
				+ with fRMA ahead of SCAN.
			
 
				+ However, the difference between RMA and fRMA is still quite small.
			
 
				+ Figure 
			
 
				+\begin_inset CommandInset ref
			
 
				+LatexCommand ref
			
 
				+reference "fig:ROC-PAM-int"
			
 
				+plural "false"
			
 
				+caps "false"
			
 
				+noprefix "false"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+ shows that the ROC curves for RMA, dChip, and fRMA look very similar and
			
 
				+ relatively smooth, while both GRSN curves and the curve for SCAN have a
			
 
				+ more jagged appearance.
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Standard
			
 
				+\begin_inset Float figure
			
 
				+wide false
			
 
				+sideways false
			
 
				+status collapsed
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+\begin_inset Graphics
			
 
				+	filename graphics/PAM/ROC-TXvsAR-external.pdf
			
 
				+	width 100col%
			
 
				+	groupId colwidth
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+\begin_inset Caption Standard
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+\begin_inset CommandInset label
			
 
				+LatexCommand label
			
 
				+name "fig:ROC-PAM-ext"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+ROC curve for PAM on external validation data using different normalization
			
 
				+ strategies
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Standard
			
 
				+For external validation, as expected, all the AUC values are lower than
			
 
				+ the internal validations, ranging from 0.642 to 0.750 (Table 
			
 
				+\begin_inset CommandInset ref
			
 
				+LatexCommand ref
			
 
				+reference "tab:AUC-PAM"
			
 
				+plural "false"
			
 
				+caps "false"
			
 
				+noprefix "false"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+).
			
 
				+ With or without GRSN, RMA shows its dominance over dChip in this more challengi
			
 
				+ng test.
			
 
				+ Unlike in the internal validation, GRSN actually improves the classifier
			
 
				+ performance for RMA, although it does not for dChip.
			
 
				+ Once again, both single-channel methods perform about on par with RMA,
			
 
				+ with fRMA performing slightly better and SCAN performing a bit worse.
			
 
				+ Figure 
			
 
				+\begin_inset CommandInset ref
			
 
				+LatexCommand ref
			
 
				+reference "fig:ROC-PAM-ext"
			
 
				+plural "false"
			
 
				+caps "false"
			
 
				+noprefix "false"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+ shows the ROC curves for the external validation test.
			
 
				+ As expected, none of them are as clean-looking as the internal validation
			
 
				+ ROC curves.
			
 
				+ The curves for RMA, RMA+GRSN, and fRMA all look similar, while the other
			
 
				+ curves look more divergent.
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Subsection
			
 
				+fRMA with custom-generated vectors enables normalization on hthgu133pluspm
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Standard
			
 
				+\begin_inset Float figure
			
 
				+wide false
			
 
				+sideways false
			
 
				+status open
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+\begin_inset Graphics
			
 
				+	filename graphics/frma-pax-bx/batchsize_batches.pdf
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+\begin_inset Caption Standard
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+\begin_inset CommandInset label
			
 
				+LatexCommand label
			
 
				+name "fig:batch-size-batches"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\series bold
			
 
				+Effect of batch size selection on number of batches included in fRMA probe
			
 
				+ weight learning.
			
 
				+ 
			
 
				+\series default
			
 
				+For batch sizes ranging from 3 to 15, the number of batches with at least
			
 
				+ that many samples was plotted for biopsy (BX) and blood (PAX) samples.
			
 
				+ The selected batch size, 5, is marked with a dotted vertical line.
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Standard
			
 
				+\begin_inset Float figure
			
 
				+wide false
			
 
				+sideways false
			
 
				+status open
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+\begin_inset Graphics
			
 
				+	filename graphics/frma-pax-bx/batchsize_samples.pdf
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+\begin_inset Caption Standard
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+\begin_inset CommandInset label
			
 
				+LatexCommand label
			
 
				+name "fig:batch-size-samples"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\series bold
			
 
				+Effect of batch size selection on number of samples included in fRMA probe
			
 
				+ weight learning.
			
 
				+ 
			
 
				+\series default
			
 
				+For batch sizes ranging from 3 to 15, the number of samples included in
			
 
				+ probe weight training was plotted for biopsy (BX) and blood (PAX) samples.
			
 
				+ The selected batch size, 5, is marked with a dotted vertical line.
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Standard
			
 
				+In order to enable use of fRMA to normalize hthgu133pluspm, a custom set
			
 
				+ of fRMA vectors was created.
			
 
				+ First, an appropriate batch size was chosen by looking at the number of
			
 
				+ batches and number of samples included as a function of batch size (Figures
			
 
				+ 
			
 
				+\begin_inset CommandInset ref
			
 
				+LatexCommand ref
			
 
				+reference "fig:batch-size-batches"
			
 
				+plural "false"
			
 
				+caps "false"
			
 
				+noprefix "false"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+ and 
			
 
				+\begin_inset CommandInset ref
			
 
				+LatexCommand ref
			
 
				+reference "fig:batch-size-samples"
			
 
				+plural "false"
			
 
				+caps "false"
			
 
				+noprefix "false"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+, respectively).
			
 
				+ For a given batch size, all batches with fewer samples that the chosen
			
 
				+ size must be ignored during training, while larger batches must be randomly
			
 
				+ downsampled to the chosen size.
			
 
				+ Hence, the number of samples included for a given batch size equals the
			
 
				+ batch size times the number of batches with at least that many samples.
			
 
				+ From Figure 
			
 
				+\begin_inset CommandInset ref
			
 
				+LatexCommand ref
			
 
				+reference "fig:batch-size-samples"
			
 
				+plural "false"
			
 
				+caps "false"
			
 
				+noprefix "false"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+, it is apparent that that a batch size of 8 maximizes the number of samples
			
 
				+ included in training.
			
 
				+ Increasing the batch size beyond this causes too many smaller batches to
			
 
				+ be excluded, reducing the total number of samples for both tissue types.
			
 
				+ However, a batch size of 8 is not necessarily optimal.
			
 
				+ The article introducing frmaTools concluded that it was highly advantageous
			
 
				+ to use a smaller batch size in order to include more batches, even at the
			
 
				+ expense of including fewer total samples in training 
			
 
				+\begin_inset CommandInset citation
			
 
				+LatexCommand cite
			
 
				+key "McCall2011"
			
 
				+literal "false"
			
 
				+
			
 
				+\end_inset
			
 
				 
			
 
				-\begin_layout Subsubsection
			
 
				-Separate normalization with RMA introduces unwanted biases in classification
			
 
				+.
			
 
				+ To strike an appropriate balance between more batches and more samples,
			
 
				+ a batch size of 5 was chosen.
			
 
				+ For both blood and biopsy samples, this increased the number of batches
			
 
				+ included by 10, with only a modest reduction in the number of samples compared
			
 
				+ to a batch size of 8.
			
 
				+ With a batch size of 5, 26 batches of biopsy samples and 46 batches of
			
 
				+ blood samples were available.
			
 
				 \end_layout
			
 
				 
			
 
				 \begin_layout Standard
			
@@ -1257,7 +2383,9 @@ status collapsed
 
				 
			
 
				 \begin_layout Plain Layout
			
 
				 \begin_inset Graphics
			
 
				-	filename graphics/PAM/predplot.pdf
			
 
				+	filename graphics/frma-pax-bx/M-BX-violin.pdf
			
 
				+	lyxscale 30
			
 
				+	groupId m-violin
			
 
				 
			
 
				 \end_inset
			
 
				 
			
@@ -1270,15 +2398,19 @@ status collapsed
 
				 \begin_layout Plain Layout
			
 
				 \begin_inset CommandInset label
			
 
				 LatexCommand label
			
 
				-name "fig:Classifier-probabilities-RMA"
			
 
				+name "fig:m-bx-violin"
			
 
				 
			
 
				 \end_inset
			
 
				 
			
 
				 
			
 
				 \series bold
			
 
				-Classifier probabilities on validation samples when normalized with RMA
			
 
				- together vs.
			
 
				- separately.
			
 
				+Violin plot of log ratios between normalizations for 20 biopsy samples.
			
 
				+ 
			
 
				+\series default
			
 
				+Each of 20 randomly selected biopsy samples was normalized with RMA and
			
 
				+ with 5 different sets of fRMA vectors.
			
 
				+ This shows the distribution of log ratios between normalized expression
			
 
				+ values, aggregated across all 20 arrays.
			
 
				 \end_layout
			
 
				 
			
 
				 \end_inset
			
@@ -1292,63 +2424,78 @@ Classifier probabilities on validation samples when normalized with RMA
 
				 \end_layout
			
 
				 
			
 
				 \begin_layout Standard
			
 
				-The initial data set for testing fRMA consisted of 157 hgu133plus2 arrays,
			
 
				- split into a training set (23 TX, 35 AR, 21 ADNR) and a validation set
			
 
				- (23 TX, 34 AR, 21 ADNR), along with an external validation set gathered
			
 
				- from public GEO data (37 TX, 38 AR, no ADNR) 
			
 
				-\begin_inset CommandInset citation
			
 
				-LatexCommand cite
			
 
				-key "Kurian2014"
			
 
				-literal "true"
			
 
				-
			
 
				-\end_inset
			
 
				-
			
 
				-.
			
 
				- To demonstrate the problem, we considered the problem of training a classifier
			
 
				- to distinguish TX from AR using the TX and AR samples from the training
			
 
				- set and validation set as training data, evaluating performance on the
			
 
				- external validation set.
			
 
				- First, training and evaluation were performed after normalizing all array
			
 
				- samples together as a single set using RMA, and second, the internal samples
			
 
				- were normalized separately from the external samples and the training and
			
 
				- evaluation were repeated.
			
 
				- For each sample in the validation set, the classifier probabilities from
			
 
				- both classifiers were plotted against each other (Fig.
			
 
				- 
			
 
				+Since fRMA training requires equal-size batches, larger batches are downsampled
			
 
				+ randomly.
			
 
				+ This introduces a nondeterministic step in the generation of normalization
			
 
				+ vectors.
			
 
				+ To show that this randomness does not substantially change the outcome,
			
 
				+ the random downsampling and subsequent vector learning was repeated 5 times,
			
 
				+ with a different random seed each time.
			
 
				+ 20 samples were selected at random as a test set and normalized with each
			
 
				+ of the 5 sets of fRMA normalization vectors as well as ordinary RMA, and
			
 
				+ the normalized expression values were compared across normalizations.
			
 
				+ Figure 
			
 
				 \begin_inset CommandInset ref
			
 
				 LatexCommand ref
			
 
				-reference "fig:Classifier-probabilities-RMA"
			
 
				+reference "fig:m-bx-violin"
			
 
				 plural "false"
			
 
				 caps "false"
			
 
				 noprefix "false"
			
 
				 
			
 
				 \end_inset
			
 
				 
			
 
				-).
			
 
				- As expected, separate normalization biases the classifier probabilities,
			
 
				- resulting in several misclassifications.
			
 
				- In this case, the bias from separate normalization causes the classifier
			
 
				- to assign a lower probability of AR to every sample.
			
 
				- Because it is not feasible to normalize all samples together in a clinical
			
 
				- context, this shows that an alternative to RMA is required.
			
 
				-\end_layout
			
 
				-
			
 
				-\begin_layout Subsubsection
			
 
				-fRMA achieves equal classification performance while eliminating dependence
			
 
				- on normalization strategy
			
 
				+ shows a summary of these comparisons for biopsy samples.
			
 
				+ Comparing RMA to each of the 5 fRMA normalizations, the distribution of
			
 
				+ log ratios is somewhat wide, indicating that the normalizations disagree
			
 
				+ on the expression values of a fair number of probe sets.
			
 
				+ In contrast, comparisons of fRMA against fRMA, the vast mojority of probe
			
 
				+ sets have very small log ratios, indicating a very high agreement between
			
 
				+ the normalized values generated by the two normalizations.
			
 
				+ This shows that the fRMA normalization's behavior is not very sensitive
			
 
				+ to the random downsampling of larger batches during training.
			
 
				 \end_layout
			
 
				 
			
 
				 \begin_layout Standard
			
 
				-\begin_inset Flex TODO Note (inline)
			
 
				-status open
			
 
				+\begin_inset Float figure
			
 
				+wide false
			
 
				+sideways false
			
 
				+status collapsed
			
 
				 
			
 
				 \begin_layout Plain Layout
			
 
				-Cite ROCR: bioinformatics.oxfordjournals.org/cgi/content/abstract/21/20/3940
			
 
				+\begin_inset Graphics
			
 
				+	filename graphics/frma-pax-bx/MA-BX-RMA.fRMA.pdf
			
 
				+	lyxscale 50
			
 
				+	groupId ma-frma
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				 \end_layout
			
 
				 
			
 
				 \begin_layout Plain Layout
			
 
				-Or maybe pROC? https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-21
			
 
				-05-12-77
			
 
				+\begin_inset Caption Standard
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+\begin_inset CommandInset label
			
 
				+LatexCommand label
			
 
				+name "fig:ma-bx-rma-frma"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				+\series bold
			
 
				+Representative MA plot comparing RMA against fRMA for 20 biopsy samples.
			
 
				+ 
			
 
				+\series default
			
 
				+Averages and log ratios were computed for every probe in each of 20 biopsy
			
 
				+ samples between RMA normalization and fRMA.
			
 
				+ Density of points is represented by darkness of shading, and individual
			
 
				+ outlier points are plotted.
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+
			
 
				 \end_layout
			
 
				 
			
 
				 \end_inset
			
@@ -1360,11 +2507,13 @@ Or maybe pROC? https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471
 
				 \begin_inset Float figure
			
 
				 wide false
			
 
				 sideways false
			
 
				-status open
			
 
				+status collapsed
			
 
				 
			
 
				 \begin_layout Plain Layout
			
 
				 \begin_inset Graphics
			
 
				-	filename graphics/PAM/external-roc-frma.pdf
			
 
				+	filename graphics/frma-pax-bx/MA-BX-fRMA.fRMA.pdf
			
 
				+	lyxscale 50
			
 
				+	groupId ma-frma
			
 
				 
			
 
				 \end_inset
			
 
				 
			
@@ -1377,12 +2526,20 @@ status open
 
				 \begin_layout Plain Layout
			
 
				 \begin_inset CommandInset label
			
 
				 LatexCommand label
			
 
				-name "fig:ROC-curve-PAM"
			
 
				+name "fig:ma-bx-frma-frma"
			
 
				 
			
 
				 \end_inset
			
 
				 
			
 
				-ROC curve for PAM on external validation data, normalizing with RMA and
			
 
				- fRMA
			
 
				+
			
 
				+\series bold
			
 
				+Representative MA plot comparing different fRMA vectors for 20 biopsy samples.
			
 
				+ 
			
 
				+\series default
			
 
				+Averages and log ratios were computed for every probe in each of 20 biopsy
			
 
				+ samples between fRMA normalizations using vectors from two different batch
			
 
				+ samplings.
			
 
				+ Density of points is represented by darkness of shading, and individual
			
 
				+ outlier points are plotted.
			
 
				 \end_layout
			
 
				 
			
 
				 \end_inset
			
@@ -1395,45 +2552,98 @@ ROC curve for PAM on external validation data, normalizing with RMA and
 
				 
			
 
				 \end_layout
			
 
				 
			
 
				-\begin_layout Itemize
			
 
				-fRMA eliminates this issue by normalizing each sample independently to the
			
 
				- same quantile distribution and summarizing probes using the same weights.
			
 
				-\end_layout
			
 
				+\begin_layout Standard
			
 
				+Figure 
			
 
				+\begin_inset CommandInset ref
			
 
				+LatexCommand ref
			
 
				+reference "fig:ma-bx-rma-frma"
			
 
				+plural "false"
			
 
				+caps "false"
			
 
				+noprefix "false"
			
 
				 
			
 
				-\begin_layout Itemize
			
 
				-Classifier performance on validation set is identical for 
			
 
				-\begin_inset Quotes eld
			
 
				 \end_inset
			
 
				 
			
 
				-RMA together
			
 
				-\begin_inset Quotes erd
			
 
				+ shows an MA plot of the RMA-normalized values against the fRMA-normalized
			
 
				+ values for the same probe sets and arrays, corresponding to the first row
			
 
				+ of Figure 
			
 
				+\begin_inset CommandInset ref
			
 
				+LatexCommand ref
			
 
				+reference "fig:m-bx-violin"
			
 
				+plural "false"
			
 
				+caps "false"
			
 
				+noprefix "false"
			
 
				+
			
 
				 \end_inset
			
 
				 
			
 
				- and fRMA, so switching to clinically applicable normalization does not
			
 
				- sacrifice accuracy
			
 
				-\end_layout
			
 
				+.
			
 
				+ This MA plot shows that not only is there a wide distribution of M-values,
			
 
				+ but the trend of M-values is dependent on the average normalized intensity.
			
 
				+ This is expected, since the overall trend represents the differences in
			
 
				+ the quantile normalization step.
			
 
				+ When running RMA, only the quantiles for these specific 20 arrays are used,
			
 
				+ while for fRMA the quantile distribution is taking from all arrays used
			
 
				+ in training.
			
 
				+ Figure 
			
 
				+\begin_inset CommandInset ref
			
 
				+LatexCommand ref
			
 
				+reference "fig:ma-bx-frma-frma"
			
 
				+plural "false"
			
 
				+caps "false"
			
 
				+noprefix "false"
			
 
				 
			
 
				-\begin_layout Standard
			
 
				-\begin_inset Flex TODO Note (inline)
			
 
				-status open
			
 
				+\end_inset
			
 
				 
			
 
				-\begin_layout Plain Layout
			
 
				-Check the published paper for any other possibly relevant figures to include
			
 
				- here.
			
 
				-\end_layout
			
 
				+ shows a similar MA plot comparing 2 different fRMA normalizations, correspondin
			
 
				+g to the 6th row of Figure 
			
 
				+\begin_inset CommandInset ref
			
 
				+LatexCommand ref
			
 
				+reference "fig:m-bx-violin"
			
 
				+plural "false"
			
 
				+caps "false"
			
 
				+noprefix "false"
			
 
				 
			
 
				 \end_inset
			
 
				 
			
 
				+.
			
 
				+ The MA plot is very tightly centered around zero with no visible trend.
			
 
				+ Figures 
			
 
				+\begin_inset CommandInset ref
			
 
				+LatexCommand ref
			
 
				+reference "fig:m-pax-violin"
			
 
				+plural "false"
			
 
				+caps "false"
			
 
				+noprefix "false"
			
 
				 
			
 
				-\end_layout
			
 
				+\end_inset
			
 
				 
			
 
				-\begin_layout Subsection
			
 
				-fRMA with custom-generated vectors
			
 
				-\end_layout
			
 
				+, 
			
 
				+\begin_inset CommandInset ref
			
 
				+LatexCommand ref
			
 
				+reference "fig:MA-PAX-rma-frma"
			
 
				+plural "false"
			
 
				+caps "false"
			
 
				+noprefix "false"
			
 
				 
			
 
				-\begin_layout Itemize
			
 
				-Non-standard platform hthgu133pluspm - no pre-built fRMA vectors available,
			
 
				- so custom vectors must be learned from in-house data
			
 
				+\end_inset
			
 
				+
			
 
				+, and 
			
 
				+\begin_inset CommandInset ref
			
 
				+LatexCommand ref
			
 
				+reference "fig:ma-bx-frma-frma"
			
 
				+plural "false"
			
 
				+caps "false"
			
 
				+noprefix "false"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+ show exactly the same information for the blood samples, once again comparing
			
 
				+ the normalized expression values between normalizations for all probe sets
			
 
				+ across 20 randomly selected test arrays.
			
 
				+ Once again, there is a wider distribution of log ratios between RMA-normalized
			
 
				+ values and fRMA-normalized, and a much tighter distribution when comparing
			
 
				+ different fRMA normalizations to each other, indicating that the fRMA training
			
 
				+ process is robust to random batch downsampling for the blood samples as
			
 
				+ well.
			
 
				 \end_layout
			
 
				 
			
 
				 \begin_layout Standard
			
@@ -1444,7 +2654,9 @@ status collapsed
 
				 
			
 
				 \begin_layout Plain Layout
			
 
				 \begin_inset Graphics
			
 
				-	filename graphics/frma-pax-bx/batchsize_batches.pdf
			
 
				+	filename graphics/frma-pax-bx/M-PAX-violin.pdf
			
 
				+	lyxscale 30
			
 
				+	groupId m-violin
			
 
				 
			
 
				 \end_inset
			
 
				 
			
@@ -1457,12 +2669,19 @@ status collapsed
 
				 \begin_layout Plain Layout
			
 
				 \begin_inset CommandInset label
			
 
				 LatexCommand label
			
 
				-name "fig:batch-size-batches"
			
 
				+name "fig:m-pax-violin"
			
 
				 
			
 
				 \end_inset
			
 
				 
			
 
				-Effect of batch size selection on number of batches included in fRMA probe
			
 
				- weight learning
			
 
				+
			
 
				+\series bold
			
 
				+Violin plot of log ratios between normalizations for 20 blood samples.
			
 
				+ 
			
 
				+\series default
			
 
				+Each of 20 randomly selected blood samples was normalized with RMA and with
			
 
				+ 5 different sets of fRMA vectors.
			
 
				+ This shows the distribution of log ratios between normalized expression
			
 
				+ values, aggregated across all 20 arrays.
			
 
				 \end_layout
			
 
				 
			
 
				 \end_inset
			
@@ -1483,7 +2702,9 @@ status collapsed
 
				 
			
 
				 \begin_layout Plain Layout
			
 
				 \begin_inset Graphics
			
 
				-	filename graphics/frma-pax-bx/batchsize_samples.pdf
			
 
				+	filename graphics/frma-pax-bx/MA-PAX-RMA.fRMA.pdf
			
 
				+	lyxscale 50
			
 
				+	groupId ma-frma
			
 
				 
			
 
				 \end_inset
			
 
				 
			
@@ -1496,12 +2717,19 @@ status collapsed
 
				 \begin_layout Plain Layout
			
 
				 \begin_inset CommandInset label
			
 
				 LatexCommand label
			
 
				-name "fig:batch-size-samples"
			
 
				+name "fig:MA-PAX-rma-frma"
			
 
				 
			
 
				 \end_inset
			
 
				 
			
 
				-Effect of batch size selection on number of samples included in fRMA probe
			
 
				- weight learning
			
 
				+
			
 
				+\series bold
			
 
				+Representative MA plot comparing RMA against fRMA for 20 blood samples.
			
 
				+ 
			
 
				+\series default
			
 
				+Averages and log ratios were computed for every probe in each of 20 blood
			
 
				+ samples between RMA normalization and fRMA.
			
 
				+ Density of points is represented by darkness of shading, and individual
			
 
				+ outlier points are plotted.
			
 
				 \end_layout
			
 
				 
			
 
				 \end_inset
			
@@ -1509,71 +2737,57 @@ Effect of batch size selection on number of samples included in fRMA probe
 
				 
			
 
				 \end_layout
			
 
				 
			
 
				-\end_inset
			
 
				-
			
 
				+\begin_layout Plain Layout
			
 
				 
			
 
				 \end_layout
			
 
				 
			
 
				-\begin_layout Itemize
			
 
				-Large body of data available for training fRMA: 341 kidney graft biopsy
			
 
				- samples, 965 blood samples from graft recipients
			
 
				-\end_layout
			
 
				+\end_inset
			
 
				 
			
 
				-\begin_deeper
			
 
				-\begin_layout Itemize
			
 
				-But not all samples can be used (see trade-off figure)
			
 
				-\end_layout
			
 
				 
			
 
				-\begin_layout Itemize
			
 
				-Figure showing trade-off between more samples per group and fewer groups
			
 
				- with that may samples, to justify choice of number of samples per group
			
 
				 \end_layout
			
 
				 
			
 
				-\begin_layout Itemize
			
 
				-pre-generated normalization vectors use ~850 samples
			
 
				-\begin_inset Flex TODO Note (Margin)
			
 
				+\begin_layout Standard
			
 
				+\begin_inset Float figure
			
 
				+wide false
			
 
				+sideways false
			
 
				 status collapsed
			
 
				 
			
 
				 \begin_layout Plain Layout
			
 
				-Look up the exact numbers
			
 
				-\end_layout
			
 
				+\begin_inset Graphics
			
 
				+	filename graphics/frma-pax-bx/MA-PAX-fRMA.fRMA.pdf
			
 
				+	lyxscale 50
			
 
				+	groupId ma-frma
			
 
				 
			
 
				 \end_inset
			
 
				 
			
 
				 
			
 
				-\begin_inset CommandInset citation
			
 
				-LatexCommand cite
			
 
				-key "McCall2010"
			
 
				-literal "false"
			
 
				+\end_layout
			
 
				 
			
 
				-\end_inset
			
 
				+\begin_layout Plain Layout
			
 
				+\begin_inset Caption Standard
			
 
				 
			
 
				-, but are designed to be general across all tissues.
			
 
				- The samples we have are suitable for tissue-specific normalization vectors.
			
 
				-\end_layout
			
 
				+\begin_layout Plain Layout
			
 
				+\begin_inset CommandInset label
			
 
				+LatexCommand label
			
 
				+name "fig:MA-PAX-frma-frma"
			
 
				 
			
 
				-\end_deeper
			
 
				-\begin_layout Itemize
			
 
				-Figure: MA plot, RMA vs fRMA, to show that the normalization is appreciably
			
 
				- and non-linearly different
			
 
				-\end_layout
			
 
				+\end_inset
			
 
				 
			
 
				-\begin_layout Itemize
			
 
				-Figure MA plot, fRMA vs fRMA with different randomly-chosen sample subsets
			
 
				- to show consistency
			
 
				-\end_layout
			
 
				 
			
 
				-\begin_layout Itemize
			
 
				-custom fRMA normalization improved cross-validated classifier performance
			
 
				+\series bold
			
 
				+Representative MA plot comparing different fRMA vectors for 20 blood samples.
			
 
				+ 
			
 
				+\series default
			
 
				+Averages and log ratios were computed for every probe in each of 20 blood
			
 
				+ samples between fRMA normalizations using vectors from two different batch
			
 
				+ samplings.
			
 
				+ Density of points is represented by darkness of shading, and individual
			
 
				+ outlier points are plotted.
			
 
				 \end_layout
			
 
				 
			
 
				-\begin_layout Standard
			
 
				-\begin_inset Flex TODO Note (inline)
			
 
				-status open
			
 
				+\end_inset
			
 
				+
			
 
				 
			
 
				-\begin_layout Plain Layout
			
 
				-Get a figure from Tom showing classifier performance improvement (compared
			
 
				- to all-sample RMA, I guess?), if possible
			
 
				 \end_layout
			
 
				 
			
 
				 \end_inset
			
@@ -1617,17 +2831,110 @@ Figure and/or table showing improved p-value historgrams/number of significant
 
				 Discussion
			
 
				 \end_layout
			
 
				 
			
 
				-\begin_layout Itemize
			
 
				-fRMA enables classifying new samples without re-normalizing the entire data
			
 
				- set
			
 
				+\begin_layout Subsection
			
 
				+fRMA achieves clinically applicable normalization without sacrificing classifica
			
 
				+tion performance
			
 
				 \end_layout
			
 
				 
			
 
				-\begin_deeper
			
 
				-\begin_layout Itemize
			
 
				-Critical for translating a classifier into clinical practice
			
 
				+\begin_layout Standard
			
 
				+As shown in Figure 
			
 
				+\begin_inset CommandInset ref
			
 
				+LatexCommand ref
			
 
				+reference "fig:Classifier-probabilities-RMA"
			
 
				+plural "false"
			
 
				+caps "false"
			
 
				+noprefix "false"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+, improper normalization, particularly separate normalization of training
			
 
				+ and test samples, leads to unwanted biases in classification.
			
 
				+ In a controlled experimental context, it is always possible to correct
			
 
				+ this issue by normalizing all experimental samples together.
			
 
				+ However, because it is not feasible to normalize all samples together in
			
 
				+ a clinical context, a single-channel normalization is required is required.
			
 
				+ 
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Standard
			
 
				+The major concern in using a single-channel normalization is that non-single-cha
			
 
				+nnel methods can share information between arrays to improve the normalization,
			
 
				+ and single-channel methods risk sacrificing the gains in normalization
			
 
				+ accuracy that come from this information sharing.
			
 
				+ In the case of RMA, this information sharing is accomplished through quantile
			
 
				+ normalization and median polish steps.
			
 
				+ The need for information sharing in quantile normalization can easily be
			
 
				+ removed by learning a fixed set of quantiles from external data and normalizing
			
 
				+ each array to these fixed quantiles, instead of the quantiles of the data
			
 
				+ itself.
			
 
				+ As long as the fixed quantiles are reasonable, the result will be similar
			
 
				+ to standard RMA.
			
 
				+ However, there is no analogous way to eliminate cross-array information
			
 
				+ sharing in the median polish step, so fRMA replaces this with a weighted
			
 
				+ average of probes on each array, with the weights learned from external
			
 
				+ data.
			
 
				+ This step of fRMA has the greatest potential to diverge from RMA un undesirable
			
 
				+ ways.
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Standard
			
 
				+However, when run on real data, fRMA performed at least as well as RMA in
			
 
				+ both the internal validation and external validation tests.
			
 
				+ This shows that fRMA can be used to normalize individual clinical samples
			
 
				+ in a class prediction context without sacrificing the classifier performance
			
 
				+ that would be obtained by using the more well-established RMA for normalization.
			
 
				+ The other single-channel normalization method considered, SCAN, showed
			
 
				+ some loss of AUC in the external validation test.
			
 
				+ Based on these results, fRMA is the preferred normalization for clinical
			
 
				+ samples in a class prediction context.
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Subsection
			
 
				+Robust fRMA vectors can be generated for new array platforms
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Standard
			
 
				+The published fRMA normalization vectors for the hgu133plus2 platform were
			
 
				+ generated from a set of about 850 samples 
			
 
				+\begin_inset Flex TODO Note (Margin)
			
 
				+status collapsed
			
 
				+
			
 
				+\begin_layout Plain Layout
			
 
				+Look up the exact numbers
			
 
				+\end_layout
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+ chosen from a wide range of tissues, which the authors determined was sufficien
			
 
				+t to generate a robust set of normalization vectors that could be applied
			
 
				+ across all tissues 
			
 
				+\begin_inset CommandInset citation
			
 
				+LatexCommand cite
			
 
				+key "McCall2010"
			
 
				+literal "false"
			
 
				+
			
 
				+\end_inset
			
 
				+
			
 
				+.
			
 
				+ Since we only had hthgu133pluspm for 2 tissues of interest, our needs were
			
 
				+ more modest.
			
 
				+ Even using only 130 samples in 26 batches of 5 samples each for kidney
			
 
				+ biopsies, we were able to train a robust set of fRMA normalization vectors
			
 
				+ that were not meaningfully affected by the random selection of 5 samples
			
 
				+ from each batch.
			
 
				+ As expected, the training process was just as robust for the blood samples
			
 
				+ with 230 samples in 46 batches of 5 samples each.
			
 
				+ Because these vectors were each generated using training samples from a
			
 
				+ single tissue, they are not suitable for general use, unlike the vectors
			
 
				+ provided with fRMA itself.
			
 
				+ They are purpose-build for normalizing a specific type of sample on a specific
			
 
				+ platform.
			
 
				+\end_layout
			
 
				+
			
 
				+\begin_layout Subsection
			
 
				+voom
			
 
				 \end_layout
			
 
				 
			
 
				-\end_deeper
			
 
				 \begin_layout Itemize
			
 
				 Methods like voom designed for RNA-seq can also help with array analysis
			
 
				 \end_layout
			
@@ -4031,19 +5338,9 @@ Also look at other types lymphocytes: CD8 T-cells, B-cells, NK cells
 
				 
			
 
				 \end_deeper
			
 
				 \begin_layout Itemize
			
 
				-Investigate epigenetic regulation of lifespan extension in 
			
 
				-\emph on
			
 
				-C.
			
 
				- elegans
			
 
				-\end_layout
			
 
				-
			
 
				-\begin_deeper
			
 
				-\begin_layout Itemize
			
 
				-ChIP-seq of important transcriptional regulators to see how transcriptional
			
 
				- drift is prevented
			
 
				+Use CV or bootstrap to better evaluate classifiers
			
 
				 \end_layout
			
 
				 
			
 
				-\end_deeper
			
 
				 \begin_layout Standard
			
 
				 \begin_inset ERT
			
 
				 status open