% Frozen RMA implementation for TGI PAX and BX samples
% Ryan Thompson
% Thu Mar 19, 2015


# Results #

The scripts below were used to evaluate the consistency of the fRMA
normalization vectors by repeating the training process with 5
different random samples and then comparing a random selection of
arrays normalized by all five trained vectors as well as by ordinary
RMA. [This folder](fRMA_consistency_results) shows the results.

# Scripts #

There are two pairs of scripts. The first pair, `train.R` and
`test.R`, handle the tasks of (respectively) generating/training the
main fRMA vectors and ensuring that they work by normalizing all the
data with them. The second pair, `consistency-train.R` and
`consistency-evaluate.R`, handle (respectively) training five separate
fRMA vector sets and testing their consistency.

## [`train.R`](train.R): Creating the fRMA vectors ##

This script reads the sample metadata tables, assembles the full file
lists for BX and PAX tissues, and trains a set of fRMA vectors for
each tissue. It exports each of these vector sets to an installable R
package.

## [`test.R`](test.R): Testing the fRMA vectors ##

This script simply loads all the arrays and normalizes them using the
appropriate fRMA vectors that were generated by `train.R`. It should
be run after installing the packages produced by `train.R`. It is
simply used for testing to make sure the fRMA vectors work.

## [`consistency-train.R`](consistency-train.R): Train several vector sets for each tissue ##

This script essentially does the same thing as `train.R`, only it does
it five times with five different subsamplings of the arrays to
generate five different fRMA vector sets and saves them all in an R
data file.

## [`consistency-evaluate.R`](consistency-evaluate.R): Verify consistency of fRMA vectors ##

This script loads the data file from `consistency-train.R`, then loads
20 random arrays from each tissue and normalizes them with all five
fRMA vector sets, and also by ordinary RMA. It then produces plots of
M vs A for every pair of normalizations. Unlike regular MA plots,
these are *not* plotting arrays against each other, but rather arrays
against themselves, but normalized using two different methods. So if
two normalizations were perfectly consistent, the MA plot would be a
flat horizontal line at M=0. It also produces boxplots and violin
plots showing the M distribution for each of the pairwise comparisons.