% Frozen RMA implementation for TGI PAX and BX samples % Ryan Thompson % Thu Mar 19, 2015 # Results # The scripts below were used to evaluate the consistency of the fRMA normalization vectors by repeating the training process with 5 different random samples and then comparing a random selection of arrays normalized by all five trained vectors as well as by ordinary RMA. [This folder](fRMA_consistency_results) shows the results. # Scripts # There are two pairs of scripts. The first pair, `train.R` and `test.R`, handle the tasks of (respectively) generating/training the main fRMA vectors and ensuring that they work by normalizing all the data with them. The second pair, `consistency-train.R` and `consistency-evaluate.R`, handle (respectively) training five separate fRMA vector sets and testing their consistency. ## [`train.R`](train.R): Creating the fRMA vectors ## This script reads the sample metadata tables, assembles the full file lists for BX and PAX tissues, and trains a set of fRMA vectors for each tissue. It exports each of these vector sets to an installable R package. ## [`test.R`](test.R): Testing the fRMA vectors ## This script simply loads all the arrays and normalizes them using the appropriate fRMA vectors that were generated by `train.R`. It should be run after installing the packages produced by `train.R`. It is simply used for testing to make sure the fRMA vectors work. ## [`consistency-train.R`](consistency-train.R): Train several vector sets for each tissue ## This script essentially does the same thing as `train.R`, only it does it five times with five different subsamplings of the arrays to generate five different fRMA vector sets and saves them all in an R data file. ## [`consistency-evaluate.R`](consistency-evaluate.R): Verify consistency of fRMA vectors ## This script loads the data file from `consistency-train.R`, then loads 20 random arrays from each tissue and normalizes them with all five fRMA vector sets, and also by ordinary RMA. It then produces plots of M vs A for every pair of normalizations. Unlike regular MA plots, these are *not* plotting arrays against each other, but rather arrays against themselves, but normalized using two different methods. So if two normalizations were perfectly consistent, the MA plot would be a flat horizontal line at M=0. It also produces boxplots and violin plots showing the M distribution for each of the pairwise comparisons.