Hematopoietic Lineage Transcriptome Stability and Representation in PAXgene Collected Peripheral Blood Utilising SPIA Single-Stranded cDNA Probes for Microarray.

Peripheral blood as a surrogate tissue for transcriptome profiling holds great promise for the discovery of diagnostic and prognostic disease biomarkers, particularly when target tissues of disease are not readily available. To maximize the reliability of gene expression data generated from clinical blood samples, both the sample collection and the microarray probe generation methods should be optimized to provide stabilized, reproducible and representative gene expression profiles faithfully representing the transcriptional profiles of the constituent blood cell types present in the circulation. Given the increasing innovation in this field in recent years, we investigated a combination of methodological advances in both RNA stabilisation and microarray probe generation with the goal of achieving robust, reliable and representative transcriptional profiles from whole blood. To assess the whole blood profiles, the transcriptomes of purified blood cell types were measured and compared with the global transcriptomes measured in whole blood. The results demonstrate that a combination of PAXgene™ RNA stabilising technology and single-stranded cDNA probe generation afforded by the NuGEN Ovation RNA amplification system V2™ enables an approach that yields faithful representation of specific hematopoietic cell lineage transcriptomes in whole blood without the necessity for prior sample fractionation, cell enrichment or globin reduction. Storage stability assessments of the PAXgene™ blood samples also advocate a short, fixed room temperature storage time for all PAXgene™ blood samples collected for the purposes of global transcriptional profiling in clinical studies.


Introduction
Development of high-quality biomarkers for disease progression, target engagement, pharmacodynamic effects and compound effi cacy is central to translational research efforts aimed at establishing personalised medicine. However, identifi cation of such biomarkers is frequently hampered by lack of access to target tissues with direct biological involvement. As such, peripheral blood (PB) has become an increasingly attractive source of surrogate material in place of target tissue as a focus for biomarker discovery and application, representing one of the most widely available and practical sources of biological material collected within the clinic and permitting a largely non-invasive method of repeated analyses within the same patient .
In diseases such as autoimmune disorders and leukaemias, where PB lineages are directly involved in the disease process, disease-associated transcriptional profi les have been well established (Alizadeh et al. 2000;Baechler et al. 2003;Bullinger et al. 2004;Crow and Wohlgemuth, 2003;Golub et al. 1999;Han et al. 2003;Maas et al. 2002;McLean et al. 2004;Moos et al. 2002;Nowicki et al. 2003;Valk et al. 2004;Bennett et al. 2003). In recent years interest has also turned to the potential of PB as a surrogate tissue for the discovery of additional types of biomarkers in other scenarios. Given the highly dynamic cellular nature of the blood and its circulation throughout the body, it is likely that systemic responses to a disease state can be captured within the PB transcriptome. Several examples of successful biomarker discovery in transcriptome data derived from PB to date have been reported from pilot studies in neurological disorders (Borovecki et al. 2005;Hershey et al. 2004;Tang et al. 2005;Moore et al. 2005), oncology settings (Twine et al. 2003;Burczynski et al. 2005), SARS ) and gastrointestinal disorders . This accumulating body of evidence supports the utility of peripheral blood transcriptional profi ling for identifi cation of biomarkers that may function as biomarkers of disease, evidence of pharmacodynamic effect, or even predictors of clinical outcomes and risk of toxicity. Whilst peripheral blood transcriptional profi ling appears to hold great promise, there are several limitations to use of this matrix for gene expression studies that should be carefully considered in any study for resulting disease of pharmacodynamic biomarkers to hold any clinical utility (Mohr and Liew, 2007).
Transcriptional profiling from PB can be achieved using a number of methodologies and associated workfl ows; for example a) whole blood, b) enriched mononuclear fractions (PBMCs) or c) selected/enriched specifi c hematopoietic lineages. Naturally, each of these approaches possess distinct advantages and disadvantages. Hematopoietic lineage representation in PBMCs or selected/ enriched fractions, whilst not requiring introduction of globin reduction strategies prior to microarray experiments, is naturally restricted to the lineages under study and several cell lineages are consequently omitted from further analysis. Global profi ling of whole blood RNA, although preferable in order to capture the blood transcriptome in its entirety, has traditionally required globin reduction strategies prior to probe generation for microarray analysis in order to reduce the interference of globin transcripts with array probes. In addition to increasing processing steps, these procedures add the risk of artefactual modulation of the transcriptome and the utility of globin reduction strategies have been questioned in several reports Li et al. 2008;Dumeaux et al. 2008).
The consistency of gene expression profi ling data of whole blood is highly dependant on the sample stabilisation method used at phlebotomy or during storage (Debey et al. 2006;Rainen et al. 2002). Technologies for whole blood sample collection and storage exist, designed in an attempt to overcome problems associated with clinical collection and PB transcriptome stability. One of these, the PAXgene TM Blood RNA system (PreAnalytiX, Hornbrechtikon, Switzerland) consists of an evacuated PAXgene TM RNA tube for blood collection and a processing kit for isolation of total RNA from whole blood. The PAXgene TM collection tube contains a proprietary reagent that is reported to immediately stabilise intracellular RNA for up to 3 days at room temperature. The potential to minimize the requirement for urgent sample processing by increasing the clinical sample transcriptome stability in this matrix is central to the discovery of robust biomarkers.
Several groups have demonstrated the impact of PAXgene TM blood collection on RNA transcript stabilisation using quantitative PCR (qPCR) analyses. These studies have demonstrated the utility of the PAXgene TM system in restricting both initial and longer term (several days) ex vivo gene expression changes occurring after phlebotomy compared to conventional anticoagulant methods for blood collection (Muller et al. 2002;Pahl and Brune, 2002;Rainen et al. 2002;Thorn et al. 2005). The impact of different PAXgene TM storage protocols on RNA quantity and quality has also been investigated with several reports obtaining high quality RNA samples over a range (2 hrs, 9 hrs, 24 hrs and 5 days) of storage times at room temperature (Chai et al. 2005;Thach et al. 2003;Wang et al. 2004). When generating gene expression data using qPCR of selected transcripts as the biological readout, good comparisons between replicate samples has been shown, although variability between samples increases with longer incubation periods (Wang et al. 2004). Gene expression levels of a limited number of transcripts in PAXgene TM -collected whole blood following 5 days room temperature storage have shown no alteration (Chai et al. 2005) compared to 24 hour storage. Conversely, a reduction in RNA integrity after storage in PAXgene TM at room temperature from 1 to 7 days has been reported and, even without apparent reduction in RNA integrity, specifi c transcript instability has been reported even with storage at + 4 o C rather than room temperature (Kagedal et al. 2005). Considering the potential clinical impact of obtaining high quality and faithful gene expression profi ling from peripheral blood, it is crucial to establish a consistent, robust and practical sample collection method for clinical blood samples.
Due to the high level of reticulocytes in peripheral blood there is a predominance of globin mRNA transcripts with the potential to (a) result in under-representation of non-globin transcripts and (b) impact on microarray data quality as a result of extensive non-specific cross-hybridisation to non-globin probes reducing visualisation of non-globin transcripts. Therefore, the quality and accuracy of gene expression profi les obtained from peripheral blood is highly reliant on the effectiveness of globin reduction carried out prior to microarray probe generation. Technologies available to reduce globin mRNA have been shown to efficiently increase the sensitivity of transcript detection (Field et al. 2007;Li et al. 2008;Liu et al. 2006) but can also reduce the signal intensities achieved for some genes (Field et al. 2007). Importantly, methods of globin reduction have also been shown to introduce changes in the transcriptome profile observed (Feezor et al. 2004;Liu et al. 2006). A potential solution to the problem of globin reduction prior to gene expression profi ling of whole blood has been launched by NuGEN Technologies TM (California, U.S.A.) in the form of the Ovation TM RNA amplifi cation system V2. The Ovation system utilises a single primer, isothermal linear amplifi cation (SPIA) method (Kurn et al. 2005) to generate single-strand cDNA microarray probes suitable for use with Affymetrix GeneChips TM . Reproducibility studies and comparison to other microarray methods have illustrated a high degree of consistence and greater hybridisation specifi city when exploiting sscDNA: DNA hybridisation compared to cRNA:DNA on microarrays (Barker et al. 2005). The NuGEN TM Ovation Whole Blood System does not require globin reduction strategies. Whilst globin transcripts are still converted to ssDNA, the high abundance of these do not present an issue in microarray hybridisations due to the increased specifi city of sscDNA probes compared to cRNA probes. This system therefore has the potential to allow whole blood transcriptional profi ling with decreased sample processing steps and circumvention of potential artefactual modulation of the transcriptome associated with globin reduction strategies. Ultimately, the utility of any whole blood profi ling approach is dependent on its ability to provide a faithful representation of the individual hematopoietic lineage transcriptomes present in blood.
In the present study we therefore sought to identify a workfl ow which potentially allows (a) effi cient and immediate stabilisation of collected peripheral whole blood and (b) microarray probe generation without the requirement for globin reduction. In addition to establishing the robustness and reproducibility of such methods, we also gauged the representation of specifi c hematopoietic lineage transcriptomes afforded by the evaluated approaches in order to identify an optimal workfl ow.

Sample collection and processing
Blood samples were collected from an appropriately consented healthy donor using the PAXgene TM collection tube system (PreAnalytiX, Hornbrechtikon, Switzerland). Blood was drawn by standard phlebotomy procedures, with 2.5 ml dispensed into each of 6 PAXgene TM tubes and processed according to the manufacturer's directions (Fig. 1). Briefl y, half of the samples collected (n = 3) were maintained at room temperature for 2 hrs and the other half for 24 hrs, both within the manufacturers' stated period of storage stability of 2-72 hrs. All samples were inverted 10 times and frozen at −20 o C overnight and then moved to −80 o C for storage until RNA isolation (Experiment 1). This same experimental procedure was repeated using a fresh blood draw obtained from the same donor on a separate day (Experiment 2). Additional blood samples from the same subject were collected into K 3 -EDTA tubes and processed for full blood counts on a Sysmex XE 2100 fully automated full blood count analyser.

Total RNA isolation
Frozen PAXgene TM tubes were thawed at room temperature for 2 hours followed by total RNA isolation according to the manufacturers' instructions (PreAnalytiX, Hornbrechtikon, Switzerland). Briefl y, PAXgene TM blood tubes were centrifuged for 10 minutes at 3000 × g, the supernatant removed and the cell pellet resuspended in 4 ml of RNase-free water. Samples were re-centrifuged for 10 minutes at 3000 × g and the supernatant removed. The cell pellet was resuspended in 350 μl Buffer BR1 and following the addition of 300 μl Buffer BR2 and 40 μl Proteinase K incubated for 10 minutes at 55 o C while shaking at 400 rpm. The resultant lysate was centrifuged through a PAXgene TM Shredder spin column at 16,300 × g for 3 minutes. The supernatant of the fl ow through was added to 350 μl of 100% ethanol, mixed and applied to a PAXgene TM RNA spin column and centrifuged for 1 minute at 16,300 × g. 350 μl of Buffer BR3 was added to the PAXgene TM RNA spin column and centrifuged for 1 minute at 16,300 × g. A DNase I incubation mix was prepared by combining 10 μl of DNase I and 70 μl of Buffer RDD per sample. DNase treatment was performed by pipetting 80 μl of DNase I incubation mix directly on to the PAXgene TM RNA spin column membrane and incubating at room temperature for 15 minutes. To wash the PAXgene TM RNA spin column 350 μl of Buffer BR3 was added and centrifuged for 1 minute at 16,300 × g. 500 μl of Buffer BR4 was added to the PAXgene TM RNA spin column and centrifuged for 1 minute at 16,300 × g. The addition of Buffer BR4 was repeated with a 3 minute 16,300 × g centrifugation followed by additional centrifugation for 1 minute at 16,300 × g to dry the column. RNA was eluted into a fresh 1.5 ml microcentrifuge tube using 40 μl of Buffer BR5 followed centrifugation for 1 minute at 16,300 × g. Elution was repeated with 40 μl of fresh Buffer BR5 giving a total elution volume of 80 μl. Eluted RNA was denatured for downstream applications by incubation at 65 o C for 5 minutes. Total RNA concentration was measured by Nanodrop ND-8000 (Nanodrop Technologies, Wilmington, DE, USE), and the integrity of the extracted total RNA was assessed using RNA Nano Chips on an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, U.S.A). Agilent Bioanalyzer 2100 electropherograms were analysed by the Biosizing software 2100 Expert Version B.02.03.SI307 (Agilent Technologies, 2100 Bioanalyzer) to generate RNA integrity numbers (RIN) as an RNA quality control metric.

Hematopoietic cell lineages
Hematopoietic cell lineage transcript expression data was obtained from a number of sources (see Table 4). RNA derived in-house (B-cells, T-cells and Dendritic cells) for microarray hybridisation was obtained from frozen cell preparations purchased from Lonza Biosciences (Basel, Switzerland). Transcript expression data for a wide range of other cell lineages were obtained from public datasets deposited in the Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo/), accession numbers GSE1133 and GSE3982 (Jeffrey et al. 2006;Su et al. 2004). Taken together this dataset provided a source of mRNA expression data (derived from Affymetrix GeneChips TM ) from a wide range of hematopoietic cell lineages prepared by either density gradient, positive or negative paramagnetic bead isolation, and with or without stimulated differentiation in vitro.

Microarray probe generation
Preparation of single stranded cDNA (sscDNA) SPIA probes for microarray hybridisation, without the requirement for globin reduction pretreatment, was carried out with 50 ng of total PAXgene TM RNA using NuGEN Technologies (California, U.S.A.) Ovation TM RNA amplification system V2 according to the manufacturers' instructions. Synthesised sscDNA probe concentration was measured by Nanodrop ND-8000 (Nanodrop Technologies, Wilmington, DE, U.S.A). The NuGEN Technologies FL-Ovation TM cDNA Biotin Module V2 was used for sscDNA fragmentation and biotin labelling of 4.4 μg of the amplifi ed sscDNA for subsequent hybridisation to the microarrays (Affymetrix, Santa Clara, CA, U.S.A). The quality of the sscDNA probes was assessed before and after fragmentation with the Agilent 2100 Bioanalyzer using RNA Nano chips (Agilent Technologies, Santa Clara, CA, U.S.A).
RNA (2 μg) prepared in-house from B-cells, T-cells and Dendritic cells was converted to cRNA, further processed according to Affymetrix (Santa Clara, CA, U.S.A) recommendations and quality assessed by Agilent 2100 Bioanalyser.

Affymetrix GeneChips
For in-house prepared B-cell, T-cell and dendritic cell data, 15 μg of fragmented cRNA was hybridised to Affymetrix HG-U133A GeneChips, according to standard Affymetrix procedures. Data from GSE1133 and GSE3982 also resulted from hybridisation to HG-U133A GeneChips.
sscDNA probes generated from RNA isolated from PAXgene TM collected blood were hybridised to Affymetrix Human Genome U133 Plus 2.0 Arrays (Affymetrix, Santa Clara, CA, U.S.A.) as described in the Affymetrix Expression Analysis Technical Manual. Briefl y, 4.4 μg of fragmented and labelled sscDNA, together with spiked hybridisation controls (GeneChip Expression 3' Amplifi cation Reagents-hybridisation controls), was hybridised for 18 hrs at 45 o C in a rotating oven. Following hybridisation GeneChip washing and staining was performed using the GeneChip Hybridisation Wash and Stain kit (Affymetrix, Santa Clara, CA, U.S.A.) on an Affymetrix GeneChip Fluidics Station 450 using the appropriate fl uidic script for the U133 Plus 2.0 microarrays with sscDNA (FS450-0004). GeneChips were scanned immediately following staining in an Affymetrix GeneChip Scanner 3000 (Affymetrix, Santa Clara, U.S.A).

Data analysis
Report fi les summarising the quality of target and control detection for each microarray were generated by GeneChip Operating Software Version 1.4 (GCOS) using the MAS5.0 algorithm (Affymetrix, Santa Clara, CA, U.S.A). All parameters (noise factor, background, and scaling factor) were acceptable by Affymetrix recommendations ( Table 2).
Assessment of PAXgene TM / NuGEN TM sscDNA whole blood data was performed using a combination of Affymetrix MAS5.0 feature extraction and RMA (Irizarry et al. 2003). Briefl y, those probesets determined by MAS5.0 to be present in at least 25% of the PAXgene TM whole blood samples were selected for downstream analyses. Array fi les were subsequently processed by RMA, filtered on those MAS5.0-selected probesets. Data were Log 2 -transformed and those probesets with values greater than 6.5 in at least one sample selected for further analysis. Comparison between 2 hrs and 24 hrs PAXgene TM storage was performed by SAM analysis (Tusher et al. 2001) implemented in TM4 MEV (multi-experiment viewer) (Saeed et al. 2003) and using all samples, from both experiments 1 and 2 together, with a False Discovery Rate (FDR) of 0.05. For assessment of any 3' or 5' bias of probeset signal intensity the chromosomal location of probesets were mapped from Affymetrix NetAffx annotation, allowing assessment of bias of signal intensity for those transcripts containing two or more probesets at distinct positions of the corresponding transcript.
To identify probesets most indicative of specifi c hematopoietic lineages, SAM analysis was performed with an FDR of 0 to select probesets signifi cantly over-represented in one cell type compared to all other cell types within the same dataset. Following identification of probesets signifi cantly over-represented in one particular lineage we applied a second fi ltering criterion of exclusion of those probesets with linear expression levels of less than 200 to maximise the likelihood of relevance of identified probesets being "diagnostic" of specifi c hematopoietic lineages. CEL fi les for Paxgene TM data and internally generated hematopoietic lineage microarray data are available from Array Express (http://www.ebi. ac.uk/microarray-as/aer), accession number E-MEXP-1600.

RNA yield and quality
In order to assess the yield and quality of the RNA obtained following PAXgene TM RNA stabilisation and isolation according to the workfl ow outlined in Figure 1, quality control analyses were carried out and yields are presented in Table 1. Analysis of total RNA quality by Agilent Bioanalyzer generated showed high RIN (RNA integrity numbers) for all samples (Table 1), revealing good quality RNA above the level traditionally deemed acceptable for microarray probe generation (RIN = 7.0) (Fig. 2, A). No signifi cant differences in yield or quality of RNA obtained from samples collected (a) in different experiments or (b) as a result of PAXgene TM storage time were noted.
Effi cacy of sscDNA probe preparation from PAXgene TM whole blood RNA All RNA samples were used to synthesize sscDNA probes for use in microarray gene expression profiling. Yields of sscDNA within the same experiments are tightly grouped and no signifi cant difference between experiments or as a result of incubation times were observed (Table 1). sscDNA yields were consistently in excess of the 4.4 μg required for further processing for array hybridisations. Quality assessment by Agilent Bioanalyzer shows that the sscDNA fragments generated from whole blood RNA are of broad distribution in length, averaging approximately 1000 nucleotides, suggesting effi cient sscDNA synthesis (Fig. 2, B). sscDNAs were fragmented and biotin-labelled in preparation for hybridisation and all resulted in fragments of ~50-200 nt (data not shown), similar to results previously reported using this method (Barker et al. 2005).

Quality of microarray measurements
High quality control metrics for both RNA and sscDNA indicated successful sample preparation and all were subsequently hybridised to Affymetrix   Human Genome U133 Plus 2.0 Arrays. All arrays within the study generated percentage present calls of between 60%-63%. Average background was 32.80 + 0.31 (mean + standard error), within the typical range of 20-100, the average noise factor was 1.43 + 0.04 and the average scaling factor for all arrays was 1.56 + 0.08. Finally, the β-actin 3'-5' ratio were consistent between all arrays suggesting no signifi cant degradation of samples. Neither experiment or incubation period of PAXgene TM whole blood contributed to any variation in quality control metrics (Table 2).

Hematopoietic transcriptome stability in PAXgene TM Whole Blood samples
Following data normalisation, transformation and fi ltering, as outlined in materials and methods, Principal Component Analysis of remaining probesets was performed (Fig. 3a). As would be expected, distinct groupings of data according to experimental day were evident. Similarly, the PCA model revealed significant influence of PAXgene TM whole blood incubation period to variance within the dataset, for both experiments. SAM analysis was used with an FDR of 0.05 to identify those probesets differing in expression level in PAXgene TM whole blood samples stored at 24 hrs relative to 2 hrs. Those probesets fulfi lling these criteria and which differ in expression level greater than 1.5-fold in 24 hr samples relative to 2 hrs are represented graphically in Figure 3b. This analysis identified a total of 3311 probesets differentially represented in PAXgene TM collected whole blood incubated at room temperature for 24 hrs compared to 2 hrs. The vast majority of these, 3050 (92.1%), represent probesets with lower signal intensities in 24 hr relative to 2 hr samples. Whilst a large amount of these (78.6%) are relatively modest 1.5 to 2-fold changes, a signifi cant number are differentially represented at levels greater than 2-fold, with only 4 of these representing probesets over-represented in 24 hr samples. The complete annotated list of these probesets is available as Supplementary online Table 1.
We fi rst considered the possibility whether those probesets under-represented in 24 hr samples were indicative of canonical degradation of mRNAs. To assess whether these could be due to 5'−3' degradation of the mRNA transcripts we exploited the fact that several transcripts are represented by 2 or more probesets on the Affymetrix Human Genome U133 Plus 2.0 Arrays. The Affymetrix annotation fi le allows us to map the chromosomal location of these multiple probesets for the genes showing differential expression between 2 hrs and 24 hrs storage. This then allowed identifi cation of which probeset is responsible for reduced signal intensity in the 24 hrs samples and subsequent assessment of whether this was due to a loss of the 3' or 5' probesets. Most genes were excluded from  these analyses due to the presence of only one probeset or multiple probesets mapping to overlapping chromosomal regions. In total 117 transcripts were utilised to investigate a potential polarity effect (i.e. 3' or 5' degradation). Interestingly, of these 117 genes a clear degradation effect was observed with 94 genes displaying preferential loss of a 3' probeset and only 23 genes showing loss of a 5' probeset suggesting highly signifi cant (P Ͻ 1 × 10 −7 , repeat random sampling) 3' degradation.

Faithfulness of hematopoietic cell lineage transcriptome representation in PAXgene TM whole blood expression profi les
It is essential that any workfl ow utilising whole blood provides a faithful representation of individual cell lineage transcriptomes. We wished to have an objective measurement of the probesets we had identifi ed as "present" in our PAXgene TM analyses compared to probesets independently identifi ed from samples where specifi c haematopoietic cell lineages had been isolated as being indicative of the identity of that particular lineage. We pursued this approach with the additional intention of investigating whether the marked underrepresentation of a large number of probesets following 24 hr PAXgene TM whole blood storage resulted from preferential reduction in signal intensity of probes accounted for by a particular hematopoietic lineage. 'Signature probe-sets' indicative for a number of individual haematopoietic cell lineages were estimated from both in-house and public data sets (Jeffrey et al. 2006;Su et al. 2004). These datasets were generated utilising a wide range of enrichment techniques; positive and/or negative selection and with or without in vitro stimulation and culture from either cord blood or peripheral blood (Table 4). Signature probesets were identifi ed following SAM analysis with an FDR of 0, as indicated in materials and methods, from those found to be significantly overrepresented in the cell-type of choice, compare to all other cell-types in that dataset. This data is summarised in Table 4 and the resultant annotated  list of lineage-enriched probesets are available as  Supplementary online Table 2.
To safeguard against any differences in hematopoietic cell lineages observed between our PAXgene dataset and others being due to abnormal levels of a particular cell lineage from the individual donor used in this study full haematopoietic analysis on blood samples from this individual was carried out and shown to be within healthy boundaries (Table 3).
Probesets determined to be hematopoietic cell lineage enriched remain well represented in the PAXgene TM /sscDNA 2 hr/24 hr combined dataset (Table 4, Fig. 4), with several reaching 100% representation. Macrophages and dendritic cell transcript "signatures" appear most-under-represented in the PAXgene TM / sscDNA whole blood expression data (71% and 75% respectively). Notably, those probesets determined to be sensitive to PAXgene TM whole blood incubation period do not appear to be contributed to by any particular individual lineage under study. Whilst the high degree of representation of individual hematopoietic lineage transcriptome in PAXgene TM / sscDNA data was encouraging it was important to establish that this was not due solely to high abundance transcripts being interrogated. We therefore investigated the relationship between abundance of transcripts identifi ed as being indicative of a particular lineage compared to transcript abundance observed in the PAXgene TM /sscDNA data (Fig. 4). The small number of lineage enriched probesets, not represented in the PAXgene TM /sscDNA dataset, appear over a range of abundances but are mostly lowly expressed in both the original and PAXgene TM datasets. Importantly, several low abundant genes within the isolated cell data sets are robustly detected in the PAXgene TM dataset (Fig. 4). Interestingly, there is a general concordance between the Table 3. Donor hematological full blood counts. levels of expression in the original cell-line specifi c data and in our PAXgene TM whole blood data set. There is only low representation (0%-2%) of any of these lineage specifi c probesets in those determined to be unstable as a result of storage times, therefore no loss of a particular cell lineage accounts for these changes.

Discussion
The ability to robustly detect and measure biological variation associated with disease or pharmacologic intervention in whole blood clinical samples is key to the future discovery of surrogate disease biomarkers leading to translational research and personalised medicine. To this end we have assessed a workfl ow which allows microarray gene expression analysis of whole blood using PAXgene TM RNA stabilisation and subsequent sscDNA microarray probe generation without the requirement for globin reduction practices.
Despite claims that PAXgene TM technology effi ciently stabilises whole blood samples for up  to 3 days at room temperature, reports of reduced RNA quality with longer PAXgene TM storage and importantly, instability of a small number of studied transcripts despite retention of RNA integrity (Kagedal et al. 2005), within these recommended storage limits, warranted further investigation of transcriptome stability in PAXgene TM collected whole blood. To assess the stability of the entire whole blood transcriptome under storage conditions refl ecting likely sample collection schema in clinical environments and within the recommended limits of the PAXgene TM technology we used Affymetrix GeneChip array technology and PAXgene TM collected whole blood stored at room temperature for either 2 or 24 hrs.
Our data demonstrate that PAXgene TM stabilised peripheral blood samples stored at room temperature for either 2 hrs or 24 hrs results in high quality RNA and generate ample sscDNA microarray probes for gene expression analysis, with subsequent arrays being of high quality with signifi cantly high present calls (60%-62%), despite the omission of any globin interference reduction strategies. As would be expected, further analysis of array datasets revealed significant contribution of (a) experimental day and (b) incubation period to overall variance within the data. However, despite the preservation of RNA quality, storage for 24 hrs at room temperature resulted in signifi cant transcriptome alterations compared to storage for 2 hrs, The number of probesets determined, as outlined in the materials and methods section, to be most indicative of specifi c hematopoietic cell lineages from in-house generated array data and from GSE3982 (Jeffrey et al. 2006) and GSE1133 (Su et al. 2004) are indicated. The number or proportion of these present in processed and fi ltered data derived from hybridisation of sscDNA probes derived from RNA isolated from PAXgene TM collected whole blood are indicated. The number of those probesets for each lineage which were also identifi ed to be incubation period sensitive is also presented. NK cells; Natural Killer Cells.
with the majority (92.1%) representing probesets under-represented in 24 hr samples at levels greater than 1.5-fold. We interpret this to indicate that intracellular RNA stabilisation is not 100% complete following 2 hrs storage at room temperature. Those transcripts for which we were able to further investigate polarity of any degradation effect substantiated this notion, with significantly lower signal intensity being attributed to the 3' end of transcripts. Whilst this in agreement with incomplete RNA stabilisation it also indicates that this does not necessarily result in canonical RNA degradation, with this data indicating loss of poly(A) sequences, the effects of which would of course be expected to manifest most profoundly in procedures utilising poly(A) tails for reverse transcription. The clinical implication of these fi ndings are signifi cant as variation in sample storage or transportation time will lead to ex vivo transcriptome alterations which may mask any clinically relevant biological variation present within those transcripts affected by storage artefacts. However, it is important to note that Affymetrix expression array GeneChips are focussed to the 3' end of transcripts and this should be considered when evaluating our data in context with competitor microarray platforms. Whilst size, nucleotide composition, secondary structure properties and abundance of transcripts could all conceivably account in part for the storage time dependent transcriptome alterations noted here we also considered it possible, albeit unlikely after more than 2 hours, that this could indicate differential lysis of a particular hematopoietic lineage within the PAXgene TM lysis matrix, thereby fostering RNA degradation within those cells. Notably, signal intensities of the vast majority (74%) of those transcripts identified as being PAXgene TM storage period sensitive at 24 hrs are within the upper 50% of signal intensity distribution for those probesets obtained at 2 hrs (data not shown), indicating that increased periods of whole blood storage in the PAXgene TM system predominantly affects transcripts of highest abundance.
Comparison of our PAXgene TM sample data set with expression data of transcripts determined to be indicative of specifi c lineage populations, we demonstrated that the 24 hr under-represented transcripts appeared to be cell-lineage independent. Similarly, pathway mapping analysis did not reveal any obvious networks that could be attributed to cell-specifi c biological functions (data not shown). The underlying factors infl uencing transcript specifi c degradation and signifi cant storage time dependent under-representation within the PAXgene TM samples are therefore undetermined at this time.
Our data are concordant with several reports that RNA integrity remains intact following extended room temperature storage times following PAXgene TM blood sample collection (Chai et al. 2005;Thach et al. 2003;Wang et al. 2004). However, we have shown degradation and signifi cant under-representation of a large number of transcripts following just 24 hrs room temperature storage, supporting previous reports (Kagedal et al. 2005) of sample instability, but contrary to others showing no change in transcript levels up to 5 days of room temperature storage (Chai et al. 2005). It is likely that this latter discrepancy is due to the use of qPCR methods to analyse only a small number of specifi c transcripts (Chai et al. 2005;Thach et al. 2003;Wang et al. 2004), rather than global profi ling approaches. Reports have also compared longer storage times (24 hrs and 5 days) demonstrating no alterations in transcript representation (Chai et al. 2005). However, with the omission of shorter time points interpretation of whether the abundance of these transcripts has already altered from their original basal levels at time of collection is not possible.
We report the generation of high quality global gene expression data from PAXgene TM peripheral blood samples, which we believe will be vital for future biomarker discovery and interrogation within clinical samples. However, considering transcriptome alterations observed within the storage time range recommended by the manufacturers it will be imperative to instigate a fi xed storage time for all clinical samples intended for direct comparison within a given study.
A further aim of this study was to assess the suitability of a single-primer isothermal linear amplifi cation (SPIA) method for application with Affymetrix GeneChip microarrays for whole blood transcriptome analysis potentially mitigating the need for globin reduction. Globin reduction practices in whole blood RNA sample processing have previously been shown to alter global gene expression profi les obtained (Feezor et al. 2004;Liu et al. 2006) and signifi cantly increase sample handling. Using this SPIA technology we synthesised high quality sscDNA in ample quantity from PAXgene TM samples to carry out hybridisation to microarrays. sscDNA probes for microarrays have previously been shown to provide high specifi city of hybridisation and consequently provide increased percentage present calls, as a result of reduced cross-hybridisation to mismatch control probes, and with this workfl ow we achieved a minimum of 60 percent present calls on Affymetrix Genechips. Therefore, we have demonstrated that NuGEN Technologies Whole Blood Ovation TM RNA amplifi cation for probe generation permits analysis of gene expression profi les from whole blood with minimal handling steps without the requirement for potentially transcriptome-modifying globin reduction practices that were previously required and recommended by the Best Practices Blood Handling for Array Based Expression Working Group (www.affymetrix.com/community/standards/blood_protocol.affx). Importantly, the gene expression data sets obtained as a result of this workflow result in excellent representation of individual hematopoietic cell lineages with a wide spectrum of transcript abundances, thereby supporting the notion that efficient whole blood expression profi ling studies can be achieved without requirement for lineage or cell population enrichment strategies and without loss of data contributed by specifi c lineages. Whilst the high levels of representation of transcripts indicative of specifi c cell lineages in the PAXgene TM / sscDNA dataset was remarkable, as expected several of these did not reach 100% and we consider this a likely refl ection of the methods employed for individual lineage enrichment. Notably, those lineages with lowest representation (approximately 70%) result from enrichment schemes which include subsequent in vitro stimulation and culture and with transcriptome profi les likely least refl ective of those lineages in vivo.
We conclude that robust global gene expression profi ling from whole blood without the requirement for globin reduction processing schemes and with maximal representation of individual hematopoietic lineages can be achieved by combination of PAXgene TM RNA stabilisation procedures, albeit with careful consideration to fixed incubation periods for all samples within a clinical collection, together with NuGEN Technologies Whole Blood Ovation TM RNA amplifi cation system for microarray probe generation.

Disclosure
The authors report no confl icts of interest.