Prospects for Proteomics Directed Genomic and Genetic Analyses in Disease Discoveries

23 Open Access Full open access to this and thousands of other papers at Abstract: Proteomic discoveries are usually made using database searches for identification of proteins in a given protein sample derived from cells or tissues. High throughput searches leave a number of peptides not analyzed for a variety of reasons, such as posttranslational modification or a mutation that results changes in the peptide that is not present in databases. Such mutations may be critically important in causing disease conditions. Accounts from ocular diseases are presented where the search provided results often from non-conventional databases (such as structural database instead of protein database) due to the presence of information about a mutant peptide. We contemplate that better algorithms and the ability to determine probabilities of different amino acids in the available sequence may permit combinatorial analysis with genomics which may help identify new disease associated mutations directly from the sequence of the captured peptides. In addition, the de novo analysis of spectra of the unidentified peptides may provide mutation or polymorphism information enabling additional insight about the disease association of a mutation or posttranslational modification.


Introduction
Since 1990s mass spectrometry has gained momentum for the analysis of cell and tissue proteomes. 1Identification of proteins through mass spectrometry of trypsin digested peptides using a solid phase matrix assisted laser disruption ionization (MALDI) device utilizes time of flight (TOF) as a measure of the mass to charge ratio.Calibration is performed using the mass of synthetic peptides and accurate masses of measured peptides are used to identify protein matches using databases.The lower and upper extremes of the spectrum of experimentally determined peptide masses utilized for such calibration are usually inaccurate, as calibration standards do not extrapolate in a linear fashion.This approach is often referred to MS 1 analysis (or simply as MS analysis) often leads to inaccurate and erroneous identification.In this type of analyses, even if the search is restricted to a given organism databases, a number of peptide masses are often left unidentified.That is, some peptides masses even within the well calibrated region of the spectra often cannot be identified, if the search is restricted to specific organisms or databases.This is due to posttranslational modification or unrecorded haplotypes or disease mutations.Despite explosive growth in mass spectrometry and the fast changing pace in which new technology is becoming available in this area, a large number of biologists still apply MALDI-TOF methodology where the lack of identification becomes inherent and valuable information is left behind.
Superior mass spectrometers, based on matrix ionization or electrospray ionization, utilize MS-MS or MS 2 and of late often MS n where n is greater than 2. In MS-MS devices, the selected precursor is broken down in a collision cell with an inert gas in a chemically defined manner from which amino acid sequences can be determined. 1,2Database searches which utilize these peptide sequence matches also suffer from some of these limitations.That is if there is no algorithm for possible mutation in one or more amino acids compared to existing information for the peptide and unknown posttranslational modifications of a peptide then that peptide is not ascribed to alignment against a sequence and is left aside as unanalyzed.
Unrecorded mutation in the identified peptides is one of the reasons and posttranslational modifications are another reason that leave behind a number of unidentified peptides obtained in mass spectrometry experiments.Often some program searches invariably look into structural databases (for example, some versions of ProteinLynx TM software) and find a match due to presence of the peptide sequence in structure databases instead of regular protein or gene sequence databases.However, improvement of search algorithms and incorporation of possibility of mutation may enable capture of a mutation directly from the peptide and enable further investigation at the genomic level and performing targeted genetic analysis.

Identification of Mutant Proteins in Ocular Diseases Glaucoma
Proteomic analysis of trabecular meshwork (TM) tissue (Fig. 1), a region that offers increased resistance to aqueous outflow in the anterior eye chamber identified a number of mutant proteins (Table 1) in glaucomatous TM tissue compared to normal controls. 3A number of the identified mutant proteins were associated with the circulatory system, consistent with the vascular theory of glaucoma.Two proteins were identified from structure database: human Lysozyme mutant A96l, W109h (Accession number: 1BB4) and Hemoglobin A Chain (1C7D).This is perhaps due to the fact that a match was readily identifiable in structure database but not in the protein database.The other identified mutant proteins are: hbbm fused globin (Accession number: Q14477), and Hb-S-Wake (Accession number: AAN11320), mutant keratin 9 (O00109) and kerato-epithelin (Q15582; Table 1).One interesting aspect of these mutations is that they occur with high frequency among African-Americans, 3 which is consistent with the clinical finding that African-American have relatively higher prevalence rates of glaucoma. 4

Pseudoexfoliation syndrome
6][7] Antibody pattern in the aqueous humor from glaucoma and pseudoexfoliation syndrome patients also have been subjected to mass spectrometric pattern determination. 8The variant peptides of lysyl lysyl oxidase (LOXL1) were detected directly in the pseudoexfoliation syndrome material. 7

concluding Remarks
Proteomics is applied to identify the proteins in cells and tissues.The peptide sequence of a given protein may differ from the gene sequence translated product in a database for a variety of reasons, such as gene mutation, gene splicing and posttranslational modifications.The de novo sequencing of unidentified peptide masses or better algorithms for incorporation of mutation or posttranslational modification may aid in searches.In the future, due to such efforts, direct identification of a mutant peptide may shorten the genomic and genetic discoveries linking the mutant product with disease or pathogenic conditions.

Figure 1 .
Figure 1.A schematic diagram of the eye and an enlargement of the anterior chamber angle.The regions of the eye are as indicated.

Table 1 .
Mutant proteins identified glaucomatous TM.SwissProt, nCBI protein and nCBI structure accession numbers are in regular, italics and bold italic types respectively. *