The strikingly wide and bimodal gene distribution exhibited by the human genome has prompted us to study the correlations between EST-counts (expression levels) and base composition of genes, especially since existing data are contradictory. Here we investigate how cDNA library preparation affects the GC distributions of ESTs and/or genes found in the library, and address consequences for expression studies. We observe that strongly anomalous GC distributions often indicate experimental biases or deficits during their preparation. We propose the use of compositional distributions of raw ESTs from a cDNA library, and/or of the genes they represent, as a simple and effective tool for quality control.
All Science Journal Classification (ASJC) codes
- Structural Biology
- Molecular Biology
- Cell Biology