Compositional gene landscapes in vertebrates

Stéphane Cruveiller, Kamel Jabbari, Oliver Clay, Giorgio Bernardi

Research output: Contribution to journalArticlepeer-review

16 Scopus citations


The existence of a well conserved linear relationship between GC levels of genes' second and third codon positions (GC2, GC3) prompted us to focus on the landscape, or joint distribution, spanned by these two variables. In human, well curated coding sequences now cover at least 15%-30% of the estimated total gene set. Our analysis of the landscape defined by this gene set revealed not only the well documented linear crest, but also the presence of several peaks and valleys along that crest, a property that was also indicated in two other warm-blooded vertebrates represented by large gene databases, that is, mouse and chicken. GC2 is the sum of eight amino acid frequencies, whereas GC3 is linearly related to the GC level of the chromosomal region containing the gene. The landscapes therefore portray relations between proteins and the DNA environments of the genes that encode them.

Original languageEnglish (US)
Pages (from-to)886-892
Number of pages7
JournalGenome Research
Issue number5
StatePublished - May 2004
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Genetics
  • Genetics(clinical)


Dive into the research topics of 'Compositional gene landscapes in vertebrates'. Together they form a unique fingerprint.

Cite this