An exon is any part of a gene that will encode a part of the final mature RNA produced by that gene after introns have been removed by RNA splicing. The term exon refers to both the DNA sequence within a gene and to the corresponding sequence in RNA transcripts. In RNA splicing, introns are removed and exons are covalently joined to one another as part of generating the mature messenger RNA. Just as the entire set of genes for a species constitutes the genome, the entire set of exons constitutes the exome.

The term exon derives from the expressed region and was coined by American biochemist Walter Gilbert in 1978: "The notion of the cistron… must be replaced by that of a transcription unit containing regions which will be lost from the mature messenger – which I suggest we call introns (for intragenic regions) – alternating with regions which will be expressed – exons."

This definition was originally made for protein-coding transcripts that are spliced before being translated. The term later came to include sequences removed from rRNA and tRNA, and it also was used later for RNA molecules originating from different parts of the genome that are then ligated by trans-splicing.

Although unicellular eukaryotes such as yeast have either no introns or very few, metazoans and especially vertebrate genomes have a large fraction of non-coding DNA. For instance, in the human genome only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. This can provide a practical advantage in omics-aided health care (such as precision medicine) because it makes commercialized whole exome sequencing a smaller and less expensive challenge than commercialized whole genome sequencing. The large variation in genome size and C-value across life forms has posed an interesting challenge called the C-value enigma.

Across all eukaryotic genes in GenBank there were (in 2002), on average, 5.48 exons per gene. The average exon encoded 30-36 amino acids. While the longest exon in the human genome is 11555 bp long, several exons have been found to be only 2 bp long. A single-nucleotide exon has been reported from the Arabidopsis genome.

In protein-coding genes, the exons include both the protein-coding sequence and the 5′- and 3′-untranslated regions (UTR). Often the first exon includes both the 5′-UTR and the first part of the coding sequence, but exons containing only regions of 5′-UTR or (more rarely) 3′-UTR occur in some genes, i.e. the UTRs may contain introns. Some non-coding RNA transcripts also have exons and introns.

This page was last edited on 15 May 2018, at 21:47.
Reference: https://en.wikipedia.org/wiki/Exons under CC BY-SA license.

Related Topics

Recently Viewed