The structure of the gene encoding chain c of the hemoglobin of the earthworm, Lumbricus terrestris

J Biol Chem. 1989 Nov 15;264(32):19003-8.

Abstract

The complete nucleotide sequence of the gene for chain c of hemoglobin of the earthworm Lumbricus terrestris has been determined. The sequence of 4037 base pairs (bp) includes about 310 bp of 5'-flanking sequence and 110 bp 3' to the poly(A) site. Comparison of cDNA and genomic sequences shows four silent differences in codons that suggest the presence of at least two genes. The coding sequence is split by two introns of 1344 and 1169 bp at highly conserved positions (Jhiang, S. M., Garey, J. R., and Riggs, A. F. (1988) Science 240, 334-336). The first intron possesses the unusual 5' splice junction sequence GC instead of GT. Many tandem triplet repeats based on (GAT) and (CCT) are present in the first intron. The second intron has nine tandem repeats based on the consensus sequence AAGGAAGGAGGTC. Each intron has several exact inverted repeats of 9-10 bp that might result in loops of 78-140 nucleotides in the RNA prior to splicing. The sequences in the second intron, at positions 2423-2644 are about 65% identical with parts of several genes found in yeast mitochondria and in DNA from several other organisms.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Animals
  • Base Sequence
  • DNA / genetics
  • Gene Library
  • Genes*
  • Hemoglobins / genetics*
  • Introns
  • Macromolecular Substances
  • Molecular Sequence Data
  • Oligochaeta / genetics*
  • Restriction Mapping
  • Sequence Homology, Nucleic Acid

Substances

  • Hemoglobins
  • Macromolecular Substances
  • DNA

Associated data

  • GENBANK/J05161