The 3' end of the 20-kb genome of the Purdue strain of
porcine transmissible gastroenteritis coronavirus (TGEV) was copied into
cDNA after priming with
oligo(dT) and the double-stranded product was cloned into the PstI site of the pUC9 vector. One clone of 2.0-kb contained part of the
poly(A) tail and was sequenced in its entirety using the chemical method of Maxam and Gilbert. Another clone of 0.7 kb also contained part of the
poly(A) tail and was sequenced in part to confirm the primary structure of the most 3' end of the genome. Two potential, nonoverlapping genes were identified within the 3'-terminal 1663-base sequence from an examination of open reading frames. The first gene encodes a 382-amino
acid protein of 43,426 mol wt, that is the apparent
nucleocapsid protein on the basis of size, chemical properties, and amino acid sequence homology with other
coronavirus nucleocapsid proteins. It is flanked on its 5' side by at least part of the matrix
protein gene. The second encodes a hypothetical 78-amino
acid protein of 9101 mol wt that is hydrophobic at both ends. A 3'-proximal noncoding sequence of 276 bases was also determined and a conserved stretch of 9
nucleotides near the
poly(A) tail was found to be common among TGEV, the mouse
hepatitis coronavirus, and the avian infectious
bronchitis coronavirus.