Protein folds in the worm genome.

M. Gerstein, J. Lin, H. Hegyi

Research output: Chapter in Book/Report/Conference proceedingChapter

16 Citations (Scopus)

Abstract

We survey the protein folds in the worm genome, using pairwise and multiple-sequence comparison methods (i.e. FASTA and PSI-blast). Overall, we find that approximately 250 folds match approximately 8000 domains in approximately 4500 ORFs, about 32 matches per fold involving a quarter of the total worm ORFs. We compare the folds in the worm genome to those in other model organisms, in particular yeast and E. coli, and find that the worm shares more folds with the phylogenetically closer yeast than with E. coli. There appear to be 36 folds unique to the worm compared to these two model organisms, and many of these are obviously implicated in aspects of multicellularity. The most common fold in the worm genome is the immunoglobulin fold, and many of the common folds are repeated in various combinations and permutations in multidomain proteins. In addition, an approach is presented for the identification of "sure" and "marginal" membrane proteins. When applied to the worm genome, this reveals a much greater relative prevalence of proteins with seven transmembrane helices in comparison to the other completely sequenced genomes, which are not of metazoans. Combining these analyses with some other simple filters allows one to identify ORFs that potentially code for soluble proteins of unknown fold, which may be promising targets for experimental investigation in structural genomics. A regularly updated worm fold analysis will be available from bioinfo.mbb.yale.edu/genome/worm.

Original languageEnglish
Title of host publicationPacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
Pages30-41
Number of pages12
Publication statusPublished - 2000

Fingerprint

Genome
Open Reading Frames
Proteins
Yeasts
Escherichia coli
Genomics
Immunoglobulins
Membrane Proteins

Cite this

Gerstein, M., Lin, J., & Hegyi, H. (2000). Protein folds in the worm genome. In Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing (pp. 30-41)

Protein folds in the worm genome. / Gerstein, M.; Lin, J.; Hegyi, H.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing. 2000. p. 30-41.

Research output: Chapter in Book/Report/Conference proceedingChapter

Gerstein, M, Lin, J & Hegyi, H 2000, Protein folds in the worm genome. in Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing. pp. 30-41.
Gerstein M, Lin J, Hegyi H. Protein folds in the worm genome. In Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing. 2000. p. 30-41
Gerstein, M. ; Lin, J. ; Hegyi, H. / Protein folds in the worm genome. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing. 2000. pp. 30-41
@inbook{8ca74aa2f20e4769be563a5a29b55e41,
title = "Protein folds in the worm genome.",
abstract = "We survey the protein folds in the worm genome, using pairwise and multiple-sequence comparison methods (i.e. FASTA and PSI-blast). Overall, we find that approximately 250 folds match approximately 8000 domains in approximately 4500 ORFs, about 32 matches per fold involving a quarter of the total worm ORFs. We compare the folds in the worm genome to those in other model organisms, in particular yeast and E. coli, and find that the worm shares more folds with the phylogenetically closer yeast than with E. coli. There appear to be 36 folds unique to the worm compared to these two model organisms, and many of these are obviously implicated in aspects of multicellularity. The most common fold in the worm genome is the immunoglobulin fold, and many of the common folds are repeated in various combinations and permutations in multidomain proteins. In addition, an approach is presented for the identification of {"}sure{"} and {"}marginal{"} membrane proteins. When applied to the worm genome, this reveals a much greater relative prevalence of proteins with seven transmembrane helices in comparison to the other completely sequenced genomes, which are not of metazoans. Combining these analyses with some other simple filters allows one to identify ORFs that potentially code for soluble proteins of unknown fold, which may be promising targets for experimental investigation in structural genomics. A regularly updated worm fold analysis will be available from bioinfo.mbb.yale.edu/genome/worm.",
author = "M. Gerstein and J. Lin and H. Hegyi",
year = "2000",
language = "English",
pages = "30--41",
booktitle = "Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing",

}

TY - CHAP

T1 - Protein folds in the worm genome.

AU - Gerstein, M.

AU - Lin, J.

AU - Hegyi, H.

PY - 2000

Y1 - 2000

N2 - We survey the protein folds in the worm genome, using pairwise and multiple-sequence comparison methods (i.e. FASTA and PSI-blast). Overall, we find that approximately 250 folds match approximately 8000 domains in approximately 4500 ORFs, about 32 matches per fold involving a quarter of the total worm ORFs. We compare the folds in the worm genome to those in other model organisms, in particular yeast and E. coli, and find that the worm shares more folds with the phylogenetically closer yeast than with E. coli. There appear to be 36 folds unique to the worm compared to these two model organisms, and many of these are obviously implicated in aspects of multicellularity. The most common fold in the worm genome is the immunoglobulin fold, and many of the common folds are repeated in various combinations and permutations in multidomain proteins. In addition, an approach is presented for the identification of "sure" and "marginal" membrane proteins. When applied to the worm genome, this reveals a much greater relative prevalence of proteins with seven transmembrane helices in comparison to the other completely sequenced genomes, which are not of metazoans. Combining these analyses with some other simple filters allows one to identify ORFs that potentially code for soluble proteins of unknown fold, which may be promising targets for experimental investigation in structural genomics. A regularly updated worm fold analysis will be available from bioinfo.mbb.yale.edu/genome/worm.

AB - We survey the protein folds in the worm genome, using pairwise and multiple-sequence comparison methods (i.e. FASTA and PSI-blast). Overall, we find that approximately 250 folds match approximately 8000 domains in approximately 4500 ORFs, about 32 matches per fold involving a quarter of the total worm ORFs. We compare the folds in the worm genome to those in other model organisms, in particular yeast and E. coli, and find that the worm shares more folds with the phylogenetically closer yeast than with E. coli. There appear to be 36 folds unique to the worm compared to these two model organisms, and many of these are obviously implicated in aspects of multicellularity. The most common fold in the worm genome is the immunoglobulin fold, and many of the common folds are repeated in various combinations and permutations in multidomain proteins. In addition, an approach is presented for the identification of "sure" and "marginal" membrane proteins. When applied to the worm genome, this reveals a much greater relative prevalence of proteins with seven transmembrane helices in comparison to the other completely sequenced genomes, which are not of metazoans. Combining these analyses with some other simple filters allows one to identify ORFs that potentially code for soluble proteins of unknown fold, which may be promising targets for experimental investigation in structural genomics. A regularly updated worm fold analysis will be available from bioinfo.mbb.yale.edu/genome/worm.

UR - http://www.scopus.com/inward/record.url?scp=0033657544&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0033657544&partnerID=8YFLogxK

M3 - Chapter

SP - 30

EP - 41

BT - Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing

ER -