Protein databases are heavily contaminated with erroneous (mispredicted, abnormal and incomplete) sequences and these erroneous data significantly distort the conclusions drawn from genome-scale protein sequence analyses. In our earlier work we described the MisPred resource that serves to identify erroneous sequences; here we present the FixPred computational pipeline that automatically corrects sequences identified by MisPred as erroneous. The current version of the associated FixPred database contains corrected UniProtKB/Swiss-Prot and NCBI/RefSeq sequences from Homo sapiens, Mus musculus, Rattus norvegicus, Monodelphis domestica, Gallus gallus, Xenopus tropicalis, Danio rerio, Fugu rubripes, Ciona intestinalis, Branchostoma floridae, Drosophila melanogaster and Caenorhabditis elegans; future releases of the FixPred database will include corrected sequences of additional Metazoan species. The FixPred computational pipeline and database (http://www.fixpred.com) are easily accessible through a simple web interface coupled to a powerful query engine and a standard web service. The content is completely or partially downloadable in a variety of formats.
ASJC Scopus subject areas
- Information Systems
- Biochemistry, Genetics and Molecular Biology(all)
- Agricultural and Biological Sciences(all)