Background noise in metagenomic studies is often of high importance and its removal requires extensive post-analytic, bioinformatics filtering. This is relevant as significant signals may be lost due to a low signal-to-noise ratio. The presence of plasmid residues, that are frequently present in reagents as contaminants, has not been investigated so far, but may pose a substantial bias. Here we show that plasmid sequences from different sources are omnipresent in molecular biology reagents. Using a metagenomic approach, we identified the presence of the (pol) of equine infectious anemia virus in human samples and traced it back to the expression plasmid used for generation of a commercial reverse transcriptase. We found fragments of multiple other expression plasmids in human samples as well as commercial polymerase preparations. Plasmid contamination sources included production chain of molecular biology reagents as well as contamination of reagents from environment or human handling of samples and reagents. Retrospective analyses of published metagenomic studies revealed an inaccurate signal-to-noise differentiation. Hence, the plasmid sequences that seem to be omnipresent in molecular biology reagents may misguide conclusions derived from genomic/metagenomics datasets and thus also clinical interpretations. Critical appraisal of metagenomic data sets for the possibility of plasmid background noise is required to identify reliable and significant signals.
ASJC Scopus subject areas