Background: Alternative polyadenylation is commonly examined using cDNA sequencing, which is known to be affected by template-switching artifacts. However, the effects of such template-switching artifacts on alternative polyadenylation are generally disregarded, while alternative polyadenylation artifacts are attributed to internal priming. Results: Here, we analyzed both long-read cDNA sequencing and direct RNA sequencing data of two organisms, generated by different sequencing platforms. We developed a filtering algorithm which takes into consideration that template-switching can be a source of artifactual polyadenylation when filtering out spurious polyadenylation sites. The algorithm outperformed the conventional internal priming filters based on comparison to direct RNA sequencing data. We also showed that the polyadenylation artifacts arise in cDNA sequencing at consecutive stretches of as few as three adenines. There was no substantial difference between the lengths of poly(A) tails at the artifactual and the true transcriptional end sites even though it is expected that internal priming artifacts have shorter poly(A) tails than genuine polyadenylated reads. Conclusions: Our findings suggest that template switching plays an important role in the generation of spurious polyadenylation and support the need for more rigorous filtering of artifactual polyadenylation sites in cDNA data, or that alternative polyadenylation should be annotated using native RNA sequencing.
ASJC Scopus subject areas