Independent validation of induced overexpression efficiency across 242 experiments shows a success rate of 39%

Gyöngyi Munkácsy, Péter Herman, B. Györffy

Research output: Contribution to journalArticle

Abstract

Although numerous studies containing induced gene expression have already been published, independent authentication of their results has not yet been performed. Here, we utilized available transcriptomic data to validate the achieved efficiency in overexpression studies. Microarray data of experiments containing cell lines with induced overexpression in one or more genes were analyzed. All together 342 studies were processed, these include 242 different genes overexpressed in 184 cell lines. The final database includes 4,755 treatment-control sample pairs. Successful gene induction (fold change induction over 1.44) was validated in 39.3% of all genes at p < 0.05. Number of repetitions within a study (p < 0.0001) and type of used vector (p = 0.023) had significant impact on successful overexpression efficacy. In summary, over 60% of studies failed to deliver a reproducible overexpression. To achieve higher efficiency, robust and strict study design with multi-level quality control will be necessary.

Original languageEnglish
Article number343
JournalScientific Reports
Volume9
Issue number1
DOIs
Publication statusPublished - Dec 1 2019

Fingerprint

Genes
Cell Line
Quality Control
Databases
Gene Expression

ASJC Scopus subject areas

  • General

Cite this

Independent validation of induced overexpression efficiency across 242 experiments shows a success rate of 39%. / Munkácsy, Gyöngyi; Herman, Péter; Györffy, B.

In: Scientific Reports, Vol. 9, No. 1, 343, 01.12.2019.

Research output: Contribution to journalArticle

@article{b1b2ef630a6a4990a39b036155890489,
title = "Independent validation of induced overexpression efficiency across 242 experiments shows a success rate of 39{\%}",
abstract = "Although numerous studies containing induced gene expression have already been published, independent authentication of their results has not yet been performed. Here, we utilized available transcriptomic data to validate the achieved efficiency in overexpression studies. Microarray data of experiments containing cell lines with induced overexpression in one or more genes were analyzed. All together 342 studies were processed, these include 242 different genes overexpressed in 184 cell lines. The final database includes 4,755 treatment-control sample pairs. Successful gene induction (fold change induction over 1.44) was validated in 39.3{\%} of all genes at p < 0.05. Number of repetitions within a study (p < 0.0001) and type of used vector (p = 0.023) had significant impact on successful overexpression efficacy. In summary, over 60{\%} of studies failed to deliver a reproducible overexpression. To achieve higher efficiency, robust and strict study design with multi-level quality control will be necessary.",
author = "Gy{\"o}ngyi Munk{\'a}csy and P{\'e}ter Herman and B. Gy{\"o}rffy",
year = "2019",
month = "12",
day = "1",
doi = "10.1038/s41598-018-36122-8",
language = "English",
volume = "9",
journal = "Scientific Reports",
issn = "2045-2322",
publisher = "Nature Publishing Group",
number = "1",

}

TY - JOUR

T1 - Independent validation of induced overexpression efficiency across 242 experiments shows a success rate of 39%

AU - Munkácsy, Gyöngyi

AU - Herman, Péter

AU - Györffy, B.

PY - 2019/12/1

Y1 - 2019/12/1

N2 - Although numerous studies containing induced gene expression have already been published, independent authentication of their results has not yet been performed. Here, we utilized available transcriptomic data to validate the achieved efficiency in overexpression studies. Microarray data of experiments containing cell lines with induced overexpression in one or more genes were analyzed. All together 342 studies were processed, these include 242 different genes overexpressed in 184 cell lines. The final database includes 4,755 treatment-control sample pairs. Successful gene induction (fold change induction over 1.44) was validated in 39.3% of all genes at p < 0.05. Number of repetitions within a study (p < 0.0001) and type of used vector (p = 0.023) had significant impact on successful overexpression efficacy. In summary, over 60% of studies failed to deliver a reproducible overexpression. To achieve higher efficiency, robust and strict study design with multi-level quality control will be necessary.

AB - Although numerous studies containing induced gene expression have already been published, independent authentication of their results has not yet been performed. Here, we utilized available transcriptomic data to validate the achieved efficiency in overexpression studies. Microarray data of experiments containing cell lines with induced overexpression in one or more genes were analyzed. All together 342 studies were processed, these include 242 different genes overexpressed in 184 cell lines. The final database includes 4,755 treatment-control sample pairs. Successful gene induction (fold change induction over 1.44) was validated in 39.3% of all genes at p < 0.05. Number of repetitions within a study (p < 0.0001) and type of used vector (p = 0.023) had significant impact on successful overexpression efficacy. In summary, over 60% of studies failed to deliver a reproducible overexpression. To achieve higher efficiency, robust and strict study design with multi-level quality control will be necessary.

UR - http://www.scopus.com/inward/record.url?scp=85060365531&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85060365531&partnerID=8YFLogxK

U2 - 10.1038/s41598-018-36122-8

DO - 10.1038/s41598-018-36122-8

M3 - Article

C2 - 30674897

AN - SCOPUS:85060365531

VL - 9

JO - Scientific Reports

JF - Scientific Reports

SN - 2045-2322

IS - 1

M1 - 343

ER -