Prediction and characterization of human ageing-related proteins by using machine learning

Csaba Kerepesi, Bálint Daróczy, Ádám Sturm, T. Vellai, András Benczúr

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Ageing has a huge impact on human health and economy, but its molecular basis - regulation and mechanism - is still poorly understood. By today, more than three hundred genes (almost all of them function as protein-coding genes) have been related to human ageing. Although individual ageing-related genes or some small subsets of these genes have been intensively studied, their analysis as a whole has been highly limited. To fill this gap, for each human protein we extracted 21000 protein features from various databases, and using these data as an input to state-of-the-art machine learning methods, we classified human proteins as ageing-related or non-ageing-related. We found a simple classification model based on only 36 protein features, such as the "number of ageing-related interaction partners", "response to oxidative stress", "damaged DNA binding", "rhythmic process" and "extracellular region". Predicted values of the model quantify the relevance of a given protein in the regulation or mechanisms of the human ageing process. Furthermore, we identified new candidate proteins having strong computational evidence of their important role in ageing. Some of them, like Cytochrome b-245 light chain (CY24A) and Endoribonuclease ZC3H12A (ZC12A) have no previous ageing-associated annotations.

Original languageEnglish
Article number4094
JournalScientific Reports
Volume8
Issue number1
DOIs
Publication statusPublished - Dec 1 2018

Fingerprint

Proteins
Endoribonucleases
Genes
Machine Learning
Oxidative Stress
Databases
Light
DNA
Health
cytochrome b245

ASJC Scopus subject areas

  • General

Cite this

Prediction and characterization of human ageing-related proteins by using machine learning. / Kerepesi, Csaba; Daróczy, Bálint; Sturm, Ádám; Vellai, T.; Benczúr, András.

In: Scientific Reports, Vol. 8, No. 1, 4094, 01.12.2018.

Research output: Contribution to journalArticle

Kerepesi, Csaba ; Daróczy, Bálint ; Sturm, Ádám ; Vellai, T. ; Benczúr, András. / Prediction and characterization of human ageing-related proteins by using machine learning. In: Scientific Reports. 2018 ; Vol. 8, No. 1.
@article{c6cbc5850b064197b7072a2a071e124f,
title = "Prediction and characterization of human ageing-related proteins by using machine learning",
abstract = "Ageing has a huge impact on human health and economy, but its molecular basis - regulation and mechanism - is still poorly understood. By today, more than three hundred genes (almost all of them function as protein-coding genes) have been related to human ageing. Although individual ageing-related genes or some small subsets of these genes have been intensively studied, their analysis as a whole has been highly limited. To fill this gap, for each human protein we extracted 21000 protein features from various databases, and using these data as an input to state-of-the-art machine learning methods, we classified human proteins as ageing-related or non-ageing-related. We found a simple classification model based on only 36 protein features, such as the {"}number of ageing-related interaction partners{"}, {"}response to oxidative stress{"}, {"}damaged DNA binding{"}, {"}rhythmic process{"} and {"}extracellular region{"}. Predicted values of the model quantify the relevance of a given protein in the regulation or mechanisms of the human ageing process. Furthermore, we identified new candidate proteins having strong computational evidence of their important role in ageing. Some of them, like Cytochrome b-245 light chain (CY24A) and Endoribonuclease ZC3H12A (ZC12A) have no previous ageing-associated annotations.",
author = "Csaba Kerepesi and B{\'a}lint Dar{\'o}czy and {\'A}d{\'a}m Sturm and T. Vellai and Andr{\'a}s Bencz{\'u}r",
year = "2018",
month = "12",
day = "1",
doi = "10.1038/s41598-018-22240-w",
language = "English",
volume = "8",
journal = "Scientific Reports",
issn = "2045-2322",
publisher = "Nature Publishing Group",
number = "1",

}

TY - JOUR

T1 - Prediction and characterization of human ageing-related proteins by using machine learning

AU - Kerepesi, Csaba

AU - Daróczy, Bálint

AU - Sturm, Ádám

AU - Vellai, T.

AU - Benczúr, András

PY - 2018/12/1

Y1 - 2018/12/1

N2 - Ageing has a huge impact on human health and economy, but its molecular basis - regulation and mechanism - is still poorly understood. By today, more than three hundred genes (almost all of them function as protein-coding genes) have been related to human ageing. Although individual ageing-related genes or some small subsets of these genes have been intensively studied, their analysis as a whole has been highly limited. To fill this gap, for each human protein we extracted 21000 protein features from various databases, and using these data as an input to state-of-the-art machine learning methods, we classified human proteins as ageing-related or non-ageing-related. We found a simple classification model based on only 36 protein features, such as the "number of ageing-related interaction partners", "response to oxidative stress", "damaged DNA binding", "rhythmic process" and "extracellular region". Predicted values of the model quantify the relevance of a given protein in the regulation or mechanisms of the human ageing process. Furthermore, we identified new candidate proteins having strong computational evidence of their important role in ageing. Some of them, like Cytochrome b-245 light chain (CY24A) and Endoribonuclease ZC3H12A (ZC12A) have no previous ageing-associated annotations.

AB - Ageing has a huge impact on human health and economy, but its molecular basis - regulation and mechanism - is still poorly understood. By today, more than three hundred genes (almost all of them function as protein-coding genes) have been related to human ageing. Although individual ageing-related genes or some small subsets of these genes have been intensively studied, their analysis as a whole has been highly limited. To fill this gap, for each human protein we extracted 21000 protein features from various databases, and using these data as an input to state-of-the-art machine learning methods, we classified human proteins as ageing-related or non-ageing-related. We found a simple classification model based on only 36 protein features, such as the "number of ageing-related interaction partners", "response to oxidative stress", "damaged DNA binding", "rhythmic process" and "extracellular region". Predicted values of the model quantify the relevance of a given protein in the regulation or mechanisms of the human ageing process. Furthermore, we identified new candidate proteins having strong computational evidence of their important role in ageing. Some of them, like Cytochrome b-245 light chain (CY24A) and Endoribonuclease ZC3H12A (ZC12A) have no previous ageing-associated annotations.

UR - http://www.scopus.com/inward/record.url?scp=85043272476&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85043272476&partnerID=8YFLogxK

U2 - 10.1038/s41598-018-22240-w

DO - 10.1038/s41598-018-22240-w

M3 - Article

VL - 8

JO - Scientific Reports

JF - Scientific Reports

SN - 2045-2322

IS - 1

M1 - 4094

ER -