Tree-Based Methods as an Alternative to Logistic Regression in Revealing Risk Factors of Crib-Biting in Horses

Krisztina Nagy, J. Reiczigel, Andrea Harnos, Anikó Schrott, P. Kabai

Research output: Contribution to journalArticle

15 Citations (Scopus)

Abstract

Determining the risk factors might help in designing prevention of crib-biting. Logistic regression is a commonly used statistical method for finding risk factors, but tree-based methods are also getting more popular. An important difference between these two statistical approaches is that logistic regression makes a number of assumptions about the underlying data, whereas tree-based methods do not. Another difference is that logistic regression can be used to derive odds ratios for the significant risk factors, whereas tree-based methods create a tree where the ramifications represent the risk factors. The probability of occurrence is assigned to each end of branch in the tree. Data of horses used for noncompetition purposes were analyzed with three statistical approaches: logistic regression, classification tree, and conditional inference tree methods. By this, we compared the advantages and disadvantages of these statistical methods. No difference was found between the two tree-based methods regarding the structure and prediction accuracy of the trees. Compared to them, logistic regression revealed fewer risk factors, and also the number of the stereotypic horses classified correctly by the model was less. The representation of the tree-based methods is closer to medical reasoning and also high-order interaction of the risk-factors can easily be visualized. Our results suggest that tree-based methods can be a new alternative in revealing risk factors, even if used alone or together with logistic regression.

Original languageEnglish
Pages (from-to)21-26
Number of pages6
JournalJournal of Equine Veterinary Science
Volume30
Issue number1
DOIs
Publication statusPublished - Jan 2010

Fingerprint

Infant Equipment
Horses
risk factors
Logistic Models
horses
methodology
statistical analysis
odds ratio
prediction
Odds Ratio

Keywords

  • Classification tree
  • Conditional inference tree
  • Crib-biting
  • Logistic regression
  • Risk factors

ASJC Scopus subject areas

  • Equine

Cite this

Tree-Based Methods as an Alternative to Logistic Regression in Revealing Risk Factors of Crib-Biting in Horses. / Nagy, Krisztina; Reiczigel, J.; Harnos, Andrea; Schrott, Anikó; Kabai, P.

In: Journal of Equine Veterinary Science, Vol. 30, No. 1, 01.2010, p. 21-26.

Research output: Contribution to journalArticle

@article{e106a696707e45fea722ca53b4ea464c,
title = "Tree-Based Methods as an Alternative to Logistic Regression in Revealing Risk Factors of Crib-Biting in Horses",
abstract = "Determining the risk factors might help in designing prevention of crib-biting. Logistic regression is a commonly used statistical method for finding risk factors, but tree-based methods are also getting more popular. An important difference between these two statistical approaches is that logistic regression makes a number of assumptions about the underlying data, whereas tree-based methods do not. Another difference is that logistic regression can be used to derive odds ratios for the significant risk factors, whereas tree-based methods create a tree where the ramifications represent the risk factors. The probability of occurrence is assigned to each end of branch in the tree. Data of horses used for noncompetition purposes were analyzed with three statistical approaches: logistic regression, classification tree, and conditional inference tree methods. By this, we compared the advantages and disadvantages of these statistical methods. No difference was found between the two tree-based methods regarding the structure and prediction accuracy of the trees. Compared to them, logistic regression revealed fewer risk factors, and also the number of the stereotypic horses classified correctly by the model was less. The representation of the tree-based methods is closer to medical reasoning and also high-order interaction of the risk-factors can easily be visualized. Our results suggest that tree-based methods can be a new alternative in revealing risk factors, even if used alone or together with logistic regression.",
keywords = "Classification tree, Conditional inference tree, Crib-biting, Logistic regression, Risk factors",
author = "Krisztina Nagy and J. Reiczigel and Andrea Harnos and Anik{\'o} Schrott and P. Kabai",
year = "2010",
month = "1",
doi = "10.1016/j.jevs.2009.11.005",
language = "English",
volume = "30",
pages = "21--26",
journal = "Journal of Equine Veterinary Science",
issn = "0737-0806",
publisher = "W.B. Saunders Ltd",
number = "1",

}

TY - JOUR

T1 - Tree-Based Methods as an Alternative to Logistic Regression in Revealing Risk Factors of Crib-Biting in Horses

AU - Nagy, Krisztina

AU - Reiczigel, J.

AU - Harnos, Andrea

AU - Schrott, Anikó

AU - Kabai, P.

PY - 2010/1

Y1 - 2010/1

N2 - Determining the risk factors might help in designing prevention of crib-biting. Logistic regression is a commonly used statistical method for finding risk factors, but tree-based methods are also getting more popular. An important difference between these two statistical approaches is that logistic regression makes a number of assumptions about the underlying data, whereas tree-based methods do not. Another difference is that logistic regression can be used to derive odds ratios for the significant risk factors, whereas tree-based methods create a tree where the ramifications represent the risk factors. The probability of occurrence is assigned to each end of branch in the tree. Data of horses used for noncompetition purposes were analyzed with three statistical approaches: logistic regression, classification tree, and conditional inference tree methods. By this, we compared the advantages and disadvantages of these statistical methods. No difference was found between the two tree-based methods regarding the structure and prediction accuracy of the trees. Compared to them, logistic regression revealed fewer risk factors, and also the number of the stereotypic horses classified correctly by the model was less. The representation of the tree-based methods is closer to medical reasoning and also high-order interaction of the risk-factors can easily be visualized. Our results suggest that tree-based methods can be a new alternative in revealing risk factors, even if used alone or together with logistic regression.

AB - Determining the risk factors might help in designing prevention of crib-biting. Logistic regression is a commonly used statistical method for finding risk factors, but tree-based methods are also getting more popular. An important difference between these two statistical approaches is that logistic regression makes a number of assumptions about the underlying data, whereas tree-based methods do not. Another difference is that logistic regression can be used to derive odds ratios for the significant risk factors, whereas tree-based methods create a tree where the ramifications represent the risk factors. The probability of occurrence is assigned to each end of branch in the tree. Data of horses used for noncompetition purposes were analyzed with three statistical approaches: logistic regression, classification tree, and conditional inference tree methods. By this, we compared the advantages and disadvantages of these statistical methods. No difference was found between the two tree-based methods regarding the structure and prediction accuracy of the trees. Compared to them, logistic regression revealed fewer risk factors, and also the number of the stereotypic horses classified correctly by the model was less. The representation of the tree-based methods is closer to medical reasoning and also high-order interaction of the risk-factors can easily be visualized. Our results suggest that tree-based methods can be a new alternative in revealing risk factors, even if used alone or together with logistic regression.

KW - Classification tree

KW - Conditional inference tree

KW - Crib-biting

KW - Logistic regression

KW - Risk factors

UR - http://www.scopus.com/inward/record.url?scp=73649138933&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=73649138933&partnerID=8YFLogxK

U2 - 10.1016/j.jevs.2009.11.005

DO - 10.1016/j.jevs.2009.11.005

M3 - Article

AN - SCOPUS:73649138933

VL - 30

SP - 21

EP - 26

JO - Journal of Equine Veterinary Science

JF - Journal of Equine Veterinary Science

SN - 0737-0806

IS - 1

ER -