Presentation of hidden knowledge from a localbreast cancer dataset by the classification and regression trees

AUTHORS

Hadi LotfnezhadAfshar 1 , Lili Rahmatnejad 2 , Bahlol Rahimi 1 , Hamid Reza Khalkhali 3 , *

1 Department of Health Information Management, Health Information Technology Department, School of Paramedicine, Urmia University of Medical Sciences, Urmia, Iran

2 Department of Midwifery, School of Nursing & Midwifery, Urmia University of Medical Sciences, Urmia, Iran

3 Department of Biostatistics, Patient Safety Research Center, School of Medicine, Urmia University of Medical Sciences, Urmia, Iran

How to Cite: LotfnezhadAfshar H, Rahmatnejad L , Rahimi B , Khalkhali H R. Presentation of hidden knowledge from a localbreast cancer dataset by the classification and regression trees, J Clin Res Paramed Sci. 2017 ; 6(2):e81268.

ARTICLE INFORMATION

Journal of Clinical Research in Paramedical Sciences: 6 (2); e81268
Published Online: July 24, 2017
Article Type: Research Article
Received: November 20, 2016
Accepted: May 03, 2017

Crossmark

CHEKING

READ FULL TEXT
Abstract

Introduction: The using of standard knowledge discovery methods such as decision trees, in context ofthe breast cancer has been studied. Presentation of undiscovered relationship among data in formats such as: visualization and formulating are the reasons of decision trees popularity. An algorithm from this group that has not been used in the previous published papers, applied in current study.

Methods: A dataset included data about 569 patients’ records between the years 2007 and 2010 was used. The missing data handling method was multiple imputation (MI). IBM statistics 21 was the used software for running MI and developing the model. The developed model was evaluated against the criteria such as: accuracy, sensitivity and specificity.

Results: A decision tree with seventeen nodes produced by the model. A set of clinically meaningful if-then rules were produced from nine nodes. It was clear from these rules that the variable that showed the stage of cancer was the most important variable to predict living probability of breast cancer. The performance of produced model for criteria (sensitivity, specificity and accuracy) was: 93.5, 53 and 80.3 percentage respectively.

Conclusion: The model created in current study as the first model in living probability of breast cancer revealed practical undiscovered rules from a not large dataset.

Keywords

Breast neoplasms survival machine learning Regression Analysis

© 2017, Journal of Clinical Research in Paramedical Sciences. This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/) which permits copy and redistribute the material just in noncommercial usages, provided the original work is properly cited.

Fulltext

The full text of this article is available on PDF.

References

  • 1.

    The References of this article is available on PDF.

  • COMMENTS

    LEAVE A COMMENT HERE: