TY - GEN
T1 - An efficient decision tree construction for large datasets
AU - Van, Uyen Nguyen Thi
AU - Tae, Choong Chung
N1 - Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2007
Y1 - 2007
N2 - In this paper, we propose a new data structure and a new framework of building decision tree classifiers that is especially suitable for large datasets. The most prominent feature of our algorithm is that in order to build a decision tree, only one scan over the entire database is needed. Compared with previous methods, where at each level of the tree one scan over the whole database is made, our algorithm is obviously much more efficient. Moreover, our algorithm provides one-time sort process for numeric attributes, which significantly reduces the sorting cost and hence the whole execution time. The experimental results show that our algorithm outperforms the RainForest algorithm - a well-known and efficient algorithm for decision tree construction - in time dimension. This proves that our algorithm can be applied into large datasets efficiently.
AB - In this paper, we propose a new data structure and a new framework of building decision tree classifiers that is especially suitable for large datasets. The most prominent feature of our algorithm is that in order to build a decision tree, only one scan over the entire database is needed. Compared with previous methods, where at each level of the tree one scan over the whole database is made, our algorithm is obviously much more efficient. Moreover, our algorithm provides one-time sort process for numeric attributes, which significantly reduces the sorting cost and hence the whole execution time. The experimental results show that our algorithm outperforms the RainForest algorithm - a well-known and efficient algorithm for decision tree construction - in time dimension. This proves that our algorithm can be applied into large datasets efficiently.
UR - http://www.scopus.com/inward/record.url?scp=50249141952&partnerID=8YFLogxK
U2 - 10.1109/IIT.2007.4430464
DO - 10.1109/IIT.2007.4430464
M3 - Conference contribution
AN - SCOPUS:50249141952
SN - 9781424418411
T3 - Innovations'07: 4th International Conference on Innovations in Information Technology, IIT
SP - 21
EP - 25
BT - Innovations'07
PB - IEEE Computer Society
T2 - Innovations'07: 4th International Conference on Innovations in Information Technology, IIT
Y2 - 18 November 2007 through 20 November 2007
ER -