An efficient decision tree construction for large datasets

Uyen Nguyen Thi Van, Choong Chung Tae

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

In this paper, we propose a new data structure and a new framework of building decision tree classifiers that is especially suitable for large datasets. The most prominent feature of our algorithm is that in order to build a decision tree, only one scan over the entire database is needed. Compared with previous methods, where at each level of the tree one scan over the whole database is made, our algorithm is obviously much more efficient. Moreover, our algorithm provides one-time sort process for numeric attributes, which significantly reduces the sorting cost and hence the whole execution time. The experimental results show that our algorithm outperforms the RainForest algorithm - a well-known and efficient algorithm for decision tree construction - in time dimension. This proves that our algorithm can be applied into large datasets efficiently.

Original languageEnglish
Title of host publicationInnovations'07
Subtitle of host publication4th International Conference on Innovations in Information Technology, IIT
PublisherIEEE Computer Society
Pages21-25
Number of pages5
ISBN (Print)9781424418411
DOIs
Publication statusPublished - 2007
EventInnovations'07: 4th International Conference on Innovations in Information Technology, IIT - Dubai, United Arab Emirates
Duration: 18 Nov 200720 Nov 2007

Publication series

NameInnovations'07: 4th International Conference on Innovations in Information Technology, IIT

Conference

ConferenceInnovations'07: 4th International Conference on Innovations in Information Technology, IIT
Country/TerritoryUnited Arab Emirates
CityDubai
Period18/11/0720/11/07

Fingerprint

Dive into the research topics of 'An efficient decision tree construction for large datasets'. Together they form a unique fingerprint.

Cite this