基于决策树的数据流挖掘算法的研究Study on Data Stream Mining Algorithm Based on Decision Tree
孙超利
摘要(Abstract):
由于数据流的快速流动性以及计算机内存的限制,因此要设计好的数据流挖掘算法是很困难的事。近几年来,对数据流进行数据挖掘的算法相继被提出。本文主要阐述分类算法中基于决策树的各种数据流挖掘算法,包括传统的增量式的决策树分类、基于Hoeffd ing tree的VFDT、可调整的VFDT(即CVFDT)以及使用整合技术的决策树分类算法,通过分析比较,总结了各种算法的主要特征,为国内研究者提供借鉴。
关键词(KeyWords): 决策树;SLIQ;SPRINT;VFDT;CVFDT
基金项目(Foundation):
作者(Author): 孙超利
参考文献(References):
- [1]Qu in lan J R.Induction of dec ision tree[J].M ach ine Learn ing,1986,81-106.
- [2]Qu in lan J R.C4.5:Program s forM ach ine Leam ing[M].San M ateo,CA:Morgan Kaufm ann Pub lishers,1993,17-26.
- [3]MEHTA M,AGRAWAL R,R ISSANEN J.SLIQ:A Fast Scalab le C lassifier for DataM in ing[R].IBM A lm aden Research Cen-ter,San Jose,Californ ia,1995.
- [4]SHAFER J,AGRAWAL R,METHA M.SPR INT:A Scalab le Parallel C lassifier for Data M in ing[R].IBM A lm aden ResearchCenter,San Jose,Californ ia,1996.
- [5]Dom ingos P,Hu len G.M in ingH igh-Speed Data Stream s[C].ACM SIGKDD Int.Conf.on Know ledge D iscovery and DataM in-ing.Boston,2000.
- [6]Hu len G,Spencer L,Dom ingos P.M in ing Tim e-Changing Data Stream s[C].Proc.of the Tth ACM SIGKDD Int.Conf.onKnow ledge D iscovery and Data M in ing.San Franc isco,2001.
- [7]W ang H,Fan W,Yu P S,Han J.M in ing Concept-D rifting Data Stream s using Ensemb le C lassifiers[C].Proc.2003 ACMSIGKDD Int.Conf.on Know ledge D iscovery and Data M in ing,W ash ington,D.C.,Aug.2003.