Classification learning can be considered as a regression problem with dependent variable consisting of 0s and 1s. Reducing classification to the problem of finding numerical dependencies we gain an opportunity to utilize powerful regression methods implemented in the PolyAnalyst data mining system. Resulting regression functions can be considered as fuzzy membership indicators for a recognized class. In order to obtain classifying rules, the optimum threshold values which minimize the number of misclassified cases can be found for these functions. We show that this approach allows one to solve the over-fit problem satisfactorily and provides results that are at least not worse than results obtained by the most popular decision tree algorithms.
To download this document, please visit https://link.springer.com/chapter/10.1007/3-540-63223-9_113.
Cite this paper as: Kiselev M.V., Ananyan S.M., Arseniev S.B. (1997) Regression-based classification methods and their comparison with decision tree algorithms. In: Komorowski J., Zytkow J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1997. Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence), vol 1263. Springer, Berlin, Heidelberg
Copyright Information: © Springer-Verlag Berlin Heidelberg 1997