Skip to content

Latest commit

 

History

History
20 lines (11 loc) · 1.12 KB

README.md

File metadata and controls

20 lines (11 loc) · 1.12 KB

Classification and Regression Tree algorithms for Apache SPARK

This repository focus on a possible implementation of a decision tree for Apache SPARK.

This repository is contains packages that helps building two kinds of tree models: Classification Tree and Regression Tree on the scalable enviroment with platform SPARK.

Paricularly, Regression Tree is implemented from CART algorithm of Breiman, which was introduced in 1984. Classification Tree has two types: Binary Classification Tree (implemented by CART ) and Multi-way Classification Tree (built by ID3).

Besides, you can do cross validation to prune tree models or use Random Forest to aggregate and increase the accuracy of prediction.

You can find some information about our work here:

  1. Overview about decision tree

  2. Regression Tree with CART algorithm

  3. How to use Regression Tree in this repository