DAY 16-100 DAYS MLCODE: Decision Trees

My Tech World

DAY 16-100 DAYS MLCODE: Decision Trees

November 25, 2018 100-Days-Of-ML-Code blog 0

In the last three blogs, we had discussed about the SVM. In this blog, we shall walkthrough a simple example of Decision Trees algorithm. Like SVM, Decision Trees are powerful algorithm of machine learning and can perform regression as well as classification task.

You can perform the classification or regression on complex data using Decision Trees and this particular algorithm is also the fundamental technique of famous Random Forests algorithm.

You can perform the regression task using  DecisionTreeRegresssor class of SciKit-Learn. In addition to that, we can use the class DecisionTreeClassifier to perform a classification task.

Lets start the classification task by loading the data from SciKit-Learn Library

from sklearn.datasets import load_wine
wine_data = load_wine()
X = wine_data.data
y = wine_data.target

Above code will load the data into X ( features) and y ( class). Now lets train the model using DecisionTreeClassifier class of SciKit-Learn

from sklearn.tree import DecisionTreeClassifier
tree_clf = DecisionTreeClassifier()
tree_clf.fit(X,y)

Above code will train the classifier , output will look like below:

DecisionTreeClassifier(class_weight=None, criterion=’gini’, max_depth=None, max_features=None, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, presort=False, random_state=None, splitter=’best’)

Let’s visualize the tree by using Export_Graphiz

from sklearn.tree import export_graphviz

export_graphviz(
        tree_clf,
        out_file="wine_tree.dot",
        feature_names= wine_data.feature_names,
        class_names= wine_data.target_names,
        rounded=True,
        filled=True
    )

This will download the file “wine_tree.dot”, now we can convert the file into png using below command in our terminal

dot -Tpng wine_tree.dot -o wine_tree.png

The above command will convert the data into png. Now review the file.

Decision Tree
Decision Tree

Now if you want to classify (predict) the new training instance, we have to traverse the tree to find out the class. Start from the tree and then it will check whether “proline” is less than or equal to 755 then algorithm will check on left side else right side. This way you keep going till reach the class of the training instance.

In conclusion, Decision trees algorithm is powerful algorithm which is base of the powerful algorithm like Random Forest. In this blog, we have just seen the simple example of how SciKit-Learn help to implement the classifier using DecisionTreeClassifier class. In the next blog, we’ll go in detail about this algorithms. You can find the today’s code here.