-
Notifications
You must be signed in to change notification settings - Fork 1
/
DecisionTreeJuntadoMarbille.html
107 lines (93 loc) · 8.82 KB
/
DecisionTreeJuntadoMarbille.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html><head><title>Python: module DecisionTreeJuntadoMarbille</title>
<meta charset="utf-8">
</head><body bgcolor="#f0f0f8">
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="heading">
<tr bgcolor="#7799ee">
<td valign=bottom> <br>
<font color="#ffffff" face="helvetica, arial"> <br><big><big><strong>DecisionTreeJuntadoMarbille</strong></big></big></font></td
><td align=right valign=bottom
><font color="#ffffff" face="helvetica, arial"><a href=".">index</a><br><a href="file:/home/marbille/UP/CS 180/mp2/DecisionTreeJuntadoMarbille.py">/home/marbille/UP/CS 180/mp2/DecisionTreeJuntadoMarbille.py</a></font></td></tr></table>
<p><tt>@author: Marbille Juntado<br>
Copyright: 2017<br>
<br>
This program performs Decision Tree learning on dataset provided by tic-tac-toe.data.<br>
It is based on the ID3 algorithm. Two experiments have been performed that outputs several<br>
files consisting of the node trace, decision tree, confusion matrix, and accuracy results.</tt></p>
<p>
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="section">
<tr bgcolor="#aa55cc">
<td colspan=3 valign=bottom> <br>
<font color="#ffffff" face="helvetica, arial"><big><strong>Modules</strong></big></font></td></tr>
<tr><td bgcolor="#aa55cc"><tt> </tt></td><td> </td>
<td width="100%"><table width="100%" summary="list"><tr><td width="25%" valign=top><a href="math.html">math</a><br>
</td><td width="25%" valign=top><a href="random.html">random</a><br>
</td><td width="25%" valign=top><a href="sys.html">sys</a><br>
</td><td width="25%" valign=top></td></tr></table></td></tr></table><p>
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="section">
<tr bgcolor="#ee77aa">
<td colspan=3 valign=bottom> <br>
<font color="#ffffff" face="helvetica, arial"><big><strong>Classes</strong></big></font></td></tr>
<tr><td bgcolor="#ee77aa"><tt> </tt></td><td> </td>
<td width="100%"><dl>
<dt><font face="helvetica, arial"><a href="DecisionTreeJuntadoMarbille.html#TreeNode">TreeNode</a>
</font></dt></dl>
<p>
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="section">
<tr bgcolor="#ffc8d8">
<td colspan=3 valign=bottom> <br>
<font color="#000000" face="helvetica, arial"><a name="TreeNode">class <strong>TreeNode</strong></a></font></td></tr>
<tr><td bgcolor="#ffc8d8"><tt> </tt></td><td> </td>
<td width="100%">Methods defined here:<br>
<dl><dt><a name="TreeNode-__init__"><strong>__init__</strong></a>(self, name)</dt><dd><tt>Construct a new '<a href="#TreeNode">TreeNode</a>' object.<br>
<br>
:param name: The name of node</tt></dd></dl>
<dl><dt><a name="TreeNode-predictResults"><strong>predictResults</strong></a>(self, cases, a)</dt><dd><tt>Returns a list containing the predicted outcomes</tt></dd></dl>
<dl><dt><a name="TreeNode-predictResultsRecurse"><strong>predictResultsRecurse</strong></a>(self, case, a)</dt><dd><tt>Recursively, method returns the predicted classification of the leaf nodes (bottom-most)</tt></dd></dl>
<dl><dt><a name="TreeNode-visualizeTree"><strong>visualizeTree</strong></a>(self)</dt><dd><tt>Visualizes the tree</tt></dd></dl>
<dl><dt><a name="TreeNode-visualizeTreeRecurse"><strong>visualizeTreeRecurse</strong></a>(self, level)</dt><dd><tt>Includes a log/trace of each node as the tree builds itself recursively</tt></dd></dl>
</td></tr></table></td></tr></table><p>
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="section">
<tr bgcolor="#eeaa77">
<td colspan=3 valign=bottom> <br>
<font color="#ffffff" face="helvetica, arial"><big><strong>Functions</strong></big></font></td></tr>
<tr><td bgcolor="#eeaa77"><tt> </tt></td><td> </td>
<td width="100%"><dl><dt><a name="-buildDTree"><strong>buildDTree</strong></a>(examples, targetAttribute, attributes)</dt><dd><tt>Returns the root node of the decision tree<br>
<br>
:param examples: Each line of the training set</tt></dd></dl>
<dl><dt><a name="-constructTreeFromFile"><strong>constructTreeFromFile</strong></a>(filepath)</dt><dd><tt>Builds a decision tree from the training data set file</tt></dd></dl>
<dl><dt><a name="-entropy"><strong>entropy</strong></a>(p, e)</dt><dd><tt>Calculates entropy<br>
<br>
:param p: list of probabilities for each value <br>
:param e: list of information gain for each value</tt></dd></dl>
<dl><dt><a name="-gather_data"><strong>gather_data</strong></a>(filename)</dt><dd><tt>This function is used in reading the data from the original data set file</tt></dd></dl>
<dl><dt><a name="-getAttributesFromFile"><strong>getAttributesFromFile</strong></a>(filepath)</dt><dd><tt>The first line of the test file contains the attributes (categories)</tt></dd></dl>
<dl><dt><a name="-getMostCommonLabel"><strong>getMostCommonLabel</strong></a>(nodes)</dt><dd><tt>Returns the most dominant classification in the list of nodes</tt></dd></dl>
<dl><dt><a name="-getMostCommonValue"><strong>getMostCommonValue</strong></a>(attr, examples, values)</dt><dd><tt>Returns the value with the highest frequency of the given attribute</tt></dd></dl>
<dl><dt><a name="-header"><strong>header</strong></a>(data)</dt><dd><tt>Useful for data sets without any attribute names. Generically labels<br>
each attribute as 'Attribute + <number>' to accommodate different datasets. <br>
The last attribute is named 'Classification' (+/-).</tt></dd></dl>
<dl><dt><a name="-infoGain"><strong>infoGain</strong></a>(count1, count2)</dt><dd><tt>Returns the information gain at any particular level of tree construction<br>
<br>
:param count1: Contains the number of positively-classified training examples<br>
:param count2: Contains the number of negatively-classified training exampels</tt></dd></dl>
<dl><dt><a name="-isNegative"><strong>isNegative</strong></a>(word)</dt><dd><tt>Boolean function that determines whether a word is negative.<br>
Used in the classification of training example.</tt></dd></dl>
<dl><dt><a name="-isPositive"><strong>isPositive</strong></a>(word)</dt><dd><tt>Boolean function that determines whether a word is positive.<br>
Used in the classification of training example.<br>
:param word: any string</tt></dd></dl>
<dl><dt><a name="-parseTestCases"><strong>parseTestCases</strong></a>(filepath)</dt><dd><tt>Parses the test cases from the test data set file</tt></dd></dl>
<dl><dt><a name="-returnAttributeHighestInfoGain"><strong>returnAttributeHighestInfoGain</strong></a>(attributes, examples)</dt><dd><tt>Returns the attribute with the highest information gain and the corresponding value<br>
<br>
:param attributes: The attributes (categories) of the data<br>
:param examples: The training examples from the data set</tt></dd></dl>
<dl><dt><a name="-split_data"><strong>split_data</strong></a>(data)</dt><dd><tt>Randomly divides the original data set into two equal sets:<br>
Training and Testing data sets<br>
<br>
:param: The original data sets</tt></dd></dl>
<dl><dt><a name="-uniqueValues"><strong>uniqueValues</strong></a>(attrIndex, examples)</dt><dd><tt>Returns list of the distinct values of the current attribute</tt></dd></dl>
</td></tr></table><p>
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="section">
<tr bgcolor="#55aa55">
<td colspan=3 valign=bottom> <br>
</body></html>