Why Am I Here?

For fun.

To present another way to predict Titanics survival

Review

Types of Learning

Supervised Learning
Teacher provides labeled examples
Unsupervised Learning
No teacher to provide examples
Reinforcement Learning
Agent only reveives reward for taking an action

Learning Algorithm

Batch Learning Algorithm
Got a large set of data all at once and selects single hypothesis
Incremental Learning Algorithm
Receives one example at once and continually modifies its hypothesis

Decision Trees

Data Structures that provide an agent with means of classifying examples

Trees as Rules

Decision tree is just a compiled set of rules

    if name == 'Daniel Soto':
        idiot
    else:
        not idiot

Examples

Here we will just use titanic data set as example

Recall of Titanic

Challenges

Choosing Useful Test Attribute

We would like attributes that "split" the training set

We can construct a decision tree recursively:

Base cases: If all examples are positve or negative, we are done. If there are no examples left, then we haven’t seen an instance of this classification, so we use the majority classification of the parent. If there are no attributes left to test, then we have instances with the same description, but different classifications. Insufficient description Noisy data, nondeterministic domain* Use majority vote Recursive step: Else, pick the best attribute to split on and recursively construct subtrees

Choosing Useful Attribute

How do we know which test attribute provides us the best information

Information Theory

You can compute the information content by the following formula:

n∑i=1−P(vi)∗log2(P(vi))

Example of coin flip:−12log2(12)+−12log2(12)=1

Sometimes Information Content is also called Entropy

Using Information Theory

We want to know how valuable each possible test is

We can estimate the probabilities of possible answers from the training set.
Usually, a single test will not be enough
Then, we need to think about how much better if we ask another question after this one
This is called information gain.

Information Gain

If training set has p positive examples and n negative examples, the information in a correct answer is:

I(pp+n,np+n)=−pp+nlog2(pp+n)−np+nlog2(np+n)

The remainder is the sum of the information in each of these subsets

Remainder(A)=v∑i=1(pi+nip+n∗I(pipi+ni,nipi+ni))

Information Gain

Difference between original information(before the test) and new information(after the test)

Gain(A)=I(pp+n,np+n)−Remainder(A)

Choose the the test attribute having the largest information gain.

Pseudocode

def makeTree(dataset) :
    if all data are in the same class :
        return a single Node with that classification
    if there are no attributes left to test:
        return a single Node with majority classification
    else :
        select the attribute that produces the largest information gain
        split the dataset according to the values of this attribute to
            create $v$ smaller datasets.
        create a new Node - each child will be created by calling
        makeTree with one on the $v$ subsets.

link to python file

TL; DL;

header, data = create_numpy_array_from_csv(csv_file_path='train.csv')

S = data[0::, 2:] # Cut out survived column
# tons of process between
Y = data[0::, 0]

decision_tree_classifer = tree.DecisionTreeClassifier(max_depth = 6)
decision_tree_classifer = decision_tree_classifer.fit(S, Y)

Overfitting

In decision trees, overfitting is dealt with through pruning.

Pruning

If a node does not provide any information, then remove the node and move the examples to the parent.

Features of Decision Trees

Work well with symbolic data
Use information gain to determine what questions are most effective at classifying
Produce a human-understandable hypothesis
Fixes need to deal with noise or missing data
Can represent all boolean functions

decision-trees

rcliao

decision-trees

0 0

decision-trees

Why Am I Here?

For fun.

To present another way to predict Titanics survival

Review

Types of Learning

Learning Algorithm

Decision Trees

Trees as Rules

Examples

Recall of Titanic

Challenges

Choosing Useful Test Attribute

We can construct a decision tree recursively:

Choosing Useful Attribute

Information Theory

Using Information Theory

Information Gain

Information Gain

Pseudocode

TL; DL;

Overfitting

Pruning

Features of Decision Trees

decision-trees

rcliao

decision-trees

0 0 (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/platform.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

decision-trees

Why Am I Here?

For fun.

To present another way to predict Titanics survival

Review

Types of Learning

Learning Algorithm

Decision Trees

Trees as Rules

Examples

Recall of Titanic

Challenges

Choosing Useful Test Attribute

We can construct a decision tree recursively:

Choosing Useful Attribute

Information Theory

Using Information Theory

Information Gain

Information Gain

Pseudocode

TL; DL;

Overfitting

Pruning

Features of Decision Trees

0 0