Decision Tree Learning methods (!1) · Merge requests · Anuchitanukul, Atijit / cw1_decision_tree

There are a total of 5 methods developed here that contribute the automatic construction of the decision tree.

load_txt_data: This method loads the data from the specified .txt file (file_name). The file can be located anywhere inside of the directory of the repository.
calculate_entropy: This method calculates the value of entropy (entropy) given the label attribute column (label_attribute).
calculate_info_gain: This method calculates the information gain (info_gain) given the label attribute column (label_attribute), input attribute column (input_attribute) and the threshold of the input attribute (threshold).
find_split: This method finds the best split threshold (threshold) that maximises the information gain (info_gain) and its corresponding input attribute column.
decision_tree_learning: This method creates the best decision tree in a recursive manner and stores the tree in a single dictionary. More explanation of the logic and the dictionary keys are provided in the code comments.

Next steps: Implement 10-fold cross validation on both the clean and noisy datasets, accuracy metrics and tree pruning.

Decision Tree Learning methods