Choose a Dataset
A dataset can be visualized as a table in which each row represents an instance and each column, an attribute.
Once you login, you will find a list of datasets available on Branch for you to use. These will be the datasets that you can explore and use to build trees and other classifiers.
Building a Decision Tree
A decision tree can be built by picking a gene/non gene feature to split the dataset. There are two basic types of features that can be picked.
Training and Testing Datasets
You can choose a training and testing dataset by clicking on the in the sidebar. Available Options:
- Training Set
- Use the same dataset for both training and testing.
- Supplied Test Set
- A list of compatible datsets is shown. You can pick one to be used as a test dataset.
- Percentage Split
- Use a percentage of the dataset you picked for training and the rest for testing.
Use to zoom in and to zoom out. You can uncheck Fit To Screen if you don't want the tree to fit within the screen.
Accuracy, Area Under Curve and the Confusion Matrix
The decision tree is evaluated on the test set you pick and the corresponding accuracy, area under curve and confusion matrix are shown in the right side bar.
Saving Trees Into Your Collections
Click on to save your tree. You have an options to save the tree along with a comment. These comments will come in handy if you need to find your tree later on. In addition if you want to keep your tree private you can save it as a private tree and such trees will not be visible to other users. You can view your saved trees in your profile. You can access your profile by clicking on your name in the right corner of the top bar. You can view your collections and trees built by other users. You can view other user's trees by clicking on . The nodes added by you will correspond to your profile and this can be seen in the user thumbnail on the top right corner of each node.
You can use the pathway search to query for genes and add then use drag and drop to build a tree. You can also use the pathway search to look for genes to use in a custom classifier which is explained in the next tutorial.
Build & Add Classifier
A split node can use a classifier like a C4.5 tree or a support vector machine to create a split using the output of the classifier.
Plot & Create Split
A split node can be built by manually selecting samples to create a leaf node. You can use the visual plotting tool to plot samples using two attributes and then select samples using a polygon select tool.
Build & Add Custom Feature
Multiple attributes from the dataset can be combined together using a mathematical equation to create a new attribute.