What are Extra Trees?
The Extra Trees algorithm creates a large number of unpruned decision trees and makes predictions based on that. Extra Trees, much like Random Forests, use decision trees to make predictions. But, there are some differences between Extra Trees and Random Forests.
Random Forests use bagging. But, Extra Trees use the whole training dataset to fit the decision trees. Moreover, a Random Forest uses a greedy algorithm to select the optimal split point at each node of the decision trees. But, Extra Trees select the split point at each node of a decision tree randomly.
For a regression problem, the Extra Tree algorithm takes the average of all the predictions made by the decision trees. And for a classification problem, the Extra Trees algorithm selects the class that gets maximum voting by the decision trees.
Extra Trees Classifier using sklearn
We can use the following Python code to solve classification problems using Extra Trees.
import seaborn from sklearn.model_selection import train_test_split from sklearn.ensemble import ExtraTreesClassifier from sklearn.metrics import accuracy_score df = seaborn.load_dataset("iris") df_features = df.drop(labels=["species"], axis=1) df_target = df.filter(items=["species"]) X_train, X_test, y_train, y_test = train_test_split(df_features, df_target["species"], shuffle=True, random_state=1) classifier = ExtraTreesClassifier(n_estimators=100) classifier.fit(X_train, y_train) y_test_pred = classifier.predict(X_test) accuracy = accuracy_score(y_test, y_test_pred) print("Accuracy Score: ", accuracy)
Here, we are first reading the “iris” dataset using seaborn. The iris dataset contains four columns sepal length, sepal width, …






0 Comments