petal length, and petal width. Based on these features, a machine learning model can predict the species of flowers. The last column, or the “species” column, is the target variable here.
df = seaborn.load_dataset("iris")
df_features = df.drop(labels=["species"], axis=1)
df_target = df.filter(items=["species"])
Now, we are splitting the features and the target variable from the dataset. The last column or the “species” column indicates the target variable. So, we drop the last column from the dataset to get the features. And we filter the last column to get the target.
X_train, X_test, y_train, y_test = train_test_split(df_features, df_target["species"], shuffle=True, random_state=1)
Now, we can split the dataset into train and test sets. Please note that the shuffle=True argument indicates that we are shuffling the data before splitting. And the random_state argument initializes the pseudo-random number generator that is used for shuffling.
classifier = ExtraTreesClassifier(n_estimators=100) classifier.fit(X_train, y_train) y_test_pred = classifier.predict(X_test)
Now, we initialize the classifier using the ExtraTreesClassifier class. The n_estimators argument indicates the number of decision trees in the forest. Now, we can fit the training dataset to the model so that the machine learning model can learn from the training dataset. After that, we can use the predict() function to predict the target for the test set.
accuracy = accuracy_score(y_test, y_test_pred)
print("Accuracy Score: ", accuracy)
Here, we are using the accuracy score (What is the accuracy score in machine learning?) to evaluate the performance of the model. The output of the above program will be:
Accuracy Score: 0.9736842105263158








































0 Comments