Random forest is an ensemble learning method in machine learning. We can use this method for classification or regression. We use random forests regressor to solve regression problems and random forests classifier to solve classification problems.
In the case of a classification problem, the mean or average predictions made by the individual trees in a random forest is returned by the random forest regressor. Interested readers who want to know more about how random forests work, please refer to this youtube video: https://www.youtube.com/watch?v=J4Wdy0Wc_xQ
How to solve regression problems using a random forest regressor in sklearn?
We can use a random forest regressor to solve a regression problem. This method is good for regression problems where the relation between the features and the target variable is non-linear. In this article, we will read the “tips” dataset and try to predict the tip amount from the total bill amount and size. And we will use a random forest regressor to solve the problem.
We can use the following Python code for this purpose:
import seaborn from sklearn.model_selection import train_test_split from sklearn.ensemble import RandomForestRegressor from sklearn.metrics import r2_score, mean_squared_error df = seaborn.load_dataset("tips") print(df.info()) df_features = df.filter(items=["total_bill", "size"]) df_target = df.filter(items=["tip"]) print(df_features.head()) print(df_target.head()) X_train, X_test, y_train, y_test = train_test_split(df_features, df_target["tip"], test_size=0.2, shuffle=True, random_state=1) regressor = RandomForestRegressor(random_state=1) regressor.fit(X_train, y_train) y_test_pred = regressor.predict(X_test) r2 = r2_score(y_test, y_test_pred) rmse = mean_squared_error(y_test, y_test_pred, squared=False) print("R2 Score: ", r2) print("RMSE: ", rmse)
Here, we are first reading the “tips” dataset and splitting the dataset into features and target variable. The df_features …
0 Comments