In one of our previous articles, we discussed linear regression. Linear regression is good if the target variable is linearly related to the features. But, if there are many features and the relation between the features and the target variable is non-linear, then we can use regression trees.
Decision tree learning is a supervised learning approach. A decision tree is used as a predictive model to draw conclusions about the target variable. A decision tree can be a regression tree or a classification tree. We use a regression tree for solving regression problems. And classification trees are used to solve a classification problem.
In this article, we will show with an example how to solve regression problems using a regression tree. Interested readers who want to know more about how regression trees work, please refer to this youtube video: https://www.youtube.com/watch?v=g9c66TUylZ4
How to solve regression problems using a regression tree in sklearn?
Let’s read the “tips” dataset from the seaborn library and try to predict the amount of tips from the total bill amount and size. We will use a regression tree here.
We can use the following Python code to solve the regression problem:
import seaborn from sklearn.model_selection import train_test_split from sklearn.tree import DecisionTreeRegressor from sklearn.metrics import r2_score, mean_squared_error df = seaborn.load_dataset("tips") print(df.info()) df_features = df.filter(items=["total_bill", "size"]) df_target = df.filter(items=["tip"]) print(df_features.head()) print(df_target.head()) X_train, X_test, y_train, y_test = train_test_split(df_features, df_target["tip"], test_size=0.2, shuffle=True, random_state=1) regressor = DecisionTreeRegressor(random_state=1) regressor.fit(X_train, y_train) y_test_pred = regressor.predict(X_test) r2 = r2_score(y_test, y_test_pred) rmse = mean_squared_error(y_test, y_test_pred, squared=False) print("R2 Score: ", r2) print("RMSE: ", rmse)
Firstly, we are loading the “tips” dataset using the seaborn library. After that, the dataset is split into features and …
0 Comments