total_bill tip sex smoker day time size 0 16.99 1.01 Female No Sun Dinner 2 1 10.34 1.66 Male No Sun Dinner 3 2 21.01 3.50 Male No Sun Dinner 3 3 23.68 3.31 Male No Sun Dinner 2 4 24.59 3.61 Female No Sun Dinner 4 <class 'pandas.core.frame.DataFrame'> RangeIndex: 244 entries, 0 to 243 Data columns (total 7 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 total_bill 244 non-null float64 1 tip 244 non-null float64 2 sex 244 non-null category 3 smoker 244 non-null category 4 day 244 non-null category 5 time 244 non-null category 6 size 244 non-null int64 dtypes: category(4), float64(2), int64(1) memory usage: 7.4 KB
Now, we plot a scatterplot and see the relationship between the total bill and the tip. The scatterplot is already shown previously.
X_train, X_test, y_train, y_test = train_test_split(df[["total_bill"]], df["tip"], train_size=0.8, shuffle=True, random_state=1)
Now, we are using the train_test_split() function to split the dataset into training set and test set. Please note that we train a machine learning model on the training set. And after that, the machine learning model is run on the test set to see the performance of the model.
The train_size=0.8 parameter here indicates 80% of the total dataset is kept as training set. And 20% will be used as test set. The shuffle=True parameter indicates that we shuffle the data before splitting. And the random_state=1 parameter controls the random number generator that is used for shuffling the data.
After that, we are initializing the linear regressor.
linear_regressor = LinearRegression() linear_regressor.fit(X_train, y_train)
Please note that using the linear_regressor.fit() function the model learns the coefficients of the linear regression from the training set. In other words, we run the linear_regression.fit() function on the training set and using this function, the model learns the coefficients of the linear regression.
Now, it is time to run the linear regression model on the test set and measure the performance…






0 Comments