df = df.dropna()
Now, we can split the dataset into a training set and test set.
X_train, X_test, y_train, y_test = train_test_split(df[["horsepower", "weight", "acceleration"]], df["mpg"], train_size=0.8, shuffle=True, random_state=1)
Please note that the shuffle=True parameter indicates that we are shuffling the dataset while splitting. And the random_state=1 parameter controls the random number generator that is used for shuffling the data. We are keeping 80% of the data as training set and the rest are test set.
Now, we execute the following Python statements:
ridge_regressor = Ridge() ridge_regressor.fit(X_train, y_train)
We first initialize the Ridge regressor. After that, we use the training set to train the model.
Now, we can use the test set to predict output and compare the results with the actual values and measure the performance of the model.
y_test_predicted = ridge_regressor.predict(X_test)
r2 = r2_score(y_test, y_test_predicted)
rmse = mean_squared_error(y_test, y_test_predicted, squared=False)
print("R2: ", r2)
print("RMSE: ", rmse)
This model gives the following R-squared score and Root Mean Square Error (RMSE).
R2: 0.719642160467646 RMSE: 4.408519403150911








































0 Comments