variable.
Now, we are splitting the dataset into training and test set. The argument test_size specifies the size of the test set. The argument shuffle specifies we are shuffling the data before splitting. And the argument random_state is used to initialize the pseudo-random number generator that is used to shuffle the data.
Now, we are initializing the classifier using the LogisticRegression class. The model is trained with the training set. Then, we are using pickle to save the module as a file.
If we need to load the model in the future and use it, then we can do so using the following Python code.
from pickle import load
import pandas
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
data = pandas.read_csv("diabetes.csv")
D = data.values
X = D[:, :-1]
y = D[:, -1]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, shuffle=True, random_state=1)
model = load(open("diabetes_classification.sav", "rb"))
y_pred = model.predict(X_test)
print("Accuracy: ", accuracy_score(y_test, y_pred))
score = model.score(X_test, y_test)
print("Score: ", score)
Here, we are first loading the model from the saved file using load() function. After that, we are testing the model using the test set.
model = load(open("diabetes_classification.sav", "rb"))
y_pred = model.predict(X_test)
print("Accuracy: ", accuracy_score(y_test, y_pred))
score = model.score(X_test, y_test)
print("Score: ", score)
Please note that we can use the predict() function to get predicted values. We can also use the score() function to evaluate the performance of the model. Please note that by default, the score() function gives the accuracy score for a classification problem (What is the accuracy score in machine learning?)
The output of the given program will be the following:
Accuracy: 0.7748917748917749 Score: 0.7748917748917749








































0 Comments