predictor variables such as the number of pregnancies the patient has had, the BMI, insulin level, age, etc. A machine learning model can learn from the dataset and predict whether the patient has diabetes based on these predictor variables.
D = data.values X = D[:, :-1] y = D[:, -1]
Now, we are splitting the columns of the dataset into features and the target variable. X here contains all the features, and y contains the target variable.
k_fold = KFold(n_splits=10, shuffle=True, random_state=1)
Now, we are initializing the k-fold cross-validation. n_splits here refers to the number of splits. The shuffle argument indicates that we are shuffling the data before splitting. And random_state is used to initialize the pseudo-random number generator that is used for shuffling the data.
classifier = LinearDiscriminantAnalysis()
We are now initializing the classifier using the LinearDiscriminantAnalysis class.
results = cross_val_score(classifier, X, y, cv=k_fold, scoring="accuracy")
mean_score = results.mean()
print("Accuracy: ", mean_score)
Now, we are using the cross_val_score() function to evaluate the performance of the model. We are calculating the accuracy score (What is the accuracy score in machine learning?) for each iteration of the k-fold cross-validation and taking the average of the accuracy scores.
The output of the above program will be:
Accuracy: 0.7759569377990431








































0 Comments