from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score
from sklearn.ensemble import AdaBoostClassifier
import pandas
data = pandas.read_csv("diabetes.csv")
D = data.values
X = D[:, :-1]
y = D[:, -1]
kfold = KFold(n_splits=10, shuffle=True, random_state=1)
model = AdaBoostClassifier(n_estimators=20, random_state=1)
result = cross_val_score(model, X, y, scoring="accuracy", cv=kfold)
print("Accuracy: ", result.mean())
Here, we are first reading the Pima Indians Diabetes dataset and splitting the columns of the dataset into features and the target variable. The last column of the dataset contains the target variable. So, X here contains all the features, and y contains the target variable.
data = pandas.read_csv("diabetes.csv")
D = data.values
X = D[:, :-1]
y = D[:, -1]
Now, we are initializing the k-fold cross-validation. Please note that n_splits refers to the number of splits. The shuffle=True argument indicates that we are shuffling the data before splitting. And the random_state initializes the pseudo-random number generator that is used for shuffling.
kfold = KFold(n_splits=10, shuffle=True, random_state=1)
Here, we are initializing the k-fold cross-validation with 10 splits. We are shuffling the data before splitting. So, the shuffle argument is True. And the argument random_state is used to initialize the pseudo-random number generator that is used for shuffling data.
model = AdaBoostClassifier(n_estimators=20, random_state=1)
Now, we are initializing the AdaBoost classifier. Please note that when we do not specify the argument “estimator” in the AdaBoostClassifier() constructor, a one-level DecisionTreeClassifier is used as the base estimator.
The argument n_estimators specify the number of stumps or one-level decision trees used by the model. And the argument …








































0 Comments