We can use the make_regression() function in sklearn to create a dataset that can be used for regression. In other words, we can create a dataset using make_regression() and run a machine learning model on that dataset. The dataset will have a specific number of features and target variables. Please note that the target variables will be continuous random variables. Please note that the function returns two ndarrays. One contains all the features and the other contains the target variables.
We can use the following Python code to create ndarrays containing data for regression using the make_regression() function.
from sklearn.datasets import make_regression X, y = make_regression(n_samples=200, n_features=5, n_targets=2, shuffle=True, random_state=1) print(X.shape) print(y.shape)
Here, the argument n_samples indicates the number of samples or records in the dataset. The arguments n_features, and n_targets indicate the number of features and target variables, respectively. Please note that we can specify more than one target variable through the n_targets argument and use the created dataset for multioutput regression problems (What are multioutput regression problems in machine learning?).
The argument shuffle=True indicates that we are shuffling the samples and the features while creating the dataset. And random_state is used to initialize the pseudo-random number generator that is used for randomization.
The function make_regression() returns two variables X and y. X contains all the features, and y contains the target variables.
The output of the given program will be the following:
(200, 5) (200, 2)
Here, we are creating 200 records with 5 features and 2 target variables. So, the shape of the ndarray X is (200, 5). And the shape of the ndarray y is (200, 2).








































0 Comments