+1 vote
in Programming Languages by (51.8k points)

I want to generate synthetic data with only non-negative values using the sklearn.datasets.make_classification() function. There is no parameter in this function to set the range of the values and by default, the synthetic data contain both positive and negative values. Is there anyway to get only non-negative values?

1 Answer

+2 votes
by (281k points)
selected by
 
Best answer

The sklearn.datasets.make_classification() function does not have a parameter to specify the range of the feature values, and the synthetic data generated by this function contain both positive and negative values. If you want to have only non-negative values in the data, you can use the sklearn.preprocessing.MinMaxScaler() function to scale the values in your desired feature range.

Here is an example to scale the feature values in the range 0-5.

>>> from sklearn.datasets import make_classification

>>> X,y = make_classification(n_samples=10, n_features=5, n_classes=2, random_state=1)

>>> X

array([[-0.19183555,  1.05492298, -0.7290756 , -1.14651383,  1.44634283],

       [-1.11731035,  0.79495321,  3.11651775, -2.85961623, -1.52637437],

       [ 0.2344157 , -1.92617151,  2.43027958,  1.49509867, -3.42524143],

       [-0.67124613,  0.72558433,  1.73994406, -2.00875146, -0.60483688],

       [-0.0126646 ,  0.14092825,  2.41932059, -1.52320683, -1.60290743],

       [ 1.6924546 ,  0.0230103 , -1.07460638,  0.55132541,  0.78712117],

       [ 0.74204416, -1.91437196,  3.84266872,  0.70896364, -4.42287433],

       [-0.74715829, -0.36632248, -2.17641632,  1.72073855,  1.23169963],

       [-0.88762896,  0.59936399, -1.18938753, -0.22942496,  1.37496472],

       [ 1.65980218, -1.04052679,  0.89368622,  1.03584131, -1.55118469]])

>>> from sklearn.preprocessing import MinMaxScaler

>>> scaler = MinMaxScaler(feature_range=(0,5))

>>> X1=scaler.fit_transform(X)

>>> X1

array([[1.64689007, 5.        , 1.20229296, 1.87005427, 5.        ],

       [0.        , 4.56396926, 4.39679289, 0.        , 2.46753518],

       [2.4054077 , 0.        , 3.82674099, 4.75368733, 0.84988583],

       [0.79377497, 4.44762126, 3.25328547, 0.92881972, 3.25259515],

       [1.96572626, 3.46701483, 3.81763746, 1.45884922, 2.40233648],

       [5.        , 3.26923856, 0.91526364, 3.72344698, 4.43840751],

       [3.30873675, 0.01979063, 5.        , 3.8955278 , 0.        ],

       [0.65868865, 2.61623547, 0.        , 5.        , 4.81714495],

       [0.40871993, 4.23591992, 0.81991597, 2.87116544, 4.93919282],

       [4.94189474, 1.4854355 , 2.55030666, 4.2523535 , 2.4463992 ]])


...