+1 vote
in Machine Learning by (51.8k points)

I am applying the LogisticRegression() function of the scikit module to imbalanced data. How can I use the class_weight parameter to assign different weights to the classes?

1 Answer

+2 votes
by (4.7k points)
selected by
Best answer

The class_weight parameter of the LogisticRegression() function of the scikit-learn module takes a dictionary or 'balanced' as values. If you want to assign equal weight to each class in imbalanced data, you can use class_weight='balanced'.

>>> from sklearn.linear_model import LogisticRegression

>>> clf = LogisticRegression(random_state=0, class_weight='balanced')

When 'balanced' is used, it adjusts weights of the classes inversely proportional to their frequencies in the input data.
Here is an example to show how the weight is computed when 'balanced' is used.
>>> import numpy as np
>>> y=np.asarray([1,0,0,1,1,0,0,0,0,0,1])
>>> y
array([1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1])
>>> n_samples = len(y)
>>> n_classes = 2
>>> n_samples / (n_classes * np.bincount(y))
array([0.78571429, 1.375     ])

In case, you want manually to assign different weights to classes, you can do it using a dictionary.

>>> clf = LogisticRegression(random_state=0, class_weight={0:2,1:3})

In the above example, ratio of weights of class 0 to class 1 is 2/3.