+2 votes

Best answer

You need to specify the index of non-numeric features in the **fit()** function of the CatBoost; otherwise, it will throw an error. The parameter to specify those indices is "**cat_features**". You can provide a list or a Numpy array. It internally converts those non-numerical features into numeric features. That's why it needs to know their indices.

fit(X, y=None,

cat_features=None, text_features=None, ...)

For example:

If your data have both numeric and non-numeric features, you need to find the indices of the non-numeric features.

Let's assume that your data's non-numeric features indices are 2, 7, 9, and 10. So, the fit() function will look like this:

cat_features_indx = [2,7,9,10]

fit(X, y=None,

cat_features=cat_features_indx, ...)

You can find a CatBoost classifier example with categorical mushroom data **here .**