Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anchor contains all the features #50

Open
laramdemajo opened this issue May 26, 2020 · 2 comments
Open

Anchor contains all the features #50

laramdemajo opened this issue May 26, 2020 · 2 comments

Comments

@laramdemajo
Copy link

I am using HELOC dataset which can be downloaded from https://community.fico.com/s/explainable-machine-learning-challenge?tabset-3158a=2.

I am using an XGBoost model as classification function and trying to use Anchors as an explainability technique over and above XGBoost.

I am using the below code to implement Anchors, however the anchors that are being outputted contain all the features (for most instances in test data), which is obviously very hard to read (and therefore, not that interpretable). Moreover, the precision for the whole anchor when given a threshold of 0.8 is only 0.33.

explainer = anchor_tabular.AnchorTabularExplainer(class_names=['Bad', 'Good'],
       feature_names=dfTrain.columns, train_data=np.array(dfTrain), categorical_names={})

idx = 100
np.random.seed(1)
predict_fn = lambda x: model.predict(xgb.DMatrix(pd.DataFrame(x, columns=list(dfTest.columns)), label = [yTest[idx]]))
print('Prediction: ', explainer.class_names[int(round(predict_fn(dfTest.iloc[[idx],:])[0]))])
exp = explainer.explain_instance(np.array(dfTest.iloc[[idx],:]), predict_fn, threshold=0.8)

print('Anchor: %s' % (' AND '.join(exp.names())))
print('Precision: %.2f' % exp.precision())
print('Coverage: %.2f' % exp.coverage())

Here is a screenshot of a sample anchor:
Screen Shot 2020-05-26 at 21 16 12

Is there something I can do from my end to improve this?

Thanks,
Lara

@marcotcr
Copy link
Owner

marcotcr commented Jun 5, 2020

If precision is still not 1 with all the features, it means that the discretization is too coarse. Try using discretizer=decile or providing your own discretizer.

@marcotcr
Copy link
Owner

marcotcr commented Jun 5, 2020

It may very well be the case that the model is too 'jumpy', in which case a full anchor is the right thing even if it's not useful (this is a limitation of anchors as discussed in the paper). But it sounds like the problem here is potentially the discretization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants