Support Attributions for Multiple Embedding Types

SequenceClassificationExplainer now has support for word attributions for both word_embeddings and position_embeddings for model's where position_ids are part of a model's forward method. Embeddings for attribution can be set with the class's call method.

from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("aychang/roberta-base-imdb")
model = AutoModelForSequenceClassification.from_pretrained("aychang/roberta-base-imdb")

Word Embedding Attributions

from transformers_interpret import SequenceClassificationExplainer
cls_explainer = SequenceClassificationExplainer(
    "This was a really good film I enjoyed it a lot", 
    model, 
    tokenizer)
attributions = cls_explainer(embedding_type=0) # 0 = word

>>> attributions.word_attributions 
[('<s>', 0.0),
 ('This', -0.3240508614377356),
 ('was', 0.1438011922867732),
 ('a', 0.2243325698743557),
 ('really', 0.2303368793560317),
 ('good', -0.0600901206724276),
 ('film', 0.01613507050261139),
 ('I', 0.002752767414682212),
 ('enjoyed', 0.36666383287176274),
 ('it', 0.46981294407030466),
 ('a', 0.15187907852049023),
 ('lot', 0.6235539369814076),
 ('</s>', 0.0)]

Position Embedding Attributions

from transformers_interpret import SequenceClassificationExplainer
cls_explainer = SequenceClassificationExplainer(
    "This was a really good film I enjoyed it a lot", 
    model, 
    tokenizer)
attributions = cls_explainer(embedding_type=1) # 1 = position
>>> attributions.word_attributions 
[('<s>', 0.0),
 ('This', -0.011571866816239364),
 ('was', 0.9746020664206717),
 ('a', 0.06633740353266766),
 ('really', 0.007891184021722232),
 ('good', 0.11340512797772889),
 ('film', -0.1035443669783489),
 ('I', -0.030966387400513003),
 ('enjoyed', -0.07312861129345115),
 ('it', -0.062475007741951326),
 ('a', 0.05681161636240444),
 ('lot', 0.04342110477675596),
 ('</s>', 0.08154160609887448)]

Additional Functionality Added To Base Explainer

To support multiple embedding types for the classification explainer a number of handlers were added to the BaseExplainer to allow this functionality to be added easily to future explainers.

BaseExplainer inspects signature of a model's forward function and determines whether it receives position_ids and token_type_ids. For example Bert models take both as optional parameters whereas distilbert does not.
From this inspection the available embedding types are set in the BaseExplainer rather than in explainers that inherit from it.

Misc

Updated tests, many of the tests in the suite now test out 3 different architectures Bert, Distilbert and GPT2. This helps iron out any issues with slight variations that these model's have.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v0.2.0

Support Attributions for Multiple Embedding Types

Word Embedding Attributions

Position Embedding Attributions

Additional Functionality Added To Base Explainer

Misc

Uh oh!