Zero Shot Classification Explainer Improvements (#49)

Changes the default behavior of how the zero shot classification explainer works by calculating attributions for each
label by default and displaying the attributions for every label in the visualization. This required some major reorganization of the former implementation.

Memory Optimizations (#54)

Every explainer instance can now take an optional parameter internal_batch_size.
This helps prevent issues where the explainer would cause OOM errors because all the steps (50 by default) used to calculate the attributions are batched together.
For large models like Longformer etc it is recommended to select very low values (1 or 2) for internal_batch_size (#51).
This addition has been extremely helpful in stabilizing the performance of the streamlit demo app which prior to this update was crashing frequently. Lowering internal_batch_size should greatly reduce memory overhead in situations where more than a single batch of gradients would cause OOM.

cls_explainer('A very short 100 character text here!', internal_batch_size=1)

Explainer instances can now also accept another optional parameter n_steps. Default value for n_steps in Captum is 50.
n_steps controls the number of steps used to calculate the approximate attributions from the baseline inputs to the true inputs.
Higher values for n_steps should result in less noisy approximations than lower values but longer calculation times.
If n_steps is set to a particularly high value it is highly recommended to set internal_batch_size to a low value to prevent OOM issues.

cls_explainer('A very short 100 character text here!', n_steps=100)