Skip to content

Commit

Permalink
Add rescale and readme changes
Browse files Browse the repository at this point in the history
  • Loading branch information
gsarti committed Jun 27, 2024
1 parent 067351a commit 0e318ce
Show file tree
Hide file tree
Showing 5 changed files with 23 additions and 4 deletions.
8 changes: 6 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,15 +9,19 @@
## 🔧 Fixes and Refactoring

- Fix the issue in the attention implementation from [#268](https://github.com/inseq-team/inseq/issues/268) where non-terminal position in the tensor were set to nan if they were 0s ([#269](https://github.com/inseq-team/inseq/pull/269)).

- Fix the pad token in cases where it is not specified by default in the loaded model (e.g. for Qwen models) ([#269](https://github.com/inseq-team/inseq/pull/269)).

- Fix bug reported in [#266](https://github.com/inseq-team/inseq/issues/266) making `value_zeroing` unusable for SDPA attention. This enables using the method on models using SDPA attention as default (e.g. `GemmaForCausalLM`) without passing `model_kwargs={'attn_implementation': 'eager'}` ([#267](https://github.com/inseq-team/inseq/pull/267)).

- Fix multi-device support and duplicate BOS for chat template models ([#280](https://github.com/inseq-team/inseq/pull/280)).

- Add `rescale_attributions` to Inseq CLI commands for `rescale=True` ([#280](https://github.com/inseq-team/inseq/pull/280)).

## 📝 Documentation and Tutorials

*No changes*

## 💥 Breaking Changes

*No changes*
*No changes*
7 changes: 5 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -280,7 +280,7 @@ Our vision for Inseq is to create a centralized, comprehensive and robust set of

## Citing Inseq

If you use Inseq in your research we suggest to include a mention to the specific release (e.g. v0.4.0) and we kindly ask you to cite our reference paper as:
If you use Inseq in your research we suggest to include a mention to the specific release (e.g. v0.6.0) and we kindly ask you to cite our reference paper as:

```bibtex
@inproceedings{sarti-etal-2023-inseq,
Expand Down Expand Up @@ -308,7 +308,7 @@ If you use Inseq in your research we suggest to include a mention to the specifi
Inseq has been used in various research projects. A list of known publications that use Inseq to conduct interpretability analyses of generative models is shown below.

> [!TIP]
> Last update: May 2024. Please open a pull request to add your publication to the list.
> Last update: June 2024. Please open a pull request to add your publication to the list.
<details>
<summary><b>2023</b></summary>
Expand All @@ -331,6 +331,9 @@ Inseq has been used in various research projects. A list of known publications t
<li><a href="https://arxiv.org/abs/2402.00794">ReAGent: A Model-agnostic Feature Attribution Method for Generative Language Models</a> (Zhao et al., 2024)</li>
<li><a href="https://arxiv.org/abs/2404.02421">Revisiting subword tokenization: A case study on affixal negation in large language models</a> (Truong et al., 2024)</li>
<li><a href="https://hal.science/hal-04581586">Exploring NMT Explainability for Translators Using NMT Visualising Tools</a> (Gonzalez-Saez et al., 2024)</li>
<li><a href="https://arxiv.org/abs/2405.14899">DETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning</a> (Zhou et al., 2024)</li>
<li><a href="https://arxiv.org/abs/2406.06399">Should We Fine-Tune or RAG? Evaluating Different Techniques to Adapt LLMs for Dialogue</a> (Alghisi et al., 2024)</li>
<li><a href="https://arxiv.org/abs/2406.13663">Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation</a> (Qi, Sarti et al., 2024)</li>
</ol>

</details>
3 changes: 3 additions & 0 deletions inseq/commands/attribute/attribute.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,14 @@ def aggregate_attribution_scores(
selectors: Optional[list[int]] = None,
aggregators: Optional[list[str]] = None,
normalize_attributions: bool = False,
rescale_attributions: bool = False,
) -> FeatureAttributionOutput:
if selectors is not None and aggregators is not None:
for select_idx, aggregator_fn in zip(selectors, aggregators):
out = out.aggregate(
aggregator=aggregator_fn,
normalize=normalize_attributions,
rescale=rescale_attributions,
select_idx=select_idx,
do_post_aggregation_checks=False,
)
Expand Down Expand Up @@ -79,6 +81,7 @@ def attribute(input_texts, generated_texts, args: AttributeExtendedArgs):
selectors=args.attribution_selectors,
aggregators=args.attribution_aggregators,
normalize_attributions=args.normalize_attributions,
rescale_attributions=args.rescale_attributions,
)
print(f"Saving {'aggregated ' if args.aggregate_output else ''}attributions to {args.save_path}")
out.save(args.save_path, overwrite=True)
Expand Down
8 changes: 8 additions & 0 deletions inseq/commands/attribute/attribute_args.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,14 @@ class AttributeBaseArgs:
"for each context are normalized to sum up to 1, providing a relative notion of input salience."
),
)
rescale_attributions: bool = cli_arg(
default=False,
help=(
"Whether to rescale the attribution scores for each context. If ``True``, the attribution scores "
"for each context are rescaled to sum up to the number of tokens in the input, providing an absolute"
" notion of input salience."
),
)
model_kwargs: dict = cli_arg(
default_factory=dict,
help="Additional keyword arguments passed to the model constructor in JSON format.",
Expand Down
1 change: 1 addition & 0 deletions inseq/commands/attribute_context/attribute_context.py
Original file line number Diff line number Diff line change
Expand Up @@ -211,6 +211,7 @@ def attribute_context_with_model(args: AttributeContextArgs, model: HuggingfaceM
selectors=args.attribution_selectors,
aggregators=args.attribution_aggregators,
normalize_attributions=args.normalize_attributions,
rescale_attributions=args.rescale_attributions,
)[0]
if args.show_intermediate_outputs:
cci_attrib_out.show(do_aggregation=False)
Expand Down

0 comments on commit 0e318ce

Please sign in to comment.