Skip to content

Commit e3145b6

Browse files
authored
Merge pull request #1 from torrvision/fazl-pub
Adding Fazl papers from 2023-2024
2 parents 15642bd + 57aadf7 commit e3145b6

7 files changed

+52
-0
lines changed
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
---
2+
title: "Measuring Value Alignment"
3+
year: 2023
4+
pdf_url: "https://arxiv.org/pdf/2312.15241"
5+
author_list: "Fazl Barez, Philip Torr"
6+
pub_in: arxiv
7+
---
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
---
2+
titie: "The Alan Turing Institute’s response to the House of Lords Large Language Models Call for Evidence"
3+
year: 2023
4+
pdf_url: "https://www.turing.ac.uk/news/publications/alan-turing-institutes-response-house-lords-large-language-models-call-evidence"
5+
author_list: "Fazl Barez, Philip H. S. Torr, Aleksandar Petrov, Carolyn Ashurst, Jennifer Ding, Ardi Janjeva, Alexander Babuta, Morgan Briggs, Jonathan Bright, Stephanie Cairns, Miranda Cross, David Leslie, Helen Margetts, Deborah Morgan, Jacob Pratt, Vincent Straub, Christopher Thomas, Sophie Arana, Christopher Burr, Cassandra Gould Van Praag, Kalle Westerling, Kirstie Whitaker, Arielle Bennett, Malvika Sharan, Bastian Greshake Tzovaras, Ashley Van De Casteele, Matt Fuller"
6+
pub_in: "The Alan Turing Institute"
7+
---
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
---
2+
title: "Interpreting Learned Feedback Patterns in Large Language Models"
3+
year: 2023
4+
pdf_url: "https://openreview.net/pdf?id=xUoNgR1Byy",
5+
author_list: "Luke Marks, Amir Abdullah, Luna Mendez, Rauno Arike, David Krueger, Philip Torr, Fazl Barez"
6+
pub_in: "Neurips 2024"
7+
---
8+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
---
2+
title: "PoisonBench: Assessing Large Language Model Vulnerability to Data Poisoning"
3+
year: 2024
4+
pdf_url: "https://arxiv.org/pdf/2410.08811",
5+
author_list: "Tingchen Fu, Mrinank Sharma, Philip Torr, Shay B Cohen, David Krueger, Fazl Barez"
6+
pub_in: "arxiv"
7+
---
8+
9+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
---
2+
title: "Sparse autoencoders reveal universal feature spaces across large language models"
3+
year: 2024
4+
pdf_url: "https://arxiv.org/pdf/2410.06981"
5+
author_list: "Michael Lan, Philip Torr, Austin Meek, Ashkan Khakzar, David Krueger, Fazl Barez"
6+
pub_in: arxiv
7+
---
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
---
2+
title: "Towards Interpreting Visual Information Processing in Vision-Language Models"
3+
year: 2024
4+
pdf_url: "https://arxiv.org/pdf/2410.07149"
5+
author_list: "Clement Neo, Luke Ong, Philip Torr, Mor Geva, David Krueger, Fazl Barez"
6+
pub_in: "arxiv"
7+
---
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
---
2+
title: "Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models"
3+
year: 2024
4+
pdf_url: "https://aclanthology.org/2024.emnlp-main.699.pdf"
5+
author_list: "Michael Lan, Philip Torr, Fazl Barez"
6+
pub_in: "EMNLP 2024"
7+
---

0 commit comments

Comments
 (0)