|
24 | 24 |
|
25 | 25 | ## Chinese Version
|
26 | 26 |
|
27 |
| -To facilitate the reading of our (English-verison) survey, we also employ LLMs + some human checking to generate a [**Chinese version**](assets/LLM_Survey__Chinese_V1.pdf) for this survey. While, since it is mainly generated by LLMs, please don't forward or post its content on the Web. |
| 27 | +To facilitate the reading of our (English-verison) survey, we also translate a [**Chinese version**](assets/LLM_Survey_Chinese.pdf) for this survey. We will continue to update the Chinese version. |
28 | 28 |
|
29 | 29 |
|
30 | 30 |
|
@@ -663,6 +663,7 @@ Please click [here](Experiments/README.md) to view more detailed information.
|
663 | 663 | **Attention**
|
664 | 664 | 1. <u>Multi-query attention</u>: **"Fast Transformer Decoding: One Write-Head is All You Need"**. *Noam Shazeer*. arXiv 2019. [[paper](https://arxiv.org/abs/1911.02150)]
|
665 | 665 | 1. <u>FlashAttention</u>: **"FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness"**. *Tri Dao et al*. NeurIPS 2022. [[paper](https://arxiv.org/abs/2205.14135)]
|
| 666 | +1. <u>PagedAttention</u>: **"vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention"**. *Woosuk Kwon et al*. 2023. paper(Stay Tuned) [[Offical WebSite](https://vllm.ai/)] |
666 | 667 |
|
667 | 668 | ##### Analysis
|
668 | 669 |
|
@@ -749,7 +750,7 @@ Please click [here](Experiments/README.md) to view more detailed information.
|
749 | 750 | 1. **"Scaling Laws for Reward Model Overoptimization"**. *Leo Gao et al*. arXiv 2022. [[Paper](https://arxiv.org/abs/2210.10760)]
|
750 | 751 | 1. **"The Wisdom of Hindsight Makes Language Models Better Instruction Followers"**. *Tianjun Zhang et al*. arXiv 2023. [[Paper](https://arxiv.org/abs/2302.05206)]
|
751 | 752 | 1. **"RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment"**. *Hanze Dong et al*. arXiv 2023. [[Paper](https://arxiv.org/abs/2304.06767)]
|
752 |
| - |
| 753 | +1. **"Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment"**. *Rishabh Bhardwaj et al*. arXiv 2023. [[Paper](https://arxiv.org/abs/2308.09662)] |
753 | 754 |
|
754 | 755 | #### Parameter-Efficient Model Adaptation
|
755 | 756 | 1. **"Parameter-Efficient Transfer Learning for NLP"**. *Neil Houlsby et al*. ICML 2019. [[Paper](https://arxiv.org/abs/1902.00751)] [[GitHub](https://github.com/google-research/adapter-bert)]
|
@@ -953,6 +954,10 @@ Please click [here](Experiments/README.md) to view more detailed information.
|
953 | 954 | 81. **"Yes but.. Can ChatGPT Identify Entities in Historical Documents?"**. *Carlos-Emiliano González-Gallardo et al.* arXiv 2023. [[Paper](https://arxiv.org/abs/2303.17322v1)]
|
954 | 955 | 82. **"Uncovering ChatGPT's Capabilities in Recommender Systems"**. *Sunhao Dai et al.* arXiv 2023. [[Paper](https://arxiv.org/abs/2305.02182)]
|
955 | 956 | 83. **"Editing Large Language Models: Problems, Methods, and Opportunities"**. *Yunzhi Yao et al.* arXiv 2023. [[Paper](https://arxiv.org/abs/2305.13172)]
|
| 957 | +84. **"Red teaming ChatGPT via Jailbreaking: Bias, Robustness, Reliability and Toxicity"**. *Terry Yue Zhuo et al.* arXiv 2023. [[Paper](https://arxiv.org/abs/2301.12867)] |
| 958 | +85. **"On Robustness of Prompt-based Semantic Parsing with Large Pre-trained Language Model: An Empirical Study on Codex"**. *Terry Yue Zhuo et al.* EACL 2023. [[Paper](https://arxiv.org/abs/2301.12868)] |
| 959 | +86. **"A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets"**. Laskar et al.* ACL'23. [[Paper]](https://arxiv.org/abs/2305.18486) |
| 960 | +87. **"Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment"**. *Rishabh Bhardwaj et al*. arXiv 2023. [[Paper](https://arxiv.org/abs/2308.09662)] |
956 | 961 |
|
957 | 962 | ### The Team
|
958 | 963 |
|
|
0 commit comments