Skip to content

Commit

Permalink
Dev (#239)
Browse files Browse the repository at this point in the history
  • Loading branch information
MING-ZCH authored May 8, 2024
2 parents 6c870d3 + 91ec1b5 commit ad3a1ce
Show file tree
Hide file tree
Showing 12 changed files with 34 additions and 47 deletions.
18 changes: 9 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -172,12 +172,12 @@
- [🔗框架图](#框架图)
- [目录](#目录)
- [开发前的配置要求](#开发前的配置要求)
- [**使用指南**](#使用指南)
- [使用指南](#使用指南)
- [🍪快速体验](#快速体验)
- [📌数据构建](#数据构建)
- [🎨微调指南](#微调指南)
- [🔧部署指南](#部署指南)
- [⚙RAG(检索增强生成)Pipeline](#rag检索增强生成pipeline)
- [⚙RAG(检索增强生成)](#rag检索增强生成)
- [使用到的框架](#使用到的框架)
- [如何参与本项目](#如何参与本项目)
- [作者(排名不分先后)](#作者排名不分先后)
Expand All @@ -192,7 +192,7 @@

- 硬件:A100 40G(仅针对InternLM2_7B_chat+qlora微调+deepspeed zero2优化)

###### **使用指南**
###### 使用指南

1. Clone the repo

Expand All @@ -211,7 +211,8 @@ git clone https://github.com/SmartFlowAI/EmoLLM.git

### 🍪快速体验

- 请阅读[快速体验](docs/quick_start.md)查阅
- 请阅读[快速体验](quick_start/quick_start.md)查阅
- 快速上手:[Baby EmoLLM](quick_start/Baby_EmoLLM.ipynb)


### 📌数据构建
Expand All @@ -229,9 +230,9 @@ git clone https://github.com/SmartFlowAI/EmoLLM.git
- Demo部署:详见[部署指南](demo/README.md)
- 基于[LMDeploy](https://github.com/InternLM/lmdeploy/)的量化部署:详见[deploy](./deploy/lmdeploy.md)

### ⚙RAG(检索增强生成)Pipeline
### ⚙RAG(检索增强生成)

- 详见[RAG](./rag/)
- 详见[RAG](rag/README.md)

<details>
<summary>更多详情</summary>
Expand Down Expand Up @@ -307,11 +308,10 @@ git clone https://github.com/SmartFlowAI/EmoLLM.git

### 特别鸣谢

- [Sanbu](https://github.com/sanbuphy)
- [上海人工智能实验室](https://www.shlab.org.cn/)
- [闻星大佬(小助手)](https://github.com/vansin)
- [扫地升(公众号宣传)](https://mp.weixin.qq.com/s/78lrRl2tlXEKUfElnkVx4A)
- [闻星(浦语小助手)](https://github.com/vansin)
- 阿布(北大心理学硕士)
- [Sanbu](https://github.com/sanbuphy)
- [HatBoy](https://github.com/hatboy)

<!-- links -->
Expand Down
20 changes: 10 additions & 10 deletions README_EN.md
Original file line number Diff line number Diff line change
Expand Up @@ -173,12 +173,12 @@ The Model aims to fully understand and promote the mental health of individuals,
- [Roadmap](#roadmap)
- [Contents](#contents)
- [Pre-development Configuration Requirements.](#pre-development-configuration-requirements)
- [**User Guide**](#user-guide)
- [User Guide](#user-guide)
- [🍪Quick start](#quick-start)
- [📌Data Construction](#data-construction)
- [🎨Fine-tuning Guide](#fine-tuning-guide)
- [🔧Deployment Guide](#deployment-guide)
- [⚙RAG (Retrieval Augmented Generation) Pipeline](#rag-retrieval-augmented-generation-pipeline)
- [⚙RAG (Retrieval Augmented Generation)](#rag-retrieval-augmented-generation)
- [Frameworks Used](#frameworks-used)
- [How to participate in this project](#how-to-participate-in-this-project)
- [Version control](#version-control)
Expand All @@ -193,7 +193,7 @@ The Model aims to fully understand and promote the mental health of individuals,

- A100 40G (specifically for InternLM2_7B_chat + qlora fine-tuning + deepspeed zero2 optimization)

###### **User Guide**
###### User Guide

1. Clone the repo

Expand All @@ -211,7 +211,8 @@ git clone https://github.com/SmartFlowAI/EmoLLM.git


### 🍪Quick start
- Please read [Quick Start](docs/quick_start_EN.md) to see.
- Please read [Quick Start](quick_start/quick_start_EN.md) to see.
- Quick coding: [Baby EmoLLM](quick_start/Baby_EmoLLM.ipynb)

### 📌Data Construction

Expand All @@ -228,9 +229,9 @@ For details, see the [fine-tuning guide](xtuner_config/README_EN.md)
- Demo deployment: see [deployment guide](./demo/README_EN.md) for details.
- Quantitative deployment based on [LMDeploy](https://github.com/InternLM/lmdeploy/): see [deploy](./deploy/lmdeploy_EN.md)

### ⚙RAG (Retrieval Augmented Generation) Pipeline
### ⚙RAG (Retrieval Augmented Generation)

- See [RAG](./rag/)
- See [RAG](rag/README_EN.md)

<details>
<summary>Additional Details</summary>
Expand Down Expand Up @@ -297,11 +298,10 @@ The project is licensed under the MIT License. Please refer to the details

### Acknowledgments

- [Sanbu](https://github.com/sanbuphy)
- [Shanghai Artificial Intelligence Laboratory](https://www.shlab.org.cn/)
- [Vanin](https://github.com/vansin)
- [Bloom up (WeChat Official Account Promotion)](https://mp.weixin.qq.com/s/78lrRl2tlXEKUfElnkVx4A)
- Abu (M.A. in Psychology, Peking University)
- [Vansin](https://github.com/vansin)
- A.bu (M.A. in Psychology, Peking University)
- [Sanbuphy](https://github.com/sanbuphy)
- [HatBoy](https://github.com/hatboy)

<!-- links -->
Expand Down
21 changes: 0 additions & 21 deletions datasets/LICENSE

This file was deleted.

2 changes: 1 addition & 1 deletion datasets/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

* 数据集按用处分为两种类型:**General****Role-play**
* 数据按格式分为两种类型:**QA****Conversation**
* 数据汇总:General(**6个数据集**);Role-play(**5个数据集**
* 数据汇总:General(**8个数据集**);Role-play(**5个数据集**

## 数据集类型

Expand Down
2 changes: 1 addition & 1 deletion datasets/README_EN.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

* Category of dataset: **General** and **Role-play**
* Type of data: **QA** and **Conversation**
* Summary: General(**6 datasets**), Role-play(**5 datasets**)
* Summary: General(**8 datasets**), Role-play(**5 datasets**)

## Category
* **General**: generic dataset, including psychological Knowledge, counseling technology, etc.
Expand Down
16 changes: 12 additions & 4 deletions datasets/processed/Book_QA_Process.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,15 @@
## 一共有两个 .py 文件,分别为Book_QA_process_Step_1.py和Book_QA_process_Step_2.py
# Book_QA_process

共两个python文件,分别为Book_QA_process_Step_1.py和Book_QA_process_Step_2.py

### Book_QA_process_Step_1.py
该代码是将我们生成的QA对jsonl数据转换为json格式

* 该代码是将我们生成的QA对jsonl数据转换为json格式

### Book_QA_process_Step_2.py
该代码是将第一步生成的json格式数据转化为可用于指令微调的数据格式,并添加system,即:
* 该代码是将第一步生成的json格式数据转化为可用于指令微调的数据格式,并添加system,即:

```json
{
"conversation": [
{
Expand All @@ -11,4 +18,5 @@
"output": "Answer"
}
]
}
}
```
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

# 打开JSON文件并读取其内容

file_name = 'ruozhiba_raw.jsonl'
file_name = '../ruozhiba_raw.jsonl'

# with open(f'data/{file_name}', 'r', encoding='utf-8') as file:
# data = json.load(file)
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

0 comments on commit ad3a1ce

Please sign in to comment.