Skip to content

Commit c5e5ab8

Browse files
committed
Merged upstream fixes
2 parents 43b2f41 + 3c1ff4f commit c5e5ab8

10 files changed

+154
-100
lines changed

.github/workflows/push_format.yml

+2-1
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,8 @@ jobs:
4040
- name: Create Pull Request
4141
if: steps.commitback.outcome == 'success'
4242
continue-on-error: true
43-
uses: peter-evans/create-pull-request@v4
43+
uses: peter-evans/create-pull-request@v5
4444
with:
4545
body: Apply Code Formatter Change
46+
title: Apply Code Formatter Change
4647
commit-message: Automatic code format

Changelog_KO.md

+23-9
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,26 @@
1+
### 2023년 6월 18일 업데이트
2+
3+
- v2 버전에서 새로운 32k와 48k 사전 학습 모델을 추가.
4+
- non-f0 모델들의 추론 오류 수정.
5+
- 학습 세트가 1시간을 넘어가는 경우, 인덱스 생성 단계에서 minibatch-kmeans을 사용해, 학습속도 가속화.
6+
- [huggingface](https://huggingface.co/spaces/lj1995/vocal2guitar)에서 vocal2guitar 제공.
7+
- 데이터 처리 단계에서 이상 값 자동으로 제거.
8+
- ONNX로 내보내는(export) 옵션 탭 추가.
9+
10+
업데이트에 적용되지 않았지만 시도한 것들 :
11+
12+
- 시계열 차원을 추가하여 특징 검색을 진행했지만, 유의미한 효과는 없었습니다.
13+
- PCA 차원 축소를 추가하여 특징 검색을 진행했지만, 유의미한 효과는 없었습니다.
14+
- ONNX 추론을 지원하는 것에 실패했습니다. nsf 생성시, Pytorch가 필요하기 때문입니다.
15+
- 훈련 중에 입력에 대한 음고, 성별, 이퀄라이저, 노이즈 등 무작위로 강화하는 것에, 유의미한 효과는 없었습니다.
16+
17+
추후 업데이트 목록:
18+
19+
- Vocos-RVC (소형 보코더) 통합 예정.
20+
- 학습 단계에 음고 인식을 위한 Crepe 지원 예정.
21+
- Crepe의 정밀도를 REC-config와 동기화하여 지원 예정.
22+
- FO 에디터 지원 예정.
23+
124
### 2023년 5월 28일 업데이트
225

326
- v2 jupyter notebook 추가, 한국어 업데이트 로그 추가, 의존성 모듈 일부 수정.
@@ -8,15 +31,6 @@
831
- 배치 음성 변환 처리 및 UVR5 보컬 분리 시, 사용자가 수동으로 출력 오디오의 내보내기(export) 형식을 선택할 수 있도록 지원.
932
- 32k 훈련 모델 지원 종료.
1033

11-
추후 업데이트 목록:
12-
13-
- 특징 검색: 시간적 특징 검색 추가.
14-
- 특징 검색: pre-kmeans 옵션 추가.
15-
- 특징 검색: PCAR 차원 축소 추가.
16-
- onnx 추론 지원 추가.
17-
- 학습 시 랜덤 데이터 증강: 음고, 성별, eq, 잡음.
18-
- v2 버전 사전 훈련된 모델 추가.
19-
2034
### 2023년 5월 13일 업데이트
2135

2236
- 원클릭 패키지의 이전 버전 런타임 내, 불필요한 코드(infer_pack 및 uvr5_pack) 제거.

docs/README.en.md

+6-4
Original file line numberDiff line numberDiff line change
@@ -20,17 +20,19 @@ An easy-to-use Voice Conversion framework based on VITS.<br><br>
2020

2121
[**English**](./README.en.md) | [**中文简体**](../README.md) | [**日本語**](./README.ja.md) | [**한국어**](./README.ko.md) ([**韓國語**](./README.ko.han.md))
2222

23-
:fire: A online demo using RVC that convert Vocal to Acoustic Guitar audio:fire:https://huggingface.co/spaces/lj1995/vocal2guitar
2423

25-
:fire: Vocal2Guitar demo video:fire:https://www.bilibili.com/video/BV19W4y1D7tT/
24+
Check our [Demo Video](https://www.bilibili.com/video/BV1pm4y1z7Gm/) here!
2625

27-
> Check our [Demo Video](https://www.bilibili.com/video/BV1pm4y1z7Gm/) here!
26+
Realtime Voice Conversion Software using RVC : [w-okada/voice-changer](https://github.com/w-okada/voice-changer)
2827

29-
> Realtime Voice Conversion Software using RVC : [w-okada/voice-changer](https://github.com/w-okada/voice-changer)
28+
> A online demo using RVC that convert Vocal to Acoustic Guitar audio:https://huggingface.co/spaces/lj1995/vocal2guitar
29+
30+
> Vocal2Guitar demo video:https://www.bilibili.com/video/BV19W4y1D7tT/
3031
3132
> The dataset for the pre-training model uses nearly 50 hours of high quality VCTK open source dataset.
3233
3334
> High quality licensed song datasets will be added to training-set one after another for your use, without worrying about copyright infringement.
35+
3436
## Summary
3537
This repository has the following features:
3638
+ Reduce tone leakage by replacing source feature to training-set feature using top1 retrieval;

i18n/en_US.json

+1-13
Original file line numberDiff line numberDiff line change
@@ -122,17 +122,5 @@
122122
"开始音频转换": "Start audio conversion",
123123
"停止音频转换": "Stop audio conversion",
124124
"推理时间(ms):": "Inference time (ms):",
125-
"人声伴奏分离批量处理, 使用UVR5模型。 <br>合格的文件夹路径格式举例: E:\\codes\\py39\\vits_vc_gpu\\白鹭霜华测试样例(去文件管理器地址栏拷就行了)。 <br>模型分为三类: <br>1、保留人声:不带和声的音频选这个,对主人声保留比HP5更好。内置HP2和HP3两个模型,HP3可能轻微漏伴奏但对主人声保留比HP2稍微好一丁点; <br>2、仅保留主人声:带和声的音频选这个,对主人声可能有削弱。内置HP5一个模型; <br> 3、去混响、去延迟模型(by FoxJoy):<br>  (1)MDX-Net(onnx_dereverb):对于双通道混响是最好的选择,不能去除单通道混响;<br>&emsp;(234)DeEcho:去除延迟效果。Aggressive比Normal去除得更彻底,DeReverb额外去除混响,可去除单声道混响,但是对高频重的板式混响去不干净。<br>去混响/去延迟,附:<br>1、DeEcho-DeReverb模型的耗时是另外2个DeEcho模型的接近2倍;<br>2、MDX-Net-Dereverb模型挺慢的;<br>3、个人推荐的最干净的配置是先MDX-Net再DeEcho-Aggressive。":"Batch processing for vocal accompaniment separation using the UVR5 model.<br>Example of a valid folder path format: D:\\path\\to\\input\\folder (copy it from the file manager address bar).<br>The model is divided into three categories:<br>1. Preserve vocals: Choose this option for audio without harmonies. It preserves vocals better than HP5. It includes two built-in models: HP2 and HP3. HP3 may slightly leak accompaniment but preserves vocals slightly better than HP2.<br>2. Preserve main vocals only: Choose this option for audio with harmonies. It may weaken the main vocals. It includes one built-in model: HP5.<br>3. De-reverb and de-delay models (by FoxJoy):<br>  (1) MDX-Net: The best choice for stereo reverb removal but cannot remove mono reverb;<br>&emsp;(234) DeEcho: Removes delay effects. Aggressive mode removes more thoroughly than Normal mode. DeReverb additionally removes reverb and can remove mono reverb, but not very effectively for heavily reverberated high-frequency content.<br>De-reverb/de-delay notes:<br>1. The processing time for the DeEcho-DeReverb model is approximately twice as long as the other two DeEcho models.<br>2. The MDX-Net-Dereverb model is quite slow.<br>3. The recommended cleanest configuration is to apply MDX-Net first and then DeEcho-Aggressive.",
126-
"人声伴奏分离批量处理, 使用UVR5模型。": "Batch processing for vocal accompaniment separation using the UVR5 model.",
127-
"合格的文件夹路径格式举例: E:\\codes\\py39\\vits_vc_gpu\\白鹭霜华测试样例(去文件管理器地址栏拷就行了)。": "Example of a valid folder path format: D:\\path\\to\\input\\folder (copy it from the file manager address bar).",
128-
"模型分为三类:": "The model is divided into three categories:",
129-
"1、保留人声:不带和声的音频选这个,对主人声保留比HP5更好。内置HP2和HP3两个模型,HP3可能轻微漏伴奏但对主人声保留比HP2稍微好一丁点;": "1. Preserve vocals: Choose this option for audio without harmonies. It preserves vocals better than HP5. It includes two built-in models: HP2 and HP3. HP3 may slightly leak accompaniment but preserves vocals slightly better than HP2.",
130-
"2、仅保留主人声:带和声的音频选这个,对主人声可能有削弱。内置HP5一个模型;": "2. Preserve main vocals only: Choose this option for audio with harmonies. It may weaken the main vocals. It includes one built-in model: HP5.",
131-
"3、去混响、去延迟模型(by FoxJoy):": "3. De-reverb and de-delay models (by FoxJoy):",
132-
"(1)MDX-Net:对于双通道混响是最好的选择,不能去除单通道混响;": "(1) MDX-Net: The best choice for stereo reverb removal but cannot remove mono reverb;",
133-
"(234)DeEcho:去除延迟效果。Aggressive比Normal去除得更彻底,DeReverb额外去除混响,可去除单声道混响,但是对高频重的板式混响去不干净。": "(234) DeEcho: Removes delay effects. Aggressive mode removes more thoroughly than Normal mode. DeReverb additionally removes reverb and can remove mono reverb, but not very effectively for heavily reverberated high-frequency content.",
134-
"去混响/去延迟,附:": "De-reverb/de-delay notes:",
135-
"1、DeEcho-DeReverb模型的耗时是另外2个DeEcho模型的接近2倍;": "1. The processing time for the DeEcho-DeReverb model is approximately twice as long as the other two DeEcho models.",
136-
"2、MDX-Net-Dereverb模型挺慢的;": "2. The MDX-Net-Dereverb model is quite slow.",
137-
"3、个人推荐的最干净的配置是先MDX-Net再DeEcho-Aggressive。": "3. The recommended cleanest configuration is to apply MDX-Net first and then DeEcho-Aggressive."
125+
"人声伴奏分离批量处理, 使用UVR5模型。 <br>合格的文件夹路径格式举例: E:\\codes\\py39\\vits_vc_gpu\\白鹭霜华测试样例(去文件管理器地址栏拷就行了)。 <br>模型分为三类: <br>1、保留人声:不带和声的音频选这个,对主人声保留比HP5更好。内置HP2和HP3两个模型,HP3可能轻微漏伴奏但对主人声保留比HP2稍微好一丁点; <br>2、仅保留主人声:带和声的音频选这个,对主人声可能有削弱。内置HP5一个模型; <br> 3、去混响、去延迟模型(by FoxJoy):<br>  (1)MDX-Net(onnx_dereverb):对于双通道混响是最好的选择,不能去除单通道混响;<br>&emsp;(234)DeEcho:去除延迟效果。Aggressive比Normal去除得更彻底,DeReverb额外去除混响,可去除单声道混响,但是对高频重的板式混响去不干净。<br>去混响/去延迟,附:<br>1、DeEcho-DeReverb模型的耗时是另外2个DeEcho模型的接近2倍;<br>2、MDX-Net-Dereverb模型挺慢的;<br>3、个人推荐的最干净的配置是先MDX-Net再DeEcho-Aggressive。":"Batch processing for vocal accompaniment separation using the UVR5 model.<br>Example of a valid folder path format: D:\\path\\to\\input\\folder (copy it from the file manager address bar).<br>The model is divided into three categories:<br>1. Preserve vocals: Choose this option for audio without harmonies. It preserves vocals better than HP5. It includes two built-in models: HP2 and HP3. HP3 may slightly leak accompaniment but preserves vocals slightly better than HP2.<br>2. Preserve main vocals only: Choose this option for audio with harmonies. It may weaken the main vocals. It includes one built-in model: HP5.<br>3. De-reverb and de-delay models (by FoxJoy):<br>  (1) MDX-Net: The best choice for stereo reverb removal but cannot remove mono reverb;<br>&emsp;(234) DeEcho: Removes delay effects. Aggressive mode removes more thoroughly than Normal mode. DeReverb additionally removes reverb and can remove mono reverb, but not very effectively for heavily reverberated high-frequency content.<br>De-reverb/de-delay notes:<br>1. The processing time for the DeEcho-DeReverb model is approximately twice as long as the other two DeEcho models.<br>2. The MDX-Net-Dereverb model is quite slow.<br>3. The recommended cleanest configuration is to apply MDX-Net first and then DeEcho-Aggressive."
138126
}

0 commit comments

Comments
 (0)