Audio Demos for "HiFi-SVC: Fast High Fidelity Cross-Domain Singing Voice Conversion"
Abstract
This paper presents HiFi-SVC, a small cross-domain singing voice conversion model for generating high fidelity 22.05 kHz singing voices.
Effects of Using Pitch Adjustment
| Reference sample |
|
|
| Source |
w/o Pitch Adjustment |
w/ Pitch Adjustment |
|
|
|
|
|
|
|
|
|
|
|
|
Any-to-One Cross-domain (A2O-CD) singing voice conversion
Target speech reference samples from LJ-Speech.
| Reference sample |
|
|
|
|
| Source |
FastSVC |
HiFi-SVC |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Any-to-Many Cross-domain (A2M-CD) singing voice conversion
| Source |
References (VCTK) |
FastSVC |
HiFi-SVC |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Any-to-Many In-domain (A2M-ID) singing voice conversion
Female source singer
| Source sample from ADIZ (NUS-48E) |
|
|
| reference |
FastSVC |
HiFi-SVC |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Male source singer
| Source sample from VKOW (NUS-48E) |
|
|
| reference |
FastSVC |
HiFi-SVC |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Cross-lingual (CL) singing voice conversion
Female source singer
| Chinese Source sample |
|
|
| reference |
FastSVC |
HiFi-SVC |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Male source singer
| Chinese Source sample |
|
|
| reference |
FastSVC |
HiFi-SVC |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|