Skip to the content.

Audio Demos for "HiFi-SVC: Fast High Fidelity Cross-Domain Singing Voice Conversion"

Abstract

This paper presents HiFi-SVC, a small cross-domain singing voice conversion model for generating high fidelity 22.05 kHz singing voices.

Effects of Using Pitch Adjustment

Reference sample  
Source w/o Pitch Adjustment w/ Pitch Adjustment

Any-to-One Cross-domain (A2O-CD) singing voice conversion

Target speech reference samples from LJ-Speech.

Reference sample
Source FastSVC HiFi-SVC    
   
   
   
   
   
   
   
   
   
   
   
   

Any-to-Many Cross-domain (A2M-CD) singing voice conversion

Source References (VCTK) FastSVC HiFi-SVC

Any-to-Many In-domain (A2M-ID) singing voice conversion

Female source singer

Source sample from ADIZ (NUS-48E)  
reference FastSVC HiFi-SVC

Male source singer

Source sample from VKOW (NUS-48E)  
reference FastSVC HiFi-SVC

Cross-lingual (CL) singing voice conversion

Female source singer

Chinese Source sample  
reference FastSVC HiFi-SVC

Male source singer

Chinese Source sample  
reference FastSVC HiFi-SVC