Audio Demos for "HiFi-SVC: Fast High Fidelity Cross-Domain Singing Voice Conversion"

Abstract

This paper presents HiFi-SVC, a small cross-domain singing voice conversion model for generating high fidelity 22.05 kHz singing voices.

Effects of Using Pitch Adjustment

Reference sample
Source	w/o Pitch Adjustment	w/ Pitch Adjustment

Any-to-One Cross-domain (A2O-CD) singing voice conversion

Target speech reference samples from LJ-Speech.

Reference sample
Source	FastSVC	HiFi-SVC

Any-to-Many Cross-domain (A2M-CD) singing voice conversion

Source	References (VCTK)	FastSVC	HiFi-SVC

Any-to-Many In-domain (A2M-ID) singing voice conversion

Female source singer

Source sample from ADIZ (NUS-48E)
reference	FastSVC	HiFi-SVC

Male source singer

Source sample from VKOW (NUS-48E)
reference	FastSVC	HiFi-SVC

Cross-lingual (CL) singing voice conversion

Female source singer

Chinese Source sample
reference	FastSVC	HiFi-SVC

Male source singer

Chinese Source sample
reference	FastSVC	HiFi-SVC