Demos for "CycleGAN with Dual Adversarial Loss for Bone-Conducted Speech Enhancement"

 
Author: Qing Pan, Teng Gao, Jian Zhou, Huabin Wang, Liang Tao, and Hon Keung Kwan
 
Abstract: Compared with air-conducted speech, bone-conducted speech has the unique advantage of shielding background noise. Enhancement of bone-conducted speech helps to improve its quality and intelligibility. In this paper, a novel CycleGAN with dual adversarial loss (CycleGAN-DAL) is proposed for bone-conducted speech enhancement. The proposed method uses the adversarial loss and the cycle-consistent loss simultaneously to learn forward and cyclic mapping, in which the adversarial loss is replaced with the classification adversarial loss and the defect adversarial loss to consolidate the forward mapping. Compared with conventional baseline methods, it can learn feature mapping between bone-conducted speech and target speech without additional air-conducted speech assistance. Moreover, the proposed method also avoids the over-smooth problem which is occurred commonly in conventional statistical based models. Experimental results show that the proposed method outperforms baseline methods such as CycleGAN, GMM, and BLSTM.

 


Ground truth target samples

Speakers

source

target

female
male
01
02
03
04
05
06

Speakers "female" and "male" belong to dataset AEUCHSAC&BC-2017 corpus. The paper is available at here.

Speakers "01"、"02"、"03"、"04"、"05"、"06" belong to the paper here.

 

Comparision of proposed method to baseline methods

Speakers

GMM_w

GMM_wo

BLSTM

NMC

PMC

PMCD

female
male
01
02
03
04
05
06

"NMC" represents nonparallel CycleGAN, "PMC" indicates parallel CycleGAN, "PMCD" denotes parallel CycleGAN with dual adversarial loss.