Dionyssos Kounades-Bastian, Laurent Girin, Xavier Alameda-Pineda, Sharon Gannot and Radu Horaud
[Could not find the bibliography file(s)We got a paper accepted at IEEE ICASSP’17: An EM algorithms for joint source separation and diarisation of multichannel convolutive speech mixtures [?].
Abstract: We present a statistical model for joint source separation and diarization of multichannel convolutive speech mixtures. We build upon the framework of local Gaussian model (LGM) with non-negative matrix factorization (NMF). The diarization is introduced as a temporal labeling of each source in the mix as active or inactive at the short-term frame level. We devise an EM algorithm in which the source separation process is aided by the diarization state, since the latter indicates the sources actually present in the mixture. The diarization state is tracked with a Hidden Markov Model (HMM) with emission probabilities calculated from the estimated source signals. The proposed EM has separation performance comparable with a state-of-the-art LGM NMF method, while outperforming a state-of-the-art speaker diarization pipeline.