ICCVW 2017 paper on Exploiting the Complementarity of Audio-Visual Data for Probabilistic Multi-Speaker Tracking



Yutong Ban, Laurent Girin, Xavier Alameda-Pineda and Radu Horaud


We’ve got a paper accepted at ICCV 2017 Workshopon Computer Vision for Audio-Visual Media about Exploiting the Complementarity of Audio-Visual Data for Probabilistic Multi-Speaker Tracking [1].

Abstract: Multi-speaker tracking is a central problem in human-robot interaction. In this context, exploiting auditory and visual information is gratifying and challenging at the same time. Gratifying because the complementary nature of auditory and visual information allows us to be more robust against noise and outliers than uni-modal approaches. Challenging because how to properly fuse auditory and visual information for multi-speaker tracking is far from being a solved question. In this paper we propose a probabilistic generative model that tracks multiple speakers by jointly exploiting auditory and visual features in their natural representation spaces. Importantly, the method is robust to missing data and it is thus able to track when only one of the modalities is present. Quantitative and qualitative results on the AVDIAR dataset are reported.

References:

  1. Y. Ban, L. Girin, X. Alameda-Pineda, and R. Horaud, “Exploiting the Complementarity of Audio-Visual Data for Probabilistic Multi-Speaker Tracking,” in IEEE ICCV Workshop on Computer Vision for Audio-Visual Media, Venice, Italy, 2017. [ bib pdf ]
    @inproceedings{Ban-CVAVM-2017,
    author = {Yutong Ban and Laurent Girin and Xavier Alameda-Pineda and Radu Horaud},
    title = {Exploiting the Complementarity of Audio-Visual Data for Probabilistic Multi-Speaker Tracking},
    booktitle = {IEEE ICCV Workshop on Computer Vision for Audio-Visual Media}, 
    year = {2017},
    address = {Venice, Italy},
      pdf={http://xavirema.eu/wp-content/papercite-data/pdf/Ban-CVAVM-2017.pdf}
    }

Category: Research

No responses yet.

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>