ICCVW 2017 paper on Exploiting the Complementarity of Audio-Visual Data for Probabilistic Multi-Speaker Tracking