Deep audio-visual speech recognition