no code implementations • 4 Nov 2023 • Veera Raghavendra Elluru, Devang Kulshreshtha, Rohit Paturi, Sravan Bodapati, Srikanth Ronanki
Our multimodal training approach improves the accuracy of zero-shot intent classification on unseen intents of SLURP by 2. 75% and 18. 2% for the SLURP and internal goal-oriented dialog datasets, respectively, compared to audio-only training.