no code implementations • 21 Aug 2024 • Sharath Turuvekere Sreenivas, Saurav Muralidharan, Raviraj Joshi, Marcin Chochowski, Mostofa Patwary, Pavlo Molchanov, Mohammad Shoeybi, Jan Kautz, Ameya Sunil Mahabaleshwarkar, Gerald Shen, Jiaqi Zeng, Oleksii Kuchaiev, Zijia Chen, Yoshi Suhara, Shizhe Diao, Chenhan Yu, Wei-Chun Chen, Hayley Ross, Daniel Korzekwa, Oluwatobi Olabiyi, Ashwath Aithal, Bryan Catanzaro
We present a comprehensive report on compressing the Llama 3. 1 8B and Mistral NeMo 12B models to 4B and 8B parameters, respectively, using pruning and distillation.
no code implementations • 3 Sep 2019 • Oluwatobi Olabiyi, Erik T. Mueller, Christopher Larson, Tarek Lahlou
Our experiments shows that adversarial bootstrapping is effective at addressing exposure bias, leading to improvement in response relevance and coherence.
no code implementations • WS 2020 • Oluwatobi Olabiyi, Erik T. Mueller
Neural dialogue models, despite their successes, still suffer from lack of relevance, diversity, and in many cases coherence in their generated responses.
no code implementations • NAACL 2019 • Oluwatobi Olabiyi, Anish Khazane, Alan Salimov, Erik T. Mueller
In this paper, we extend the persona-based sequence-to-sequence (Seq2Seq) neural network conversation model to a multi-turn dialogue scenario by modifying the state-of-the-art hredGAN architecture to simultaneously capture utterance attributes such as speaker identity, dialogue topic, speaker sentiments and so on.
no code implementations • WS 2019 • Oluwatobi Olabiyi, Alan Salimov, Anish Khazane, Erik T. Mueller
We propose an adversarial learning approach for generating multi-turn dialogue responses.
no code implementations • 7 Jun 2017 • Oluwatobi Olabiyi, Eric Martinson, Vijay Chintalapudi, Rui Guo
In this paper, we formulate driver action prediction as a timeseries anomaly prediction problem.