no code implementations • 7 Feb 2025 • Yue Zhao, Fuzhao Xue, Scott Reed, Linxi Fan, Yuke Zhu, Jan Kautz, Zhiding Yu, Philipp Krähenbühl, De-An Huang
We validate the effectiveness of QLIP for multimodal understanding and text-conditioned image generation with a single model.
1 code implementation • 23 Dec 2024 • Qi Jia, Siyu Ren, Ziheng Qin, Fuzhao Xue, Jinjie Ni, Yang You
On the other hand, the diversity score is defined on top of the samples' responses under the consideration of their informativeness.
no code implementations • 17 Oct 2024 • Jinjie Ni, YiFan Song, Deepanway Ghosal, Bo Li, David Junhao Zhang, Xiang Yue, Fuzhao Xue, Zian Zheng, Kaichen Zhang, Mahir Shah, Kabir Jain, Yang You, Michael Shieh
Perceiving and generating diverse modalities are crucial for AI models to effectively learn from and engage with real-world signals, necessitating reliable evaluations for their development.
1 code implementation • 19 Aug 2024 • Yukang Chen, Fuzhao Xue, Dacheng Li, Qinghao Hu, Ligeng Zhu, Xiuyu Li, Yunhao Fang, Haotian Tang, Shang Yang, Zhijian Liu, Ethan He, Hongxu Yin, Pavlo Molchanov, Jan Kautz, Linxi Fan, Yuke Zhu, Yao Lu, Song Han
We introduce the long-context Multi-Modal Sequence Parallelism (MM-SP) system that efficiently parallelizes long video training and inference, enabling 2M context length training on 256 GPUs without any gradient checkpointing.
Ranked #9 on
Video Question Answering
on NExT-QA
no code implementations • 26 Jul 2024 • Boyi Li, Ligeng Zhu, Ran Tian, Shuhan Tan, Yuxiao Chen, Yao Lu, Yin Cui, Sushant Veer, Max Ehrlich, Jonah Philion, Xinshuo Weng, Fuzhao Xue, Andrew Tao, Ming-Yu Liu, Sanja Fidler, Boris Ivanovic, Trevor Darrell, Jitendra Malik, Song Han, Marco Pavone
Finally, we establish a benchmark for video captioning and introduce a leaderboard, aiming to accelerate advancements in video understanding, captioning, and data alignment.
no code implementations • 3 Jun 2024 • Jinjie Ni, Fuzhao Xue, Xiang Yue, Yuntian Deng, Mahir Shah, Kabir Jain, Graham Neubig, Yang You
Our benchmarks' advantages lie in (1) a 0. 96 model ranking correlation with Chatbot Arena arising from the highly impartial query distribution and grading mechanism, (2) fast, cheap, and reproducible execution (6% of the time and cost of MMLU), and (3) dynamic evaluation enabled by the rapid and stable data update pipeline.
1 code implementation • 29 Jan 2024 • Fuzhao Xue, Zian Zheng, Yao Fu, Jinjie Ni, Zangwei Zheng, Wangchunshu Zhou, Yang You
To help the open-source community have a better understanding of Mixture-of-Experts (MoE) based large language models (LLMs), we train and release OpenMoE, a series of fully open-sourced and reproducible decoder-only MoE LLMs, ranging from 650M to 34B parameters and trained on up to over 1T tokens.
1 code implementation • NeurIPS 2023 • Zangwei Zheng, Xiaozhe Ren, Fuzhao Xue, Yang Luo, Xin Jiang, Yang You
By leveraging this information, we introduce an efficient sequence scheduling technique that groups queries with similar response lengths into micro-batches.
1 code implementation • Tiny Papers @ ICLR 2023 • Xiao Liu, Jian Zhang, Heng Zhang, Fuzhao Xue, Yang You
We evaluate our model on various dialogue understanding tasks including dialogue relation extraction, dialogue emotion recognition, and dialogue act classification.
Ranked #1 on
Dialog Relation Extraction
on DialogRE
1 code implementation • 30 Jan 2023 • Fuzhao Xue, Valerii Likhosherstov, Anurag Arnab, Neil Houlsby, Mostafa Dehghani, Yang You
However, most standard neural networks have a fixed function type and computation budget regardless of the sample's nature or difficulty.
no code implementations • 21 May 2022 • Fuzhao Xue, Jianghai Chen, Aixin Sun, Xiaozhe Ren, Zangwei Zheng, Xiaoxin He, Yongming Chen, Xin Jiang, Yang You
In this paper, we revisit these conventional configurations.
Ranked #102 on
Image Classification
on ImageNet
1 code implementation • 13 Apr 2022 • Zangwei Zheng, Pengtai Xu, Xuan Zou, Da Tang, Zhen Li, Chenguang Xi, Peng Wu, Leqi Zou, Yijie Zhu, Ming Chen, Xiangzhuo Ding, Fuzhao Xue, Ziheng Qin, Youlong Cheng, Yang You
Our experiments show that previous scaling rules fail in the training of CTR prediction neural networks.
1 code implementation • CVPR 2022 • Wangbo Zhao, Kai Wang, Xiangxiang Chu, Fuzhao Xue, Xinchao Wang, Yang You
Text-based video segmentation aims to segment the target object in a video based on a describing sentence.
Ranked #10 on
Referring Expression Segmentation
on A2D Sentences
Optical Flow Estimation
Referring Expression Segmentation
+4
no code implementations • 26 Jan 2022 • Fuzhao Xue, Xiaoxin He, Xiaozhe Ren, Yuxuan Lou, Yang You
Mixture-of-experts (MoE) is a powerful sparse architecture including multiple experts.
no code implementations • 1 Nov 2021 • Xiaoxin He, Fuzhao Xue, Xiaozhe Ren, Yang You
Deep learning have achieved promising results on a wide spectrum of AI applications.
no code implementations • 5 Sep 2021 • Yuxuan Lou, Fuzhao Xue, Zangwei Zheng, Yang You
Mixture-of-Experts (MoE), a conditional computation architecture, achieved promising performance by scaling local module (i. e. feed-forward network) of transformer.
no code implementations • 10 Aug 2021 • Andrew Koh, Fuzhao Xue, Eng Siong Chng
In this paper, we examine the use of Transfer Learning using Pretrained Audio Neural Networks (PANNs), and propose an architecture that is able to better leverage the acoustic features provided by PANNs for the Automated Audio Captioning Task.
1 code implementation • 25 Jul 2021 • Fuzhao Xue, Ziji Shi, Futao Wei, Yuxuan Lou, Yong liu, Yang You
To achieve better performance with fewer trainable parameters, recent methods are proposed to go shallower by parameter sharing or model compressing along with the depth.
Ranked #723 on
Image Classification
on ImageNet
no code implementations • 26 May 2021 • Shenggui Li, Fuzhao Xue, Chaitanya Baranwal, Yongbin Li, Yang You
That is, with sparse attention, our sequence parallelism enables us to train transformer with infinite long sequence.
no code implementations • 10 May 2021 • Jinjie Ni, Tom Young, Vlad Pandelea, Fuzhao Xue, Erik Cambria
To the best of our knowledge, this survey is the most comprehensive and up-to-date one at present for deep learning based dialogue systems, extensively covering the popular techniques.
1 code implementation • 27 Dec 2020 • Fuzhao Xue, Aixin Sun, Hao Zhang, Jinjie Ni, Eng Siong Chng
Dialogue relation extraction (RE) is to predict the relation type of two entities mentioned in a dialogue.
Ranked #9 on
Dialog Relation Extraction
on DialogRE
1 code implementation • 12 Dec 2020 • Fuzhao Xue, Aixin Sun, Hao Zhang, Eng Siong Chng
Recent advances on RE task are from BERT-based sequence modeling and graph-based modeling of relationships among the tokens in the sequence.
Ranked #4 on
Dialog Relation Extraction
on DialogRE
(F1c (v1) metric)
no code implementations • ICML 2020 • Hengguan Huang, Fuzhao Xue, Hao Wang, Ye Wang
Lying at the core of human intelligence, relational thinking is characterized by initially relying on innumerable unconscious percepts pertaining to relations between new sensory signals and prior knowledge, consequently becoming a recognizable concept or object through coupling and transformation of these percepts.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1