1 code implementation • 27 Dec 2024 • DeepSeek-AI, Aixin Liu, Bei Feng, Bing Xue, Bingxuan Wang, Bochao Wu, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fucong Dai, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Han Bao, Hanwei Xu, Haocheng Wang, Haowei Zhang, Honghui Ding, Huajian Xin, Huazuo Gao, Hui Li, Hui Qu, J. L. Cai, Jian Liang, JianZhong Guo, Jiaqi Ni, Jiashi Li, Jiawei Wang, Jin Chen, Jingchang Chen, Jingyang Yuan, Junjie Qiu, Junlong Li, Junxiao Song, Kai Dong, Kai Hu, Kaige Gao, Kang Guan, Kexin Huang, Kuai Yu, Lean Wang, Lecong Zhang, Lei Xu, Leyi Xia, Liang Zhao, Litong Wang, Liyue Zhang, Meng Li, Miaojun Wang, Mingchuan Zhang, Minghua Zhang, Minghui Tang, Mingming Li, Ning Tian, Panpan Huang, Peiyi Wang, Peng Zhang, Qiancheng Wang, Qihao Zhu, Qinyu Chen, Qiushi Du, R. J. Chen, R. L. Jin, Ruiqi Ge, Ruisong Zhang, Ruizhe Pan, Runji Wang, Runxin Xu, Ruoyu Zhang, Ruyi Chen, S. S. Li, Shanghao Lu, Shangyan Zhou, Shanhuang Chen, Shaoqing Wu, Shengfeng Ye, Shirong Ma, Shiyu Wang, Shuang Zhou, Shuiping Yu, Shunfeng Zhou, Shuting Pan, T. Wang, Tao Yun, Tian Pei, Tianyu Sun, W. L. Xiao, Wangding Zeng, Wanjia Zhao, Wei An, Wen Liu, Wenfeng Liang, Wenjun Gao, Wenqin Yu, Wentao Zhang, X. Q. Li, Xiangyue Jin, Xianzu Wang, Xiao Bi, Xiaodong Liu, Xiaohan Wang, Xiaojin Shen, Xiaokang Chen, Xiaokang Zhang, Xiaosha Chen, Xiaotao Nie, Xiaowen Sun, Xiaoxiang Wang, Xin Cheng, Xin Liu, Xin Xie, Xingchao Liu, Xingkai Yu, Xinnan Song, Xinxia Shan, Xinyi Zhou, Xinyu Yang, Xinyuan Li, Xuecheng Su, Xuheng Lin, Y. K. Li, Y. Q. Wang, Y. X. Wei, Y. X. Zhu, Yang Zhang, Yanhong Xu, Yanping Huang, Yao Li, Yao Zhao, Yaofeng Sun, Yaohui Li, Yaohui Wang, Yi Yu, Yi Zheng, Yichao Zhang, Yifan Shi, Yiliang Xiong, Ying He, Ying Tang, Yishi Piao, Yisong Wang, Yixuan Tan, Yiyang Ma, Yiyuan Liu, Yongqiang Guo, Yu Wu, Yuan Ou, Yuchen Zhu, Yuduan Wang, Yue Gong, Yuheng Zou, Yujia He, Yukun Zha, Yunfan Xiong, Yunxian Ma, Yuting Yan, Yuxiang Luo, Yuxiang You, Yuxuan Liu, Yuyang Zhou, Z. F. Wu, Z. Z. Ren, Zehui Ren, Zhangli Sha, Zhe Fu, Zhean Xu, Zhen Huang, Zhen Zhang, Zhenda Xie, Zhengyan Zhang, Zhewen Hao, Zhibin Gou, Zhicheng Ma, Zhigang Yan, Zhihong Shao, Zhipeng Xu, Zhiyu Wu, Zhongyu Zhang, Zhuoshu Li, Zihui Gu, Zijia Zhu, Zijun Liu, Zilin Li, Ziwei Xie, Ziyang Song, Ziyi Gao, Zizheng Pan
We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
1 code implementation • 13 Dec 2024 • Zhiyu Wu, Xiaokang Chen, Zizheng Pan, Xingchao Liu, Wen Liu, Damai Dai, Huazuo Gao, Yiyang Ma, Chengyue Wu, Bingxuan Wang, Zhenda Xie, Yu Wu, Kai Hu, Jiawei Wang, Yaofeng Sun, Yukun Li, Yishi Piao, Kang Guan, Aixin Liu, Xin Xie, Yuxiang You, Kai Dong, Xingkai Yu, Haowei Zhang, Liang Zhao, Yisong Wang, Chong Ruan
We present DeepSeek-VL2, an advanced series of large Mixture-of-Experts (MoE) Vision-Language Models that significantly improves upon its predecessor, DeepSeek-VL, through two key major upgrades.
Ranked #1 on Referring Expression Comprehension on RefCOCOg-test
1 code implementation • 12 Nov 2024 • Yiyang Ma, Xingchao Liu, Xiaokang Chen, Wen Liu, Chengyue Wu, Zhiyu Wu, Zizheng Pan, Zhenda Xie, Haowei Zhang, Xingkai Yu, Liang Zhao, Yisong Wang, Jiaying Liu, Chong Ruan
To further improve the performance of our unified model, we adopt two key strategies: (i) decoupling the understanding and generation encoders, and (ii) aligning their representations during unified training.
Ranked #197 on Visual Question Answering on MM-Vet
1 code implementation • 17 Oct 2024 • Chengyue Wu, Xiaokang Chen, Zhiyu Wu, Yiyang Ma, Xingchao Liu, Zizheng Pan, Wen Liu, Zhenda Xie, Xingkai Yu, Chong Ruan, Ping Luo
In this paper, we introduce Janus, an autoregressive framework that unifies multimodal understanding and generation.
Ranked #164 on Visual Question Answering on MM-Vet
1 code implementation • 17 Jul 2024 • Yuanzhi Zhu, Xingchao Liu, Qiang Liu
The rectified flow framework trains one-step generative models using two operations, reflow and distillation.
1 code implementation • 2 Jul 2024 • Ling Yang, Zixiang Zhang, Zhilong Zhang, Xingchao Liu, Minkai Xu, Wentao Zhang, Chenlin Meng, Stefano Ermon, Bin Cui
Additionally, we propose a multi-segment training approach for Consistency-FM to enhance expressiveness, achieving a better trade-off between sampling quality and speed.
1 code implementation • 14 May 2024 • Zhimin Li, Jianwei Zhang, Qin Lin, Jiangfeng Xiong, Yanxin Long, Xinchi Deng, Yingfang Zhang, Xingchao Liu, Minbin Huang, Zedong Xiao, Dayou Chen, Jiajun He, Jiahao Li, Wenyue Li, Chen Zhang, Rongwei Quan, Jianxiang Lu, Jiabin Huang, Xiaoyan Yuan, Xiaoxiao Zheng, Yixuan Li, Jihong Zhang, Chao Zhang, Meng Chen, Jie Liu, Zheng Fang, Weiyan Wang, Jinbao Xue, Yangyu Tao, Jianchen Zhu, Kai Liu, Sihuan Lin, Yifu Sun, Yun Li, Dongdong Wang, Mingtao Chen, Zhichao Hu, Xiao Xiao, Yan Chen, Yuhong Liu, Wei Liu, Di Wang, Yong Yang, Jie Jiang, Qinglin Lu
For fine-grained language understanding, we train a Multimodal Large Language Model to refine the captions of the images.
1 code implementation • 13 May 2024 • Hanshu Yan, Xingchao Liu, Jiachun Pan, Jun Hao Liew, Qiang Liu, Jiashi Feng
We present Piecewise Rectified Flow (PeRFlow), a flow-based method for accelerating diffusion models.
no code implementations • 25 Mar 2024 • Shujian Zhang, Lemeng Wu, Chengyue Gong, Xingchao Liu
Extensive experiments and ablation studies demonstrate that our method can be general, effective, and beneficial for many NLP tasks.
1 code implementation • 6 Feb 2024 • Xixi Hu, Bo Liu, Xingchao Liu, Qiang Liu
To address this challenge, we propose AdaFlow, an imitation learning framework based on flow-based generative modeling.
2 code implementations • 12 Sep 2023 • Xingchao Liu, Xiwen Zhang, Jianzhu Ma, Jian Peng, Qiang Liu
Leveraging our new pipeline, we create, to the best of our knowledge, the first one-step diffusion-based text-to-image generator with SD-level image quality, achieving an FID (Frechet Inception Distance) of $23. 3$ on MS COCO 2017-5k, surpassing the previous state-of-the-art technique, progressive distillation, by a significant margin ($37. 2$ $\rightarrow$ $23. 3$ in FID).
no code implementations • 4 May 2023 • Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mingyuan Zhou
Ultimately, with this prompt paragraph, AutoML-GPT will automatically conduct the experiments from data processing to model architecture, hyperparameter tuning, and predicted training log.
no code implementations • CVPR 2023 • Xingchao Liu, Lemeng Wu, Shujian Zhang, Chengyue Gong, Wei Ping, Qiang Liu
To further accelerate the computation of the back-propagation, we propose to use a non-uniform discretization to approximate the ODE trajectory, where we measure how straight the trajectory is and gather the straight parts into one discretization step.
1 code implementation • CVPR 2023 • Lemeng Wu, Dilin Wang, Chengyue Gong, Xingchao Liu, Yunyang Xiong, Rakesh Ranjan, Raghuraman Krishnamoorthi, Vikas Chandra, Qiang Liu
We perform evaluations on multiple 3D tasks and find that our PSF performs comparably to the standard diffusion model, outperforming other efficient 3D point cloud generation methods.
no code implementations • 2 Nov 2022 • Shujian Zhang, Chengyue Gong, Xingchao Liu
Experiments on different tasks across open question answering, dialogue conversation, and fact verification show that our method consistently outperforms its baselines.
no code implementations • 6 Oct 2022 • Yan Zheng, Lemeng Wu, Xingchao Liu, Zhen Chen, Qiang Liu, QiXing Huang
We first propose a diffusion-based generative model to tackle this problem by generating voxelized shapes with close-to-reality outlines and structures.
6 code implementations • 7 Sep 2022 • Xingchao Liu, Chengyue Gong, Qiang Liu
The idea of rectified flow is to learn the ODE to follow the straight paths connecting the points drawn from \pi_0 and \pi_1 as much as possible.
no code implementations • 2 Sep 2022 • Lemeng Wu, Chengyue Gong, Xingchao Liu, Mao Ye, Qiang Liu
AI-based molecule generation provides a promising approach to a large area of biomedical sciences and engineering, such as antibody design, hydrolase engineering, or vaccine development.
no code implementations • 31 Aug 2022 • Xingchao Liu, Lemeng Wu, Mao Ye, Qiang Liu
Diffusion-based generative models have achieved promising results recently, but raise an array of open questions in terms of conceptual understanding, theoretical analysis, algorithm improvement and extensions to discrete, structured, non-Euclidean domains.
1 code implementation • 20 Jun 2022 • Ruqi Zhang, Xingchao Liu, Qiang Liu
We propose discrete Langevin proposal (DLP), a simple and scalable gradient-based proposal for sampling complex high-dimensional discrete distributions.
no code implementations • Findings (NAACL) 2022 • Shujian Zhang, Chengyue Gong, Xingchao Liu, Pengcheng He, Weizhu Chen, Mingyuan Zhou
Active learning, which effectively collects informative unlabeled data for annotation, reduces the demand for labeled data.
1 code implementation • 2 Dec 2021 • Xingchao Liu, Chengyue Gong, Lemeng Wu, Shujian Zhang, Hao Su, Qiang Liu
We approach text-to-image generation by combining the power of the retrained CLIP representation with an off-the-shelf image generator (GANs), optimizing in the latent space of GAN to find images that achieve maximum CLIP score with the given input text.
Ranked #46 on Text-to-Image Generation on MS COCO
no code implementations • NeurIPS 2021 • Chengyue Gong, Xingchao Liu, Qiang Liu
In this work, we consider constrained optimization as a more principled approach for trading off two losses, with a special emphasis on lexicographic optimization, a degenerated limit of constrained optimization which optimizes a secondary loss inside the optimal set of the main loss.
1 code implementation • NeurIPS 2021 • Xingchao Liu, Xin Tong, Qiang Liu
In this work, we propose a family of constrained sampling algorithms which generalize Langevin Dynamics (LD) and Stein Variational Gradient Descent (SVGD) to incorporate a moment constraint specified by a general nonlinear function.
1 code implementation • NeurIPS 2021 • Xingchao Liu, Xin Tong, Qiang Liu
Finding diverse and representative Pareto solutions from the Pareto front is a key challenge in multi-objective optimization (MOO).
4 code implementations • NeurIPS 2021 • Bo Liu, Xingchao Liu, Xiaojie Jin, Peter Stone, Qiang Liu
The goal of multi-task learning is to enable more efficient learning than single task learning by sharing model structures for a diverse set of tasks.
1 code implementation • NeurIPS 2021 • Xingchao Liu, Xin Tong, Qiang Liu
In this work, we propose a family of constrained sampling algorithms which generalize Langevin Dynamics (LD) and Stein Variational Gradient Descent (SVGD) to incorporate a moment constraint specified by a general nonlinear function.
1 code implementation • NeurIPS 2021 • Xingchao Liu, Xin Tong, Qiang Liu
Finding diverse and representative Pareto solutions from the Pareto front is a key challenge in multi-objective optimization (MOO).
no code implementations • NeurIPS 2021 • Chengyue Gong, Xingchao Liu, Qiang Liu
In this work, we consider constrained optimization as a more principled approach for trading off two losses, with a special emphasis on lexicographic optimization, a degenerated limit of constrained optimization which optimizes a secondary loss inside the optimal set of the main loss.
no code implementations • 17 Feb 2021 • Lemeng Wu, Xingchao Liu, Qiang Liu
Self-attention, as the key block of transformers, is a powerful mechanism for extracting features from the inputs.
Ranked #672 on Image Classification on ImageNet
no code implementations • 1 Jan 2021 • Chengyue Gong, Xingchao Liu, Qiang Liu
We apply our method to recently-proposed MOCO, SimCLR, SwAV and notice that we can reduce the computational cost with little loss on the performance of ImageNet linear classification and other downstream tasks.
1 code implementation • NeurIPS 2020 • Xingchao Liu, Xing Han, Na Zhang, Qiang Liu
In this work, we propose to certify the monotonicity of the general piece-wise linear neural networks by solving a mixed integer linear programming problem. This provides a new general approach for learning monotonic neural networks with arbitrary model structures.
no code implementations • 20 Feb 2020 • Xingchao Liu, Mao Ye, Dengyong Zhou, Qiang Liu
We propose multipoint quantization, a quantization method that approximates a full-precision weight vector using a linear combination of multiple vectors of low-bit numbers; this is in contrast to typical quantization methods that approximate each weight using a single low precision number.
no code implementations • 27 Sep 2018 • Xingchao Liu, Tongzhou Mu, Hao Su
In this paper, we investigate the problem of transfer learning across environments with different dynamics while accomplishing the same task in the continuous control domain.