Search Results for author: Yuchen Zhu

Found 15 papers, 5 papers with code

Plain-Det: A Plain Multi-Dataset Object Detector

1 code implementation14 Jul 2024 Cheng Shi, Yuchen Zhu, Sibei Yang

Recent advancements in large-scale foundational models have sparked widespread interest in training highly proficient large vision models.

Object object-detection +1

Token-Mol 1.0: Tokenized drug design with large language model

no code implementations10 Jul 2024 Jike Wang, Rui Qin, Mingyang Wang, Meijing Fang, Yangyang Zhang, Yuchen Zhu, Qun Su, Qiaolin Gou, Chao Shen, Odin Zhang, Zhenxing Wu, Dejun Jiang, Xujun Zhang, Huifeng Zhao, Xiaozhe Wan, Zhourui Wu, Liwei Liu, Yu Kang, Chang-Yu Hsieh, Tingjun Hou

This model encodes all molecular information, including 2D and 3D structures, as well as molecular property data, into tokens, which transforms classification and regression tasks in drug discovery into probabilistic prediction problems, thereby enabling learning through a unified paradigm.

Drug Discovery Language Modelling +5

Structured Learning of Compositional Sequential Interventions

no code implementations9 Jun 2024 Jialin Yu, Andreas Koukorinis, Nicolò Colombo, Yuchen Zhu, Ricardo Silva

We consider sequential treatment regimes where each unit is exposed to combinations of interventions over time.

Matrix Completion

Trivialized Momentum Facilitates Diffusion Generative Modeling on Lie Groups

no code implementations25 May 2024 Yuchen Zhu, Tianrong Chen, Lingkai Kong, Evangelos A. Theodorou, Molei Tao

However, our trivialization technique creates to a new momentum variable that stays in a simple $\textbf{fixed vector space}$.


DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

1 code implementation7 May 2024 DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding, Huajian Xin, Huazuo Gao, Hui Li, Hui Qu, J. L. Cai, Jian Liang, JianZhong Guo, Jiaqi Ni, Jiashi Li, Jin Chen, Jingyang Yuan, Junjie Qiu, Junxiao Song, Kai Dong, Kaige Gao, Kang Guan, Lean Wang, Lecong Zhang, Lei Xu, Leyi Xia, Liang Zhao, Liyue Zhang, Meng Li, Miaojun Wang, Mingchuan Zhang, Minghua Zhang, Minghui Tang, Mingming Li, Ning Tian, Panpan Huang, Peiyi Wang, Peng Zhang, Qihao Zhu, Qinyu Chen, Qiushi Du, R. J. Chen, R. L. Jin, Ruiqi Ge, Ruizhe Pan, Runxin Xu, Ruyi Chen, S. S. Li, Shanghao Lu, Shangyan Zhou, Shanhuang Chen, Shaoqing Wu, Shengfeng Ye, Shirong Ma, Shiyu Wang, Shuang Zhou, Shuiping Yu, Shunfeng Zhou, Size Zheng, T. Wang, Tian Pei, Tian Yuan, Tianyu Sun, W. L. Xiao, Wangding Zeng, Wei An, Wen Liu, Wenfeng Liang, Wenjun Gao, Wentao Zhang, X. Q. Li, Xiangyue Jin, Xianzu Wang, Xiao Bi, Xiaodong Liu, Xiaohan Wang, Xiaojin Shen, Xiaokang Chen, Xiaosha Chen, Xiaotao Nie, Xiaowen Sun, Xiaoxiang Wang, Xin Liu, Xin Xie, Xingkai Yu, Xinnan Song, Xinyi Zhou, Xinyu Yang, Xuan Lu, Xuecheng Su, Y. Wu, Y. K. Li, Y. X. Wei, Y. X. Zhu, Yanhong Xu, Yanping Huang, Yao Li, Yao Zhao, Yaofeng Sun, Yaohui Li, Yaohui Wang, Yi Zheng, Yichao Zhang, Yiliang Xiong, Yilong Zhao, Ying He, Ying Tang, Yishi Piao, Yixin Dong, Yixuan Tan, Yiyuan Liu, Yongji Wang, Yongqiang Guo, Yuchen Zhu, Yuduan Wang, Yuheng Zou, Yukun Zha, Yunxian Ma, Yuting Yan, Yuxiang You, Yuxuan Liu, Z. Z. Ren, Zehui Ren, Zhangli Sha, Zhe Fu, Zhen Huang, Zhen Zhang, Zhenda Xie, Zhewen Hao, Zhihong Shao, Zhiniu Wen, Zhipeng Xu, Zhongyu Zhang, Zhuoshu Li, Zihan Wang, Zihui Gu, Zilin Li, Ziwei Xie

MLA guarantees efficient inference through significantly compressing the Key-Value (KV) cache into a latent vector, while DeepSeekMoE enables training strong models at an economical cost through sparse computation.

Language Modelling Reinforcement Learning (RL)

A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for Functional Minimiax Optimization

no code implementations18 Apr 2024 Yuchen Zhu, Yufeng Zhang, Zhaoran Wang, Zhuoran Yang, Xiaohong Chen

Under this regime, the stochastic gradient descent-ascent corresponds to a Wasserstein gradient flow over the space of probability measures defined over the space of neural network parameters.

Representation Learning

Quantum State Generation with Structure-Preserving Diffusion Model

no code implementations9 Apr 2024 Yuchen Zhu, Tianrong Chen, Evangelos A. Theodorou, Xie Chen, Molei Tao

This article considers the generative modeling of the (mixed) states of quantum systems, and an approach based on denoising diffusion model is proposed.


Meaningful Causal Aggregation and Paradoxical Confounding

no code implementations23 Apr 2023 Yuchen Zhu, Kailash Budhathoki, Jonas Kuebler, Dominik Janzing

On the positive side, we show that cause-effect relations can be aggregated when the macro interventions are such that the distribution of micro states is the same as in the observational distribution; we term this natural macro interventions.

Causal Effect Inference for Structured Treatments

2 code implementations NeurIPS 2021 Jean Kaddour, Yuchen Zhu, Qi Liu, Matt J. Kusner, Ricardo Silva

We address the estimation of conditional average treatment effects (CATEs) for structured treatments (e. g., graphs, images, texts).

Proximal Causal Learning with Kernels: Two-Stage Estimation and Moment Restriction

2 code implementations10 May 2021 Afsaneh Mastouri, Yuchen Zhu, Limor Gultchin, Anna Korba, Ricardo Silva, Matt J. Kusner, Arthur Gretton, Krikamol Muandet

In particular, we provide a unifying view of two-stage and moment restriction approaches for solving this problem in a nonlinear setting.

Vocal Bursts Valence Prediction

Model Rectification via Unknown Unknowns Extraction from Deployment Samples

no code implementations8 Feb 2021 Bruno Abrahao, Zheng Wang, Haider Ahmed, Yuchen Zhu

Model deficiency that results from incomplete training data is a form of structural blindness that leads to costly errors, oftentimes with high confidence.

Active Learning

EQuANt (Enhanced Question Answer Network)

1 code implementation24 Jun 2019 François-Xavier Aubet, Dominic Danks, Yuchen Zhu

By training and evaluating EQuANt on SQuAD 2, we show that it is indeed possible to extend QANet to the unanswerable domain.

Machine Reading Comprehension Multi-Task Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.