Search Results for author: Di Zhang

Found 42 papers, 10 papers with code

Inductive-Deductive Strategy Reuse for Multi-Turn Instructional Dialogues

no code implementations17 Apr 2024 Jiao Ou, Jiayu Wu, Che Liu, Fuzheng Zhang, Di Zhang, Kun Gai

Existing methods target instructions from real instruction dialogues as a learning goal and fine-tune a user simulator for posing instructions.

UNIAA: A Unified Multi-modal Image Aesthetic Assessment Baseline and Benchmark

no code implementations15 Apr 2024 Zhaokun Zhou, Qiulin Wang, Bin Lin, Yiwei Su, Rui Chen, Xin Tao, Amin Zheng, Li Yuan, Pengfei Wan, Di Zhang

To further evaluate the IAA capability of MLLMs, we construct the UNIAA-Bench, which consists of three aesthetic levels: Perception, Description, and Assessment.

Language Modelling Large Language Model

End-to-end training of Multimodal Model and ranking Model

2 code implementations9 Apr 2024 Xiuqi Deng, Lu Xu, Xiyao Li, Jinkai Yu, Erpeng Xue, Zhongyuan Wang, Di Zhang, Zhaojie Liu, Guorui Zhou, Yang song, Na Mou, Shen Jiang, Han Li

In this paper, we propose an industrial multimodal recommendation framework named EM3: End-to-end training of Multimodal Model and ranking Model, which sufficiently utilizes multimodal information and allows personalized ranking tasks to directly train the core modules in the multimodal model to obtain more task-oriented content features, without overburdening resource consumption.

Contrastive Learning Multimodal Recommendation

Motion Inversion for Video Customization

no code implementations29 Mar 2024 Luozhou Wang, Guibao Shen, Yixun Liang, Xin Tao, Pengfei Wan, Di Zhang, Yijun Li, Yingcong Chen

In this research, we present a novel approach to motion customization in video generation, addressing the widespread gap in the thorough exploration of motion representation within video generative models.

Video Generation

DouRN: Improving DouZero by Residual Neural Networks

no code implementations21 Mar 2024 Yiquan Chen, Yingchao Lyu, Di Zhang

Deep reinforcement learning has made significant progress in games with imperfect information, but its performance in the card game Doudizhu (Chinese Poker/Fight the Landlord) remains unsatisfactory.

DragAnything: Motion Control for Anything using Entity Representation

2 code implementations12 Mar 2024 Weijia Wu, Zhuang Li, YuChao Gu, Rui Zhao, Yefei He, David Junhao Zhang, Mike Zheng Shou, Yan Li, Tingting Gao, Di Zhang

We introduce DragAnything, which utilizes a entity representation to achieve motion control for any object in controllable video generation.

Object Video Generation

A Survey on Applications of Reinforcement Learning in Spatial Resource Allocation

no code implementations6 Mar 2024 Di Zhang, Moyang Wang, Joseph Mango, Xiang Li, Xianrui Xu

Given these advancements, there has been a surge in novel methods employing reinforcement learning to tackle spatial resource allocation problems.

Decision Making reinforcement-learning

ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors

1 code implementation26 Feb 2024 Zhexin Zhang, Yida Lu, Jingyuan Ma, Di Zhang, Rui Li, Pei Ke, Hao Sun, Lei Sha, Zhifang Sui, Hongning Wang, Minlie Huang

The safety of Large Language Models (LLMs) has gained increasing attention in recent years, but there still lacks a comprehensive approach for detecting safety issues within LLMs' responses in an aligned, customizable and explainable manner.

Enhancing Role-playing Systems through Aggressive Queries: Evaluation and Improvement

no code implementations16 Feb 2024 Yihong Tang, Jiao Ou, Che Liu, Fuzheng Zhang, Di Zhang, Kun Gai

Experiments on models improved by RoleAD indicate that our adversarial dataset ameliorates this deficiency, with the improvements demonstrating a degree of generalizability in ordinary scenarios.

Dialogue Generation

ChemLLM: A Chemical Large Language Model

no code implementations10 Feb 2024 Di Zhang, Wei Liu, Qian Tan, Jingdan Chen, Hang Yan, Yuliang Yan, Jiatong Li, Weiran Huang, Xiangyu Yue, Dongzhan Zhou, Shufei Zhang, Mao Su, Hansen Zhong, Yuqiang Li, Wanli Ouyang

ChemLLM beats GPT-3. 5 on all three principal tasks in chemistry, i. e., name conversion, molecular caption, and reaction prediction, and surpasses GPT-4 on two of them.

Language Modelling Large Language Model +2

Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization

1 code implementation5 Feb 2024 Yang Jin, Zhicheng Sun, Kun Xu, Liwei Chen, Hao Jiang, Quzhe Huang, Chengru Song, Yuliang Liu, Di Zhang, Yang song, Kun Gai, Yadong Mu

In light of recent advances in multimodal Large Language Models (LLMs), there is increasing attention to scaling them from image-text data to more informative real-world videos.

Video Understanding Visual Question Answering

Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion

no code implementations5 Feb 2024 Shiyuan Yang, Liang Hou, Haibin Huang, Chongyang Ma, Pengfei Wan, Di Zhang, Xiaodong Chen, Jing Liao

In practice, users often desire the ability to control object motion and camera movement independently for customized video creation.

Object Video Generation

Rethinking Cross-Attention for Infrared and Visible Image Fusion

no code implementations22 Jan 2024 Lihua Jian, Songlei Xiong, Han Yan, Xiaoguang Niu, Shaowu Wu, Di Zhang

The DIIM is designed by modifying the vanilla cross-attention mechanism, which can promote the extraction of the discrepancy information of the source images.

Infrared And Visible Image Fusion

Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint

1 code implementation11 Jan 2024 Zhipeng Chen, Kun Zhou, Wayne Xin Zhao, Junchen Wan, Fuzheng Zhang, Di Zhang, Ji-Rong Wen

To address it, we propose a new RL method named \textbf{RLMEC} that incorporates a generative model as the reward model, which is trained by the erroneous solution rewriting task under the minimum editing constraint, and can produce token-level rewards for RL training.

Question Answering Reinforcement Learning (RL)

I2V-Adapter: A General Image-to-Video Adapter for Diffusion Models

no code implementations27 Dec 2023 Xun Guo, Mingwu Zheng, Liang Hou, Yuan Gao, Yufan Deng, Pengfei Wan, Di Zhang, Yufan Liu, Weiming Hu, ZhengJun Zha, Haibin Huang, Chongyang Ma

I2V-Adapter adeptly propagates the unnoised input image to subsequent noised frames through a cross-frame attention mechanism, maintaining the identity of the input image without any changes to the pretrained T2V model.

Video Generation

Paragraph-to-Image Generation with Information-Enriched Diffusion Model

1 code implementation24 Nov 2023 Weijia Wu, Zhuang Li, Yefei He, Mike Zheng Shou, Chunhua Shen, Lele Cheng, Yan Li, Tingting Gao, Di Zhang, Zhongyuan Wang

In this paper, we introduce an information-enriched diffusion model for paragraph-to-image generation task, termed ParaDiffusion, which delves into the transference of the extensive semantic comprehension capabilities of large language models to the task of image generation.

Image Generation Language Modelling +1

Ask One More Time: Self-Agreement Improves Reasoning of Language Models in (Almost) All Scenarios

no code implementations14 Nov 2023 Lei Lin, Jiayi Fu, Pengli Liu, Qingyang Li, Yan Gong, Junchen Wan, Fuzheng Zhang, Zhongyuan Wang, Di Zhang, Kun Gai

Although chain-of-thought (CoT) prompting combined with language models has achieved encouraging results on complex reasoning tasks, the naive greedy decoding used in CoT prompting usually causes the repetitiveness and local optimality.

Language Modelling

DialogBench: Evaluating LLMs as Human-like Dialogue Systems

no code implementations3 Nov 2023 Jiao Ou, Junda Lu, Che Liu, Yihong Tang, Fuzheng Zhang, Di Zhang, Kun Gai

In this paper, we propose DialogBench, a dialogue evaluation benchmark that contains 12 dialogue tasks to probe the capabilities of LLMs as human-like dialogue systems should have.

Dialogue Evaluation

USDC: Unified Static and Dynamic Compression for Visual Transformer

no code implementations17 Oct 2023 Huan Yuan, Chao Liao, Jianchao Tan, Peng Yao, Jiyuan Jia, Bin Chen, Chengru Song, Di Zhang

To alleviate two disadvantages of two categories of methods, we propose to unify the static compression and dynamic compression techniques jointly to obtain an input-adaptive compressed model, which can further better balance the total compression ratios and the model performances.

Model Compression

ASP: Automatic Selection of Proxy dataset for efficient AutoML

no code implementations17 Oct 2023 Peng Yao, Chao Liao, Jiyuan Jia, Jianchao Tan, Bin Chen, Chengru Song, Di Zhang

Deep neural networks have gained great success due to the increasing amounts of data, and diverse effective neural network designs.

Neural Architecture Search

Parrot: Enhancing Multi-Turn Chat Models by Learning to Ask Questions

no code implementations11 Oct 2023 Yuchong Sun, Che Liu, Jinwen Huang, Ruihua Song, Fuzheng Zhang, Di Zhang, Zhongyuan Wang, Kun Gai

In this paper, we address these challenges by introducing Parrot, a highly scalable solution designed to automatically generate high-quality instruction-tuning data, which are then used to enhance the effectiveness of chat models in multi-turn conversations.

Attribute Instruction Following

KwaiYiiMath: Technical Report

no code implementations11 Oct 2023 Jiayi Fu, Lei Lin, Xiaoyang Gao, Pengli Liu, Zhengzong Chen, Zhirui Yang, ShengNan Zhang, Xue Zheng, Yan Li, Yuliang Liu, Xucheng Ye, Yiqiao Liao, Chao Liao, Bin Chen, Chengru Song, Junchen Wan, Zijia Lin, Fuzheng Zhang, Zhongyuan Wang, Di Zhang, Kun Gai

Recent advancements in large language models (LLMs) have demonstrated remarkable abilities in handling a variety of natural language processing (NLP) downstream tasks, even on mathematical tasks requiring multi-step reasoning.

Ranked #87 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +1

Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization

1 code implementation9 Sep 2023 Yang Jin, Kun Xu, Liwei Chen, Chao Liao, Jianchao Tan, Quzhe Huang, Bin Chen, Chenyi Lei, An Liu, Chengru Song, Xiaoqiang Lei, Di Zhang, Wenwu Ou, Kun Gai, Yadong Mu

Specifically, we introduce a well-designed visual tokenizer to translate the non-linguistic image into a sequence of discrete tokens like a foreign language that LLM can read.

Language Modelling Large Language Model +1

Resource Constrained Model Compression via Minimax Optimization for Spiking Neural Networks

1 code implementation9 Aug 2023 Jue Chen, Huan Yuan, Jianchao Tan, Bin Chen, Chengru Song, Di Zhang

We propose an improved end-to-end Minimax optimization method for this sparse learning problem to better balance the model performance and the computation efficiency.

Model Compression Sparse Learning

Cyclic Delay-Doppler Shift: A Simple Transmit Diversity Technique for Delay-Doppler Waveforms in Doubly Selective Channels

no code implementations22 Feb 2023 Haoran Yin, Jiaojiao Xiong, Yu Zhou, Chi Zhang, Di Zhang, Xizhang Wei, Yanqun Tang

Delay-Doppler waveform design has been considered as a promising solution to achieve reliable communication under high-mobility channels for the space-air-ground-integrated networks (SAGIN).

ClusterLog: Clustering Logs for Effective Log-based Anomaly Detection

no code implementations19 Jan 2023 Chris Egersdoerfer, Dong Dai, Di Zhang

With the increasing prevalence of scalable file systems in the context of High Performance Computing (HPC), the importance of accurate anomaly detection on runtime logs is increasing.

Anomaly Detection Clustering +2

Optimal Settings for Cryptocurrency Trading Pairs

no code implementations20 Oct 2022 Di Zhang, Youzhou Zhou

2) It satisfies the connectivity constraint, that is, all currencies are guaranteed to be tradable.

Management

Modeling Randomly Walking Volatility with Chained Gamma Distributions

no code implementations4 Jul 2022 Di Zhang, Qiang Niu, Youzhou Zhou

2) If the variational inference(VI) is used for state estimation, it runs much faster than Monte Carlo(MC) methods since the calculation of the posterior uses only basic arithmetic operations.

Time Series Analysis Variational Inference

PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems

1 code implementation11 Apr 2022 Yuanxing Zhang, Langshi Chen, Siran Yang, Man Yuan, Huimin Yi, Jie Zhang, Jiamang Wang, Jianbo Dong, Yunlong Xu, Yue Song, Yong Li, Di Zhang, Wei Lin, Lin Qu, Bo Zheng

However, we observe that GPU devices in training recommender systems are underutilized, and they cannot attain an expected throughput improvement as what it has achieved in CV and NLP areas.

Marketing Recommendation Systems

AMCAD: Adaptive Mixed-Curvature Representation based Advertisement Retrieval System

no code implementations28 Mar 2022 Zhirong Xu, Shiyang Wen, Junshan Wang, Guojun Liu, Liang Wang, Zhi Yang, Lei Ding, Yan Zhang, Di Zhang, Jian Xu, Bo Zheng

Moreover, to deploy AMCAD in Taobao, one of the largest ecommerce platforms with hundreds of million users, we design an efficient two-layer online retrieval framework for the task of graph based advertisement retrieval.

Graph Embedding Information Retrieval +1

Juvenile state hypothesis: What we can learn from lottery ticket hypothesis researches?

no code implementations8 Sep 2021 Di Zhang

The original lottery ticket hypothesis performs pruning and weight resetting after training convergence, exposing it to the problem of forgotten learning knowledge and potential high cost of training.

M6-T: Exploring Sparse Expert Models and Beyond

no code implementations31 May 2021 An Yang, Junyang Lin, Rui Men, Chang Zhou, Le Jiang, Xianyan Jia, Ang Wang, Jie Zhang, Jiamang Wang, Yong Li, Di Zhang, Wei Lin, Lin Qu, Jingren Zhou, Hongxia Yang

Mixture-of-Experts (MoE) models can achieve promising results with outrageous large amount of parameters but constant computation cost, and thus it has become a trend in model scaling.

Playing the Game of 2048

Graph Intention Network for Click-through Rate Prediction in Sponsored Search

no code implementations30 Mar 2021 Feng Li, Zhenrui Chen, Pengjie Wang, Yi Ren, Di Zhang, Xiaoyu Zhu

Moreover, it is difficult for user to jump out of their specific historical behaviors for possible interest exploration, namely weak generalization problem.

Click-Through Rate Prediction Graph Learning

SCMA Codebook Design Based on Uniquely Decomposable Constellation Groups

no code implementations27 Feb 2021 Xuewan Zhang, Dalong Zhang, Liuqing Yang, Gangtao Han, Hsiao-Hwa Chen, Di Zhang

Thus, BER performance of the proposed codebook design approach outperforms that of the existing codebook design schemes in both uncoded and coded SCMA systems, especially for large-size codebooks.

Tunable ferroelectricity in hBN intercalated twisted double-layer graphene

no code implementations24 Feb 2021 Yibo Wang, Siqi Jiang, Jingkuan Xiao, Xiaofan Cai, Di Zhang, Ping Wang, Guodong Ma, Yaqing Han, Jiabei Huang, Kenji Watanabe, Takashi Taniguchi, Alexander S. Mayorov, Geliang Yu

Van der Waals (vdW) assembly of two-dimensional materials has been long recognized as a powerful tool to create unique systems with properties that cannot be found in natural compounds.

Mesoscale and Nanoscale Physics Materials Science

Singlino-dominated dark matter in $Z_3$-NMSSM

no code implementations10 Feb 2021 Haijing Zhou, Junjie Cao, Jingwei Lian, Di Zhang

Approximate analytical formulas describing the dark matter abundance and cross section in the scattering with nucleons are used to illustrate a dependence on theoretical parameters in neutralino and Higgs sectors.

High Energy Physics - Phenomenology

Radiative Decays of Charged Leptons in the Seesaw Effective Field Theory with One-loop Matching

no code implementations9 Feb 2021 Di Zhang, Shun Zhou

For the first time, the Wilson coefficients of all the relevant six-dimensional operators are computed by carrying out the one-loop matching between the effective theory and full seesaw model, and applied to calculate the total rates of radiative decays of charged leptons.

High Energy Physics - Phenomenology High Energy Physics - Experiment

A bi-diffusion based layer-wise sampling method for deep learning in large graphs

no code implementations25 Sep 2019 Yu He, Shiyang Wen, Wenjin Wu, Yan Zhang, Siran Yang, Yuan Wei, Di Zhang, Guojie Song, Wei Lin, Liang Wang, Bo Zheng

The Graph Convolutional Network (GCN) and its variants are powerful models for graph representation learning and have recently achieved great success on many graph-based applications.

Graph Representation Learning

Projecting "better than randomly": How to reduce the dimensionality of very large datasets in a way that outperforms random projections

no code implementations3 Jan 2019 Michael Wojnowicz, Di Zhang, Glenn Chisholm, Xuan Zhao, Matt Wolff

However, the recent development of randomized principal component analysis (RPCA) has opened up the possibility of obtaining approximate principal components on very large datasets.

Dimensionality Reduction General Classification +1

Gear Training: A new way to implement high-performance model-parallel training

no code implementations11 Jun 2018 Hao Dong, Shuai Li, Dongchang Xu, Yi Ren, Di Zhang

The training of Deep Neural Networks usually needs tremendous computing resources.

Cannot find the paper you are looking for? You can Submit a new open access paper.