Search Results for author: Yudi Zhang

Found 16 papers, 3 papers with code

Consistent Image Layout Editing with Diffusion Models

no code implementations9 Mar 2025 Tao Xia, Yudi Zhang, Ting Liu Lei Zhang

Despite the great success of large-scale text-to-image diffusion models in image generation and image editing, existing methods still struggle to edit the layout of real images.

Image Generation

Distill Not Only Data but Also Rewards: Can Smaller Language Models Surpass Larger Ones?

no code implementations26 Feb 2025 Yudi Zhang, Lu Wang, Meng Fang, Yali Du, Chenghua Huang, Jun Wang, QIngwei Lin, Mykola Pechenizkiy, Dongmei Zhang, Saravan Rajmohan, Qi Zhang

Our method generates pseudo-rewards through a self-supervised mechanism that leverages the inherent structure of both teacher and student responses, enabling reward learning without explicit external evaluation.

GSM8K MMLU +1

FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling

no code implementations20 Feb 2025 Weilin Zhao, Tengyu Pan, Xu Han, Yudi Zhang, Ao Sun, Yuxiang Huang, Kaihuo Zhang, Weilun Zhao, YuXuan Li, Jianyong Wang, Zhiyuan Liu, Maosong Sun

Speculative sampling has emerged as an important technique for accelerating the auto-regressive generation process of large language models (LLMs) by utilizing a draft-then-verify mechanism to produce multiple tokens per forward pass.

Language Modeling Language Modelling

Collaborative Deterministic-Diffusion Model for Probabilistic Urban Spatiotemporal Prediction

no code implementations16 Feb 2025 Zhi Sheng, Yuan Yuan, Yudi Zhang, Depeng Jin, Yong Li

Existing spatiotemporal prediction models are predominantly deterministic, focusing on primary spatiotemporal patterns.

Prediction

Large Action Models: From Inception to Implementation

1 code implementation13 Dec 2024 Lu Wang, Fangkai Yang, Chaoyun Zhang, Junting Lu, Jiaxu Qian, Shilin He, Pu Zhao, Bo Qiao, Ray Huang, Si Qin, Qisheng Su, Jiayi Ye, Yudi Zhang, Jian-Guang Lou, QIngwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang

As AI continues to advance, there is a growing demand for systems that go beyond language-based assistance and move toward intelligent agents capable of performing real-world actions.

Action Generation

RuAG: Learned-rule-augmented Generation for Large Language Models

no code implementations4 Nov 2024 Yudi Zhang, Pei Xiao, Lu Wang, Chaoyun Zhang, Meng Fang, Yali Du, Yevgeniy Puzyrev, Randolph Yao, Si Qin, QIngwei Lin, Mykola Pechenizkiy, Dongmei Zhang, Saravan Rajmohan, Qi Zhang

In-context learning (ICL) and Retrieval-Augmented Generation (RAG) have gained attention for their ability to enhance LLMs' reasoning by incorporating external knowledge but suffer from limited contextual window size, leading to insufficient information injection.

Decision Making In-Context Learning +1

Simplify Implant Depth Prediction as Video Grounding: A Texture Perceive Implant Depth Prediction Network

no code implementations7 Jun 2024 Xinquan Yang, Xuguang Li, Xiaoling Luo, Leilei Zeng, Yudi Zhang, Linlin Shen, Yongqiang Deng

Inspired by the video grounding task which localizes the starting and ending time of the target video segment, in this paper, we simplify the implant depth prediction as video grounding and develop a Texture Perceive Implant Depth Prediction Network (TPNet), which enables us to directly output the implant depth without complex measurements of oral bone.

Depth Estimation Depth Prediction +2

DragTex: Generative Point-Based Texture Editing on 3D Mesh

no code implementations4 Mar 2024 Yudi Zhang, Qi Xu, Lei Zhang

Creating 3D textured meshes using generative artificial intelligence has garnered significant attention recently.

Decoder Texture Synthesis

A Spectrum Evaluation Benchmark for Medical Multi-Modal Large Language Models

no code implementations17 Feb 2024 Jie Liu, Wenxuan Wang, Yihang Su, Jingyuan Huan, WenTing Chen, Yudi Zhang, Cheng-Yi Li, Kao-Jung Chang, Xiaohan Xin, Linlin Shen, Michael R. Lyu

The significant breakthroughs of Medical Multi-Modal Large Language Models (Med-MLLMs) renovate modern healthcare with robust information synthesis and medical decision support.

Diagnostic Visual Question Answering (VQA)

Large Language Models Are Neurosymbolic Reasoners

1 code implementation17 Jan 2024 Meng Fang, Shilong Deng, Yudi Zhang, Zijing Shi, Ling Chen, Mykola Pechenizkiy, Jun Wang

A wide range of real-world applications is characterized by their symbolic nature, necessitating a strong capability for symbolic reasoning.

Common Sense Reasoning Math +2

Multimodal Molecular Pretraining via Modality Blending

no code implementations12 Jul 2023 Qiying Yu, Yudi Zhang, Yuyan Ni, Shikun Feng, Yanyan Lan, Hao Zhou, Jingjing Liu

Self-supervised learning has recently gained growing interest in molecular modeling for scientific tasks such as AI-assisted drug discovery.

Drug Discovery molecular representation +3

Interpretable Reward Redistribution in Reinforcement Learning: A Causal Approach

no code implementations NeurIPS 2023 Yudi Zhang, Yali Du, Biwei Huang, Ziyan Wang, Jun Wang, Meng Fang, Mykola Pechenizkiy

While the majority of current approaches construct the reward redistribution in an uninterpretable manner, we propose to explicitly model the contributions of state and action from a causal perspective, resulting in an interpretable reward redistribution and preserving policy invariance.

reinforcement-learning Reinforcement Learning

RSPT: Reconstruct Surroundings and Predict Trajectories for Generalizable Active Object Tracking

no code implementations7 Apr 2023 Fangwei Zhong, Xiao Bi, Yudi Zhang, Wei zhang, Yizhou Wang

However, building a generalizable active tracker that works robustly across different scenarios remains a challenge, especially in unstructured environments with cluttered obstacles and diverse layouts.

Autonomous Driving Object Tracking

Cannot find the paper you are looking for? You can Submit a new open access paper.