Search Results for author: Yu Li

Found 218 papers, 101 papers with code

Improving Conversational Recommendation Systems’ Quality with Context-Aware Item Meta-Information

no code implementations Findings (NAACL) 2022 Bowen Yang, Cong Han, Yu Li, Lei Zuo, Zhou Yu

In this paper, we propose a simple yet effective architecture comprising a pre-trained language model (PLM) and an item metadata encoder to integrate the recommendation and the dialog generation better.

Conversational Recommendation Knowledge Graphs +3

基于层次化语义框架的知识库属性映射方法(Property Mapping in Knowledge Base Under the Hierarchical Semantic Framework)

no code implementations CCL 2020 Yu Li, Guangyou Zhou

面向知识库的自动问答是自然语言处理的一项重要任务, 它旨在对用户提出的自然语言形式问题给出精炼、准确的回复。目前由于缺少数据集、特征不一致等因素, 导致难以使用通用的数据和方法实现领域知识库问答。因此, 本文将“问题意图”视作不同领域问答可能存在的共同特征, 将“问题”与三元组知识库中“关系谓词”的映射过程作为问答核心工作。为了考虑多种层次的语义避免重要信息的损失, 本文分别将“基于门控卷积的深层语义”和“基于交互注意力机制的浅层语义”两个方面通过门控感知机制相融合。我们在NLPCC-ICCPOL 2016 KBQA数据集上的实验表明, 本文提出的方法与现有的基于CDSSM和BDSSM相比, 效能有明显的提升。此外, 本文通过构造天文常识知识库, 将问题与关系谓词映射模型移植到特定领域, 结合Bi-LSTM-CRF模型构建了天文常识自动问答系统。

AdaManip: Adaptive Articulated Object Manipulation Environments and Policy Learning

no code implementations16 Feb 2025 Yuanfei Wang, Xiaojie Zhang, Ruihai Wu, Yu Li, Yan Shen, Mingdong Wu, Zhaofeng He, Yizhou Wang, Hao Dong

To enhance the diversity and complexity of adaptive manipulation mechanisms, we build a novel articulated object manipulation environment and equip it with 9 categories of objects.

TEASER: Token Enhanced Spatial Modeling for Expressions Reconstruction

no code implementations16 Feb 2025 Yunfei Liu, Lei Zhu, Lijian Lin, Ye Zhu, Ailing Zhang, Yu Li

3D facial reconstruction from a single in-the-wild image is a crucial task in human-centered computer vision tasks.

SWA-LDM: Toward Stealthy Watermarks for Latent Diffusion Models

no code implementations14 Feb 2025 Zhonghao Yang, Linye Lyu, Xuanhang Chang, Daojing He, Yu Li

In the rapidly evolving landscape of image generation, Latent Diffusion Models (LDMs) have emerged as powerful tools, enabling the creation of highly realistic images.

Improving TCM Question Answering through Tree-Organized Self-Reflective Retrieval with LLMs

no code implementations13 Feb 2025 Chang Liu, Ying Chang, Jianmin Li, Yiqian Qu, Yu Li, Lingyong Cao, Shuyuan Lin

Results: By coupling with GPT-4, the framework can improve the best performance on the TCM MLE benchmark by 19. 85% in absolute accuracy, and improve recall accuracy from 27% to 38% on CCE datasets.

Question Answering RAG +1

MedConv: Convolutions Beat Transformers on Long-Tailed Bone Density Prediction

1 code implementation2 Feb 2025 Xuyin Qi, Zeyu Zhang, Huazhan Zheng, Mingxi Chen, Numan Kutaiba, Ruth Lim, Cherie Chiang, Zi En Tham, Xuan Ren, Wenxin Zhang, Lei Zhang, Hao Zhang, Wenbing Lv, Guangzhen Yao, Renda Han, Kangsheng Wang, Mingyuan Li, Hongtao Mao, Yu Li, Zhibin Liao, Yang Zhao, Minh-Son To

Bone density prediction via CT scans to estimate T-scores is crucial, providing a more precise assessment of bone health compared to traditional methods like X-ray bone density tests, which lack spatial resolution and the ability to detect localized changes.

Prediction

AlphaAdam:Asynchronous Masked Optimization with Dynamic Alpha for Selective Updates

no code implementations30 Jan 2025 Da Chang, Yu Li, Ganzhao Yuan

To achieve efficient parameter updates, existing methods usually achieve performance comparable to full parameter updates through methods such as low-dimensional decomposition or layer-wise selective updates.

Computational Efficiency

Harnessing Diverse Perspectives: A Multi-Agent Framework for Enhanced Error Detection in Knowledge Graphs

1 code implementation27 Jan 2025 Yu Li, Yi Huang, Guilin Qi, Junlan Feng, Nan Hu, Songlin Zhai, Haohan Xue, Yongrui Chen, Ruoyan Shen, Tongtong Wu

For specific industrial scenarios, our framework can facilitate the training of specialized agents using domain-specific knowledge graphs for error detection, which highlights the potential industrial application value of our framework.

Decision Making Knowledge Graphs

Qffusion: Controllable Portrait Video Editing via Quadrant-Grid Attention Learning

no code implementations11 Jan 2025 Maomao Li, Lijian Lin, Yunfei Liu, Ye Zhu, Yu Li

Specifically, we consider a design principle of ``animation for editing'', and train Qffusion as a general animation framework from two still reference images while we can use it for portrait video editing easily by applying modified start and end frames as references during inference.

Video Editing Video Generation

Identity-Preserving Video Dubbing Using Motion Warping

no code implementations8 Jan 2025 Runzhen Liu, Qinjie Lin, Yunfei Liu, Lijian Lin, Ye Zhu, Yu Li, Chuhua Xian, Fa-Ting Hong

To address these limitations, we propose IPTalker, a novel and robust framework for video dubbing that achieves seamless alignment between driving audio and reference identity while ensuring both lip-sync accuracy and high-fidelity identity preservation.

Exploring Optimal Latent Trajetory for Zero-shot Image Editing

no code implementations7 Jan 2025 Maomao Li, Yu Li, Yunfei Liu, Dong Xu

Then, we propose a ZigZag process to perform mild target guiding on this pivot, which fulfills denoising and inversion iteratively, approaching the target while still holding fidelity.

Denoising

From thermodynamics to protein design: Diffusion models for biomolecule generation towards autonomous protein engineering

no code implementations5 Jan 2025 Wen-ran Li, Xavier F. Cadet, David Medina-Ortiz, Mehdi D. Davari, Ramanathan Sowdhamini, Cedric Damour, Yu Li, Alain Miranville, Frederic Cadet

In this review, we first give the definition and characteristics of diffusion models and then focus on two strategies: Denoising Diffusion Probabilistic Models and Score-based Generative Models, where DDPM is the discrete form of SGM.

Denoising Drug Discovery +1

Adaptive$^2$: Adaptive Domain Mining for Fine-grained Domain Adaptation Modeling

no code implementations11 Dec 2024 Wenxuan Sun, Zixuan Yang, Yunli Wang, Zhen Zhang, Zhiqiang Wang, Yu Li, Jian Yang, Yiming Yang, Shiyang Wen, Peng Jiang, Kun Gai

To the best of our knowledge, Adaptive$^2$ is the first approach to automatically learn both domain identification and adaptation in online advertising, opening new research directions for this area.

Domain Adaptation

Multimodal large language model for wheat breeding: a new exploration of smart breeding

no code implementations20 Nov 2024 Guofeng Yang, Yu Li, Yong He, Zhenjiang Zhou, Lingzhen Ye, Hui Fang, Yiqi Luo, Xuping Feng

UAV remote sensing technology has become a key technology in crop breeding, which can achieve high-throughput and non-destructive collection of crop phenotyping data.

Language Modeling Language Modelling +2

Toward Robust and Accurate Adversarial Camouflage Generation against Vehicle Detectors

no code implementations15 Nov 2024 Jiawei Zhou, Linye Lyu, Daojing He, Yu Li

However, existing methods often struggle to capture environmental characteristics during the rendering process or produce adversarial textures that can precisely map to the target vehicle.

Neural Rendering

A Preview of XiYan-SQL: A Multi-Generator Ensemble Framework for Text-to-SQL

3 code implementations13 Nov 2024 Yingqi Gao, Yifu Liu, Xiaoxia Li, Xiaorong Shi, Yin Zhu, Yiming Wang, Shiqi Li, Wei Li, Yuntao Hong, Zhiling Luo, Jinyang Gao, Liyu Mou, Yu Li

On the other hand, we implement the ICL approach with an example selection method based on named entity recognition to prevent overemphasis on entities.

Diversity In-Context Learning +3

Urban Flood Mapping Using Satellite Synthetic Aperture Radar Data: A Review of Characteristics, Approaches and Datasets

no code implementations6 Nov 2024 Jie Zhao, Ming Li, Yu Li, Patrick Matgen, Marco Chini

Besides, we evaluated the Technology Readiness Levels (TRLs) of urban flood mapping techniques to identify challenges and future research areas.

Vector Quantization Prompting for Continual Learning

1 code implementation27 Oct 2024 Li Jiao, Qiuxia Lai, Yu Li, Qiang Xu

In this way, VQ-Prompt can optimize the prompt selection process with task loss and meanwhile achieve effective abstraction of task knowledge for continual learning.

Continual Learning Quantization

MoMQ: Mixture-of-Experts Enhances Multi-Dialect Query Generation across Relational and Non-Relational Databases

no code implementations24 Oct 2024 Zhisheng Lin, Yifu Liu, Zhiling Luo, Jinyang Gao, Yu Li

Additionally, a shared expert group is introduced to address data imbalance, facilitating the transfer of common knowledge from high-resource dialects to low-resource ones.

Guide for Defense (G4D): Dynamic Guidance for Robust and Balanced Defense in Large Language Models

1 code implementation23 Oct 2024 He Cao, Weidi Luo, Yu Wang, Zijing Liu, Bing Feng, Yuan YAO, Yu Li

With the extensive deployment of Large Language Models (LLMs), ensuring their safety has become increasingly critical.

Efficient Antibody Structure Refinement Using Energy-Guided SE(3) Flow Matching

no code implementations22 Oct 2024 Jiying Zhang, Zijing Liu, Shengyuan Bai, He Cao, Yu Li, Lei Zhang

In this paper, we develop a novel antibody structure refinement method termed FlowAB based on energy-guided flow matching.

Specificity

SMILES-Prompting: A Novel Approach to LLM Jailbreak Attacks in Chemical Synthesis

1 code implementation21 Oct 2024 Aidan Wong, He Cao, Zijing Liu, Yu Li

The increasing integration of large language models (LLMs) across various fields has heightened concerns about their potential to propagate dangerous information.

LLM Jailbreak Red Teaming

CAPE: A Chinese Dataset for Appraisal-based Emotional Generation using Large Language Models

no code implementations18 Oct 2024 June M. Liu, He Cao, Renliang Sun, Rui Wang, Yu Li, Jiaxing Zhang

Generating emotionally appropriate responses in conversations with large language models presents a significant challenge due to the complexities of human emotions and cognitive processes, which remain largely underexplored in their critical role in social interactions.

AgentSquare: Automatic LLM Agent Search in Modular Design Space

1 code implementation8 Oct 2024 Yu Shang, Yu Li, Keyu Zhao, Likai Ma, Jiahe Liu, Fengli Xu, Yong Li

We believe that the modular design space and AgentSquare search framework offer a platform for fully exploiting the potential of prior successful designs and consolidating the collective efforts of research community.

L-C4: Language-Based Video Colorization for Creative and Consistent Color

no code implementations7 Oct 2024 Zheng Chang, Shuchen Weng, Huan Ouyang, Yu Li, Si Li, Boxin Shi

Automatic video colorization is inherently an ill-posed problem because each monochrome frame has multiple optional color candidates.

Colorization Image Colorization

DAPE V2: Process Attention Score as Feature Map for Length Extrapolation

2 code implementations7 Oct 2024 Chuanyang Zheng, Yihang Gao, Han Shi, Jing Xiong, Jiankai Sun, Jingyao Li, Minbin Huang, Xiaozhe Ren, Michael Ng, Xin Jiang, Zhenguo Li, Yu Li

The attention mechanism is a fundamental component of the Transformer model, contributing to interactions among distinct tokens, in contrast to earlier feed-forward neural networks.

From Pixels to Personas: Investigating and Modeling Self-Anthropomorphism in Human-Robot Dialogues

no code implementations4 Oct 2024 Yu Li, Devamanyu Hazarika, Di Jin, Julia Hirschberg, Yang Liu

Self-anthropomorphism in robots manifests itself through their display of human-like characteristics in dialogue, such as expressing preferences and emotions.

Reblurring-Guided Single Image Defocus Deblurring: A Learning Framework with Misaligned Training Pairs

1 code implementation26 Sep 2024 Xinya Shu, Yu Li, Dongwei Ren, Xiaohe Wu, Jin Li, WangMeng Zuo

Then, to effectively learn the baseline defocus deblurring network with misaligned training pairs, our reblurring module ensures spatial consistency between the deblurred image, the reblurred image and the input blurry image by reconstructing spatially variant isotropic blur kernels.

Deblurring Image Defocus Deblurring

CNCA: Toward Customizable and Natural Generation of Adversarial Camouflage for Vehicle Detectors

1 code implementation26 Sep 2024 Linye Lyu, Jiawei Zhou, Daojing He, Yu Li

By sampling the optimal texture image from the diffusion model with a user-specific text prompt, our method can generate natural and customizable adversarial camouflage while maintaining high attack performance.

How Well Do LLMs Handle Cantonese? Benchmarking Cantonese Capabilities of Large Language Models

1 code implementation29 Aug 2024 Jiyue Jiang, Pengan Chen, Liheng Chen, Sheng Wang, Qinghang Bao, Lingpeng Kong, Yu Li, Chuan Wu

The rapid evolution of large language models (LLMs) has transformed the competitive landscape in natural language processing (NLP), particularly for English and other data-rich languages.

Benchmarking General Knowledge

TAMER: Tree-Aware Transformer for Handwritten Mathematical Expression Recognition

1 code implementation16 Aug 2024 Jianhua Zhu, Wenqi Zhao, Yu Li, Xingjian Hu, Liangcai Gao

TAMER combines the advantages of both sequence decoding and tree decoding models by jointly optimizing sequence prediction and tree structure prediction tasks, which enhances the model's understanding and generalization of complex mathematical expression structures.

Handwritten Mathmatical Expression Recognition Prediction

The mean-variance portfolio selection based on the average and current profitability of the risky asset

no code implementations15 Aug 2024 Yu Li, Yuhan Wu, Shuhua Zhang

We study the continuous-time pre-commitment mean-variance portfolio selection in a time-varying financial market.

Parameter-Efficient Fine-Tuning via Circular Convolution

no code implementations27 Jul 2024 Aochuan Chen, Jiashun Cheng, Zijing Liu, Ziqi Gao, Fugee Tsung, Yu Li, Jia Li

Low-Rank Adaptation (LoRA) has gained popularity for fine-tuning large foundation models, leveraging low-rank matrices $\mathbf{A}$ and $\mathbf{B}$ to represent weight changes (i. e., $\Delta \mathbf{W} = \mathbf{B} \mathbf{A}$).

parameter-efficient fine-tuning

Unlocking the Potential: Benchmarking Large Language Models in Water Engineering and Research

no code implementations22 Jul 2024 Boyan Xu, Liang Wen, Zihao Li, Yuxing Yang, Guanlan Wu, Xiongpeng Tang, Yu Li, Zihao Wu, Qingxian Su, Xueqing Shi, Yue Yang, Rui Tong, How Yong Ng

Overall, this study pioneered evaluating LLMs in water engineering and research by introducing the WaterER benchmark to assess the trustworthiness of their predictions.

Benchmarking

LayoutDiT: Exploring Content-Graphic Balance in Layout Generation with Diffusion Transformer

no code implementations21 Jul 2024 Yu Li, Yifan Chen, Gongye Liu, Fei Yin, Qingyan Bai, Jie Wu, Hongfa Wang, Ruihang Chu, Yujiu Yang

To address these challenges, we introduce LayoutDiT, an effective framework that balances content and graphic features to generate high-quality, visually appealing layouts.

Blocking

From 2015 to 2023: How Machine Learning Aids Natural Product Analysis

no code implementations18 Jul 2024 Suwen Shi, Ziwei Huang, Xingxin Gu, Xu Lin, Chaoying Zhong, Junjie Hang, Jianli Lin, Claire Chenwen Zhong, Lin Zhang, Yu Li, JunJie Huang

In recent years, conventional chemistry techniques have faced significant challenges due to their inherent limitations, struggling to cope with the increasing complexity and volume of data generated in contemporary research endeavors.

LIONs: An Empirically Optimized Approach to Align Language Models

1 code implementation9 Jul 2024 Xiao Yu, Qingyang Wu, Yu Li, Zhou Yu

Alignment is a crucial step to enhance the instruction-following and conversational abilities of language models.

Instruction Following

Data Augmentation of Multi-turn Psychological Dialogue via Knowledge-driven Progressive Thought Prompting

no code implementations24 Jun 2024 Jiyue Jiang, Liheng Chen, Sheng Wang, Lingpeng Kong, Yu Li, Chuan Wu

The thought generated by the progressive thought generator serves as a prompt to prevent the generated dialogue from having significant semantic deviations, while the psychology knowledge generator produces psychological knowledge to serve as the dialogue history for the LLM, guiding the dialogue generator to create multi-turn psychological dialogue.

Data Augmentation Dialogue Generation

PRESTO: Progressive Pretraining Enhances Synthetic Chemistry Outcomes

1 code implementation19 Jun 2024 He Cao, Yanjun Shao, Zhiyuan Liu, Zijing Liu, Xiangru Tang, Yuan YAO, Yu Li

Current approaches, however, often neglect the critical role of multiple molecule graph interaction in understanding chemical reactions, leading to suboptimal performance in synthetic chemistry tasks.

cross-modal alignment

SeRTS: Self-Rewarding Tree Search for Biomedical Retrieval-Augmented Generation

no code implementations17 Jun 2024 Minda Hu, Licheng Zong, Hongru Wang, Jingyan Zhou, Jingjing Li, Yichen Gao, Kam-Fai Wong, Yu Li, Irwin King

By combining the reasoning capabilities of LLMs with the effectiveness of tree search, SeRTS boosts the zero-shot performance of retrieving high-quality and informative results for RAG.

Question Answering RAG +1

Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models

1 code implementation26 May 2024 Yongxian Wei, Zixuan Hu, Li Shen, Zhenyi Wang, Yu Li, Chun Yuan, DaCheng Tao

By encouraging a gradient direction suitable for all tasks, the meta-model captures shared representations that generalize across tasks.

Meta-Learning

DAPE: Data-Adaptive Positional Encoding for Length Extrapolation

2 code implementations23 May 2024 Chuanyang Zheng, Yihang Gao, Han Shi, Minbin Huang, Jingyao Li, Jing Xiong, Xiaozhe Ren, Michael Ng, Xin Jiang, Zhenguo Li, Yu Li

Positional encoding plays a crucial role in transformers, significantly impacting model performance and length generalization.

SubGDiff: A Subgraph Diffusion Model to Improve Molecular Representation Learning

1 code implementation9 May 2024 Jiying Zhang, Zijing Liu, Yu Wang, Yu Li

We propose a novel diffusion model termed SubGDiff for involving the molecular subgraph information in diffusion.

Denoising Drug Discovery +2

A weighted multilevel Monte Carlo method

no code implementations6 May 2024 Yu Li, Antony Ware

The Multilevel Monte Carlo (MLMC) method has been applied successfully in a wide range of settings since its first introduction by Giles (2008).

DLoRA-TrOCR: Mixed Text Mode Optical Character Recognition Based On Transformer

no code implementations19 Apr 2024 Da Chang, Yu Li

With the continuous development of Optical Character Recognition (OCR) and the expansion of application fields, text recognition in complex scenes has become a key challenge.

Optical Character Recognition Optical Character Recognition (OCR)

RiboDiffusion: Tertiary Structure-based RNA Inverse Folding with Generative Diffusion Models

1 code implementation17 Apr 2024 Han Huang, Ziqian Lin, Dongchen He, Liang Hong, Yu Li

A fundamental challenge is to find functional RNA sequences that satisfy given structural constraints, known as the inverse folding problem.

Graph Neural Network

DESTEIN: Navigating Detoxification of Language Models via Universal Steering Pairs and Head-wise Activation Fusion

1 code implementation16 Apr 2024 Yu Li, Han Jiang, Chuanyang Gong, Zhihua Wei

Despite the remarkable achievements of language models (LMs) across a broad spectrum of tasks, their propensity for generating toxic outputs remains a prevalent concern.

Diversity

HeGTa: Leveraging Heterogeneous Graph-enhanced Large Language Models for Few-shot Complex Table Understanding

no code implementations28 Mar 2024 Rihui Jin, Yu Li, Guilin Qi, Nan Hu, Yuan-Fang Li, Jiaoyan Chen, Jianan Wang, Yongrui Chen, Dehai Min, Sheng Bi

Table understanding (TU) has achieved promising advancements, but it faces the challenges of the scarcity of manually labeled tables and the presence of complex table structures. To address these challenges, we propose HGT, a framework with a heterogeneous graph (HG)-enhanced large language model (LLM) to tackle few-shot TU tasks. It leverages the LLM by aligning the table semantics with the LLM's parametric knowledge through soft prompts and instruction turning and deals with complex tables by a multi-task pre-training scheme involving three novel multi-granularity self-supervised HG pre-training objectives. We empirically demonstrate the effectiveness of HGT, showing that it outperforms the SOTA for few-shot complex TU on several benchmarks.

Language Modeling Language Modelling +1

Enhancing Trust and Privacy in Distributed Networks: A Comprehensive Survey on Blockchain-based Federated Learning

no code implementations28 Mar 2024 Ji Liu, Chunlu Chen, Yu Li, Lin Sun, Yulun Song, Jingbo Zhou, Bo Jing, Dejing Dou

While centralized servers pose a risk of being a single point of failure, decentralized approaches like blockchain offer a compelling solution by implementing a consensus mechanism among multiple entities.

Distributed Computing Federated Learning +1

MATEval: A Multi-Agent Discussion Framework for Advancing Open-Ended Text Evaluation

1 code implementation28 Mar 2024 Yu Li, Shenyu Zhang, Rui Wu, Xiutian Huang, Yongrui Chen, Wenhao Xu, Guilin Qi, Dehai Min

Experimental results show that our framework outperforms existing open-ended text evaluation methods and achieves the highest correlation with human evaluation, which confirms the effectiveness and advancement of our framework in addressing the uncertainties and instabilities in evaluating LLMs-generated text.

Compress3D: a Compressed Latent Space for 3D Generation from a Single Image

no code implementations20 Mar 2024 BoWen Zhang, Tianyu Yang, Yu Li, Lei Zhang, Xi Zhao

In this paper, we present a triplane autoencoder, which encodes 3D models into a compact triplane latent space to effectively compress both the 3D geometry and texture information.

3D Generation 3D geometry

DEE: Dual-stage Explainable Evaluation Method for Text Generation

no code implementations18 Mar 2024 Shenyu Zhang, Yu Li, Rui Wu, Xiutian Huang, Yongrui Chen, Wenhao Xu, Guilin Qi

Automatic methods for evaluating machine-generated texts hold significant importance due to the expanding applications of generative systems.

Hallucination Text Generation

AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shield Prompting

1 code implementation14 Mar 2024 Yu Wang, Xiaogeng Liu, Yu Li, Muhao Chen, Chaowei Xiao

However, with the integration of additional modalities, MLLMs are exposed to new vulnerabilities, rendering them prone to structured-based jailbreak attacks, where semantic content (e. g., "harmful text") has been injected into the images to mislead MLLMs.

MoleculeQA: A Dataset to Evaluate Factual Accuracy in Molecular Comprehension

1 code implementation13 Mar 2024 Xingyu Lu, He Cao, Zijing Liu, Shengyuan Bai, Leqing Chen, Yuan YAO, Hai-Tao Zheng, Yu Li

Large language models are playing an increasingly significant role in molecular research, yet existing models often generate erroneous information, posing challenges to accurate molecular comprehension.

Question Answering

A Framework for Cost-Effective and Self-Adaptive LLM Shaking and Recovery Mechanism

no code implementations12 Mar 2024 Zhiyu Chen, Yu Li, Suochao Zhang, Jingbo Zhou, Jiwen Zhou, Chenfu Bao, dianhai yu

As Large Language Models (LLMs) gain great success in real-world applications, an increasing number of users are seeking to develop and deploy their customized LLMs through cloud services.

Privacy Preserving

Dual Graph Attention based Disentanglement Multiple Instance Learning for Brain Age Estimation

no code implementations2 Mar 2024 Fanzhe Yan, Gang Yang, Yu Li, Aiping Liu, Xun Chen

To overcome these limitations, we propose a Dual Graph Attention based Disentanglement Multi-instance Learning (DGA-DMIL) framework for improving brain age estimation.

Age Estimation Disentanglement +2

RAUCA: A Novel Physical Adversarial Attack on Vehicle Detectors via Robust and Accurate Camouflage Generation

1 code implementation24 Feb 2024 Jiawei Zhou, Linye Lyu, Daojing He, Yu Li

However, existing methods often struggle to capture environmental characteristics during the rendering process or produce adversarial textures that can precisely map to the target vehicle, resulting in suboptimal attack performance.

Adversarial Attack Neural Rendering

AlgoFormer: An Efficient Transformer Framework with Algorithmic Structures

no code implementations21 Feb 2024 Yihang Gao, Chuanyang Zheng, Enze Xie, Han Shi, Tianyang Hu, Yu Li, Michael K. Ng, Zhenguo Li, Zhaoqiang Liu

Furthermore, some theoretical and empirical results are presented to show that the designed transformer has the potential to perform algorithm representation and learning.

Machine Translation text-classification +1

Progress and Opportunities of Foundation Models in Bioinformatics

no code implementations6 Feb 2024 Qing Li, Zhihang Hu, YiXuan Wang, Lei LI, Yimin Fan, Irwin King, Le Song, Yu Li

Central to our focus is the application of FMs to specific biological problems, aiming to guide the research community in choosing appropriate FMs for their research needs.

Synergy-of-Thoughts: Eliciting Efficient Reasoning in Hybrid Language Models

no code implementations4 Feb 2024 Yu Shang, Yu Li, Fengli Xu, Yong Li

If these intuitive thoughts exhibit conflicts, SoT will invoke the reflective reasoning of scaled-up language models to emulate the intervention of System 2, which will override the intuitive thoughts and rectify the reasoning results.

GPAvatar: Generalizable and Precise Head Avatar from Image(s)

2 code implementations18 Jan 2024 Xuangeng Chu, Yu Li, Ailing Zeng, Tianyu Yang, Lijian Lin, Yunfei Liu, Tatsuya Harada

Head avatar reconstruction, crucial for applications in virtual reality, online meetings, gaming, and film industries, has garnered substantial attention within the computer vision community.

Neural Rendering Novel View Synthesis

Developing ChatGPT for Biology and Medicine: A Complete Review of Biomedical Question Answering

no code implementations15 Jan 2024 Qing Li, Lei LI, Yu Li

Central to our focus is the utilizing of language models and multimodal paradigms for medical question answering, aiming to guide the research community in selecting appropriate mechanisms for their specific medical research requirements.

Cross-Modal Retrieval Medical Diagnosis +3

MEAOD: Model Extraction Attack against Object Detectors

no code implementations22 Dec 2023 Zeyu Li, Chenghui Shi, Yuwen Pu, Xuhong Zhang, Yu Li, Jinbao Li, Shouling Ji

The widespread use of deep learning technology across various industries has made deep neural network models highly valuable and, as a result, attractive targets for potential attackers.

Active Learning model +4

A Video is Worth 256 Bases: Spatial-Temporal Expectation-Maximization Inversion for Zero-Shot Video Editing

1 code implementation CVPR 2024 Maomao Li, Yu Li, Tianyu Yang, Yunfei Liu, Dongxu Yue, Zhihui Lin, Dong Xu

This paper presents a video inversion approach for zero-shot video editing, which models the input video with low-rank representation during the inversion process.

Video Editing

InstructMol: Multi-Modal Integration for Building a Versatile and Reliable Molecular Assistant in Drug Discovery

1 code implementation27 Nov 2023 He Cao, Zijing Liu, Xingyu Lu, Yuan YAO, Yu Li

The rapid evolution of artificial intelligence in drug discovery encounters challenges with generalization and extensive training, yet Large Language Models (LLMs) offer promise in reshaping interactions with complex molecular data.

Drug Discovery Molecule Captioning

Large-Scale and Multi-Perspective Opinion Summarization with Diverse Review Subsets

1 code implementation20 Oct 2023 Han Jiang, Rui Wang, Zhihua Wei, Yu Li, Xinpeng Wang

Furthermore, our in-depth analysis verifies that the advanced selection of review subsets and the two-stage training scheme are vital to boosting the summarization performance.

Opinion Summarization

Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts

no code implementations18 Oct 2023 Xinhua Cheng, Tianyu Yang, Jianan Wang, Yu Li, Lei Zhang, Jian Zhang, Li Yuan

Recent text-to-3D generation methods achieve impressive 3D content creation capacity thanks to the advances in image diffusion models and optimizing strategies.

3D Generation Text to 3D

Curriculum-Driven Edubot: A Framework for Developing Language Learning Chatbots Through Synthesizing Conversational Data

no code implementations28 Sep 2023 Yu Li, Shang Qu, Jili Shen, Shangchao Min, Zhou Yu

Chatbots have become popular in educational settings, revolutionizing how students interact with material and how teachers teach.

Chatbot

Lyra: Orchestrating Dual Correction in Automated Theorem Proving

1 code implementation27 Sep 2023 Chuanyang Zheng, Haiming Wang, Enze Xie, Zhengying Liu, Jiankai Sun, Huajian Xin, Jianhao Shen, Zhenguo Li, Yu Li

In addition, we introduce Conjecture Correction, an error feedback mechanism designed to interact with prover to refine formal proof conjectures with prover error messages.

Ranked #2 on Automated Theorem Proving on miniF2F-valid (Pass@100 metric)

Automated Theorem Proving Hallucination

Effective Whole-body Pose Estimation with Two-stages Distillation

1 code implementation29 Jul 2023 Zhendong Yang, Ailing Zeng, Chun Yuan, Yu Li

Different from the previous self-knowledge distillation, this stage finetunes the student's head with only 20% training time as a plug-and-play training strategy.

Ranked #4 on 2D Human Pose Estimation on COCO-WholeBody (using extra training data)

2D Human Pose Estimation Pose Estimation +1

Enhancing the Protein Tertiary Structure Prediction by Multiple Sequence Alignment Generation

2 code implementations2 Jun 2023 Le Zhang, Jiayang Chen, Tao Shen, Yu Li, Siqi Sun

The field of protein folding research has been greatly advanced by deep learning methods, with AlphaFold2 (AF2) demonstrating exceptional performance and atomic-level precision.

Language Modeling Language Modelling +3

Progressive-Hint Prompting Improves Reasoning in Large Language Models

1 code implementation19 Apr 2023 Chuanyang Zheng, Zhengying Liu, Enze Xie, Zhenguo Li, Yu Li

The performance of Large Language Models (LLMs) in reasoning tasks depends heavily on prompt design, with Chain-of-Thought (CoT) and self-consistency being critical methods that enhance this ability.

Arithmetic Reasoning GSM8K +2

User Adaptive Language Learning Chatbots with a Curriculum

no code implementations11 Apr 2023 Kun Qian, Ryan Shea, Yu Li, Luke Kutszik Fryer, Zhou Yu

Along with the development of systems for natural language understanding and generation, dialog systems have been widely adopted for language learning and practicing.

Natural Language Understanding

One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer

1 code implementation CVPR 2023 Jing Lin, Ailing Zeng, Haoqian Wang, Lei Zhang, Yu Li

It is challenging to perform this task with a single network due to resolution issues, i. e., the face and hands are usually located in extremely small regions.

3D Human Pose Estimation 3D Human Reconstruction +2

FishDreamer: Towards Fisheye Semantic Completion via Unified Image Outpainting and Segmentation

1 code implementation24 Mar 2023 Hao Shi, Yu Li, Kailun Yang, Jiaming Zhang, Kunyu Peng, Alina Roitberg, Yaozu Ye, Huajian Ni, Kaiwei Wang, Rainer Stiefelhagen

This paper raises the new task of Fisheye Semantic Completion (FSC), where dense texture, structure, and semantics of a fisheye image are inferred even beyond the sensor field-of-view (FoV).

Image Outpainting Semantic Segmentation

From Knowledge Distillation to Self-Knowledge Distillation: A Unified Approach with Normalized Loss and Customized Soft Labels

1 code implementation ICCV 2023 Zhendong Yang, Ailing Zeng, Zhe Li, Tianke Zhang, Chun Yuan, Yu Li

We decompose the KD loss and find the non-target loss from it forces the student's non-target logits to match the teacher's, but the sum of the two non-target logits is different, preventing them from being identical.

Self-Knowledge Distillation

Make Encoder Great Again in 3D GAN Inversion through Geometry and Occlusion-Aware Encoding

no code implementations ICCV 2023 Ziyang Yuan, Yiming Zhu, Yu Li, Hongyu Liu, Chun Yuan

We leverage the inherent properties of EG3D's latent space to design a discriminator and a background depth regularization.

3D geometry

Can ChatGPT Replace Traditional KBQA Models? An In-depth Analysis of the Question Answering Performance of the GPT LLM Family

2 code implementations14 Mar 2023 Yiming Tan, Dehai Min, Yu Li, Wenbo Li, Nan Hu, Yongrui Chen, Guilin Qi

ChatGPT is a powerful large language model (LLM) that covers knowledge resources such as Wikipedia and supports natural language question answering using its own knowledge.

Knowledge Base Question Answering Language Modeling +4

POSGen: Personalized Opening Sentence Generation for Online Insurance Sales

no code implementations10 Feb 2023 Yu Li, Yi Zhang, Weijia Wu, Zimu Zhou, Qiang Li

Such personalized opening sentence generation is challenging because (i) there are limited historical samples for conversation topic recommendation in online insurance sales and (ii) existing text generation schemes often fail to support customized topic ordering based on user preferences.

Chatbot Management +2

On Function-Coupled Watermarks for Deep Neural Networks

no code implementations8 Feb 2023 Xiangyu Wen, Yu Li, Wei Jiang, Qiang Xu

Various watermarking techniques are proposed to protect such intellectual properties (IPs), wherein the DNN providers implant secret information into the model so that they can later claim IP ownership by retrieving their embedded watermarks with some dedicated trigger inputs.

Image Classification

LipFormer: Learning to Lipread Unseen Speakers based on Visual-Landmark Transformers

no code implementations4 Feb 2023 Feng Xue, Yu Li, Deyin Liu, Yincen Xie, Lin Wu, Richang Hong

However, generalizing these methods to unseen speakers incurs catastrophic performance degradation due to the limited number of speakers in training bank and the evident visual variations caused by the shape/color of lips for different speakers.

Lipreading Sentence

Drug Synergistic Combinations Predictions via Large-Scale Pre-Training and Graph Structure Learning

no code implementations14 Jan 2023 Zhihang Hu, Qinze Yu, Yucheng Guo, Taifeng Wang, Irwin King, Xin Gao, Le Song, Yu Li

While previous methods reported fair performance, their models usually do not take advantage of multi-modal data and they are unable to handle new drugs or cell lines.

Graph structure learning

Learning Rain Location Prior for Nighttime Deraining

1 code implementation ICCV 2023 Fan Zhang, ShaoDi You, Yu Li, Ying Fu

This learned prior contains location information of rain streaks and, when injected into deraining models, can significantly improve their performance.

Rain Removal

Accurate 3D Face Reconstruction with Facial Component Tokens

no code implementations ICCV 2023 Tianke Zhang, Xuangeng Chu, Yunfei Liu, Lijian Lin, Zhendong Yang, Zhengzhuo Xu, Chengkun Cao, Fei Yu, Changyin Zhou, Chun Yuan, Yu Li

However, the current deep learning-based methods face significant challenges in achieving accurate reconstruction with disentangled facial parameters and ensuring temporal stability in single-frame methods for 3D face tracking on video data.

3D Face Reconstruction

L-CoIns: Language-Based Colorization With Instance Awareness

no code implementations CVPR 2023 Zheng Chang, Shuchen Weng, Peixuan Zhang, Yu Li, Si Li, Boxin Shi

Language-based colorization produces plausible colors consistent with the language description provided by the user.

Colorization Descriptive +1

DIONYSUS: A Pre-trained Model for Low-Resource Dialogue Summarization

no code implementations20 Dec 2022 Yu Li, Baolin Peng, Pengcheng He, Michel Galley, Zhou Yu, Jianfeng Gao

In this work, we propose DIONYSUS (dynamic input optimization in pre-training for dialogue summarization), a pre-trained encoder-decoder model for summarizing dialogues in any new domain.

Decoder

DQnet: Cross-Model Detail Querying for Camouflaged Object Detection

no code implementations16 Dec 2022 Wei Sun, Chengao Liu, Linyan Zhang, Yu Li, Pengxu Wei, Chang Liu, Jialing Zou, Jianbin Jiao, Qixiang Ye

Optimizing a convolutional neural network (CNN) for camouflaged object detection (COD) tends to activate local discriminative regions while ignoring complete object extent, causing the partial activation issue which inevitably leads to missing or redundant regions of objects.

Object object-detection +2

An Empirical Study on the Efficacy of Deep Active Learning for Image Classification

no code implementations30 Nov 2022 Yu Li, Muxi Chen, Yannan Liu, Daojing He, Qiang Xu

Third, performing data selection in the SSAL setting can achieve a significant and consistent performance improvement, especially with abundant unlabeled data.

Active Learning Image Classification

Learning Single Image Defocus Deblurring with Misaligned Training Pairs

2 code implementations26 Nov 2022 Yu Li, Dongwei Ren, Xinya Shu, WangMeng Zuo

First, in the deblurring module, a bi-directional optical flow-based deformation is introduced to tolerate spatial misalignment between deblurred and ground-truth images.

Deblurring Image Defocus Deblurring +1

CT2: Colorization Transformer via Color Tokens

1 code implementation ECCV 2022 Shuchen Weng, Jimeng Sun, Yu Li, Si Li, Boxin Shi

Automatic image colorization is an ill-posed problem with multi-modal uncertainty, and there remains two main challenges with previous methods: incorrect semantic colors and under-saturation.

Colorization Image Colorization

Robots-Dont-Cry: Understanding Falsely Anthropomorphic Utterances in Dialog Systems

1 code implementation22 Oct 2022 David Gros, Yu Li, Zhou Yu

Dialog systems are often designed or trained to output human-like responses.

Robust Human Matting via Semantic Guidance

1 code implementation11 Oct 2022 Xiangguang Chen, Ye Zhu, Yu Li, Bingtao Fu, Lei Sun, Ying Shan, Shan Liu

Unlike previous works, our framework is data efficient, which requires a small amount of matting ground-truth to learn to estimate high quality object mattes.

Image Matting Segmentation

GTAV-NightRain: Photometric Realistic Large-scale Dataset for Night-time Rain Streak Removal

1 code implementation10 Oct 2022 Fan Zhang, ShaoDi You, Yu Li, Ying Fu

In this paper, we propose GTAV-NightRain dataset, which is a large-scale synthetic night-time rain streak removal dataset.

Rethinking Knowledge Distillation via Cross-Entropy

1 code implementation22 Aug 2022 Zhendong Yang, Zhe Li, Yuan Gong, Tianke Zhang, Shanshan Lao, Chun Yuan, Yu Li

Furthermore, we smooth students' target output to treat it as the soft target for training without teachers and propose a teacher-free new KD loss (tf-NKD).

Knowledge Distillation

Genome-wide nucleotide-resolution model of single-strand break site reveals species evolutionary hierarchy

1 code implementation21 Aug 2022 Sheng Xu, Junkang Wei, Yu Li

Besides, SSBlazer is a lightweight model with robust cross-species generalization ability in the cross-species evaluation, which enables the large-scale genome-wide application in diverse species.

Using Chatbots to Teach Languages

no code implementations31 Jul 2022 Yu Li, Chun-Yen Chen, Dian Yu, Sam Davidson, Ryan Hou, Xun Yuan, Yinghua Tan, Derek Pham, Zhou Yu

This paper reports on progress towards building an online language learning tool to provide learners with conversational experience by using dialog systems as conversation practice partners.

reinforcement-learning Reinforcement Learning (RL)

A Multi-tasking Model of Speaker-Keyword Classification for Keeping Human in the Loop of Drone-assisted Inspection

1 code implementation8 Jul 2022 Yu Li, Anisha Parsan, Bill Wang, Penghao Dong, Shanshan Yao, Ruwen Qin

A base model for a group of five authorized subjects is trained and tested on the inspection keyword dataset collected by this study.

Do we really need temporal convolutions in action segmentation?

1 code implementation26 May 2022 Dazhao Du, Bing Su, Yu Li, Zhongang Qi, Lingyu Si, Ying Shan

Most state-of-the-art methods focus on designing temporal convolution-based models, but the inflexibility of temporal convolutions and the difficulties in modeling long-term temporal dependencies restrict the potential of these models.

Action Classification Action Segmentation +1

ProNet DB: A proteome-wise database for protein surface property representations and RNA-binding profiles

no code implementations16 May 2022 Junkang Wei, Jin Xiao, Siyuan Chen, Licheng Zong, Xin Gao, Yu Li

The rapid growth in the number of experimental and predicted protein structures and more complicated protein structures challenge users in computational biology for utilizing the structural information and protein surface property representation.

Drug Discovery

Learning Dual-Pixel Alignment for Defocus Deblurring

1 code implementation26 Apr 2022 Yu Li, Yaling Yi, Dongwei Ren, Qince Li, WangMeng Zuo

Generally, DPANet is an encoder-decoder with skip-connections, where two branches with shared parameters in the encoder are employed to extract and align deep features from left and right views, and one decoder is adopted to fuse aligned features for predicting the sharp image.

Deblurring Decoder

"Think Before You Speak": Improving Multi-Action Dialog Policy by Planning Single-Action Dialogs

1 code implementation25 Apr 2022 Shuo Zhang, Junzhou Zhao, Pinghui Wang, Yu Li, Yi Huang, Junlan Feng

Multi-action dialog policy (MADP), which generates multiple atomic dialog actions per turn, has been widely applied in task-oriented dialog systems to provide expressive and efficient system responses.

Multi-Task Learning

Temporally Efficient Vision Transformer for Video Instance Segmentation

3 code implementations CVPR 2022 Shusheng Yang, Xinggang Wang, Yu Li, Yuxin Fang, Jiemin Fang, Wenyu Liu, Xun Zhao, Ying Shan

To effectively and efficiently model the crucial temporal information within a video clip, we propose a Temporally Efficient Vision Transformer (TeViT) for video instance segmentation (VIS).

Instance Segmentation Semantic Segmentation +1

Interpretable RNA Foundation Model from Unannotated Data for Highly Accurate RNA Structure and Function Predictions

2 code implementations1 Apr 2022 Jiayang Chen, Zhihang Hu, Siqi Sun, Qingxiong Tan, YiXuan Wang, Qinze Yu, Licheng Zong, Liang Hong, Jin Xiao, Tao Shen, Irwin King, Yu Li

Non-coding RNA structure and function are essential to understanding various biological processes, such as cell signaling, gene expression, and post-transcriptional regulations.

Self-Supervised Learning

A physics and data co-driven surrogate modeling approach for temperature field prediction on irregular geometric domain

no code implementations15 Mar 2022 Kairui Bao, Wen Yao, Xiaoya Zhang, Wei Peng, Yu Li

Second, a physics-driven CNN surrogate with partial differential equation (PDE) residuals as a loss function is utilized for fast meshing (meshing surrogate); then, we present a data-driven surrogate model based on the multi-level reduced-order method, aiming to learn solutions of temperature field in the above regular computational plane (thermal surrogate).

SODAR: Segmenting Objects by DynamicallyAggregating Neighboring Mask Representations

1 code implementation15 Feb 2022 Tao Wang, Jun Hao Liew, Yu Li, Yunpeng Chen, Jiashi Feng

Unlike the original per grid cell object masks, SODAR is implicitly supervised to learn mask representations that encode geometric structure of nearby objects and complement adjacent representations with context.

Instance Segmentation Object +1