Search Results for author: Xiaotong Li

Found 17 papers, 10 papers with code

Composite Indicator-Guided Infilling Sampling for Expensive Multi-Objective Optimization

no code implementations28 Mar 2025 Huixiang Zhen, Xiaotong Li, Wenyin Gong, Ling Wang, Xiangyun Hu

This indicator simultaneously considers convergence, diversity, and distribution to improve the efficiency of identifying promising candidate solutions, which significantly improves algorithm performance.

Diversity

Tempo: Helping Data Scientists and Domain Experts Collaboratively Specify Predictive Modeling Tasks

no code implementations14 Feb 2025 Venkatesh Sivaraman, Anika Vaishampayan, Xiaotong Li, Brian R Buck, Ziyong Ma, Richard D Boyce, Adam Perer

Temporal predictive models have the potential to improve decisions in health care, public services, and other domains, yet they often fail to effectively support decision-makers.

EVEv2: Improved Baselines for Encoder-Free Vision-Language Models

1 code implementation10 Feb 2025 Haiwen Diao, Xiaotong Li, Yufeng Cui, Yueze Wang, Haoge Deng, Ting Pan, Wenxuan Wang, Huchuan Lu, Xinlong Wang

Existing encoder-free vision-language models (VLMs) are rapidly narrowing the performance gap with their encoder-based counterparts, highlighting the promising potential for unified multimodal systems with structural simplicity and efficient deployment.

Decoder

Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data

4 code implementations24 Oct 2024 Shuhao Gu, Jialing Zhang, Siyuan Zhou, Kevin Yu, Zhaohu Xing, Liangdong Wang, Zhou Cao, Jintao Jia, Zhuoyi Zhang, YiXuan Wang, Zhenchong Hu, Bo-Wen Zhang, Jijie Li, Dong Liang, Yingli Zhao, Songjing Wang, Yulong Ao, Yiming Ju, Huanhuan Ma, Xiaotong Li, Haiwen Diao, Yufeng Cui, Xinlong Wang, Yaoqi Liu, Fangxiang Feng, Guang Liu

Despite the availability of several open-source multimodal datasets, limitations in the scale and quality of open-source instruction data hinder the performance of VLMs trained on these datasets, leading to a significant gap compared to models trained on closed-source data.

Image Generation Question Generation +2

InstructBioMol: Advancing Biomolecule Understanding and Design Following Human Instructions

no code implementations10 Oct 2024 Xiang Zhuang, Keyan Ding, Tianwen Lyu, Yinuo Jiang, Xiaotong Li, Zhuoyi Xiang, Zeyuan Wang, Ming Qin, Kehua Feng, Jike Wang, Qiang Zhang, Huajun Chen

Understanding and designing biomolecules, such as proteins and small molecules, is central to advancing drug discovery, synthetic biology, and enzyme engineering.

Data Integration Drug Discovery

Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach

no code implementations8 Oct 2024 Sha Guo, Zhuo Chen, Yang Zhao, Ning Zhang, Xiaotong Li, Lingyu Duan

Extensive experiments demonstrate the effectiveness of the proposed framework in both image reconstruction and downstream machine vision tasks such as object detection, segmentation, and facial landmark detection, achieving superior perceptual quality compared to state-of-the-art methods.

Data Compression Facial Landmark Detection +5

DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception

1 code implementation11 Jul 2024 Xiaotong Li, Fan Zhang, Haiwen Diao, Yueze Wang, Xinlong Wang, Ling-Yu Duan

To facilitate the cutting-edge research of MLLMs on comprehensive vision perception, we thereby propose Perceptual Fusion, using a low-budget but highly effective caption engine for complete and accurate image descriptions.

Visual Question Answering

Unveiling Encoder-Free Vision-Language Models

1 code implementation17 Jun 2024 Haiwen Diao, Yufeng Cui, Xiaotong Li, Yueze Wang, Huchuan Lu, Xinlong Wang

Training pure VLMs that accept the seamless vision and language inputs, i. e., without vision encoders, remains challenging and rarely explored.

Decoder Inductive Bias +1

LEAD: Exploring Logit Space Evolution for Model Selection

no code implementations CVPR 2024 Zixuan Hu, Xiaotong Li, Shixiang Tang, Jun Liu, Yichun Hu, Ling-Yu Duan

The remarkable success of "pretrain-then-finetune" paradigm has led to a proliferation of available pre-trained models for vision tasks.

model Model Selection

InstructProtein: Aligning Human and Protein Language via Knowledge Instruction

no code implementations5 Oct 2023 Zeyuan Wang, Qiang Zhang, Keyan Ding, Ming Qin, Xiang Zhuang, Xiaotong Li, Huajun Chen

To address this challenge, we propose InstructProtein, an innovative LLM that possesses bidirectional generation capabilities in both human and protein languages: (i) taking a protein sequence as input to predict its textual function description and (ii) using natural language to prompt protein sequence generation.

Knowledge Graphs Protein Function Prediction +1

Exploring Model Transferability through the Lens of Potential Energy

1 code implementation ICCV 2023 Xiaotong Li, Zixuan Hu, Yixiao Ge, Ying Shan, Ling-Yu Duan

The experimental results on 10 downstream tasks and 12 self-supervised models demonstrate that our approach can seamlessly integrate into existing ranking techniques and enhance their performances, revealing its effectiveness for the model selection task and its potential for understanding the mechanism in transfer learning.

Model Selection Transfer Learning

Modeling Uncertain Feature Representation for Domain Generalization

1 code implementation16 Jan 2023 Xiaotong Li, Zixuan Hu, Jun Liu, Yixiao Ge, Yongxing Dai, Ling-Yu Duan

In this paper, we improve the network generalization ability by modeling domain shifts with uncertainty (DSU), i. e., characterizing the feature statistics as uncertain distributions during training.

Domain Generalization image-classification +4

Masked Image Modeling with Denoising Contrast

1 code implementation19 May 2022 Kun Yi, Yixiao Ge, Xiaotong Li, Shusheng Yang, Dian Li, Jianping Wu, Ying Shan, XiaoHu Qie

Since the development of self-supervised visual representation learning from contrastive learning to masked image modeling (MIM), there is no significant difference in essence, that is, how to design proper pretext tasks for vision dictionary look-up.

Contrastive Learning Denoising +7

mc-BEiT: Multi-choice Discretization for Image BERT Pre-training

1 code implementation29 Mar 2022 Xiaotong Li, Yixiao Ge, Kun Yi, Zixuan Hu, Ying Shan, Ling-Yu Duan

Image BERT pre-training with masked image modeling (MIM) becomes a popular practice to cope with self-supervised representation learning.

Instance Segmentation object-detection +5

Uncertainty Modeling for Out-of-Distribution Generalization

1 code implementation ICLR 2022 Xiaotong Li, Yongxing Dai, Yixiao Ge, Jun Liu, Ying Shan, Ling-Yu Duan

In this paper, we improve the network generalization ability by modeling the uncertainty of domain shifts with synthesized feature statistics during training.

image-classification Image Classification +3

Generalizable Person Re-identification with Relevance-aware Mixture of Experts

no code implementations CVPR 2021 Yongxing Dai, Xiaotong Li, Jun Liu, Zekun Tong, Ling-Yu Duan

Specifically, we propose a decorrelation loss to make the source domain networks (experts) keep the diversity and discriminability of individual domains' characteristics.

Generalizable Person Re-identification Mixture-of-Experts

Cannot find the paper you are looking for? You can Submit a new open access paper.