Search Results for author: Sheng Liu

Found 54 papers, 27 papers with code

Diffusion Models for Robotic Manipulation: A Survey

no code implementations11 Apr 2025 Rosa Wolf, Yitian Shi, Sheng Liu, Rania Rayyes

This paper also presents the two main frameworks of diffusion models and their integration with imitation learning and reinforcement learning.

Image Augmentation Imitation Learning +3

MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs

1 code implementation1 Apr 2025 Juncheng Wu, Wenlong Deng, Xingxuan Li, Sheng Liu, Taomian Mi, Yifan Peng, Ziyang Xu, Yi Liu, Hyunjin Cho, Chang-In Choi, Yihan Cao, Hui Ren, Xiang Li, Xiaoxiao Li, Yuyin Zhou

Our pipeline generates detailed reasoning for various medical questions from 7 medical datasets, resulting in a dataset of 32, 682 question-answer pairs, each with detailed, step-by-step explanations.

Knowledge Graphs Mathematical Reasoning

VISO-Grasp: Vision-Language Informed Spatial Object-centric 6-DoF Active View Planning and Grasping in Clutter and Invisibility

no code implementations16 Mar 2025 Yitian Shi, Di Wen, Guanqi Chen, Edgar Welte, Sheng Liu, Kunyu Peng, Rainer Stiefelhagen, Rania Rayyes

To the best of our knowledge, VISO-Grasp is the first unified framework integrating FMs into target-aware active view planning and 6-DoF grasping in environments with severe occlusions and entire invisibility constraints.

Spatial Reasoning

OLMD: Orientation-aware Long-term Motion Decoupling for Continuous Sign Language Recognition

no code implementations11 Mar 2025 Yiheng Yu, Sheng Liu, Yuan Feng, Min Xu, Zhelun Jin, Xuhua Yang

The primary challenge in continuous sign language recognition (CSLR) mainly stems from the presence of multi-orientational and long-term motions.

Sign Language Recognition

OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning

no code implementations16 Feb 2025 Pan Lu, Bowen Chen, Sheng Liu, Rahul Thapa, Joseph Boen, James Zou

In this paper, we introduce OctoTools, a training-free, user-friendly, and easily extensible open-source agentic framework designed to tackle complex reasoning across diverse domains.

MedQA MMLU +1

OTLRM: Orthogonal Learning-based Low-Rank Metric for Multi-Dimensional Inverse Problems

1 code implementation15 Dec 2024 Xiangming Wang, Haijin Zeng, Jiaoyang Chen, Sheng Liu, Yongyong Chen, Guoqing Chao

The TNN-regularized optimization problem is solved by the singular value thresholding (SVT) operator, which leverages the t-SVD framework to obtain the low-rank tensor.

Image Denoising

Multimodal Instruction Tuning with Hybrid State Space Models

no code implementations13 Nov 2024 Jianing Zhou, Han Li, Shuai Zhang, Ning Xie, Ruijie Wang, Xiaohan Nie, Sheng Liu, Lingyun Wang

Remarkably, our model enhances inference efficiency for high-resolution images and high-frame-rate videos by about 4 times compared to current models, with efficiency gains increasing as image resolution or video frames rise.

Mamba State Space Models

Reducing Hallucinations in Vision-Language Models via Latent Space Steering

1 code implementation21 Oct 2024 Sheng Liu, Haotian Ye, Lei Xing, James Zou

Unlike in large language models (LLMs), hallucination in LVLMs often arises from misalignments between visual inputs and textual outputs.

Hallucination

TFG: Unified Training-Free Guidance for Diffusion Models

1 code implementation24 Sep 2024 Haotian Ye, Haowei Lin, Jiaqi Han, Minkai Xu, Sheng Liu, Yitao Liang, Jianzhu Ma, James Zou, Stefano Ermon

Given an unconditional diffusion model and a predictor for a target property of interest (e. g., a classifier), the goal of training-free guidance is to generate samples with desirable target properties without additional training.

MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine

1 code implementation6 Aug 2024 Yunfei Xie, Ce Zhou, Lang Gao, Juncheng Wu, Xianhang Li, Hong-Yu Zhou, Sheng Liu, Lei Xing, James Zou, Cihang Xie, Yuyin Zhou

Unlike the existing multimodal datasets, which are limited by the availability of image-text pairs, we have developed the first automated pipeline that scales up multimodal data by generating multigranular visual and textual annotations in the form of image-ROI-description triplets without the need for any paired text descriptions.

Medical Visual Question Answering Organ Detection +1

Automated radiotherapy treatment planning guided by GPT-4Vision

no code implementations21 Jun 2024 Sheng Liu, Oscar Pastor-Serrano, Yizheng Chen, Matthew Gopaulchan, Weixing Liang, Mark Buyyounouski, Erqi Pollom, Quynh-Thu Le, Michael Gensheimer, Peng Dong, Yong Yang, James Zou, Lei Xing

Objective: Radiotherapy treatment planning is a time-consuming and potentially subjective process that requires the iterative adjustment of model parameters to balance multiple conflicting objectives.

In-Context Learning Language Modelling +2

TextGrad: Automatic "Differentiation" via Text

2 code implementations11 Jun 2024 Mert Yuksekgonul, Federico Bianchi, Joseph Boen, Sheng Liu, Zhi Huang, Carlos Guestrin, James Zou

Without modifying the framework, TextGrad improves the zero-shot accuracy of GPT-4o in Google-Proof Question Answering from $51\%$ to $55\%$, yields $20\%$ relative performance gain in optimizing LeetCode-Hard coding problem solutions, improves prompts for reasoning, designs new druglike small molecules with desirable in silico binding, and designs radiation oncology treatment plans with high specificity.

Ranked #5 on on GPQA

Question Answering Specificity

GNSS Measurement-Based Context Recognition for Vehicle Navigation using Gated Recurrent Unit

no code implementations22 Apr 2024 Sheng Liu, Zhiqiang Yao, Xuemeng Cao, Xiaowen Cai

Recent years, people have put forward higher and higher requirements for context-adaptive navigation (CAN).

Mapping the Increasing Use of LLMs in Scientific Papers

1 code implementation1 Apr 2024 Weixin Liang, Yaohui Zhang, Zhengxuan Wu, Haley Lepp, Wenlong Ji, Xuandong Zhao, Hancheng Cao, Sheng Liu, Siyu He, Zhi Huang, Diyi Yang, Christopher Potts, Christopher D Manning, James Y. Zou

To address this gap, we conduct the first systematic, large-scale analysis across 950, 965 papers published between January 2020 and February 2024 on the arXiv, bioRxiv, and Nature portfolio journals, using a population-level statistical framework to measure the prevalence of LLM-modified content over time.

Data Reconstruction Attacks and Defenses: A Systematic Evaluation

no code implementations13 Feb 2024 Sheng Liu, Zihan Wang, Yuxiao Chen, Qi Lei

Reconstruction attacks and defenses are essential in understanding the data leakage problem in machine learning.

Federated Learning Reconstruction Attack

Beyond Gradient and Priors in Privacy Attacks: Leveraging Pooler Layer Inputs of Language Models in Federated Learning

no code implementations10 Dec 2023 Jianwei Li, Sheng Liu, Qi Lei

Language models trained via federated learning (FL) demonstrate impressive capabilities in handling complex tasks while protecting user privacy.

CoLA Federated Learning +3

Making Self-supervised Learning Robust to Spurious Correlation via Learning-speed Aware Sampling

no code implementations27 Nov 2023 Weicheng Zhu, Sheng Liu, Carlos Fernandez-Granda, Narges Razavian

Self-supervised learning (SSL) has emerged as a powerful technique for learning rich representations from unlabeled data.

Self-Supervised Learning

In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering

1 code implementation11 Nov 2023 Sheng Liu, Haotian Ye, Lei Xing, James Zou

On a new query, instead of adding demonstrations to the prompt, we shift the latent states of the LLM using the ICV.

In-Context Learning Style Transfer

Unleashing the Potential of Regularization Strategies in Learning with Noisy Labels

no code implementations11 Jul 2023 Hui Kang, Sheng Liu, Huaxi Huang, Jun Yu, Bo Han, Dadong Wang, Tongliang Liu

In recent years, research on learning with noisy labels has focused on devising novel algorithms that can achieve robustness to noisy training labels while generalizing to clean data.

Learning with noisy labels

LEMaRT: Label-Efficient Masked Region Transform for Image Harmonization

no code implementations CVPR 2023 Sheng Liu, Cong Phuoc Huynh, Cong Chen, Maxim Arap, Raffay Hamid

We present a simple yet effective self-supervised pre-training method for image harmonization which can leverage large-scale unannotated image datasets.

Image Harmonization

Swin MAE: Masked Autoencoders for Small Datasets

1 code implementation28 Dec 2022 Zi'an Xu, Yin Dai, Fayu Liu, Weibing Chen, Yue Liu, Lifu Shi, Sheng Liu, YuHang Zhou

The development of deep learning models in medical image analysis is majorly limited by the lack of large-sized and well-annotated datasets.

Medical Image Analysis Transfer Learning

Understanding and Improving Transfer Learning of Deep Models via Neural Collapse

no code implementations23 Dec 2022 Xiao Li, Sheng Liu, Jinxin Zhou, Xinyu Lu, Carlos Fernandez-Granda, Zhihui Zhu, Qing Qu

In particular, we discovered a systematic pattern that emerges when linear probing pre-trained models on downstream training data: the more feature collapse of pre-trained models on downstream training data, the higher the transfer accuracy.

Data Augmentation parameter-efficient fine-tuning +2

Avoiding spurious correlations via logit correction

1 code implementation2 Dec 2022 Sheng Liu, Xu Zhang, Nitesh Sekhar, Yue Wu, Prateek Singhal, Carlos Fernandez-Granda

Empirical studies suggest that machine learning models trained with empirical risk minimization (ERM) often rely on attributes that may be spuriously correlated with the class labels.

Attribute

Recovering Sign Bits of DCT Coefficients in Digital Images as an Optimization Problem

1 code implementation2 Nov 2022 Ruiyuan Lin, Sheng Liu, Jun Jiang, Shujun Li, Chengqing Li, C. -C. Jay Kuo

Recovering unknown, missing, damaged, distorted, or lost information in DCT coefficients is a common task in multiple applications of digital image processing, including image compression, selective image encryption, and image communication.

Image Compression SSIM

Are All Losses Created Equal: A Neural Collapse Perspective

no code implementations4 Oct 2022 Jinxin Zhou, Chong You, Xiao Li, Kangning Liu, Sheng Liu, Qing Qu, Zhihui Zhu

We extend such results and show through global solution and landscape analyses that a broad family of loss functions including commonly used label smoothing (LS) and focal loss (FL) exhibits Neural Collapse.

All

Asymmetric Dual-Decoder U-Net for Joint Rain and Haze Removal

1 code implementation14 Jun 2022 Yuan Feng, Yaojun Hu, Pengfei Fang, Yanhong Yang, Sheng Liu, ShengYong Chen

However, jointly removing the rain and haze in scene images is ill-posed and challenging, where the existence of haze and rain and the change of atmosphere light, can both degrade the scene information.

Autonomous Driving Decoder +1

Parotid Gland MRI Segmentation Based on Swin-Unet and Multimodal Images

no code implementations7 Jun 2022 Zi'an Xu, Yin Dai, Fayu Liu, Siqi Li, Sheng Liu, Lifu Shi, Jun Fu

Preoperative tumor localization, differential diagnosis, and subsequent selection of appropriate treatment for parotid gland tumors are critical.

MRI segmentation Segmentation +1

Depth-Guided Sparse Structure-from-Motion for Movies and TV Shows

1 code implementation CVPR 2022 Sheng Liu, Xiaohan Nie, Raffay Hamid

We demonstrate that our approach: (a) significantly improves the quality of 3-D reconstruction for our small-parallax setting, (b) does not cause any degradation for data with large-parallax, and (c) maintains the generalizability and scalability of geometry-based sparse SfM.

On Learning Contrastive Representations for Learning with Noisy Labels

1 code implementation CVPR 2022 Li Yi, Sheng Liu, Qi She, A. Ian McLeod, Boyu Wang

To address this issue, we focus on learning robust contrastive representations of data on which the classifier is hard to memorize the label noise under the CE loss.

Learning with noisy labels Memorization +1

Robust Training under Label Noise by Over-parameterization

1 code implementation28 Feb 2022 Sheng Liu, Zhihui Zhu, Qing Qu, Chong You

In this work, we propose a principled approach for robust training of over-parameterized deep networks in classification tasks where a proportion of training labels are corrupted.

Learning with noisy labels

Uncertainty Detection and Reduction in Neural Decoding of EEG Signals

1 code implementation28 Dec 2021 Tiehang Duan, Zhenyi Wang, Sheng Liu, Sargur N. Srihari, Hui Yang

In this work, we proposed an uncertainty estimation and reduction model (UNCER) to quantify and mitigate the uncertainty during the EEG decoding process.

Data Augmentation Decision Making +3

Deep Probability Estimation

no code implementations21 Nov 2021 Sheng Liu, Aakash Kaku, Weicheng Zhu, Matan Leibovich, Sreyas Mohan, Boyang Yu, Haoxiang Huang, Laure Zanna, Narges Razavian, Jonathan Niles-Weed, Carlos Fernandez-Granda

Reliable probability estimation is of crucial importance in many real-world applications where there is inherent (aleatoric) uncertainty.

Autonomous Vehicles Binary Classification +2

Adaptive Early-Learning Correction for Segmentation from Noisy Annotations

2 code implementations CVPR 2022 Sheng Liu, Kangning Liu, Weicheng Zhu, Yiqiu Shen, Carlos Fernandez-Granda

We discover a phenomenon that has been previously reported in the context of classification: the networks tend to first fit the clean pixel-level labels during an "early-learning" phase, before eventually memorizing the false annotations.

Classification Medical Image Segmentation +5

Few-Shot Fine-Grained Action Recognition via Bidirectional Attention and Contrastive Meta-Learning

1 code implementation15 Aug 2021 Jiahao Wang, Yunhong Wang, Sheng Liu, Annan Li

Fine-grained action recognition is attracting increasing attention due to the emerging demand of specific action understanding in real-world applications, whereas the data of rare fine-grained categories is very limited.

Action Understanding Fine-grained Action Recognition +1

Multi-modal and frequency-weighted tensor nuclear norm for hyperspectral image denoising

no code implementations23 Jun 2021 Xiaozhen Xie, Sheng Liu

In this paper, we propose the multi-modal and frequency-weighted tensor nuclear norm (MFWTNN) and the non-convex MFWTNN for HSI denoising tasks.

Hyperspectral Image Denoising Image Denoising

Convolutional Normalization: Improving Deep Convolutional Network Robustness and Training

1 code implementation NeurIPS 2021 Sheng Liu, Xiao Li, Yuexiang Zhai, Chong You, Zhihui Zhu, Carlos Fernandez-Granda, Qing Qu

Furthermore, we show that our ConvNorm can reduce the layerwise spectral norm of the weight matrices and hence improve the Lipschitzness of the network, leading to easier training and improved robustness for deep ConvNets.

Generative Adversarial Network

Hyperspectral Image Denoising via Multi-modal and Double-weighted Tensor Nuclear Norm

no code implementations19 Jan 2021 Sheng Liu, Xiaozhen Xie, Wenfeng Kong

In the Fourier transform domain of HSIs, different frequency slices (FS) contain different information; different singular values (SVs) of each FS also represent different information.

Hyperspectral Image Denoising Image Denoising +1

Adversarial Multiscale Feature Learning for Overlapping Chromosome Segmentation

1 code implementation22 Dec 2020 Liye Mei, Yalan Yu, Yueyun Weng, Xiaopeng Guo, Yan Liu, Du Wang, Sheng Liu, Fuling Zhou, Cheng Lei

Since manual analysis is highly time and effort consuming, computer-assisted automatic chromosome karyotype analysis based on images is routinely used to improve the efficiency and accuracy of the analysis.

Generative Adversarial Network Segmentation

Urban Bike Lane Planning with Bike Trajectories: Models, Algorithms, and a Real-World Case Study

no code implementations21 Aug 2020 Sheng Liu, Zuo-Jun Max Shen, Xiang Ji

We formalize the bike lane planning problem in view of the cyclists' utility functions and derive an integer optimization model to maximize the utility.

Management

Early-Learning Regularization Prevents Memorization of Noisy Labels

2 code implementations NeurIPS 2020 Sheng Liu, Jonathan Niles-Weed, Narges Razavian, Carlos Fernandez-Granda

In contrast with existing approaches, which use the model output during early learning to detect the examples with clean labels, and either ignore or attempt to correct the false labels, we take a different route and instead capitalize on early learning via regularization.

General Classification Learning with noisy labels +1

Towards Understanding the Adversarial Vulnerability of Skeleton-based Action Recognition

no code implementations14 May 2020 Tianhang Zheng, Sheng Liu, Changyou Chen, Junsong Yuan, Baochun Li, Kui Ren

We first formulate generation of adversarial skeleton actions as a constrained optimization problem by representing or approximating the physiological and physical constraints with mathematical formulations.

Action Recognition Skeleton Based Action Recognition

Machine Discovery of Partial Differential Equations from Spatiotemporal Data

1 code implementation15 Sep 2019 Ye Yuan, Junlin Li, Liang Li, Frank Jiang, Xiuchuan Tang, Fumin Zhang, Sheng Liu, Jorge Goncalves, Henning U. Voss, Xiuting Li, Jürgen Kurths, Han Ding

The study presents a general framework for discovering underlying Partial Differential Equations (PDEs) using measured spatiotemporal data.

Sparse Recovery Beyond Compressed Sensing: Separable Nonlinear Inverse Problems

no code implementations12 May 2019 Brett Bernstein, Sheng Liu, Chrysa Papadaniil, Carlos Fernandez-Granda

In this work, we consider separable inverse problems, where the data are modeled as a linear combination of functions that depend nonlinearly on certain parameters of interest.

compressed sensing Geophysics

Time-Series Analysis via Low-Rank Matrix Factorization Applied to Infant-Sleep Data

no code implementations9 Apr 2019 Sheng Liu, Mark Cheng, Hayley Brooks, Wayne Mackey, David J. Heeger, Esteban G. Tabak, Carlos Fernandez-Granda

We apply our methodology to detect anomalous individuals, to cluster the cohort into groups with different sleeping tendencies, and to obtain improved predictions of future sleep behavior.

Time Series Time Series Analysis

Discovering Influential Factors in Variational Autoencoder

1 code implementation6 Sep 2018 Shiqi Liu, Jingxin Liu, Qian Zhao, Xiangyong Cao, Huibin Li, Hongy-ing Meng, Sheng Liu, Deyu Meng

In the field of machine learning, it is still a critical issue to identify and supervise the learned representation without manually intervening or intuition assistance to extract useful knowledge or serve for the downstream tasks.

General Classification

Defect detection for patterned fabric images based on GHOG and low-rank decomposition

no code implementations18 Feb 2017 Chunlei Li, Guangshuai Gao, Zhoufeng Liu, Di Huang, Sheng Liu, Miao Yu

In order to accurately detect defects in patterned fabric images, a novel detection algorithm based on Gabor-HOG (GHOG) and low-rank decomposition is proposed in this paper.

Computational Efficiency Defect Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.