Search Results for author: Ming Liu

Found 271 papers, 104 papers with code

Less Is More: Domain Adaptation with Lottery Ticket for Reading Comprehension

1 code implementation Findings (EMNLP) 2021 Haichao Zhu, Zekun Wang, Heng Zhang, Ming Liu, Sendong Zhao, Bing Qin

Then, we only fine-tune the lottery subnetwork, a small fraction of the whole parameters, on the annotated target domain data for adaptation.

Domain Adaptation Reading Comprehension

AMLN: Adversarial-based Mutual Learning Network for Online Knowledge Distillation

no code implementations ECCV 2020 Xiaobing Zhang, Shijian Lu, Haigang Gong, Zhipeng Luo, Ming Liu

Online knowledge distillation has attracted increasing interest recently, which jointly learns teacher and student models or an ensemble of student models simultaneously and collaboratively.

Knowledge Distillation Transfer Learning

On Fairness of Unified Multimodal Large Language Model for Image Generation

no code implementations5 Feb 2025 Ming Liu, Hao Chen, Jindong Wang, LiWen Wang, Bhiksha Raj Ramakrishnan, Wensheng Zhang

Unified multimodal large language models (U-MLLMs) have demonstrated impressive performance in visual understanding and generation in an end-to-end pipeline.

Fairness Image Generation +4

From In Silico to In Vitro: A Comprehensive Guide to Validating Bioinformatics Findings

no code implementations24 Jan 2025 Tianyang Wang, Silin Chen, Yunze Wang, Yichao Zhang, Xinyuan Song, Ziqian Bi, Ming Liu, Qian Niu, Junyu Liu, Pohsun Feng, Xintian Sun, Benji Peng, Charles Zhang, Keyu Chen, Ming Li, Cheng Fei, Lawrence KQ Yan

The integration of bioinformatics predictions and experimental validation plays a pivotal role in advancing biological research, from understanding molecular mechanisms to developing therapeutic strategies.

TAD-Bench: A Comprehensive Benchmark for Embedding-Based Text Anomaly Detection

no code implementations21 Jan 2025 Yang Cao, Sikun Yang, Chen Li, Haolong Xiang, Lianyong Qi, Bo Liu, Rongsheng Li, Ming Liu

Text anomaly detection is crucial for identifying spam, misinformation, and offensive language in natural language processing tasks.

Anomaly Detection Misinformation

From Aleatoric to Epistemic: Exploring Uncertainty Quantification Techniques in Artificial Intelligence

no code implementations5 Jan 2025 Tianyang Wang, Yunze Wang, Jun Zhou, Benji Peng, Xinyuan Song, Charles Zhang, Xintian Sun, Qian Niu, Junyu Liu, Silin Chen, Keyu Chen, Ming Li, Pohsun Feng, Ziqian Bi, Ming Liu, Yichao Zhang, Cheng Fei, Caitlyn Heqi Yin, Lawrence KQ Yan

Uncertainty quantification (UQ) is a critical aspect of artificial intelligence (AI) systems, particularly in high-risk domains such as healthcare, autonomous systems, and financial technology, where decision-making processes must account for uncertainty.

Decision Making Ensemble Learning +1

Retrieval Augmented Image Harmonization

no code implementations18 Dec 2024 Haolin Wang, Ming Liu, Zifei Yan, Chao Zhou, Longan Xiao, WangMeng Zuo

When embedding objects (foreground) into images (background), considering the influence of photography conditions like illumination, it is usually necessary to perform image harmonization to make the foreground object coordinate with the background image in terms of brightness, color, and etc.

Data Augmentation Image Harmonization +1

From Specific-MLLM to Omni-MLLM: A Survey about the MLLMs alligned with Multi-Modality

no code implementations16 Dec 2024 Shixin Jiang, Jiafeng Liang, Ming Liu, Bing Qin

From the Specific-MLLM, which excels in single-modal tasks, to the Omni-MLLM, which extends the range of general modalities, this evolution aims to achieve understanding and generation of multimodal information.

From Noise to Nuance: Advances in Deep Generative Image Models

no code implementations12 Dec 2024 Benji Peng, Chia Xin Liang, Ziqian Bi, Ming Liu, Yichao Zhang, Tianyang Wang, Keyu Chen, Xinyuan Song, Pohsun Feng

We examine how recent developments in Stable Diffusion, DALL-E, and consistency models have redefined the capabilities and performance boundaries of image synthesis, while addressing persistent challenges in efficiency and quality.

Computational Efficiency Image Generation

Learning Spatially Decoupled Color Representations for Facial Image Colorization

no code implementations10 Dec 2024 Hangyan Zhu, Ming Liu, Chao Zhou, Zifei Yan, Kuanquan Wang, WangMeng Zuo

To expand the application paradigms to scenarios with no reference images, we further train two alternative modules, which predict the color representations from the grayscale input or a random seed, respectively.

Colorization Face Parsing +1

Deep Learning, Machine Learning, Advancing Big Data Analytics and Management

no code implementations3 Dec 2024 Weiche Hsieh, Ziqian Bi, Keyu Chen, Benji Peng, Sen Zhang, Jiawei Xu, Jinlang Wang, Caitlyn Heqi Yin, Yichao Zhang, Pohsun Feng, Yizhu Wen, Tianyang Wang, Ming Li, Chia Xin Liang, Jintao Ren, Qian Niu, Silin Chen, Lawrence K. Q. Yan, Han Xu, Hong-Ming Tseng, Xinyuan Song, Bowen Jing, Junjie Yang, Junhao Song, Junyu Liu, Ming Liu

This work explores the theoretical foundations, methodological advancements, and practical implementations of these technologies, emphasizing their role in uncovering actionable insights from massive, high-dimensional datasets.

Anomaly Detection Deep Learning +6

Real-Time Metric-Semantic Mapping for Autonomous Navigation in Outdoor Environments

1 code implementation30 Nov 2024 Jianhao Jiao, Ruoyu Geng, Yuanhang Li, Ren Xin, Bowen Yang, Jin Wu, Lujia Wang, Ming Liu, Rui Fan, Dimitrios Kanoulas

The creation of a metric-semantic map, which encodes human-prior knowledge, represents a high-level abstraction of environments.

Autonomous Navigation

A Comprehensive Survey and Guide to Multimodal Large Language Models in Vision-Language Tasks

no code implementations9 Nov 2024 Chia Xin Liang, Pu Tian, Caitlyn Heqi Yin, Yao Yua, Wei An-Hou, Li Ming, Tianyang Wang, Ziqian Bi, Ming Liu

This survey and application guide to multimodal large language models(MLLMs) explores the rapidly developing field of MLLMs, examining their architectures, applications, and impact on AI and Generative Models.

Visual Storytelling

Large Language Model Benchmarks in Medical Tasks

no code implementations28 Oct 2024 Lawrence K. Q. Yan, Qian Niu, Ming Li, Yichao Zhang, Caitlyn Heqi Yin, Cheng Fei, Benji Peng, Ziqian Bi, Pohsun Feng, Keyu Chen, Tianyang Wang, Yunze Wang, Silin Chen, Ming Liu, Junyu Liu

With the increasing application of large language models (LLMs) in the medical domain, evaluating these models' performance using benchmark datasets has become crucial.

Image Captioning Language Modeling +7

Deep Learning, Machine Learning -- Digital Signal and Image Processing: From Theory to Application

no code implementations27 Oct 2024 Weiche Hsieh, Ziqian Bi, Junyu Liu, Benji Peng, Sen Zhang, Xuanhe Pan, Jiawei Xu, Jinlang Wang, Keyu Chen, Caitlyn Heqi Yin, Pohsun Feng, Yizhu Wen, Tianyang Wang, Ming Li, Jintao Ren, Qian Niu, Silin Chen, Ming Liu

Digital Signal Processing (DSP) and Digital Image Processing (DIP) with Machine Learning (ML) and Deep Learning (DL) are popular research areas in Computer Vision and related fields.

Image Enhancement

FMCW Radar Principles and Human Activity Recognition Systems: Foundations, Techniques, and Applications

no code implementations11 Oct 2024 Ziqian Bi, Jiawei Xu, Ming Liu

This book introduces the theoretical foundations of FMCW radar systems, including range and velocity estimation, signal processing techniques, and the generation of radar point clouds.

Human Activity Recognition

Deep Learning and Machine Learning: Advancing Big Data Analytics and Management with Design Patterns

no code implementations4 Oct 2024 Keyu Chen, Ziqian Bi, Tianyang Wang, Yizhu Wen, Pohsun Feng, Qian Niu, Junyu Liu, Benji Peng, Sen Zhang, Ming Li, Xuanhe Pan, Jiawei Xu, Jinlang Wang, Ming Liu

This book, Design Patterns in Machine Learning and Deep Learning: Advancing Big Data Analytics Management, presents a comprehensive study of essential design patterns tailored for large-scale machine learning and deep learning applications.

Management

Deep Learning and Machine Learning, Advancing Big Data Analytics and Management: Unveiling AI's Potential Through Tools, Techniques, and Applications

no code implementations2 Oct 2024 Pohsun Feng, Ziqian Bi, Yizhu Wen, Xuanhe Pan, Benji Peng, Ming Liu, Jiawei Xu, Keyu Chen, Junyu Liu, Caitlyn Heqi Yin, Sen Zhang, Jinlang Wang, Qian Niu, Ming Li, Tianyang Wang

Artificial intelligence (AI), machine learning, and deep learning have become transformative forces in big data analytics and management, enabling groundbreaking advancements across diverse industries.

AutoML Edge-computing +3

Deep Learning and Machine Learning, Advancing Big Data Analytics and Management: Object-Oriented Programming

no code implementations30 Sep 2024 Tianyang Wang, Ziqian Bi, Keyu Chen, Jiawei Xu, Qian Niu, Junyu Liu, Benji Peng, Ming Li, Sen Zhang, Xuanhe Pan, Jinlang Wang, Pohsun Feng, Yizhu Wen, Ming Liu

Object-Oriented Programming (OOP) has become a crucial paradigm for managing the growing complexity of modern software systems, particularly in fields like machine learning, deep learning, large language models (LLM), and data analytics.

Management

Annotation-Free Curb Detection Leveraging Altitude Difference Image

no code implementations30 Sep 2024 Fulong Ma, Peng Hou, Yuxuan Liu, Ming Liu, Jun Ma

This module utilizes a deterministic curb detection algorithm to automatically generate a vast quantity of training data.

Autonomous Vehicles

Task-Oriented Pre-Training for Drivable Area Detection

no code implementations30 Sep 2024 Fulong Ma, Guoyang Zhao, Weiqing Qi, Ming Liu, Jun Ma

By initially training on large datasets and subsequently fine-tuning on task-specific data, pre-training provides a solid foundation for models, improving generalization abilities and accelerating convergence rates.

Drivable Area Detection

CoT-ST: Enhancing LLM-based Speech Translation with Multimodal Chain-of-Thought

1 code implementation29 Sep 2024 Yexing Du, Ziyang Ma, Yifan Yang, Keqi Deng, Xie Chen, Bo Yang, Yang Xiang, Ming Liu, Bing Qin

We propose CoT-ST, a speech translation model that utilizes multimodal CoT to decompose speech translation into sequential steps of speech recognition and translation.

speech-recognition Speech Recognition +1

Deep Learning and Machine Learning, Advancing Big Data Analytics and Management: Handy Appetizer

no code implementations25 Sep 2024 Benji Peng, Xuanhe Pan, Yizhu Wen, Ziqian Bi, Keyu Chen, Ming Li, Ming Liu, Qian Niu, Junyu Liu, Jinlang Wang, Sen Zhang, Jiawei Xu, Pohsun Feng

This book explores the role of Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) in driving the progress of big data analytics and management.

Autonomous Driving Deep Learning +2

FisheyeDepth: A Real Scale Self-Supervised Depth Estimation Model for Fisheye Camera

1 code implementation23 Sep 2024 Guoyang Zhao, Yuxuan Liu, Weiqing Qi, Fulong Ma, Ming Liu, Jun Ma

We incorporate a fisheye camera model into the projection and reprojection stages during training to handle image distortions, thereby improving depth estimation accuracy and training stability.

Autonomous Vehicles Depth Estimation

TSCLIP: Robust CLIP Fine-Tuning for Worldwide Cross-Regional Traffic Sign Recognition

1 code implementation23 Sep 2024 Guoyang Zhao, Fulong Ma, Weiqing Qi, Chenguang Zhang, Yuxuan Liu, Ming Liu, Jun Ma

In this paper, we propose TSCLIP, a robust fine-tuning approach with the contrastive language-image pre-training (CLIP) model for worldwide cross-regional traffic sign recognition.

Prompt Engineering Traffic Sign Recognition

Deep Learning and Machine Learning, Advancing Big Data Analytics and Management: Tensorflow Pretrained Models

no code implementations20 Sep 2024 Keyu Chen, Ziqian Bi, Qian Niu, Junyu Liu, Benji Peng, Sen Zhang, Ming Liu, Ming Li, Xuanhe Pan, Jiawei Xu, Jinlang Wang, Pohsun Feng

The application of TensorFlow pre-trained models in deep learning is explored, with an emphasis on practical guidance for tasks such as image classification and object detection.

Deep Learning Image Classification +4

CFSP: An Efficient Structured Pruning Framework for LLMs with Coarse-to-Fine Activation Information

1 code implementation20 Sep 2024 Yuxin Wang, Minghua Ma, Zekun Wang, Jingchang Chen, Huiming Fan, Liping Shan, Qing Yang, Dongliang Xu, Ming Liu, Bing Qin

To this end, we introduce an efficient structured pruning framework named CFSP, which leverages both Coarse (interblock) and Fine-grained (intrablock) activation information as an importance criterion to guide pruning.

Network Pruning

From Text to Multimodality: Exploring the Evolution and Impact of Large Language Models in Medical Practice

no code implementations14 Sep 2024 Qian Niu, Keyu Chen, Ming Li, Pohsun Feng, Ziqian Bi, Lawrence KQ Yan, Yichao Zhang, Caitlyn Heqi Yin, Cheng Fei, Junyu Liu, Benji Peng, Tianyang Wang, Yunze Wang, Silin Chen, Ming Liu

This comprehensive review explores the progression of LLMs to Multimodal Large Language Models (MLLMs) and their growing influence in medical practice.

Large Language Models and Cognitive Science: A Comprehensive Review of Similarities, Differences, and Challenges

no code implementations4 Sep 2024 Qian Niu, Junyu Liu, Ziqian Bi, Pohsun Feng, Benji Peng, Keyu Chen, Ming Li, Lawrence KQ Yan, Yichao Zhang, Caitlyn Heqi Yin, Cheng Fei, Tianyang Wang, Yunze Wang, Silin Chen, Ming Liu

This comprehensive review explores the intersection of Large Language Models (LLMs) and cognitive science, examining similarities and differences between LLMs and human cognitive processes.

Towards Efficient Large Language Models for Scientific Text: A Review

no code implementations20 Aug 2024 Huy Quoc To, Ming Liu, Guangyan Huang

Large language models (LLMs) have ushered in a new era for processing complex information in various fields, including science.

CEGRL-TKGR: A Causal Enhanced Graph Representation Learning Framework for Temporal Knowledge Graph Reasoning

no code implementations15 Aug 2024 Jinze Sun, Yongpan Sheng, Lirong He, Yongbin Qin, Ming Liu, Tao Jia

This framework introduces causal structures in graph-based representation learning to unveil the essential causal relationships between events, ultimately enhancing the performance of the TKGR task.

Graph Representation Learning Knowledge Graphs +1

Dynamic neural network with memristive CIM and CAM for 2D and 3D vision

no code implementations12 Jul 2024 Yue Zhang, Woyu Zhang, Shaocong Wang, Ning Lin, Yifei Yu, Yangu He, Bo wang, Hao Jiang, Peng Lin, Xiaoxin Xu, Xiaojuan Qi, Zhongrui Wang, Xumeng Zhang, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu

In contrast, AI models are static, unable to associate inputs with past experiences, and run on digital computers with physically separated memory and processing.

GraCoRe: Benchmarking Graph Comprehension and Complex Reasoning in Large Language Models

1 code implementation3 Jul 2024 Zike Yuan, Ming Liu, Hui Wang, Bing Qin

Evaluating the graph comprehension and reasoning abilities of Large Language Models (LLMs) is challenging and often incomplete.

Benchmarking

BeamAggR: Beam Aggregation Reasoning over Multi-source Knowledge for Multi-hop Question Answering

no code implementations28 Jun 2024 Zheng Chu, Jingchang Chen, Qianglong Chen, Haotian Wang, Kun Zhu, Xiyuan Du, Weijiang Yu, Ming Liu, Bing Qin

For composite questions, the LLM combines beam candidates, explores multiple reasoning paths through probabilistic aggregation, and prioritizes the most promising trajectory.

Multi-hop Question Answering Question Answering +1

GUIDE: A Guideline-Guided Dataset for Instructional Video Comprehension

no code implementations26 Jun 2024 Jiafeng Liang, Shixin Jiang, Zekun Wang, Haojie Pan, Zerui Chen, Zheng Chu, Ming Liu, Ruiji Fu, Zhongyuan Wang, Bing Qin

Our proposed benchmark consists of three sub-tasks to evaluate comprehension ability of models: (1) Step Captioning: models have to generate captions for specific steps from videos.

Continuous-Time Digital Twin with Analogue Memristive Neural Ordinary Differential Equation Solver

1 code implementation12 Jun 2024 Hegan Chen, Jichang Yang, Jia Chen, Songqi Wang, Shaocong Wang, Dingchen Wang, Xinyu Tian, Yifei Yu, Xi Chen, Yinan Lin, Yangu He, Xiaoshan Wu, Xinyuan Zhang, Ning Lin, Meng Xu, Yi Li, Xumeng Zhang, Zhongrui Wang, Han Wang, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu

We experimentally validate our approach by developing a digital twin of the HP memristor, which accurately extrapolates its nonlinear dynamics, achieving a 4. 2-fold projected speedup and a 41. 4-fold projected decrease in energy consumption compared to state-of-the-art digital hardware, while maintaining an acceptable error margin.

GLAD: Towards Better Reconstruction with Global and Local Adaptive Diffusion Models for Unsupervised Anomaly Detection

1 code implementation11 Jun 2024 Hang Yao, Ming Liu, Haolin Wang, Zhicun Yin, Zifei Yan, Xiaopeng Hong, WangMeng Zuo

Therefore, instead of utilizing the same setting for all samples, we propose to predict a particular denoising step for each sample by evaluating the difference between image contents and the priors extracted from diffusion models.

Denoising Unsupervised Anomaly Detection

Planning Like Human: A Dual-process Framework for Dialogue Planning

1 code implementation8 Jun 2024 Tao He, Lizi Liao, Yixin Cao, Yuanxing Liu, Ming Liu, Zerui Chen, Bing Qin

In proactive dialogue, the challenge lies not just in generating responses but in steering conversations toward predetermined goals, a task where Large Language Models (LLMs) typically struggle due to their reactive nature.

Prompt Engineering

Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation

no code implementations30 May 2024 Jingchang Chen, Hongxuan Tang, Zheng Chu, Qianglong Chen, Zekun Wang, Ming Liu, Bing Qin

To this end, we propose FunCoder, a code generation framework incorporating the divide-and-conquer strategy with functional consensus.

Code Generation HumanEval +1

LLM as a Complementary Optimizer to Gradient Descent: A Case Study in Prompt Tuning

1 code implementation30 May 2024 Zixian Guo, Ming Liu, Zhilong Ji, Jinfeng Bai, Yiwen Guo, WangMeng Zuo

Inferred results of LLMs are used as restarting points for the next stage of gradient optimization.

CLRKDNet: Speeding up Lane Detection with Knowledge Distillation

1 code implementation21 May 2024 Weiqing Qi, Guoyang Zhao, Fulong Ma, Linwei Zheng, Ming Liu

Road lanes are integral components of the visual perception systems in intelligent vehicles, playing a pivotal role in safe navigation.

Autonomous Driving Knowledge Distillation +1

Rethinking ChatGPT's Success: Usability and Cognitive Behaviors Enabled by Auto-regressive LLMs' Prompting

no code implementations17 May 2024 Xinzhe Li, Ming Liu

Over the last decade, a wide range of training and deployment strategies for Large Language Models (LLMs) have emerged.

BeautyMap: Binary-Encoded Adaptable Ground Matrix for Dynamic Points Removal in Global Maps

2 code implementations12 May 2024 Mingkai Jia, Qingwen Zhang, Bowen Yang, Jin Wu, Ming Liu, Patric Jensfelt

In response, we present BeautyMap to efficiently remove the dynamic points while retaining static features for high-fidelity global maps.

Computational Efficiency

GRAMMAR: Grounded and Modular Methodology for Assessment of Closed-Domain Retrieval-Augmented Language Model

1 code implementation30 Apr 2024 Xinzhe Li, Ming Liu, Shang Gao

Retrieval-Augmented Generation (RAG) systems are widely used across various industries for querying closed-domain and in-house knowledge bases.

Language Modeling Language Modelling +2

A Survey on the Real Power of ChatGPT

1 code implementation22 Apr 2024 Ming Liu, Ran Liu, Ye Zhu, Hua Wang, Youyang Qu, Rongsheng Li, Yongpan Sheng, Wray Buntine

ChatGPT has changed the AI community and an active research line is the performance evaluation of ChatGPT.

Survey

Efficient and accurate neural field reconstruction using resistive memory

no code implementations15 Apr 2024 Yifei Yu, Shaocong Wang, Woyu Zhang, Xinyuan Zhang, Xiuzhe Wu, Yangu He, Jichang Yang, Yue Zhang, Ning Lin, Bo wang, Xi Chen, Songqi Wang, Xumeng Zhang, Xiaojuan Qi, Zhongrui Wang, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu

The GE harnesses the intrinsic stochasticity of resistive memory for efficient input encoding, while the PE achieves precise weight mapping through a Hardware-Aware Quantization (HAQ) circuit.

Novel View Synthesis Quantization

SmartControl: Enhancing ControlNet for Handling Rough Visual Conditions

1 code implementation9 Apr 2024 Xiaoyu Liu, Yuxiang Wei, Ming Liu, Xianhui Lin, Peiran Ren, Xuansong Xie, WangMeng Zuo

The key idea of our SmartControl is to relax the visual condition on the areas that are conflicted with text prompts.

Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model

1 code implementation8 Apr 2024 Jichang Yang, Hegan Chen, Jia Chen, Songqi Wang, Shaocong Wang, Yifei Yu, Xi Chen, Bo wang, Xinyuan Zhang, Binbin Cui, Ning Lin, Meng Xu, Yi Li, Xiaoxin Xu, Xiaojuan Qi, Zhongrui Wang, Xumeng Zhang, Dashan Shang, Han Wang, Qi Liu, Kwang-Ting Cheng, Ming Liu

Demonstrating equivalent generative quality to the software baseline, our system achieved remarkable enhancements in generative speed for both unconditional and conditional generation tasks, by factors of 64. 8 and 156. 5, respectively.

Edge-computing

OmniColor: A Global Camera Pose Optimization Approach of LiDAR-360Camera Fusion for Colorizing Point Clouds

1 code implementation6 Apr 2024 Bonan Liu, Guoyang Zhao, Jianhao Jiao, Guang Cai, Chengyang Li, Handi Yin, Yuyang Wang, Ming Liu, Pan Hui

A Colored point cloud, as a simple and efficient 3D representation, has many advantages in various fields, including robotic navigation and scene reconstruction.

3D Reconstruction

CurbNet: Curb Detection Framework Based on LiDAR Point Cloud Segmentation

1 code implementation25 Mar 2024 Guoyang Zhao, Fulong Ma, Weiqing Qi, Yuxuan Liu, Ming Liu, Jun Ma

Curb detection is a crucial function in intelligent driving, essential for determining drivable areas on the road.

Point Cloud Segmentation

Hi-Map: Hierarchical Factorized Radiance Field for High-Fidelity Monocular Dense Mapping

no code implementations6 Jan 2024 Tongyan Hua, Haotian Bai, Zidong Cao, Ming Liu, DaCheng Tao, Lin Wang

In this paper, we introduce Hi-Map, a novel monocular dense mapping approach based on Neural Radiance Field (NeRF).

Depth Estimation NeRF

Black-Box Tuning of Vision-Language Models with Effective Gradient Approximation

1 code implementation26 Dec 2023 Zixian Guo, Yuxiang Wei, Ming Liu, Zhilong Ji, Jinfeng Bai, Yiwen Guo, WangMeng Zuo

Parameter-efficient fine-tuning (PEFT) methods have provided an effective way for adapting large vision-language models to specific tasks or scenarios.

parameter-efficient fine-tuning

U2-KWS: Unified Two-pass Open-vocabulary Keyword Spotting with Keyword Bias

no code implementations15 Dec 2023 Ao Zhang, Pan Zhou, Kaixun Huang, Yong Zou, Ming Liu, Lei Xie

Open-vocabulary keyword spotting (KWS), which allows users to customize keywords, has attracted increasingly more interest.

Decoder Keyword Spotting

Random resistive memory-based deep extreme point learning machine for unified visual processing

no code implementations14 Dec 2023 Shaocong Wang, Yizhao Gao, Yi Li, Woyu Zhang, Yifei Yu, Bo wang, Ning Lin, Hegan Chen, Yue Zhang, Yang Jiang, Dingchen Wang, Jia Chen, Peng Dai, Hao Jiang, Peng Lin, Xumeng Zhang, Xiaojuan Qi, Xiaoxin Xu, Hayden So, Zhongrui Wang, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu

Our random resistive memory-based deep extreme point learning machine may pave the way for energy-efficient and training-friendly edge AI across various data modalities and tasks.

KwaiAgents: Generalized Information-seeking Agent System with Large Language Models

1 code implementation8 Dec 2023 Haojie Pan, Zepeng Zhai, Hao Yuan, Yaojia LV, Ruiji Fu, Ming Liu, Zhongyuan Wang, Bing Qin

Driven by curiosity, humans have continually sought to explore and understand the world around them, leading to the invention of various tools to satiate this inquisitiveness.

Generalized Label-Efficient 3D Scene Parsing via Hierarchical Feature Aligned Pre-Training and Region-Aware Fine-tuning

1 code implementation1 Dec 2023 Kangcheng Liu, Yong-Jin Liu, Kai Tang, Ming Liu, Baoquan Chen

Deep neural network models have achieved remarkable progress in 3D scene understanding while trained in the closed-set setting and with full labels.

Contrastive Learning Few-Shot Learning +2

TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models

1 code implementation29 Nov 2023 Zheng Chu, Jingchang Chen, Qianglong Chen, Weijiang Yu, Haotian Wang, Ming Liu, Bing Qin

Grasping the concept of time is a fundamental facet of human cognition, indispensable for truly comprehending the intricacies of the world.

GDTS: Goal-Guided Diffusion Model with Tree Sampling for Multi-Modal Pedestrian Trajectory Prediction

no code implementations25 Nov 2023 Ge Sun, Sheng Wang, Lei Zhu, Ming Liu, Jun Ma

To address these challenges and facilitate the use of diffusion models in multi-modal trajectory prediction, we propose GDTS, a novel Goal-Guided Diffusion Model with Tree Sampling for multi-modal trajectory prediction.

Autonomous Driving Denoising +3

Pruning random resistive memory for optimizing analogue AI

no code implementations13 Nov 2023 Yi Li, Songqi Wang, Yaping Zhao, Shaocong Wang, Woyu Zhang, Yangu He, Ning Lin, Binbin Cui, Xi Chen, Shiming Zhang, Hao Jiang, Peng Lin, Xumeng Zhang, Xiaojuan Qi, Zhongrui Wang, Xiaoxin Xu, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu

Here, we report a universal solution, software-hardware co-design using structural plasticity-inspired edge pruning to optimize the topology of a randomly weighted analogue resistive memory neural network.

Audio Classification Image Segmentation +1

MTGER: Multi-view Temporal Graph Enhanced Temporal Reasoning over Time-Involved Document

no code implementations8 Nov 2023 Zheng Chu, Zekun Wang, Jiafeng Liang, Ming Liu, Bing Qin

To address this issue, we propose MTGER, a novel Multi-view Temporal Graph Enhanced Temporal Reasoning framework for temporal reasoning over time-involved documents.

Learning to Learn for Few-shot Continual Active Learning

no code implementations7 Nov 2023 Stella Ho, Ming Liu, Shang Gao, Longxiang Gao

Recent advances in continual learning are mostly confined to a supervised learning setting, especially in NLP domain.

Active Learning Continual Learning +3

Myriad: Large Multimodal Model by Applying Vision Experts for Industrial Anomaly Detection

1 code implementation29 Oct 2023 Yuanze Li, Haolin Wang, Shihao Yuan, Ming Liu, Debin Zhao, Yiwen Guo, Chen Xu, Guangming Shi, WangMeng Zuo

To stimulate the relevant knowledge in LMMs and adapt the LMMs towards anomaly detection tasks, we introduce existing IAD methods as vision experts and present a novel large multimodal model applying vision experts for industrial anomaly detection~(abbreviated to {Myriad}).

Anomaly Detection Image Captioning +2

Beyond Image Borders: Learning Feature Extrapolation for Unbounded Image Composition

1 code implementation ICCV 2023 Xiaoyu Liu, Ming Liu, Junyi Li, Shuai Liu, Xiaotao Wang, Lei Lei, WangMeng Zuo

In this paper, we circumvent this issue by presenting a joint framework for both unbounded recommendation of camera view and image composition (i. e., UNIC).

Image Cropping

MetaF2N: Blind Image Super-Resolution by Learning Efficient Model Adaptation from Faces

1 code implementation ICCV 2023 Zhicun Yin, Ming Liu, Xiaoming Li, Hui Yang, Longan Xiao, WangMeng Zuo

To evaluate our proposed MetaF2N, we have collected a real-world low-quality dataset with one or multiple faces in each image, and our MetaF2N achieves superior performance on both synthetic and real-world datasets.

Image Generation Image Super-Resolution +1

Efficient Ray Sampling for Radiance Fields Reconstruction

no code implementations29 Aug 2023 Shilei Sun, Ming Liu, Zhongyi Fan, Yuxue Liu, Chengwei Lv, Liquan Dong, Lingqin Kong

Through this method, not only can the convergence of the network be accelerated, but the spatial geometry of a scene can also be perceived more accurately.

NeRF

Forecast-MAE: Self-supervised Pre-training for Motion Forecasting with Masked Autoencoders

2 code implementations ICCV 2023 Jie Cheng, Xiaodong Mei, Ming Liu

This study explores the application of self-supervised learning (SSL) to the task of motion forecasting, an area that has not yet been extensively investigated despite the widespread success of SSL in computer vision and natural language processing.

Inductive Bias Motion Forecasting +1

Recent Advances in Hierarchical Multi-label Text Classification: A Survey

no code implementations30 Jul 2023 Rundong Liu, Wenhan Liang, Weijun Luo, Yuxiang Song, He Zhang, Ruohua Xu, Yunfeng Li, Ming Liu

Hierarchical multi-label text classification aims to classify the input text into multiple labels, among which the labels are structured and hierarchical.

Multi Label Text Classification Multi-Label Text Classification +2

Car-Studio: Learning Car Radiance Fields from Single-View and Endless In-the-wild Images

1 code implementation26 Jul 2023 Tianyu Liu, Hao Zhao, Yang Yu, Guyue Zhou, Ming Liu

However, previous studies learned within a sequence of autonomous driving datasets, resulting in unsatisfactory blurring when rotating the car in the simulator.

Autonomous Driving

Adaptive Control of Resource Flow to Optimize Construction Work and Cash Flow via Online Deep Reinforcement Learning

no code implementations20 Jul 2023 Can Jiang, Xin Li, Jia-Rui Lin, Ming Liu, Zhiliang Ma

Therefore, this paper introducess a model and method to adaptive control the resource flows to optimize the work and cash flows of construction projects.

Deep Reinforcement Learning Management

Make Text Unlearnable: Exploiting Effective Patterns to Protect Personal Data

1 code implementation2 Jul 2023 Xinzhe Li, Ming Liu, Shang Gao

This paper addresses the ethical concerns arising from the use of unauthorized public data in deep learning models and proposes a novel solution.

Question Answering text-classification +1

A Survey on Out-of-Distribution Evaluation of Neural NLP Models

no code implementations27 Jun 2023 Xinzhe Li, Ming Liu, Shang Gao, Wray Buntine

Adversarial robustness, domain generalization and dataset biases are three active lines of research contributing to out-of-distribution (OOD) evaluation on neural NLP models.

Adversarial Robustness Domain Generalization +1

Can Pretrained Language Models Derive Correct Semantics from Corrupt Subwords under Noise?

1 code implementation27 Jun 2023 Xinzhe Li, Ming Liu, Shang Gao

For Pretrained Language Models (PLMs), their susceptibility to noise has recently been linked to subword segmentation.

Segmentation

SmartTrim: Adaptive Tokens and Attention Pruning for Efficient Vision-Language Models

no code implementations24 May 2023 Zekun Wang, Jingchang Chen, Wangchunshu Zhou, Haichao Zhu, Jiafeng Liang, Liping Shan, Ming Liu, Dongliang Xu, Qing Yang, Bing Qin

Despite achieving remarkable performance on various vision-language tasks, Transformer-based Vision-Language Models (VLMs) suffer from redundancy in inputs and parameters, significantly hampering their efficiency in real-world applications.

Data Augmentation

Automatic Localization and Detection Applicable to Robust Image Watermarking Resisting against Camera Shooting

no code implementations27 Apr 2023 Ming Liu

Furthermore, the proposed scheme is not limited to any specific watermark embedding strategy, allowing for improvements in the watermark embedding and extraction procedure.

Binary stochasticity enabled highly efficient neuromorphic deep learning achieves better-than-software accuracy

no code implementations25 Apr 2023 Yang Li, Wei Wang, Ming Wang, Chunmeng Dou, Zhengyu Ma, Huihui Zhou, Peng Zhang, Nicola Lepri, Xumeng Zhang, Qing Luo, Xiaoxin Xu, Guanhua Yang, Feng Zhang, Ling Li, Daniele Ielmini, Ming Liu

We propose a binary stochastic learning algorithm that modifies all elementary neural network operations, by introducing (i) stochastic binarization of both the forwarding signals and the activation function derivatives, (ii) signed binarization of the backpropagating errors, and (iii) step-wised weight updates.

Binarization Deep Learning

D2NT: A High-Performing Depth-to-Normal Translator

1 code implementation24 Apr 2023 Yi Feng, Bohuan Xue, Ming Liu, Qijun Chen, Rui Fan

Surface normal holds significant importance in visual environmental perception, serving as a source of rich geometric information.

Surface Normal Estimation Vocal Bursts Intensity Prediction

Human Guided Ground-truth Generation for Realistic Image Super-resolution

1 code implementation CVPR 2023 Du Chen, Jie Liang, Xindong Zhang, Ming Liu, Hui Zeng, Lei Zhang

A human guided GT image dataset with both positive and negative samples is then constructed, and a loss function is proposed to train the Real-ISR models.

Image Enhancement Image Super-Resolution

LCE-Calib: Automatic LiDAR-Frame/Event Camera Extrinsic Calibration With A Globally Optimal Solution

1 code implementation17 Mar 2023 Jianhao Jiao, Feiyi Chen, Hexiang Wei, Jin Wu, Ming Liu

This paper proposes an automatic checkerboard-based approach to calibrate extrinsics between a LiDAR and a frame/event camera, where four contributions are presented.

Generalized 3D Self-supervised Learning Framework via Prompted Foreground-Aware Feature Contrast

1 code implementation CVPR 2023 Kangcheng Liu, Xinhu Zheng, Chaoqun Wang, Kai Tang, Ming Liu, Baoquan Chen

The second is that we prevent over-discrimination between 3D segments/objects and encourage grouped foreground-to-background distinctions at the segment level with adaptive feature learning in a Siamese correspondence network, which adaptively learns feature correlations within and across point cloud views effectively.

3D Semantic Segmentation Contrastive Learning +8

An Efficient Approach to the Online Multi-Agent Path Finding Problem by Using Sustainable Information

no code implementations11 Jan 2023 Mingkai Tang, Boyi Liu, Yuanhang Li, Hongji Liu, Ming Liu, Lujia Wang

The low-level solver, the Sustainable Reverse Safe Interval Path Planning algorithm (SRSIPP), is an efficient single-agent solver that uses previous planning context to reduce duplicate calculations.

Computational Efficiency Multi-Agent Path Finding

Physics-Guided ISO-Dependent Sensor Noise Modeling for Extreme Low-Light Photography

1 code implementation CVPR 2023 Yue Cao, Ming Liu, Shuai Liu, Xiaotao Wang, Lei Lei, WangMeng Zuo

Although deep neural networks have achieved astonishing performance in many vision tasks, existing learning-based methods are far inferior to the physical model-based solutions in extreme low-light sensor noise modeling.

Image Denoising

GreenEyes: An Air Quality Evaluating Model based on WaveNet

1 code implementation8 Dec 2022 Kan Huang, Kai Zhang, Ming Liu

Accompanying rapid industrialization, humans are suffering from serious air pollution problems.

Self-Supervised Image Restoration with Blurry and Noisy Pairs

1 code implementation14 Nov 2022 Zhilu Zhang, Rongjian Xu, Ming Liu, Zifei Yan, WangMeng Zuo

By learning in a collaborative manner, the deblurring and denoising tasks in our method can benefit each other.

Deblurring Denoising +1

BigCilin: An Automatic Chinese Open-domain Knowledge Graph with Fine-grained Hypernym-Hyponym Relations

no code implementations7 Nov 2022 Ming Liu, Yaojia LV, Jingrun Zhang, Ruiji Fu, Bing Qin

One is that it supports querying any Chinese named entity and browsing the extracted hypernym-hyponym paths surro-unding the query entity.

How Far are We from Robust Long Abstractive Summarization?

1 code implementation30 Oct 2022 Huan Yee Koh, Jiaxin Ju, He Zhang, Ming Liu, Shirui Pan

For long document abstractive models, we show that the constant strive for state-of-the-art ROUGE results can lead us to generate more relevant summaries but not factual ones.

Abstractive Text Summarization

Kuaipedia: a Large-scale Multi-modal Short-video Encyclopedia

1 code implementation28 Oct 2022 Haojie Pan, Zepeng Zhai, Yuzhou Zhang, Ruiji Fu, Ming Liu, Yangqiu Song, Zhongyuan Wang, Bing Qin

In this paper, we propose Kuaipedia, a large-scale multi-modal encyclopedia consisting of items, aspects, and short videos lined to them, which was extracted from billions of videos of Kuaishou (Kwai), a well-known short-video platform in China.

Entity Linking Entity Typing

RNGDet++: Road Network Graph Detection by Transformer with Instance Segmentation and Multi-scale Features Enhancement

no code implementations21 Sep 2022 Zhenhua Xu, Yuxuan Liu, Yuxiang Sun, Ming Liu, Lujia Wang

To annotate road network graphs effectively and efficiently, automatic algorithms for road network graph detection are demanded.

Autonomous Driving