no code implementations • 13 Feb 2025 • Xiaohong Liu, Xulong Zhao, Gang Liu, Zili Wu, Tao Wang, Lei Meng, YuHan Wang
3D Multi-Object Tracking (MOT) provides the trajectories of surrounding objects, assisting robots or vehicles in smarter path planning and obstacle avoidance.
no code implementations • 31 Oct 2024 • Jinlong He, Pengfei Li, Gang Liu, Shenjun Zhong
Multimodal Large Language Models (MLLMs) inherit the superior text understanding capabilities of LLMs and extend these capabilities to multimodal scenarios.
no code implementations • 31 Oct 2024 • Chenxin Tu, Xiaowei Cui, Gang Liu, Sihao Zhao, Mingquan Lu
Extensive numerical simulations validate our theoretical analysis and demonstrate the effectiveness of the proposed method, highlighting its superiority over state-of-the-art approaches across various scenarios.
no code implementations • 14 Oct 2024 • Han Wang, Yilin Zhao, Dian Li, Xiaohan Wang, Gang Liu, Xuguang Lan, Hui Wang
Humor is a culturally nuanced aspect of human language that presents challenges for understanding and generation, requiring participants to possess good creativity and strong associative thinking.
1 code implementation • 9 Oct 2024 • Fangwei Zhu, Dian Li, Jiajun Huang, Gang Liu, Hui Wang, Zhifang Sui
After selecting a chip for classification, all layers subsequent to the attached layer could be removed with marginal performance loss.
1 code implementation • 5 Oct 2024 • Gang Liu, Michael Sun, Wojciech Matusik, Meng Jiang, Jie Chen
While large language models (LLMs) have integrated images, adapting them to graphs remains challenging, limiting their applications in materials and drug design.
1 code implementation • 26 Aug 2024 • Jiajun Fei, Dian Li, Zhidong Deng, Zekun Wang, Gang Liu, Hui Wang
To address these limitations, we apply cross-attention layers in the intermediate projector between the visual encoder and the large language model (LLM).
1 code implementation • 17 Jun 2024 • Gang Liu, Srijit Seal, John Arevalo, Zhenwen Liang, Anne E. Carpenter, Meng Jiang, Shantanu Singh
A sufficiency objective decodes the representation to align with different feature spaces from the molecule's neighborhood in the context graph.
no code implementations • 5 Mar 2024 • Gang Liu, Hongyang Li, Zerui He, Shenjun Zhong
In this paper, we introduce a method that incorporates gradient-guided parameter perturbations to the visual encoder of the multimodality model during both pre-training and fine-tuning phases, to improve model generalization for downstream medical VQA tasks.
no code implementations • 5 Mar 2024 • Zhiding Liang, Gang Liu, Zheyuan Liu, Jinglei Cheng, Tianyi Hao, Kecheng Liu, Hang Ren, Zhixin Song, Ji Liu, Fanny Ye, Yiyu Shi
In recent years, quantum computing has emerged as a transformative force in the field of combinatorial optimization, offering novel approaches to tackling complex problems that have long challenged classical computational methods.
no code implementations • 6 Feb 2024 • Zhenwen Liang, Kehan Guo, Gang Liu, Taicheng Guo, Yujun Zhou, Tianyu Yang, Jiajun Jiao, Renjie Pi, Jipeng Zhang, Xiangliang Zhang
The paper introduces SceMQA, a novel benchmark for scientific multimodal question answering at the college entrance level.
1 code implementation • 24 Jan 2024 • Gang Liu, Jiaxin Xu, Tengfei Luo, Meng Jiang
Inverse molecular design with diffusion models holds great potential for advancements in material and drug discovery.
1 code implementation • 5 Jan 2024 • Jinlong He, Pengfei Li, Gang Liu, Genrong He, Zhaolin Chen, Shenjun Zhong
In this paper, we propose a parameter efficient framework for fine-tuning MLLMs, specifically validated on medical visual question answering (Med-VQA) and medical report generation (MRG) tasks, using public benchmark datasets.
Ranked #1 on
Medical Visual Question Answering
on VQA-RAD
(using extra training data)
Medical Report Generation
Medical Visual Question Answering
+5
1 code implementation • 5 Dec 2023 • Bowen Jin, Gang Liu, Chi Han, Meng Jiang, Heng Ji, Jiawei Han
Besides, although LLMs have shown their pure text-based reasoning ability, it is underexplored whether such ability can be generalized to graphs (i. e., graph-based reasoning).
no code implementations • 30 Oct 2023 • Noah Ziems, Gang Liu, John Flanagan, Meng Jiang
Finally, we show LLM generated decision tree explanations correlate highly with human ratings of readability, quality, and use of background knowledge while simultaneously providing better understanding of decision boundaries.
1 code implementation • 8 Sep 2023 • Eric Inae, Gang Liu, Meng Jiang
Attribute reconstruction is used to predict node or edge features in the pre-training of graph neural networks.
1 code implementation • 11 Jul 2023 • Pengfei Li, Gang Liu, Jinlong He, Zixu Zhao, Shenjun Zhong
Medical visual question answering (VQA) is a challenging task that requires answering clinical questions of a given medical image, by taking consider of both visual and language information.
Ranked #1 on
Medical Visual Question Answering
on PathVQA
1 code implementation • 25 May 2023 • Jiahao Tan, Yipeng Zhou, Gang Liu, Jessie Hui Wang, Shui Yu
More specifically, we decouple a NN model into a personalized feature extractor, obtained by aggregating models from similar clients, and a classifier, which is obtained by local training and used to estimate client similarity.
1 code implementation • 20 May 2023 • Gang Liu, Tong Zhao, Eric Inae, Tengfei Luo, Meng Jiang
The training data balance is achieved by (1) pseudo-labeling more graphs for under-represented labels with a novel regression confidence measurement and (2) augmenting graph examples in latent space for remaining rare labels after data balancing with pseudo-labels.
1 code implementation • 17 Mar 2023 • Gang Liu, Eric Inae, Tong Zhao, Jiaxin Xu, Tengfei Luo, Meng Jiang
A conventional approach is training a model with the unlabeled graphs on self-supervised tasks and then fine-tuning the model on the prediction tasks.
1 code implementation • 1 Mar 2023 • Sen yang, Wen Heng, Gang Liu, Guozhong Luo, Wankou Yang, Gang Yu
In this paper we present a novel method to estimate 3D human pose and shape from monocular videos.
Ranked #45 on
3D Human Pose Estimation
on 3DPW
no code implementations • 17 Feb 2023 • Mufan Sang, Yong Zhao, Gang Liu, John H. L. Hansen, Jian Wu
The proposed models achieve 0. 75% EER on VoxCeleb 1 test set, outperforming the previously proposed Transformer-based models and CNN-based models, such as ResNet34 and ECAPA-TDNN.
no code implementations • 30 Dec 2022 • Wan Jiang, Gang Liu, Xiaofeng Chen, Yipeng Zhou
Unlike traditional distributed machine learning, federated learning stores data locally for training and then aggregates the models on the server, which solves the data security problem that may arise in traditional distributed machine learning.
1 code implementation • 16 Dec 2022 • Xiaoxiang Han, Yiman Liu, Gang Liu, Yuanjie Lin, Qiaohong Liu
In order to make the model lightweight and improve the model accuracy, a Lightweight Network Using Object Attention (LOANet) for Buildings and Roads from UAV Aerial Remote Sensing Images is proposed.
2 code implementations • 24 Nov 2022 • Pengfei Li, Gang Liu, Lin Tan, Jinying Liao, Shenjun Zhong
Medical image visual question answering (VQA) is a task to answer clinical questions, given a radiographic image, which is a challenging problem that requires a model to integrate both vision and language information.
Ranked #3 on
Medical Visual Question Answering
on PathVQA
no code implementations • 23 Nov 2022 • Xinyuan An, Xiaowei Cui, Sihao Zhao, Gang Liu, Mingquan Lu
Mounting multiple tags at different positions of the AGV to collect more TOFs is a feasible solution to tackle this difficulty.
1 code implementation • 19 Sep 2022 • Jiefeng Li, Siyuan Bian, Chao Xu, Gang Liu, Gang Yu, Cewu Lu
In this work, we present D&D (Learning Human Dynamics from Dynamic Camera), which leverages the laws of physics to reconstruct 3D human motion from the in-the-wild videos with a moving camera.
no code implementations • 17 Jul 2022 • Xinyuan An, Sihao Zhao, Xiaowei Cui, Gang Liu, Mingquan Lu
For radio-based time-of-arrival (TOA) positioning systems applied in harsh environments, obstacles in the surroundings and on the vehicle itself will block the signals from the anchors, reduce the number of available TOA measurements and thus degrade the localization performance.
no code implementations • 9 Jul 2022 • Gang Liu, Zhihan Zhang, Zheng Ning, Meng Jiang
To enable explainability, recent techniques such as ACCENT and FIA are looking for counterfactual explanations that are specific historical actions of a user, the removal of which leads to a change to the recommendation result.
1 code implementation • 6 Jun 2022 • Gang Liu, Tong Zhao, Jiaxin Xu, Tengfei Luo, Meng Jiang
Rationale is defined as a subset of input features that best explains or supports the prediction by machine learning models.
Ranked #1 on
Graph Regression
on GlassTemp
no code implementations • 27 Apr 2022 • Sanyuan Chen, Yu Wu, Chengyi Wang, Shujie Liu, Zhuo Chen, Peidong Wang, Gang Liu, Jinyu Li, Jian Wu, Xiangzhan Yu, Furu Wei
Recently, self-supervised learning (SSL) has demonstrated strong performance in speaker recognition, even if the pre-training objective is designed for speech recognition.
1 code implementation • 17 Feb 2022 • Tong Zhao, Wei Jin, Yozen Liu, Yingheng Wang, Gang Liu, Stephan Günnemann, Neil Shah, Meng Jiang
Overall, our work aims to clarify the landscape of existing literature in graph data augmentation and motivates additional work in this area, providing a helpful resource for researchers and practitioners in the broader graph machine learning domain.
no code implementations • 26 Oct 2021 • Lei You, Hui Ma, Tapan Kumar Saha, Gang Liu
This paper proposes a distributionally robust optimal power flow (OPF) model for transmission grids with wind power generation.
1 code implementation • NeurIPS 2021 • Tong Zhao, Gang Liu, Daheng Wang, Wenhao Yu, Meng Jiang
However, the causal relationship between the two variables was largely ignored for learning to predict links on a graph.
Ranked #1 on
Link Property Prediction
on ogbl-ddi
no code implementations • 23 Dec 2020 • Jérôme Weiss, Peng Zhang, Oguz Umut Salman, Gang Liu, Lev Truskinovsky
We link this new size effect with other related phenomena like size dependence of strength ("smaller is stronger") and the size induced switch between different hardening mechanisms.
Mesoscale and Nanoscale Physics Materials Science Statistical Mechanics Computational Physics
1 code implementation • 4 Jun 2020 • Gang Liu, Yajing Pang, Shuai Yin, Xiaoke Niu, Jing Wang, Hong Wan
Significance: DD with AC can be used for most engineering systems, such as sensor systems, and will speed up computation in these online systems.
3 code implementations • International Symposium on Intelligence Computation and Applications 2020 • Jie Chen, Gang Liu, Xin Chen
The existing methods usually have some problems, among which significant problems mainly include: 1) the generated images have no obvious animated style textures; 2) the generated images lose the content of the original images; 3) the parameters of the network require the large memory capacity.
2 code implementations • 25 Apr 2020 • Gang Liu, Jing Wang
However, this link has yet to be understood due to the complexity of human hand.
1 code implementation • 8 Apr 2020 • Gang Liu, Jing Wang
The main contribution of this paper is the basic machine learning algorithm (DD) with a white-box attribute, controllable precision for better generalization capability, and lower computational complexity.
no code implementations • CVPR 2019 • Yu Yu, Gang Liu, Jean-Marc Odobez
In this work, we address the problem of person-specific gaze model adaptation from only a few reference training samples.
no code implementations • 20 Apr 2019 • Gang Liu, Yu Yu, Kenneth A. Funes Mora, Jean-Marc Odobez
Non-invasive gaze estimation methods usually regress gaze directions directly from a single face or eye image.
no code implementations • 13 Nov 2018 • Ruchao Fan, Pan Zhou, Wei Chen, Jia Jia, Gang Liu
In previous work, researchers have shown that such architectures can acquire comparable results to state-of-the-art ASR systems, especially when using a bidirectional encoder and global soft attention (GSA) mechanism.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
no code implementations • 27 Oct 2017 • Fei Tao, Gang Liu
In this study, we propose a new variation of LSTM, advanced LSTM (A-LSTM), for better temporal context modeling.
1 code implementation • 19 Sep 2017 • He Zhao, Lan Du, Wray Buntine, Gang Liu
Besides the text content, documents and their associated words usually come with rich sets of meta informa- tion, such as categories of documents and semantic/syntactic features of words, like those encoded in word embeddings.
no code implementations • 10 Feb 2017 • Gui-Song Xia, Gang Liu, Xiang Bai, Liangpei Zhang
In contrast with existing works, the proposed method not only inherits the strong ability to depict geometrical aspects of textures and the high robustness to variations of imaging conditions from the shape-based method, but also provides a flexible way to consider shape relationships and to compute high-order statistics on the tree.
2 code implementations • 4 May 2016 • Gang Liu, Yann Gousseau, Gui-Song Xia
This paper presents a significant improvement for the synthesis of texture images using convolutional neural networks (CNNs), making use of constraints on the Fourier spectrum of the results.
no code implementations • 17 Jan 2015 • Gui-Song Xia, Gang Liu, Wen Yang
The segmentation of synthetic aperture radar (SAR) images is a longstanding yet challenging task, not only because of the presence of speckle, but also due to the variations of surface backscattering properties in the images.
no code implementations • 24 Dec 2013 • Gang Liu, Ting-Zhu Huang, Xiao-Guang Lv, Jun Liu
To solve this kind of ill-posed problems, a regularization term (i. e., regularizer) should be introduced, under the assumption that the solutions have some specific properties, such as sparsity and group sparsity.
no code implementations • 21 Dec 2013 • Gang Liu, Ting-Zhu Huang, Jun Liu, Xiao-Guang Lv
The total variation (TV) regularization method is an effective method for image deblurring in preserving edges.