1 code implementation • ACL 2022 • Xu Han, Guoyang Zeng, Weilin Zhao, Zhiyuan Liu, Zhengyan Zhang, Jie zhou, Jun Zhang, Jia Chao, Maosong Sun
In recent years, large-scale pre-trained language models (PLMs) containing billions of parameters have achieved promising results on various NLP tasks.
1 code implementation • ACL 2022 • Xu Han, Yuqi Luo, Weize Chen, Zhiyuan Liu, Maosong Sun, Zhou Botong, Hao Fei, Suncong Zheng
In this paper, we propose a cross-lingual contrastive learning framework to learn FGET models for low-resource languages.
1 code implementation • EMNLP (NLP-COVID19) 2020 • Adam Poliak, Max Fleming, Cash Costello, Kenton Murray, Mahsa Yarmohammadi, Shivani Pandya, Darius Irani, Milind Agarwal, Udit Sharma, Shuo Sun, Nicola Ivanov, Lingxi Shang, Kaushik Srinivasan, Seolhwa Lee, Xu Han, Smisha Agarwal, João Sedoc
We release a dataset of over 2, 100 COVID19 related Frequently asked Question-Answer pairs scraped from over 40 trusted websites.
no code implementations • 10 Apr 2025 • Yi Huang, Ke Zhang, Wei Liu, Yuanyuan Wang, Vishal M. Patel, Le Lu, Xu Han, Dakai Jin, Ke Yan
Finally, we introduce a topology-preserving loss function that leverages contextual and shape priors to balance the growth and suppression of tubular structures, which also allows the model to handle low-quality and incomplete annotations.
no code implementations • 3 Apr 2025 • Zidong Yu, Shuo Wang, Nan Jiang, Weiqiang Huang, Xu Han, Junliang Du
Harmful text detection has become a crucial task in the development and deployment of large language models, especially as AI-generated content continues to expand across digital platforms.
1 code implementation • 24 Mar 2025 • Xu Han, Yuan Tang, Jinfeng Xu, Xianzhi Li
We introduce Monarch Sparse Tuning (MoST), the first reparameterization-based parameter-efficient fine-tuning (PEFT) method tailored for 3D representation learning.
1 code implementation • 12 Mar 2025 • Yingfa Chen, Yutong Wu, Xu Han, Zhiyuan Liu, Maosong Sun
However, they overlook the impacts of context length and attention head configuration (the number of query and key-value heads in grouped-query attention) on training and inference.
no code implementations • 9 Mar 2025 • Jiangtao Duan, Jushan Bai, Xu Han
Under a rotational change, unlike in the single-breakpoint case, the pseudo factor covariance matrix within each regime can be either full rank or singular, yet the QML estimation error for the breakpoints remains stably bounded.
1 code implementation • 20 Feb 2025 • Jianling Li, Shangzhan Li, Zhenye Gao, Qi Shi, YuXuan Li, Zefan Wang, Jiacheng Huang, Haojie Wang, Jianrong Wang, Xu Han, Zhiyuan Liu, Maosong Sun
Despite advances in large language models (LLMs) for conventional code generation, these models struggle to generate accurate, performance-optimized Triton code, as they lack awareness of its specifications and the complexities of GPU programming.
no code implementations • 20 Feb 2025 • Weilin Zhao, Tengyu Pan, Xu Han, Yudi Zhang, Ao Sun, Yuxiang Huang, Kaihuo Zhang, Weilun Zhao, YuXuan Li, Jianyong Wang, Zhiyuan Liu, Maosong Sun
Speculative sampling has emerged as an important technique for accelerating the auto-regressive generation process of large language models (LLMs) by utilizing a draft-then-verify mechanism to produce multiple tokens per forward pass.
1 code implementation • 17 Feb 2025 • Yuxiang Huang, Mingye Li, Xu Han, Chaojun Xiao, Weilin Zhao, Sun Ao, Hao Zhou, Jie zhou, Zhiyuan Liu, Maosong Sun
While long-context inference is crucial for advancing large language model (LLM) applications, its prefill speed remains a significant bottleneck.
1 code implementation • 15 Feb 2025 • Zeli Su, Ziyin Zhang, Guixian Xu, Jianing Liu, Xu Han, Ting Zhang, Yushuang Dong
While multilingual language models like XLM-R have advanced multilingualism in NLP, they still perform poorly in extremely low-resource languages.
1 code implementation • 3 Feb 2025 • Ganqu Cui, Lifan Yuan, Zefan Wang, Hanbin Wang, Wendi Li, Bingxiang He, Yuchen Fan, Tianyu Yu, Qixin Xu, Weize Chen, Jiarui Yuan, Huayu Chen, Kaiyan Zhang, Xingtai Lv, Shuo Wang, Yuan YAO, Xu Han, Hao Peng, Yu Cheng, Zhiyuan Liu, Maosong Sun, BoWen Zhou, Ning Ding
While dense rewards also offer an appealing choice for the reinforcement learning (RL) of LLMs since their fine-grained rewards have the potential to address some inherent issues of outcome rewards, such as training efficiency and credit assignment, this potential remains largely unrealized.
no code implementations • 9 Jan 2025 • Zengqi Peng, Yubin Wang, Xu Han, Lei Zheng, Jun Ma
LearningFlow includes a curriculum sequence generation process and a reward generation process, which work in tandem to guide the RL policy by generating tailored training curricula and reward functions.
no code implementations • 5 Dec 2024 • Chaojun Xiao, Jie Cai, Weilin Zhao, Guoyang Zeng, Biyuan Lin, Jie zhou, Zhi Zheng, Xu Han, Zhiyuan Liu, Maosong Sun
This paper introduces the concept of ``\textit{capacity density}'' as a new metric to evaluate the quality of the LLMs across different scales and describes the trend of LLMs in terms of both effectiveness and efficiency.
1 code implementation • 25 Nov 2024 • Qiao Yu, Xianzhi Li, Yuan Tang, Xu Han, Long Hu, Yixue Hao, Min Chen
The fidelity enhancement module deforms the 3D mesh to match the input image.
1 code implementation • 22 Nov 2024 • Zheni Zeng, Yuxuan Chen, Shi Yu, Ruobing Wang, Yukun Yan, Zhenghao Liu, Shuo Wang, Xu Han, Zhiyuan Liu, Maosong Sun
Humans can utilize techniques to quickly acquire knowledge from specific materials in advance, such as creating self-assessment questions, enabling us to achieving related tasks more efficiently.
no code implementations • 18 Nov 2024 • Chuang Yang, Xu Han, Tao Han, Yuejiao Su, Junyu Gao, Hongyuan Zhang, Yi Wang, Lap-Pui Chau
Meanwhile, we develop a traffic guidance assistant (TGA) scenario application to re-explore the role of traffic signs in ADS as a complement to popular autonomous technologies (such as obstacle perception).
1 code implementation • 5 Nov 2024 • Xu Han, Junyu Gao, Chuang Yang, Yuan Yuan, Qi Wang
In addition, to validate the scene robustness of the SM-Net, we conduct experiments on traffic, industrial, and natural scene datasets.
1 code implementation • 4 Nov 2024 • Yuqi Luo, Chenyang Song, Xu Han, Yingfa Chen, Chaojun Xiao, Zhiyuan Liu, Maosong Sun
These demonstrate that ReLU is more efficient as the activation function than SiLU and can leverage more training data to improve activation sparsity.
1 code implementation • 22 Oct 2024 • Xu Han, Linghao Jin, Xiaofeng Liu, Paul Pu Liang
Despite the impressive text-to-image (T2I) synthesis capabilities of diffusion models, they often struggle to understand compositional relationships between objects and attributes, especially in complex settings.
no code implementations • 17 Oct 2024 • Xu Han, Yuancheng Sun, Kai Chen, Kang Liu, Qiwei Ye
Coarse-grained(CG) molecular dynamics simulations offer computational efficiency for exploring protein conformational ensembles and thermodynamic properties.
1 code implementation • 14 Oct 2024 • Shi Yu, Chaoyue Tang, Bokai Xu, Junbo Cui, Junhao Ran, Yukun Yan, Zhenghao Liu, Shuo Wang, Xu Han, Zhiyuan Liu, Maosong Sun
In this pipeline, instead of first parsing the document to obtain text, the document is directly embedded using a VLM as an image and then retrieved to enhance the generation of a VLM.
1 code implementation • 12 Oct 2024 • Zihan Zhou, Chong Li, Xinyi Chen, Shuo Wang, Yu Chao, Zhili Li, Haoyu Wang, Rongqiao An, Qi Shi, Zhixing Tan, Xu Han, Xiaodong Shi, Zhiyuan Liu, Maosong Sun
The proposed LLM$\times$MapReduce framework splits the entire document into several chunks for LLMs to read and then aggregates the intermediate answers to produce the final output.
1 code implementation • 11 Oct 2024 • Ruobing Wang, Daren Zha, Shi Yu, Qingfei Zhao, Yuxuan Chen, YiXuan Wang, Shuo Wang, Yukun Yan, Zhenghao Liu, Xu Han, Zhiyuan Liu, Maosong Sun
Retrieval-Augmented Generation (RAG) mitigates issues of the factual errors and hallucinated outputs generated by Large Language Models (LLMs) in open-domain question-answering tasks (OpenQA) via introducing external knowledge.
no code implementations • 9 Oct 2024 • Yingfa Chen, Xinrong Zhang, Shengding Hu, Xu Han, Zhiyuan Liu, Maosong Sun
For the second concern, we train a series of Mamba-2 models on long documents to empirically estimate the recurrent state capacity in language modeling and passkey retrieval.
1 code implementation • 4 Oct 2024 • Jiecheng Lu, Xu Han, Yan Sun, Shihao Yang
We propose an Autoregressive (AR) Moving-average (MA) attention structure that can adapt to various linear attention mechanisms, enhancing their ability to capture long-range and local temporal patterns in time series.
1 code implementation • 4 Oct 2024 • Zhengyan Zhang, Chaojun Xiao, Qiujieli Qin, Yankai Lin, Zhiyuan Zeng, Xu Han, Zhiyuan Liu, Ruobing Xie, Maosong Sun, Jie zhou
SSD adaptively switches between the Mixtures-of-Experts (MoE) based sparse training and the conventional dense training during the pre-training process, leveraging the efficiency of sparse training and avoiding the static activation correlation of sparse training.
1 code implementation • 2 Oct 2024 • Yuxiang Huang, Binhang Yuan, Xu Han, Chaojun Xiao, Zhiyuan Liu
Scaling the input context length of a large language model (LLM) incurs a significant increase in computation cost and memory footprint to maintain the attention key-value (KV) cache.
no code implementations • 25 Sep 2024 • Xu Han, Junyu Gao, Chuang Yang, Yuan Yuan, Qi Wang
The irregular contour representation is one of the tough challenges in scene text detection.
no code implementations • 25 Sep 2024 • Xu Han, Junyu Gao, Chuang Yang, Yuan Yuan, Qi Wang
The latter extracts region-level information and encourages the model to focus on the distribution of positive samples in the vicinity of a pixel, which perceives environment information.
no code implementations • 18 Sep 2024 • Wang Xu, Shuo Wang, Weilin Zhao, Xu Han, Yukun Yan, Yudi Zhang, Zhe Tao, Zhiyuan Liu, Wanxiang Che
To address this limitation, researchers have proposed duplex models.
no code implementations • 5 Sep 2024 • Jifan Yu, Zheyuan Zhang, Daniel Zhang-li, Shangqing Tu, Zhanxin Hao, Rui Miao Li, Haoxuan Li, Yuanchun Wang, Hanming Li, Linlu Gong, Jie Cao, Jiayin Lin, Jinchang Zhou, Fei Qin, Haohua Wang, Jianxiao Jiang, Lijun Deng, Yisi Zhan, Chaojun Xiao, Xusheng Dai, Xuan Yan, Nianyi Lin, Nan Zhang, Ruixin Ni, Yang Dang, Lei Hou, Yu Zhang, Xu Han, Manli Li, Juanzi Li, Zhiyuan Liu, Huiqin Liu, Maosong Sun
Since the first instances of online education, where courses were uploaded to accessible and shared online platforms, this form of scaling the dissemination of human knowledge to reach a broader audience has sparked extensive discussion and widespread adoption.
no code implementations • 4 Sep 2024 • Chaojun Xiao, Zhengyan Zhang, Chenyang Song, Dazhi Jiang, Feng Yao, Xu Han, Xiaozhi Wang, Shuo Wang, Yufei Huang, GuanYu Lin, Yingfa Chen, Weilin Zhao, Yuge Tu, Zexuan Zhong, Ao Zhang, Chenglei Si, Khai Hao Moo, Chenyang Zhao, Huimin Chen, Yankai Lin, Zhiyuan Liu, Jingbo Shang, Maosong Sun
We first formalize modules into emergent bricks - functional neuron partitions that emerge during the pre-training phase, and customized bricks - bricks constructed via additional post-training to improve the capabilities and knowledge of LLMs.
1 code implementation • 2 Sep 2024 • Yingfa Chen, Chenlong Hu, Cong Feng, Chenyang Song, Shi Yu, Xu Han, Zhiyuan Liu, Maosong Sun
This study presents a multi-modal multi-granularity tokenizer specifically designed for analyzing ancient Chinese scripts, focusing on the Chu bamboo slip (CBS) script used during the Spring and Autumn and Warring States period (771-256 BCE) in Ancient China.
no code implementations • 28 Aug 2024 • Zhaoxuan Wang, Xu Han, Hongxin Liu, Xianzhi Li
Particularly, our RIDE is compatible and easy to plug into the existing one-stage and two-stage 3D detectors, and boosts both detection performance and rotation robustness.
1 code implementation • 28 Aug 2024 • Yuan Tang, Xu Han, Xianzhi Li, Qiao Yu, Jinfeng Xu, Yixue Hao, Long Hu, Min Chen
To address this task, we introduce GreenPLM, which leverages more text data to compensate for the lack of 3D data.
1 code implementation • 23 Aug 2024 • Tianze Zheng, Ailun Wang, Xu Han, Yu Xia, Xingyuan Xu, Jiawei Zhan, Yu Liu, Yang Chen, Zhi Wang, Xiaojie Wu, Sheng Gong, Wen Yan
A force field is a critical component in molecular dynamics simulations for computational drug discovery.
no code implementations • 19 Aug 2024 • Xu Han, Felix Yu, Joao Sedoc, Benjamin Van Durme
Using the results of IBWS as a best-desired outcome, we evaluate various direct assessment methods to determine what is both cost-efficient and best correlating to a large scale BWS annotation strategy.
1 code implementation • 12 Aug 2024 • Yufei Huang, Xu Han, Maosong Sun
Open Domain Question Answering (ODQA) has been advancing rapidly in recent times, driven by significant developments in dense passage retrieval and pretrained language models.
2 code implementations • 3 Aug 2024 • Yuan YAO, Tianyu Yu, Ao Zhang, Chongyi Wang, Junbo Cui, Hongji Zhu, Tianchi Cai, Haoyu Li, Weilin Zhao, Zhihui He, Qianyu Chen, Huarong Zhou, Zhensheng Zou, Haoye Zhang, Shengding Hu, Zhi Zheng, Jie zhou, Jie Cai, Xu Han, Guoyang Zeng, Dahai Li, Zhiyuan Liu, Maosong Sun
The recent surge of Multimodal Large Language Models (MLLMs) has fundamentally reshaped the landscape of AI research and industry, shedding light on a promising path toward the next AI milestone.
Ranked #7 on
Multiple-choice
on Neptune-Full
1 code implementation • 2 Aug 2024 • Kunlun Zhu, Yifan Luo, Dingling Xu, Yukun Yan, Zhenghao Liu, Shi Yu, Ruobing Wang, Shuo Wang, Yishan Li, Nan Zhang, Xu Han, Zhiyuan Liu, Maosong Sun
However, evaluating the effectiveness of RAG systems in specialized scenarios remains challenging due to the high costs of data construction and the lack of suitable evaluation metrics.
no code implementations • 18 Jul 2024 • Wei Huang, Wei Liu, XiaoMing Zhang, Xiaoli Yin, Xu Han, Chunli Li, Yuan Gao, Yu Shi, Le Lu, Ling Zhang, Lei Zhang, Ke Yan
The early detection and precise diagnosis of liver tumors are tasks of critical clinical value, yet they pose significant challenges due to the high heterogeneity and variability of liver tumors.
no code implementations • 17 Jul 2024 • Xianda Chen, PakHin Tiu, Xu Han, Junjie Chen, Yuanfei Wu, Xinhu Zheng, Meixin Zhu
The continual evolution of autonomous driving technology requires car-following models that can adapt to diverse and dynamic traffic environments.
no code implementations • 2 Jul 2024 • Xu Han, Linghao Jin, Xuezhe Ma, Xiaofeng Liu
Moreover, the growing reliance on multi-modal generation exacerbates this issue because of its susceptibility to adversarial attacks.
no code implementations • 23 Jun 2024 • Xianda Chen, Xu Han, Meixin Zhu, Xiaowen Chu, PakHin Tiu, Xinhu Zheng, Yinhai Wang
To overcome these limitations, we propose the Editable Behavior Generation (EBG) model, a data-driven car-following model that allows for adjusting driving discourtesy levels.
1 code implementation • 22 Jun 2024 • Qiao Yu, Xianzhi Li, Yuan Tang, Xu Han, Jinfeng Xu, Long Hu, Min Chen
Regarding this, we propose PointDreamer, a novel framework for textured mesh reconstruction from colored point cloud via diffusion-based 2D inpainting.
1 code implementation • 22 Jun 2024 • Xinrong Zhang, Yingfa Chen, Shengding Hu, Xu Han, Zihang Xu, Yuanwei Xu, Weilin Zhao, Maosong Sun, Zhiyuan Liu
Specifically, we divide the queries and responses of conversations into several time slices and then adopt a time-division-multiplexing (TDM) encoding-decoding strategy to pseudo-simultaneously process these slices.
no code implementations • 21 Jun 2024 • Xu Han, Fangfang Fan, Jingzhao Rong, Zhen Li, Georges El Fakhri, Qingyu Chen, Xiaofeng Liu
For evaluation, we set the target dataset to be enhanced as the BraST18 dataset, and trained a brain magnetic resonance (MR) slice-based gender classifier from it.
no code implementations • 15 Jun 2024 • Xu Han, Qiannan Yang, Xianda Chen, Xiaowen Chu, Meixin Zhu
Reinforcement Learning (RL) plays a crucial role in advancing autonomous driving technologies by maximizing reward functions to achieve the optimal policy.
1 code implementation • 13 Jun 2024 • Bowen Ping, Shuo Wang, Hanqing Wang, Xu Han, Yuzhuang Xu, Yukun Yan, Yun Chen, Baobao Chang, Zhiyuan Liu, Maosong Sun
Motivated by the long-tail distribution of singular values in the delta weights, we propose a delta quantization approach using mixed-precision.
1 code implementation • 2 May 2024 • Yuan Tang, Xu Han, Xianzhi Li, Qiao Yu, Yixue Hao, Long Hu, Min Chen
Notably, MiniGPT-3D gains an 8. 12 increase on GPT-4 evaluation score for the challenging object captioning task compared to ShapeLLM-13B, while the latter costs 160 total GPU-hours on 8 A800.
Ranked #1 on
3D Object Captioning
on Objaverse
1 code implementation • 23 Apr 2024 • Xu Han, Yuan Tang, Zhaoxuan Wang, Xianzhi Li
In contrast, the newly proposed Mamba model, based on state space models (SSM), outperforms Transformer in multiple areas with only linear complexity.
1 code implementation • 11 Apr 2024 • Chaoqun He, Renjie Luo, Shengding Hu, Yuanqian Zhao, Jie zhou, Hanghao Wu, Jiajie Zhang, Xu Han, Zhiyuan Liu, Maosong Sun
The rapid development of LLMs calls for a lightweight and easy-to-use framework for swift evaluation deployment.
3 code implementations • 9 Apr 2024 • Shengding Hu, Yuge Tu, Xu Han, Chaoqun He, Ganqu Cui, Xiang Long, Zhi Zheng, Yewei Fang, Yuxiang Huang, Weilin Zhao, Xinrong Zhang, Zheng Leng Thai, Kaihuo Zhang, Chongyi Wang, Yuan YAO, Chenyang Zhao, Jie zhou, Jie Cai, Zhongwu Zhai, Ning Ding, Chao Jia, Guoyang Zeng, Dahai Li, Zhiyuan Liu, Maosong Sun
For data scaling, we introduce a Warmup-Stable-Decay (WSD) learning rate scheduler (LRS), conducive to continuous training and domain adaptation.
1 code implementation • 26 Mar 2024 • Yingfa Chen, Zhengyan Zhang, Xu Han, Chaojun Xiao, Zhiyuan Liu, Chen Chen, Kuai Li, Tao Yang, Maosong Sun
Large language models (LLMs) can make predictions using parametric knowledge--knowledge encoded in the model weights--or contextual knowledge--knowledge presented in the context.
no code implementations • 24 Mar 2024 • Hao Xiang, Zhaoliang Zheng, Xin Xia, Runsheng Xu, Letian Gao, Zewei Zhou, Xu Han, Xinkai Ji, Mingxi Li, Zonglin Meng, Li Jin, Mingyue Lei, Zhaoyang Ma, Zihang He, Haoxuan Ma, Yunshuang Yuan, Yingqian Zhao, Jiaqi Ma
Recent advancements in Vehicle-to-Everything (V2X) technologies have enabled autonomous vehicles to share sensing information to see through occlusions, greatly boosting the perception capability.
1 code implementation • 14 Mar 2024 • Ao Sun, Weilin Zhao, Xu Han, Cheng Yang, Zhiyuan Liu, Chuan Shi, Maosong Sun
Effective attention modules have played a crucial role in the success of Transformer-based large language models (LLMs), but the quadratic time and memory complexities of these attention modules also pose a challenge when processing long sequences.
1 code implementation • 4 Mar 2024 • Jiecheng Lu, Xu Han, Yan Sun, Shihao Yang
For Multivariate Time Series Forecasting (MTSF), recent deep learning applications show that univariate models frequently outperform multivariate ones.
Ranked #20 on
Time Series Forecasting
on ETTh1 (336) Multivariate
no code implementations • 23 Feb 2024 • Yufei Huang, Shengding Hu, Xu Han, Zhiyuan Liu, Maosong Sun
Recent studies have uncovered intriguing phenomena in deep learning, such as grokking, double descent, and emergent abilities in large language models, which challenge human intuition and are crucial for a deeper understanding of neural models.
1 code implementation • 21 Feb 2024 • Chenyang Song, Xu Han, Zhengyan Zhang, Shengding Hu, Xiyu Shi, Kuai Li, Chen Chen, Zhiyuan Liu, Guangli Li, Tao Yang, Maosong Sun
Some recent efforts have explored introducing ReLU or its variants as the substitutive activation function to help LLMs achieve activation sparsity and inference acceleration, but few can simultaneously obtain high sparsity and comparable model performance.
4 code implementations • 21 Feb 2024 • Xinrong Zhang, Yingfa Chen, Shengding Hu, Zihang Xu, JunHao Chen, Moo Khai Hao, Xu Han, Zhen Leng Thai, Shuo Wang, Zhiyuan Liu, Maosong Sun
Processing and reasoning over long contexts is crucial for many practical applications of Large Language Models (LLMs), such as document comprehension and agent construction.
1 code implementation • 21 Feb 2024 • Weilin Zhao, Yuxiang Huang, Xu Han, Wang Xu, Chaojun Xiao, Xinrong Zhang, Yewei Fang, Kaihuo Zhang, Zhiyuan Liu, Maosong Sun
To achieve this, we introduce Ouroboros, which can generate draft phrases to parallelize the drafting process and meanwhile lengthen drafts in a training-free manner.
1 code implementation • 21 Feb 2024 • Chaoqun He, Renjie Luo, Yuzhuo Bai, Shengding Hu, Zhen Leng Thai, Junhao Shen, Jinyi Hu, Xu Han, Yujie Huang, Yuxiang Zhang, Jie Liu, Lei Qi, Zhiyuan Liu, Maosong Sun
Notably, the best-performing model, GPT-4V, attains an average score of 17. 97% on OlympiadBench, with a mere 10. 74% in physics, highlighting the benchmark rigor and the intricacy of physical reasoning.
2 code implementations • 19 Feb 2024 • Zengqing Wu, Run Peng, Shuyuan Zheng, Qianying Liu, Xu Han, Brian Inhyuk Kwon, Makoto Onizuka, Shaojie Tang, Chuan Xiao
Large Language Models (LLMs) have increasingly been utilized in social simulations, where they are often guided by carefully crafted instructions to stably exhibit human-like behaviors during simulations.
1 code implementation • 18 Feb 2024 • Zhiyu Yang, Zihan Zhou, Shuo Wang, Xin Cong, Xu Han, Yukun Yan, Zhenghao Liu, Zhixing Tan, Pengyuan Liu, Dong Yu, Zhiyuan Liu, Xiaodong Shi, Maosong Sun
Scientific data visualization plays a crucial role in research by enabling the direct display of complex information and assisting researchers in identifying implicit patterns.
no code implementations • 18 Feb 2024 • Hanqing Wang, Bowen Ping, Shuo Wang, Xu Han, Yun Chen, Zhiyuan Liu, Maosong Sun
Most prior works on LoRA combination primarily rely on task-level weights for each involved LoRA, making different examples and tokens share the same LoRA weights.
1 code implementation • 17 Feb 2024 • Yuzhuang Xu, Xu Han, Zonghan Yang, Shuo Wang, Qingfu Zhu, Zhiyuan Liu, Weidong Liu, Wanxiang Che
Model quantification uses low bit-width values to represent the weight matrices of existing models to be quantized, which is a promising approach to reduce both storage and computational overheads of deploying highly anticipated LLMs.
1 code implementation • 7 Feb 2024 • Haoyu Wang, Shuo Wang, Yukun Yan, Xujia Wang, Zhiyu Yang, Yuzhuang Xu, Zhenghao Liu, Liner Yang, Ning Ding, Xu Han, Zhiyuan Liu, Maosong Sun
Different from previous works that simply translate English instructions, we consider both the language-specific and language-agnostic abilities of LLMs.
1 code implementation • 7 Feb 2024 • Chaojun Xiao, Pengle Zhang, Xu Han, Guangxuan Xiao, Yankai Lin, Zhengyan Zhang, Zhiyuan Liu, Maosong Sun
In this paper, we unveil the intrinsic capacity of LLMs for understanding extremely long sequences without any fine-tuning.
no code implementations • 6 Feb 2024 • Zhengyan Zhang, Yixin Song, Guanghui Yu, Xu Han, Yankai Lin, Chaojun Xiao, Chenyang Song, Zhiyuan Liu, Zeyu Mi, Maosong Sun
To find the most efficient activation function for sparse computation, we propose a systematic framework to examine the sparsity of LLMs from three aspects: the trade-off between sparsity and performance, the predictivity of sparsity, and the hardware affinity.
1 code implementation • 11 Jan 2024 • Changtai Li, Xu Han, Chao Yao, Xiaojuan Ban
Efficient and accurate extraction of microstructures in micrographs of materials is essential in process optimization and the exploration of structure-property relationships.
no code implementations • 31 Dec 2023 • Sihao Yuan, Xu Han, Jun Zhang, Zhaoxin Xie, Cheng Fan, Yunlong Xiao, Yi Qin Gao, Yi Isaac Yang
We applied this approach to study a Claisen rearrangement reaction and a Carbonyl insertion reaction catalyzed by Manganese.
no code implementations • 17 Nov 2023 • Chuang Yang, Kai Zhuang, Mulin Chen, Haozhao Ma, Xu Han, Tao Han, Changxing Guo, Han Han, Bingxuan Zhao, Qi Wang
Following the above issues, we propose a traffic sign interpretation (TSI) task, which aims to interpret global semantic interrelated traffic signs (e. g.,~driving instruction-related texts, symbols, and guide panels) into a natural language for providing accurate instruction support to autonomous or assistant driving.
1 code implementation • 15 Nov 2023 • Xiaozhi Wang, Hao Peng, Yong Guan, Kaisheng Zeng, Jianhui Chen, Lei Hou, Xu Han, Yankai Lin, Zhiyuan Liu, Ruobing Xie, Jie zhou, Juanzi Li
Understanding events in texts is a core objective of natural language understanding, which requires detecting event occurrences, extracting event arguments, and analyzing inter-event relationships.
no code implementations • 14 Nov 2023 • Han Gao, Xu Han, Xiantao Fan, Luning Sun, Li-Ping Liu, Lian Duan, Jian-Xun Wang
A notable feature of our approach is the method proposed for long-span flow sequence generation, which is based on autoregressive gradient-based conditional sampling, eliminating the need for cumbersome retraining processes.
4 code implementations • 10 Nov 2023 • Zengqing Wu, Run Peng, Xu Han, Shuyuan Zheng, Yixin Zhang, Chuan Xiao
ABM's strength lies in its bottom-up methodology, illuminating emergent phenomena by modeling the behaviors of individual components of a system.
1 code implementation • 24 Oct 2023 • Chaojun Xiao, Yuqi Luo, Wenbin Zhang, Pengle Zhang, Xu Han, Yankai Lin, Zhengyan Zhang, Ruobing Xie, Zhiyuan Liu, Maosong Sun, Jie zhou
Pre-trained language models (PLMs) have achieved remarkable results on NLP tasks but at the expense of huge parameter sizes and the consequent computational costs.
no code implementations • 19 Oct 2023 • Weize Chen, Xiaoyue Xu, Xu Han, Yankai Lin, Ruobing Xie, Zhiyuan Liu, Maosong Sun, Jie zhou
Parameter-shared pre-trained language models (PLMs) have emerged as a successful approach in resource-constrained environments, enabling substantial reductions in model storage and memory costs without significant performance compromise.
no code implementations • 14 Oct 2023 • Jiecheng Lu, Xu Han, Shihao Yang
Long-term time series forecasting (LTSF) is important for various domains but is confronted by challenges in handling the complex temporal-contextual relationships.
no code implementations • 5 Oct 2023 • Shengding Hu, Xin Liu, Xu Han, Xinrong Zhang, Chaoqun He, Weilin Zhao, Yankai Lin, Ning Ding, Zebin Ou, Guoyang Zeng, Zhiyuan Liu, Maosong Sun
With PassUntil, we conduct a quantitative investigation into the scaling law of task performance.
1 code implementation • 26 Sep 2023 • Chenyang Song, Xu Han, Zheni Zeng, Kuai Li, Chen Chen, Zhiyuan Liu, Maosong Sun, Tao Yang
First, Static ConPET can adapt former continual learning methods originally designed for relatively smaller models to LLMs through PET and a dynamic replay strategy, which largely reduces the tuning costs and alleviates the over-fitting and forgetting issue.
no code implementations • 19 Sep 2023 • Kunlun Zhu, Shihao Liang, Xu Han, Zhi Zheng, Guoyang Zeng, Zhiyuan Liu, Maosong Sun
Recent years have witnessed the success of question answering (QA), especially its potential to be a foundation paradigm for tackling diverse NLP tasks.
no code implementations • 30 Aug 2023 • Xu Han, Xianda Chen, Meixin Zhu, Pinlong Cai, Jianshan Zhou, Xiaowen Chu
The experimental results illustrate that EnsembleFollower yields improved accuracy of human-like behavior and achieves effectiveness in combining hybrid models, demonstrating that our proposed framework can handle diverse car-following conditions by leveraging the strengths of various low-level models.
2 code implementations • 23 Aug 2023 • Jinyi Hu, Yuan YAO, Chongyi Wang, Shan Wang, Yinxu Pan, Qianyu Chen, Tianyu Yu, Hanghao Wu, Yue Zhao, Haoye Zhang, Xu Han, Yankai Lin, Jiao Xue, Dahai Li, Zhiyuan Liu, Maosong Sun
Building a competitive counterpart in other languages is highly challenging due to the low-resource nature of non-English multimodal data (i. e., lack of large-scale, high-quality image-text data).
2 code implementations • 21 Aug 2023 • Xu Han, Zengqing Wu, Chuan Xiao
Our results demonstrate that, in the absence of communication, smart agents consistently reach tacit collusion, leading to prices converging at levels higher than the Bertrand equilibrium price but lower than monopoly or cartel prices.
1 code implementation • 15 Jul 2023 • Weilin Zhao, Yuxiang Huang, Xu Han, Zhiyuan Liu, Zhengyan Zhang, Kuai Li, Chen Chen, Tao Yang, Maosong Sun
To address this, there are two key methods available: the first is model compression, which compresses LLMs into smaller sizes; the second is LoRA, which can transfer an LLM to other tasks with very few parameters, avoiding the storage of multiple model variants in multi-task scenarios by only preserving LoRAs.
1 code implementation • 6 Jul 2023 • Xu Han, Anmin Liu, Chenxuan Yao, Yanbo Fan, Kun He
In either case, the common gradient-based methods generally use the sign function to generate perturbations on the gradient update, that offers a roughly correct direction and has gained great success.
no code implementations • 13 Jun 2023 • Xu Han, Bin Guo, Yoon Jung, Benjamin Yao, Yu Zhang, Xiaohu Liu, Chenlei Guo
Personalized dialogue agents (DAs) powered by large pre-trained language models (PLMs) often rely on explicit persona descriptions to maintain personality consistency.
1 code implementation • 28 May 2023 • Zhengyan Zhang, Zhiyuan Zeng, Yankai Lin, Huadong Wang, Deming Ye, Chaojun Xiao, Xu Han, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
Experimental results on three knowledge-driven NLP tasks show that existing injection methods are not suitable for the new paradigm, while map-tuning effectively improves the performance of downstream models.
1 code implementation • 28 May 2023 • Chaojun Xiao, Zhengyan Zhang, Xu Han, Chi-Min Chan, Yankai Lin, Zhiyuan Liu, Xiangyang Li, Zhonghua Li, Zhao Cao, Maosong Sun
By inserting document plugins into the backbone PTM for downstream tasks, we can encode a document one time to handle multiple tasks, which is more efficient than conventional encoding-task coupling methods that simultaneously encode documents and input queries using task-specific encoders.
1 code implementation • 28 May 2023 • Weize Chen, Xu Han, Yankai Lin, Zhiyuan Liu, Maosong Sun, Jie zhou
Since it is non-trivial to directly model the intermediate states and design a running cost function, we propose to use latent stochastic bridges to regularize the intermediate states and use the regularization as the running cost of PETs.
1 code implementation • 28 May 2023 • Zhengyan Zhang, Zhiyuan Zeng, Yankai Lin, Chaojun Xiao, Xiaozhi Wang, Xu Han, Zhiyuan Liu, Ruobing Xie, Maosong Sun, Jie zhou
In analogy to human brains, we consider two main characteristics of modularity: (1) functional specialization of neurons: we evaluate whether each neuron is mainly specialized in a certain function, and find that the answer is yes.
1 code implementation • 25 May 2023 • Xianda Chen, Meixin Zhu, Kehua Chen, Pengqin Wang, Hongliang Lu, Hui Zhong, Xu Han, Yinhai Wang
To address this gap and promote the development of microscopic traffic flow modeling, we establish a public benchmark dataset for car-following behavior modeling.
no code implementations • 19 May 2023 • Jinyi Hu, Xu Han, Xiaoyuan Yi, Yutong Chen, Wenhao Li, Zhiyuan Liu, Maosong Sun
IAP optimizes only a separate Chinese text encoder with all other parameters fixed to align Chinese semantics space to the English one in CLIP.
no code implementations • 16 May 2023 • Pengcheng Shi, Haozhe Cheng, Xu Han, Yiyang Zhou, Jihua Zhu
To tackle these challenges, we propose an information interaction-based generative network for point cloud completion ($\mathbf{DualGenerator}$).
1 code implementation • 15 May 2023 • Yujia Qin, Cheng Qian, Xu Han, Yankai Lin, Huadong Wang, Ruobing Xie, Zhiyuan Liu, Maosong Sun, Jie zhou
In pilot studies, we find that after continual pre-training, the upgraded PLM remains compatible with the outdated adapted weights to some extent.
1 code implementation • 11 May 2023 • Yujia Qin, Zihan Cai, Dian Jin, Lan Yan, Shihao Liang, Kunlun Zhu, Yankai Lin, Xu Han, Ning Ding, Huadong Wang, Ruobing Xie, Fanchao Qi, Zhiyuan Liu, Maosong Sun, Jie zhou
We recruit annotators to search for relevant information using our interface and then answer questions.
1 code implementation • 6 May 2023 • Xiaohui Chen, Jiaxing He, Xu Han, Li-Ping Liu
The empirical study shows that EDGE is much more efficient than competing methods and can generate large graphs with thousands of nodes.
no code implementations • 27 Apr 2023 • Xiaoqian Liu, Xu Han, Eric C. Chi, Boaz Nadler
In 1-bit matrix completion, the aim is to estimate an underlying low-rank matrix from a partial set of binary observations.
3 code implementations • 17 Apr 2023 • Yujia Qin, Shengding Hu, Yankai Lin, Weize Chen, Ning Ding, Ganqu Cui, Zheni Zeng, Yufei Huang, Chaojun Xiao, Chi Han, Yi Ren Fung, Yusheng Su, Huadong Wang, Cheng Qian, Runchu Tian, Kunlun Zhu, Shihao Liang, Xingyu Shen, Bokai Xu, Zhen Zhang, Yining Ye, Bowen Li, Ziwei Tang, Jing Yi, Yuzhang Zhu, Zhenning Dai, Lan Yan, Xin Cong, Yaxi Lu, Weilin Zhao, Yuxiang Huang, Junxi Yan, Xu Han, Xian Sun, Dahai Li, Jason Phang, Cheng Yang, Tongshuang Wu, Heng Ji, Zhiyuan Liu, Maosong Sun
Considering the lack of a systematic tool learning evaluation in prior works, we experiment with 18 representative tools and show the potential of current foundation models in skillfully utilizing tools.
no code implementations • 15 Mar 2023 • Yimin Yin, Siliang He, Renye Zhang, Hongli Chang, Xu Han, Jinghua Zhang
This paper collects 120 relevant papers to summarize the development of iris recognition based on deep learning.
1 code implementation • 26 Nov 2022 • Han Gao, Xu Han, Jiaoyang Huang, Jian-Xun Wang, Li-Ping Liu
Recently the Transformer structure has shown good performances in graph learning tasks.
1 code implementation • 14 Nov 2022 • Xiaozhi Wang, Yulin Chen, Ning Ding, Hao Peng, Zimu Wang, Yankai Lin, Xu Han, Lei Hou, Juanzi Li, Zhiyuan Liu, Peng Li, Jie zhou
It contains 103, 193 event coreference chains, 1, 216, 217 temporal relations, 57, 992 causal relations, and 15, 841 subevent relations, which is larger than existing datasets of all the ERE tasks by at least an order of magnitude.
1 code implementation • 25 Oct 2022 • Yujia Qin, Cheng Qian, Jing Yi, Weize Chen, Yankai Lin, Xu Han, Zhiyuan Liu, Maosong Sun, Jie zhou
(3) How does the PLM's task knowledge change along the path connecting two minima?
1 code implementation • 24 Oct 2022 • Jing Yi, Weize Chen, Yujia Qin, Yankai Lin, Ning Ding, Xu Han, Zhiyuan Liu, Maosong Sun, Jie zhou
To fathom the mystery, we hypothesize that the adaptations of different DETs could all be reparameterized as low-dimensional optimizations in a unified optimization subspace, which could be found by jointly decomposing independent solutions of different DETs.
1 code implementation • 19 Oct 2022 • Linfeng Liu, Xu Han, Dawei Zhou, Li-Ping Liu
In this work, we convert graph pruning to a problem of node relabeling and then relax it to a differentiable problem.
1 code implementation • 8 Oct 2022 • Cong Ma, Yaping Zhang, Mei Tu, Xu Han, Linghui Wu, Yang Zhao, Yu Zhou
End-to-end text image translation (TIT), which aims at translating the source language embedded in images to the target language, has attracted intensive attention in recent research.
no code implementations • 29 Jul 2022 • Xu Han, Feng Wu
Most reinforcement learning (RL) methods only focus on learning a single task from scratch and are not able to use prior knowledge to learn other tasks more effectively.
1 code implementation • 22 Jun 2022 • Xiaoxuan Liu, Lianmin Zheng, Dequan Wang, Yukuo Cen, Weize Chen, Xu Han, Jianfei Chen, Zhiyuan Liu, Jie Tang, Joey Gonzalez, Michael Mahoney, Alvin Cheung
Training large neural network (NN) models requires extensive memory resources, and Activation Compressed Training (ACT) is a promising approach to reduce training memory footprint.
no code implementations • 16 Jun 2022 • Jushan Bai, Jiangtao Duan, Xu Han
This paper considers the likelihood ratio (LR) test for a variance change in the estimated factors.
no code implementations • 20 May 2022 • Junyu Liu, Changchun Zhong, Matthew Otten, Anirban Chandra, Cristian L. Cortes, Chaoyang Ti, Stephen K Gray, Xu Han
Quantum machine learning is a rapidly evolving field of research that could facilitate important applications for quantum computing and also significantly impact data-driven sciences.
1 code implementation • 6 Apr 2022 • Xu Han, Anmin Liu, Yifeng Xiong, Yanbo Fan, Kun He
Deviation between the original gradient and the generated noises may lead to inaccurate gradient update estimation and suboptimal solutions for adversarial transferability, which is crucial for black-box attacks.
no code implementations • 26 Mar 2022 • Sha Yuan, Hanyu Zhao, Shuai Zhao, Jiahong Leng, Yangxiao Liang, Xiaozhi Wang, Jifan Yu, Xin Lv, Zhou Shao, Jiaao He, Yankai Lin, Xu Han, Zhenghao Liu, Ning Ding, Yongming Rao, Yizhao Gao, Liang Zhang, Ming Ding, Cong Fang, Yisen Wang, Mingsheng Long, Jing Zhang, Yinpeng Dong, Tianyu Pang, Peng Cui, Lingxiao Huang, Zheng Liang, HuaWei Shen, HUI ZHANG, Quanshi Zhang, Qingxiu Dong, Zhixing Tan, Mingxuan Wang, Shuo Wang, Long Zhou, Haoran Li, Junwei Bao, Yingwei Pan, Weinan Zhang, Zhou Yu, Rui Yan, Chence Shi, Minghao Xu, Zuobai Zhang, Guoqiang Wang, Xiang Pan, Mengjie Li, Xiaoyu Chu, Zijun Yao, Fangwei Zhu, Shulin Cao, Weicheng Xue, Zixuan Ma, Zhengyan Zhang, Shengding Hu, Yujia Qin, Chaojun Xiao, Zheni Zeng, Ganqu Cui, Weize Chen, Weilin Zhao, Yuan YAO, Peng Li, Wenzhao Zheng, Wenliang Zhao, Ziyi Wang, Borui Zhang, Nanyi Fei, Anwen Hu, Zenan Ling, Haoyang Li, Boxi Cao, Xianpei Han, Weidong Zhan, Baobao Chang, Hao Sun, Jiawen Deng, Chujie Zheng, Juanzi Li, Lei Hou, Xigang Cao, Jidong Zhai, Zhiyuan Liu, Maosong Sun, Jiwen Lu, Zhiwu Lu, Qin Jin, Ruihua Song, Ji-Rong Wen, Zhouchen Lin, LiWei Wang, Hang Su, Jun Zhu, Zhifang Sui, Jiajun Zhang, Yang Liu, Xiaodong He, Minlie Huang, Jian Tang, Jie Tang
With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm.
no code implementations • ICLR 2022 • Xu Han, Han Gao, Tobias Pfaff, Jian-Xun Wang, Li-Ping Liu
Graph-based next-step prediction models have recently been very successful in modeling complex high-dimensional physical systems on irregular meshes.
1 code implementation • 29 Nov 2021 • Shijie Hao, Xu Han, Yanrong Guo, Meng Wang
On the other hand, since the parameter matrix learned from the first stage is aware of the lightness distribution and the scene structure, it can be incorporated into the second stage as the complementary information.
2 code implementations • 16 Sep 2021 • Runsheng Xu, Hao Xiang, Xin Xia, Xu Han, Jinlong Li, Jiaqi Ma
We then construct a comprehensive benchmark with a total of 16 implemented models to evaluate several information fusion strategies~(i. e. early, late, and intermediate fusion) with state-of-the-art LiDAR detection algorithms.
Ranked #2 on
3D Object Detection
on OPV2V
1 code implementation • ACL 2022 • Yuxian Gu, Xu Han, Zhiyuan Liu, Minlie Huang
To ensure the generalization of PPT, we formulate similar classification tasks into a unified task form and pre-train soft prompts for this unified task.
no code implementations • 24 Aug 2021 • Ning Ding, Yulin Chen, Xu Han, Guangwei Xu, Pengjun Xie, Hai-Tao Zheng, Zhiyuan Liu, Juanzi Li, Hong-Gee Kim
In this work, we investigate the application of prompt-learning on fine-grained entity typing in fully supervised, few-shot and zero-shot scenarios.
no code implementations • 22 Jul 2021 • Xiaofeng Liu, Bo Hu, Linghao Jin, Xu Han, Fangxu Xing, Jinsong Ouyang, Jun Lu, Georges El Fakhri, Jonghye Woo
In this work, we propose a domain generalization (DG) approach to learn on several labeled source domains and transfer knowledge to a target domain that is inaccessible in training.
2 code implementations • 20 Jun 2021 • Zhengyan Zhang, Yuxian Gu, Xu Han, Shengqi Chen, Chaojun Xiao, Zhenbo Sun, Yuan YAO, Fanchao Qi, Jian Guan, Pei Ke, Yanzheng Cai, Guoyang Zeng, Zhixing Tan, Zhiyuan Liu, Minlie Huang, Wentao Han, Yang Liu, Xiaoyan Zhu, Maosong Sun
We present a suite of cost-effective techniques for the use of PLMs to deal with the efficiency issues of pre-training, fine-tuning, and inference.
no code implementations • 16 Jun 2021 • Junhui Cai, Xu Han, Ya'acov Ritov, Linda Zhao
In contrast to the state-of-the-art methods, the proposed methods solve the estimation and testing problem at once with several merits: 1) an accurate sparsity estimation; 2) point estimates with shrinkage/soft-thresholding property; 3) credible intervals for uncertainty quantification; 4) an optimal multiple testing procedure that controls false discovery rate.
no code implementations • 16 Jun 2021 • Xu Han, Ethan X Fang, Cheng Yong Tang
Strong correlations between explanatory variables are problematic for high-dimensional regularized regression methods.
no code implementations • 14 Jun 2021 • Xu Han, Zhengyan Zhang, Ning Ding, Yuxian Gu, Xiao Liu, Yuqi Huo, Jiezhong Qiu, Yuan YAO, Ao Zhang, Liang Zhang, Wentao Han, Minlie Huang, Qin Jin, Yanyan Lan, Yang Liu, Zhiyuan Liu, Zhiwu Lu, Xipeng Qiu, Ruihua Song, Jie Tang, Ji-Rong Wen, Jinhui Yuan, Wayne Xin Zhao, Jun Zhu
Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved great success and become a milestone in the field of artificial intelligence (AI).
1 code implementation • 11 Jun 2021 • Xiaohui Chen, Xu Han, Jiajing Hu, Francisco J. R. Ruiz, LiPing Liu
A graph generative model defines a distribution over graphs.
1 code implementation • NAACL 2021 • Kai Zhang, Yuan YAO, Ruobing Xie, Xu Han, Zhiyuan Liu, Fen Lin, Leyu Lin, Maosong Sun
To establish the bidirectional connections between OpenRE and relation hierarchy, we propose the task of open hierarchical relation extraction and present a novel OHRE framework for the task.
1 code implementation • ACL 2022 • Weize Chen, Xu Han, Yankai Lin, Hexu Zhao, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
Hyperbolic neural networks have shown great potential for modeling complex data.
1 code implementation • ACL 2021 • Ziqi Wang, Xiaozhi Wang, Xu Han, Yankai Lin, Lei Hou, Zhiyuan Liu, Peng Li, Juanzi Li, Jie zhou
Event extraction (EE) has considerably benefited from pre-trained language models (PLMs) by fine-tuning.
2 code implementations • NAACL 2022 • Yujia Qin, Yankai Lin, Jing Yi, Jiajie Zhang, Xu Han, Zhengyan Zhang, Yusheng Su, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
Specifically, we introduce a pre-training framework named "knowledge inheritance" (KI) and explore how could knowledge distillation serve as auxiliary supervision during pre-training to efficiently learn larger PLMs.
1 code implementation • 24 May 2021 • Xu Han, Weilin Zhao, Ning Ding, Zhiyuan Liu, Maosong Sun
This indicates that PTR is a promising approach to take advantage of both human prior knowledge and PLMs for those complicated classification tasks.
1 code implementation • Findings (ACL) 2021 • Tianyu Gao, Xu Han, Keyue Qiu, Yuzhuo Bai, Zhiyu Xie, Yankai Lin, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
Distantly supervised (DS) relation extraction (RE) has attracted much attention in the past few years as it can utilize large-scale auto-labeled data.
7 code implementations • ACL 2021 • Ning Ding, Guangwei Xu, Yulin Chen, Xiaobin Wang, Xu Han, Pengjun Xie, Hai-Tao Zheng, Zhiyuan Liu
In this paper, we present Few-NERD, a large-scale human-annotated few-shot NER dataset with a hierarchy of 8 coarse-grained and 66 fine-grained entity types.
Ranked #6 on
Named Entity Recognition (NER)
on Few-NERD (SUP)
1 code implementation • ICCV 2021 • Yuan YAO, Ao Zhang, Xu Han, Mengdi Li, Cornelius Weber, Zhiyuan Liu, Stefan Wermter, Maosong Sun
In this work, we propose visual distant supervision, a novel paradigm of visual relation learning, which can train scene graph models without any human-labeled data.
no code implementations • 25 Feb 2021 • Jiangtao Duan, Jushan Bai, Xu Han
This paper estimates the break point for large-dimensional factor models with a single structural break in factor loadings at a common unknown date.
1 code implementation • 7 Feb 2021 • Yusheng Su, Xu Han, Yankai Lin, Zhengyan Zhang, Zhiyuan Liu, Peng Li, Jie zhou, Maosong Sun
We then perform contrastive semi-supervised learning on both the retrieved unlabeled and original labeled instances to help PLMs capture crucial task-related semantic features.
no code implementations • 1 Jan 2021 • Xiaofeng Liu, Linghao Jin, Xu Han, Jun Lu, Jane You, Lingsheng Kong
In the up to two orders of magnitude compressed domain, we can explicitly infer the expression from the residual frames and possible to extract identity factors from the I frame with a pre-trained face recognition network.
1 code implementation • 14 Dec 2020 • Xu Han, Xiaohui Chen, Li-Ping Liu
Motivated by the observation that GAN ensembles often outperform single GANs in generation tasks, we propose to construct GAN ensembles for anomaly detection.
1 code implementation • COLING 2020 • Bowen Dong, Yuan YAO, Ruobing Xie, Tianyu Gao, Xu Han, Zhiyuan Liu, Fen Lin, Leyu Lin, Maosong Sun
Few-shot classification requires classifiers to adapt to new classes with only a few training instances.
1 code implementation • Asian Chapter of the Association for Computational Linguistics 2020 • Xiaozhi Wang, Shengyu Jia, Xu Han, Zhiyuan Liu, Juanzi Li, Peng Li, Jie zhou
Existing EAE methods either extract each event argument roles independently or sequentially, which cannot adequately model the joint probability distribution among event arguments and their roles.
10 code implementations • 1 Dec 2020 • Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun
However, applying GPT-3 to address Chinese NLP tasks is still challenging, as the training corpus of GPT-3 is primarily English, and the parameters are not publicly available.
no code implementations • 13 Nov 2020 • Jun Zhang, Yao-Kun Lei, Zhen Zhang, Xu Han, Maodong Li, Lijiang Yang, Yi Isaac Yang, Yi Qin Gao
Combining reinforcement learning (RL) and molecular dynamics (MD) simulations, we propose a machine-learning approach (RL$^\ddag$) to automatically unravel chemical reaction mechanisms.
1 code implementation • EMNLP 2020 • Chaojun Xiao, Yuan YAO, Ruobing Xie, Xu Han, Zhiyuan Liu, Maosong Sun, Fen Lin, Leyu Lin
Distant supervision (DS) has been widely used to generate auto-labeled data for sentence-level relation extraction (RE), which improves RE performance.
no code implementations • 21 Oct 2020 • Xiaofeng Liu, Yuzhuo Han, Song Bai, Yi Ge, Tianxing Wang, Xu Han, Site Li, Jane You, Ju Lu
However, the cross entropy loss can not take the different importance of each class in an self-driving system into account.
no code implementations • 20 Oct 2020 • Xiaofeng Liu, Linghao Jin, Xu Han, Jane You
In the up to two orders of magnitude compressed domain, we can explicitly infer the expression from the residual frames and possibly extract identity factors from the I frame with a pre-trained face recognition network.
1 code implementation • EMNLP 2020 • Hao Peng, Tianyu Gao, Xu Han, Yankai Lin, Peng Li, Zhiyuan Liu, Maosong Sun, Jie zhou
We find that (i) while context is the main source to support the predictions, RE models also heavily rely on the information from entity mentions, most of which is type information, and (ii) existing datasets may leak shallow heuristics via entity mentions and thus contribute to the high performance on RE benchmarks.
Ranked #24 on
Relation Extraction
on TACRED
1 code implementation • EMNLP 2020 • Xin Lv, Xu Han, Lei Hou, Juanzi Li, Zhiyuan Liu, Wei zhang, Yichi Zhang, Hao Kong, Suhui Wu
On the one hand, sparse KGs contain less information, which makes it difficult for the model to choose correct paths.
no code implementations • EMNLP 2020 • Xu Han, Yuzhuo Bai, Keyue Qiu, Zhiyuan Liu, Maosong Sun
Oracle bone script (OBS) is the earliest known ancient Chinese writing system and the ancestor of modern Chinese.
1 code implementation • 29 Sep 2020 • Yusheng Su, Xu Han, Zhengyan Zhang, Peng Li, Zhiyuan Liu, Yankai Lin, Jie zhou, Maosong Sun
In this paper, we propose a novel framework named Coke to dynamically select contextual knowledge and embed knowledge context according to textual context for PLMs, which can avoid the effect of redundant and ambiguous knowledge in KGs that cannot match the input text.
no code implementations • 17 Aug 2020 • Xu Han, Zhengyang Shen, Zhenlin Xu, Spyridon Bakas, Hamed Akbari, Michel Bilello, Christos Davatzikos, Marc Niethammer
They are therefore not designed for the registration of images with strong pathologies for example in the context of brain tumors, and traumatic brain injuries.
1 code implementation • ICCV 2021 • Zhipeng Ding, Xu Han, Peirong Liu, Marc Niethammer
Thus, we propose a learning-based calibration method that focuses on multi-label semantic segmentation.
no code implementations • ACL 2020 • Xu Han, Yi Dai, Tianyu Gao, Yankai Lin, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
Continual relation learning aims to continually train a model on new data to learn incessantly emerging novel relations while avoiding catastrophically forgetting old relations.
1 code implementation • EMNLP 2020 • Xiaozhi Wang, Ziqi Wang, Xu Han, Wangyi Jiang, Rong Han, Zhiyuan Liu, Juanzi Li, Peng Li, Yankai Lin, Jie zhou
Most existing datasets exhibit the following issues that limit further development of ED: (1) Data scarcity.
no code implementations • 25 Apr 2020 • Jun Zhang, Yao-Kun Lei, Zhen Zhang, Junhan Chang, Maodong Li, Xu Han, Lijiang Yang, Yi Isaac Yang, Yi Qin Gao
Deep learning is transforming many areas in science, and it has great potential in modeling molecular systems.
no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Xu Han, Tianyu Gao, Yankai Lin, Hao Peng, Yaoliang Yang, Chaojun Xiao, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
Relational facts are an important component of human knowledge, which are hidden in vast amounts of text.
1 code implementation • 5 Nov 2019 • Yuan Yao, Haoxi Zhong, Zhengyan Zhang, Xu Han, Xiaozhi Wang, Chaojun Xiao, Guoyang Zeng, Zhiyuan Liu, Maosong Sun
In this work, we propose a challenging adversarial language game called Adversarial Taboo as an example, in which an attacker and a defender compete around a target word.
no code implementations • 3 Nov 2019 • Xiaofeng Liu, Xu Han, Yukai Qiao, Yi Ge, Lu Jun
In this paper, we target on this task from the perspective of loss function.
1 code implementation • IJCNLP 2019 • Xiaozhi Wang, Ziqi Wang, Xu Han, Zhiyuan Liu, Juanzi Li, Peng Li, Maosong Sun, Jie zhou, Xiang Ren
Existing event extraction methods classify each argument role independently, ignoring the conceptual correlations between different argument roles.
no code implementations • 1 Nov 2019 • Zhipeng Ding, Xu Han, Marc Niethammer
Specifically, we first illustrate that using a deep convolutional neural network to predict atlas probabilities can better distinguish correct atlas labels from incorrect ones than relying on image intensity difference as is typical in JLF.
1 code implementation • IJCNLP 2019 • Ruidong Wu, Yuan YAO, Xu Han, Ruobing Xie, Zhiyuan Liu, Fen Lin, Leyu Lin, Maosong Sun
Open relation extraction (OpenRE) aims to extract relational facts from the open-domain corpus.
1 code implementation • IJCNLP 2019 • Tianyu Gao, Xu Han, Hao Zhu, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
We present FewRel 2. 0, a more challenging task to investigate two aspects of few-shot relation classification models: (1) Can they adapt to a new domain with only a handful of instances?
1 code implementation • IJCNLP 2019 • Xu Han, Tianyu Gao, Yuan YAO, Demin Ye, Zhiyuan Liu, Maosong Sun
OpenNRE is an open-source and extensible toolkit that provides a unified framework to implement neural models for relation extraction (RE).
1 code implementation • IJCNLP 2019 • Xin Lv, Yuxian Gu, Xu Han, Lei Hou, Juanzi Li, Zhiyuan Liu
Multi-hop knowledge graph (KG) reasoning is an effective and explainable method for predicting the target entity via reasoning paths in query answering (QA) task.
Ranked #3 on
Link Prediction
on NELL-995
1 code implementation • 29 Aug 2019 • Tianyu Gao, Xu Han, Ruobing Xie, Zhiyuan Liu, Fen Lin, Leyu Lin, Maosong Sun
To address new relations with few-shot instances, we propose a novel bootstrapping approach, Neural Snowball, to learn new relations by transferring semantic knowledge about existing relations.
no code implementations • 4 Aug 2019 • Tian Zhang, Jia Wang, Yihang Dan, Yuxiang Lanqiu, Jian Dai, Xu Han, Xiaojuan Sun, Kun Xu
Recently, optical neural networks (ONNs) integrated in photonic chips has received extensive attention because they are expected to implement the same pattern recognition tasks in the electronic platforms with high efficiency and low power consumption.
Ranked #1 on
Reinforcement Learning
on iris
(using extra training data)
2 code implementations • ACL 2019 • Jie Zhou, Xu Han, Cheng Yang, Zhiyuan Liu, LiFeng Wang, Changcheng Li, Maosong Sun
Fact verification (FV) is a challenging task which requires to retrieve relevant evidence from plain text and use the evidence to verify given claims.
Ranked #7 on
Fact Verification
on FEVER
1 code implementation • ACL 2019 • Weize Chen, Hao Zhu, Xu Han, Zhiyuan Liu, Maosong Sun
We introduce a conceptually simple and effective method to quantify the similarity between relations in knowledge bases.
4 code implementations • ACL 2019 • Yuan Yao, Deming Ye, Peng Li, Xu Han, Yankai Lin, Zheng-Hao Liu, Zhiyuan Liu, Lixin Huang, Jie zhou, Maosong Sun
Multiple entities in a document generally exhibit complex inter-sentence relations, and cannot be well handled by existing relation extraction (RE) methods that typically focus on extracting intra-sentence relations for single entity pairs.
Ranked #59 on
Relation Extraction
on DocRED
1 code implementation • NAACL 2019 • Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li
Modern weakly supervised methods for event detection (ED) avoid time-consuming human annotation and achieve promising results by learning from auto-labeled data.
2 code implementations • ACL 2019 • Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, Qun Liu
Neural language representation models such as BERT pre-trained on large-scale corpora can well capture rich semantic patterns from plain text, and be fine-tuned to consistently improve the performance of various NLP tasks.
Ranked #1 on
Entity Linking
on FIGER
no code implementations • 18 Apr 2019 • Zhipeng Ding, Xu Han, Marc Niethammer
Experiments on 3D brain MRI data show that by selecting a good initial atlas set MAS with VoteNet significantly outperforms a number of other label fusion strategies as well as a direct DL segmentation approach.
3 code implementations • CVPR 2019 • Zhengyang Shen, Xu Han, Zhenlin Xu, Marc Niethammer
In contrast to existing approaches, our framework combines two registration methods: an affine registration and a vector momentum-parameterized stationary velocity field (vSVF) model.
Ranked #2 on
Image Registration
on Osteoarthritis Initiative
6 code implementations • 13 Jan 2019 • Patrick Bilic, Patrick Christ, Hongwei Bran Li, Eugene Vorontsov, Avi Ben-Cohen, Georgios Kaissis, Adi Szeskin, Colin Jacobs, Gabriel Efrain Humpire Mamani, Gabriel Chartrand, Fabian Lohöfer, Julian Walter Holch, Wieland Sommer, Felix Hofmann, Alexandre Hostettler, Naama Lev-Cohain, Michal Drozdzal, Michal Marianne Amitai, Refael Vivantik, Jacob Sosna, Ivan Ezhov, Anjany Sekuboyina, Fernando Navarro, Florian Kofler, Johannes C. Paetzold, Suprosanna Shit, Xiaobin Hu, Jana Lipková, Markus Rempfler, Marie Piraud, Jan Kirschke, Benedikt Wiestler, Zhiheng Zhang, Christian Hülsemeyer, Marcel Beetz, Florian Ettlinger, Michela Antonelli, Woong Bae, Míriam Bellver, Lei Bi, Hao Chen, Grzegorz Chlebus, Erik B. Dam, Qi Dou, Chi-Wing Fu, Bogdan Georgescu, Xavier Giró-i-Nieto, Felix Gruen, Xu Han, Pheng-Ann Heng, Jürgen Hesser, Jan Hendrik Moltz, Christian Igel, Fabian Isensee, Paul Jäger, Fucang Jia, Krishna Chaitanya Kaluva, Mahendra Khened, Ildoo Kim, Jae-Hun Kim, Sungwoong Kim, Simon Kohl, Tomasz Konopczynski, Avinash Kori, Ganapathy Krishnamurthi, Fan Li, Hongchao Li, Junbo Li, Xiaomeng Li, John Lowengrub, Jun Ma, Klaus Maier-Hein, Kevis-Kokitsi Maninis, Hans Meine, Dorit Merhof, Akshay Pai, Mathias Perslev, Jens Petersen, Jordi Pont-Tuset, Jin Qi, Xiaojuan Qi, Oliver Rippel, Karsten Roth, Ignacio Sarasua, Andrea Schenk, Zengming Shen, Jordi Torres, Christian Wachinger, Chunliang Wang, Leon Weninger, Jianrong Wu, Daguang Xu, Xiaoping Yang, Simon Chun-Ho Yu, Yading Yuan, Miao Yu, Liping Zhang, Jorge Cardoso, Spyridon Bakas, Rickmer Braren, Volker Heinemann, Christopher Pal, An Tang, Samuel Kadoury, Luc Soler, Bram van Ginneken, Hayit Greenspan, Leo Joskowicz, Bjoern Menze
In this work, we report the set-up and results of the Liver Tumor Segmentation Benchmark (LiTS), which was organized in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI) 2017 and the International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2017 and 2018.
2 code implementations • 28 Dec 2018 • Yankai Lin, Xu Han, Ruobing Xie, Zhiyuan Liu, Maosong Sun
Knowledge representation learning (KRL) aims to represent entities and relations in knowledge graph in low-dimensional semantic space, which have been widely used in massive knowledge-driven tasks.
no code implementations • 14 Nov 2018 • Xu Han, Laurent Albera, Amar Kachenoura, Huazhong Shu, Lotfi Senhadji
Based on the low-rank property and an over-estimation of the core tensor, this joint estimation problem is solved by promoting (group) sparsity of the over-estimated core tensor.
1 code implementation • ACL 2019 • Shun Zheng, Xu Han, Yankai Lin, Peilin Yu, Lu Chen, Ling Huang, Zhiyuan Liu, Wei Xu
To demonstrate the effectiveness of DIAG-NRE, we apply it to two real-world datasets and present both significant and interpretable improvements over state-of-the-art methods.
1 code implementation • EMNLP 2018 • Xu Han, Shulin Cao, Xin Lv, Yankai Lin, Zhiyuan Liu, Maosong Sun, Juanzi Li
We release an open toolkit for knowledge embedding (OpenKE), which provides a unified framework and various fundamental models to embed knowledge graphs into a continuous low-dimensional space.
1 code implementation • EMNLP 2018 • Xu Han, Hao Zhu, Pengfei Yu, ZiYun Wang, Yuan YAO, Zhiyuan Liu, Maosong Sun
The relation of each sentence is first recognized by distant supervision methods, and then filtered by crowdworkers.
no code implementations • 5 Oct 2018 • Xu Han, Hongsu Wang, Sanqian Zhang, Qunchao Fu, Jun S. Liu
In this paper, we develop a low than character feature embedding called radical embedding, and apply it on LSTM model for sentence segmentation of pre modern Chinese texts.
1 code implementation • EMNLP 2018 • Ji Xin, Hao Zhu, Xu Han, Zhiyuan Liu, Maosong Sun
Entity typing aims to classify semantic types of an entity mention in a specific context.
1 code implementation • EMNLP 2018 • Xu Han, Pengfei Yu, Zhiyuan Liu, Maosong Sun, Peng Li
In this paper, we aim to incorporate the hierarchical information of relations for distantly supervised relation extraction and propose a novel hierarchical attention scheme.
1 code implementation • COLING 2018 • Xiaozhi Wang, Xu Han, Yankai Lin, Zhiyuan Liu, Maosong Sun
To address these issues, we propose an adversarial multi-lingual neural relation extraction (AMNRE) model, which builds both consistent and individual representations for each sentence to consider the consistency and diversity among languages.
no code implementations • 28 May 2018 • Xu Han, Zhiyuan Liu, Maosong Sun
As shown in the experiments on a large-scale benchmark dataset in relation extraction, our denoising method can effectively filter out noisy instances and achieve significant improvements as compared with the state-of-the-art models.
1 code implementation • 11 Apr 2018 • James Kapaldo, Xu Han, Domingo Mery
Locating the center of convex objects is important in both image processing and unsupervised machine learning/data clustering fields.
1 code implementation • 15 Nov 2017 • Xu Han, Roland Kwitt, Stephen Aylward, Spyridon Bakas, Bjoern Menze, Alexander Asturias, Paul Vespa, John Van Horn, Marc Niethammer
Extracting the brain from images with strong pathologies, for example, the presence of a tumor or of a traumatic brain injury, is challenging.
no code implementations • 31 Mar 2017 • Xu Han, Xiao Yang, Stephen Aylward, Roland Kwitt, Marc Niethammer
Registration involving one or more images containing pathologies is challenging, as standard image similarity measures and spatial transforms cannot account for common changes due to pathologies.
no code implementations • 13 Nov 2016 • Xu Han, Zhiyuan Liu, Maosong Sun
Joint representation learning of text and knowledge within a unified semantic space enables us to perform knowledge graph completion more accurately.
no code implementations • 24 Jul 2014 • Jiasong Wu, Longyu Jiang, Xu Han, Lotfi Senhadji, Huazhong Shu
Texture plays an important role in many image analysis applications.