no code implementations • 21 Dec 2024 • Yunshan Zhong, Yuyao Zhou, Yuxin Zhang, Shen Li, Yong Li, Fei Chao, Zhanpeng Zeng, Rongrong Ji
Data-free quantization (DFQ), which facilitates model quantization without real data to address increasing concerns about data security, has garnered significant attention within the model compression community.
no code implementations • 19 Dec 2024 • Bin Chen, Zhiwei Liang, Yi Lei, Jingxin Deng, Shen Li, Gabriele Liga
In this paper, we introduce an analytical nonlinear interference (NLI) power model-based shaping gain estimation method to enable a fast performance evaluation of various MD modulation formats in coherent dual-polarization (DP) optical transmission system.
no code implementations • 20 Nov 2024 • Shen Li, Lei Jiang, Wei Wang, Hongwei Hu, Liang Li
This paper shows a proof-of-concept that, given a typical 3-channel images but in a randomly permuted channel order, a model (termed as Chanel-Orderer) with ad-hoc inductive biases in terms of both architecture and loss functions can accurately predict the channel ordering and knows how to make it right.
no code implementations • 20 Nov 2024 • Mengzhu Wang, Jiao Li, Houcheng Su, Nan Yin, Liang Yang, Shen Li
Semi-supervised learning (SSL) has made notable advancements in medical image segmentation (MIS), particularly in scenarios with limited labeled data and significantly enhancing data utilization efficiency.
no code implementations • 12 Nov 2024 • Weibo Zhao, Yubin Shi, Xinyu Lyu, Wanchen Sui, Shen Li, Yong Li
Quantization stands as a pivotal technique for large language model (LLM) serving, yet it poses significant challenges particularly in achieving effective low-bit quantization.
no code implementations • 11 Oct 2024 • Yanfeng Jiang, Zelan Yang, Bohua Chen, Shen Li, Yong Li, Tao Li
To address the above issue, we propose a novel distribution-driven delta compression framework DeltaDQ, which utilizes Group-wise Dropout and Separate Quantization to achieve ultra-high compression for the delta weight.
no code implementations • 30 Sep 2024 • Yajie Sheng, Bin Chen, Yi Lei, Jingxin Deng, Jiwei Xu, Mengfan Fu, Qunbi Zhuge, Shen Li
Performance of concatenated multilevel coding with probabilistic shaping (PS) and Voronoi constellations (VCs) is analysed over AWGN channel.
no code implementations • 26 Sep 2024 • Shen Li, Jianqing Xu, Jiaying Wu, Miao Xiong, Ailin Deng, Jiazhen Ji, Yuge Huang, Wenjie Feng, Shouhong Ding, Bryan Hooi
This equivalence motivates an ID-preserving sampling algorithm, which operates over an adjusted gradient vector field, enabling the generation of fake face recognition datasets that approximate the distribution of real-world faces.
no code implementations • 9 Sep 2024 • Shen Li, Yuyang Zhang, Zhaolin Ren, Claire Liang, Na Li, Julie A. Shah
Theoretical and empirical analyses show that for queries with strong preferences, response times complement choices by providing extra information about preference strength, leading to significantly improved utility estimation.
1 code implementation • 30 Aug 2024 • Shen Li, Liuyi Yao, Lan Zhang, Yaliang Li
Aligned LLMs are secure, capable of recognizing and refusing to answer malicious questions.
no code implementations • 13 Jun 2024 • Xuemin Hu, Shen Li, Yingfen Xu, Bo Tang, Long Chen
Offline reinforcement learning (RL) can learn optimal policies from pre-collected offline datasets without interacting with the environment, but the sampled actions of the agent cannot often cover the action distribution under a given state, resulting in the extrapolation error issue.
no code implementations • 11 Jun 2024 • Hao Yu, Zelan Yang, Shen Li, Yong Li, Jianxin Wu
The advent of pre-trained large language models (LLMs) has revolutionized various natural language processing tasks.
2 code implementations • 4 Mar 2024 • Buyun Zhang, Liang Luo, Yuxin Chen, Jade Nie, Xi Liu, Daifeng Guo, Yanli Zhao, Shen Li, Yuchen Hao, Yantao Yao, Guna Lakshminarayanan, Ellie Dingqiao Wen, Jongsoo Park, Maxim Naumov, Wenlin Chen
Scaling laws play an instrumental role in the sustainable improvement in model quality.
2 code implementations • 1 Mar 2024 • Liang Luo, Buyun Zhang, Michael Tsang, Yinbin Ma, Ching-Hsiang Chu, Yuxin Chen, Shen Li, Yuchen Hao, Yanli Zhao, Guna Lakshminarayanan, Ellie Dingqiao Wen, Jongsoo Park, Dheevatsa Mudigere, Maxim Naumov
We study a mismatch between the deep learning recommendation models' flat architecture, common distributed training paradigm and hierarchical data center topology.
no code implementations • 22 Feb 2024 • Shen Li, Liuyi Yao, Jinyang Gao, Lan Zhang, Yaliang Li
To support various applications, a prevalent and efficient approach for business owners is leveraging their valuable datasets to fine-tune a pre-trained LLM through the API provided by LLM owners or cloud servers.
no code implementations • 11 Jan 2024 • Bo Chen, Xingyi Cheng, Pan Li, Yangli-ao Geng, Jing Gong, Shen Li, Zhilei Bei, Xu Tan, Boyan Wang, Xin Zeng, Chiming Liu, Aohan Zeng, Yuxiao Dong, Jie Tang, Le Song
We propose a unified protein language model, xTrimoPGLM, to address these two types of tasks simultaneously through an innovative pre-training framework.
1 code implementation • 28 Sep 2023 • Jiaying Wu, Shen Li, Ailin Deng, Miao Xiong, Bryan Hooi
Despite considerable advances in automated fake news detection, due to the timely nature of news, it remains a critical open question how to effectively predict the veracity of news articles based on limited fact-checks.
no code implementations • 17 Aug 2023 • Bin Chen, Zhiwei Liang, Shen Li, Yi Lei, Gabriele Liga, Alex Alvarado
Multidimensional constellation shaping of up to 32 dimensions with different spectral efficiencies are compared through AWGN and fiber-optic simulations.
no code implementations • 16 Aug 2023 • Qinghui Nie, Jishun Ou, Haiyang Zhang, Jiawei Lu, Shen Li, Haotian Shi
An efficient urban bus control system has the potential to significantly reduce travel delays and streamline the allocation of transportation resources, thereby offering enhanced and user-friendly transit services to passengers.
1 code implementation • NeurIPS 2023 • Miao Xiong, Ailin Deng, Pang Wei Koh, Jiaying Wu, Shen Li, Jianqing Xu, Bryan Hooi
We examine the problem over 504 pretrained ImageNet models and observe that: 1) Proximity bias exists across a wide variety of model architectures and sizes; 2) Transformer-based models are relatively more susceptible to proximity bias than CNN-based models; 3) Proximity bias persists even after performing popular calibration algorithms like temperature scaling; 4) Models tend to overfit more heavily on low proximity samples than on high proximity samples.
no code implementations • 2 May 2023 • Xuemin Hu, Shen Li, Tingyu Huang, Bo Tang, Rouxing Huai, Long Chen
In general, a large scale of testing in simulation environment is conducted and then the learned driving knowledge is transferred to the real world, so how to adapt driving knowledge learned in simulation to reality becomes a critical issue.
2 code implementations • 21 Apr 2023 • Yanli Zhao, Andrew Gu, Rohan Varma, Liang Luo, Chien-chin Huang, Min Xu, Less Wright, Hamid Shojanazeri, Myle Ott, Sam Shleifer, Alban Desmaison, Can Balioglu, Pritam Damania, Bernard Nguyen, Geeta Chauhan, Yuchen Hao, Ajit Mathews, Shen Li
It is widely acknowledged that large models have the potential to deliver superior performance across a broad range of domains.
1 code implementation • 6 Feb 2023 • Ailin Deng, Shen Li, Miao Xiong, Zhirui Chen, Bryan Hooi
Trustworthy machine learning is of primary importance to the practical deployment of deep learning models.
no code implementations • CVPR 2023 • Jianqing Xu, Shen Li, Ailin Deng, Miao Xiong, Jiaying Wu, Jiaxiang Wu, Shouhong Ding, Bryan Hooi
Mean ensemble (i. e. averaging predictions from multiple models) is a commonly-used technique in machine learning that improves the performance of each individual model.
1 code implementation • 29 Nov 2022 • Miao Xiong, Shen Li, Wenjie Feng, Ailin Deng, Jihai Zhang, Bryan Hooi
How do we know when the predictions made by a classifier can be trusted?
no code implementations • 10 Nov 2022 • Jiawei Zhang, Shen Li, Li Li
Connected and automated vehicles (CAVs) are viewed as a special kind of robots that have the potential to significantly improve the safety and efficiency of traffic.
no code implementations • 19 Oct 2022 • Mitchell Wortsman, Suchin Gururangan, Shen Li, Ali Farhadi, Ludwig Schmidt, Michael Rabbat, Ari S. Morcos
When fine-tuning DeiT-base and DeiT-large on ImageNet, this procedure matches accuracy in-distribution and improves accuracy under distribution shift compared to the baseline, which observes the same amount of data but communicates gradients at each step.
no code implementations • 23 Aug 2022 • Shen Li, Bryan Hooi
Without exploiting any label information, the principal components recovered store the most informative elements in their \emph{leading} dimensions and leave the negligible in the \emph{trailing} ones, allowing for clear performance improvements of $5\%$-$10\%$ in downstream tasks.
no code implementations • 9 Jun 2022 • Yanwei Wang, Nadia Figueroa, Shen Li, Ankit Shah, Julie Shah
In this work, we identify the roots of this challenge as the failure of a learned continuous policy to satisfy the discrete plan implicit in the demonstration.
2 code implementations • 23 May 2022 • Yuchao Li, Fuli Luo, Chuanqi Tan, Mengdi Wang, Songfang Huang, Shen Li, Junjie Bai
With the dramatically increased number of parameters in language models, sparsity methods have received ever-increasing research focus to compress and accelerate the models.
5 code implementations • CVPR 2023 • Ali Hassani, Steven Walton, Jiachen Li, Shen Li, Humphrey Shi
We present Neighborhood Attention (NA), the first efficient and scalable sliding-window attention mechanism for vision.
Ranked #123 on
Semantic Segmentation
on ADE20K
no code implementations • 6 Dec 2021 • Wenjie Chu, Shen Li, Chao Chen, Longfei Xu, Hengbin Cui, Kaikui Liu
Most of the existing methods for debaising in click-through rate (CTR) prediction depend on an oversimplified assumption, i. e., the click probability is the product of observation probability and relevance probability.
no code implementations • 2 Dec 2021 • Shen Li, Jianqing Xu, Bryan Hooi
This paper proposes a probabilistic contrastive loss function for self-supervised learning.
no code implementations • 31 Oct 2021 • Yang Sun, Fajie Yuan, Min Yang, Alexandros Karatzoglou, Shen Li, Xiaoyan Zhao
In this paper, we plan to exploit such redundancy phenomena to improve the performance of RS.
no code implementations • 20 Oct 2021 • Ran Cheng, Chao Chen, Longfei Xu, Shen Li, Lei Wang, Hengbin Cui, Kaikui Liu, Xiaolong Li
For user representation, we utilize a series of historical navigation to extract user preference.
no code implementations • 18 Oct 2021 • Shen Li, Theodoros Stouraitis, Michael Gienger, Sethu Vijayakumar, Julie A. Shah
Consistent state estimation is challenging, especially under the epistemic uncertainties arising from learned (nonlinear) dynamic and observation models.
no code implementations • 6 Jul 2021 • Ankit Shah, Pritish Kamath, Shen Li, Patrick Craven, Kevin Landers, Kevin Oden, Julie Shah
When observing task demonstrations, human apprentices are able to identify whether a given task is executed correctly long before they gain expertise in actually performing that task.
1 code implementation • CVPR 2021 • Shen Li, Jianqing Xu, Xiaqing Xu, Pengcheng Shen, Shaoxin Li, Bryan Hooi
Probabilistic Face Embeddings (PFE) is the first attempt to address this dilemma.
1 code implementation • 4 Jun 2021 • Shaokun Zhang, Xiawu Zheng, Chenyi Yang, Yuchao Li, Yan Wang, Fei Chao, Mengdi Wang, Shen Li, Jun Yang, Rongrong Ji
Motivated by the necessity of efficient inference across various constraints on BERT, we propose a novel approach, YOCO-BERT, to achieve compress once and deploy everywhere.
1 code implementation • 31 May 2021 • Mingbao Lin, Yuxin Zhang, Yuchao Li, Bohong Chen, Fei Chao, Mengdi Wang, Shen Li, Yonghong Tian, Rongrong Ji
We also provide a workflow of filter rearrangement that first rearranges the weight matrix in the output channel dimension to derive more influential blocks for accuracy improvements and then applies similar rearrangement to the next-layer weights in the input channel dimension to ensure correct convolutional operations.
no code implementations • 21 Apr 2021 • Yuqiong Qi, Yang Hu, Haibin Wu, Shen Li, Haiyu Mao, Xiaochun Ye, Dongrui Fan, Ninghui Sun
In this work, we aim to extensively explore the above system design challenges and these challenges motivate us to propose a comprehensive framework that synergistically handles the heterogeneous hardware accelerator design principles, system design criteria, and task scheduling mechanism.
no code implementations • 18 Apr 2021 • Shen Li, Bingpeng Ma, Hong Chang, Shiguang Shan, Xilin Chen
This paper proposes a novel model, named Continuity-Discrimination Convolutional Neural Network (CD-CNN), for visual object tracking.
1 code implementation • 5 Feb 2021 • Chaoyang He, Shen Li, Mahdi Soltanolkotabi, Salman Avestimehr
PipeTransformer automatically adjusts the pipelining and data parallelism by identifying and freezing some layers during the training, and instead allocates resources for training of the remaining active layers.
no code implementations • 1 Jan 2021 • Shen Li, Jianqing Xu, Xiaqing Xu, Pengcheng Shen, Shaoxin Li, Bryan Hooi
To address these issues, in this paper, we propose a novel framework for face uncertainty learning in hyperspherical space.
3 code implementations • 28 Jun 2020 • Shen Li, Yanli Zhao, Rohan Varma, Omkar Salpekar, Pieter Noordhuis, Teng Li, Adam Paszke, Jeff Smith, Brian Vaughan, Pritam Damania, Soumith Chintala
This paper presents the design, implementation, and evaluation of the PyTorch distributed data parallel module.
no code implementations • 6 Apr 2020 • Shen Li, Renfen Hu, Jinshan Wu
Word meaning has different aspects, while the existing word representation "compresses" these aspects into a single vector, and it needs further analysis to recover the information in different dimensions.
2 code implementations • ICLR 2020 • Shen Li, Bryan Hooi, Gim Hee Lee
Yet, most deep generative models do not address the question of identifiability, and thus fail to deliver on the promise of the recovery of the true latent sources that generate the observations.
1 code implementation • 6 Aug 2019 • Shen Li, Chenhao Su, Renfen Hu, Zhengdong Lu
Dropout is known as an effective way to reduce overfitting via preventing co-adaptations of units.
1 code implementation • ACL 2019 • Kun Liu, Shen Li, Daqi Zheng, Zhengdong Lu, Sheng Gao, Si Li
To solve this problem, we propose a prism module to disentangle the semantic aspects of words and reduce noise at the input layer of a model.
Ranked #53 on
Named Entity Recognition (NER)
on CoNLL 2003 (English)
no code implementations • ACL 2019 • Renfen Hu, Shen Li, Shichen Liang
Diachronic word embeddings have been widely used in detecting temporal changes.
no code implementations • NeurIPS 2018 • Ankit Shah, Pritish Kamath, Julie A. Shah, Shen Li
When observing task demonstrations, human apprentices are able to identify whether a given task is executed correctly long before they gain expertise in actually performing that task.
no code implementations • 30 Aug 2018 • Shen Li, Hengru Xu, Zhengdong Lu
As neural networks have dominated the state-of-the-art results in a wide range of NLP tasks, it attracts considerable attention to improve the performance of neural models by integrating symbolic knowledge.
1 code implementation • CONLL 2018 • Hengru Xu, Shen Li, Renfen Hu, Si Li, Sheng Gao
Dropout is used to avoid overfitting by randomly dropping units from the neural networks during training.
2 code implementations • ACL 2018 • Shen Li, Zhe Zhao, Renfen Hu, Wensi Li, Tao Liu, Xiaoyong Du
Analogical reasoning is effective in capturing linguistic regularities.
1 code implementation • 20 Apr 2018 • Shen Li, Christian Häger, Nil Garcia, Henk Wymeersch
Machine learning is used to compute achievable information rates (AIRs) for a simplified fiber channel.
no code implementations • 9 Sep 2017 • Shuochao Yao, Yiran Zhao, Huajie Shao, Aston Zhang, Chao Zhang, Shen Li, Tarek Abdelzaher
Recent advances in deep learning have led various applications to unprecedented achievements, which could potentially bring higher intelligence to a broad spectrum of mobile and ubiquitous applications.
no code implementations • EMNLP 2017 • Shen Li, Zhe Zhao, Tao Liu, Renfen Hu, Xiaoyong Du
Convolutional Neural Networks (CNNs) are widely used in NLP tasks.
no code implementations • EMNLP 2017 • Zhe Zhao, Tao Liu, Shen Li, Bofang Li, Xiaoyong Du
The existing word representation methods mostly limit their information source to word co-occurrence statistics.