no code implementations • EMNLP 2020 • Kaiyu Huang, Degen Huang, Zhuang Liu, Fengran Mo
Chinese word segmentation (CWS) is an essential task for Chinese downstream NLP tasks.
no code implementations • ICML 2020 • Yu Sun, Xiaolong Wang, Zhuang Liu, John Miller, Alexei Efros, University of California Moritz Hardt
We introduce a general approach, called test-time training, for improving the performance of predictive models when training and test data come from different distributions.
no code implementations • 19 Jun 2024 • Zeping Li, Xinlong Yang, Ziheng Gao, Ji Liu, Zhuang Liu, Dong Li, Jinzhang Peng, Lu Tian, Emad Barsoum
Large Language Models (LLMs) inherently use autoregressive decoding, which lacks parallelism in inference and results in significantly slow inference speeds, especially when hardware parallel accelerators and memory bandwidth are not fully utilized.
1 code implementation • 23 May 2024 • Haoxuan Li, Jifan Yu, Yuanxin Ouyang, Zhuang Liu, Wenge Rong, Juanzi Li, Zhang Xiong
Knowledge tracing (KT), aiming to mine students' mastery of knowledge by their exercise records and predict their performance on future test questions, is a critical task in educational assessment.
no code implementations • 9 Apr 2024 • Haoxuan Li, Yuanxin Ouyang, Zhuang Liu, Wenge Rong, Zhang Xiong
Collaborative filtering (CF) is an essential technique in recommender systems that provides personalized recommendations by only leveraging user-item interactions.
1 code implementation • 13 Mar 2024 • Zhuang Liu, Kaiming He
We revisit the "dataset classification" experiment suggested by Torralba and Efros a decade ago, in the new era with large-scale, diverse, and hopefully less biased datasets as well as more capable neural network architectures.
3 code implementations • 27 Feb 2024 • MingJie Sun, Xinlei Chen, J. Zico Kolter, Zhuang Liu
We observe an empirical phenomenon in Large Language Models (LLMs) -- very few activations exhibit significantly larger values than others (e. g., 100, 000 times larger).
no code implementations • 27 Feb 2024 • Zhang Xiong, Haoxuan Li, Zhuang Liu, Zhuofan Chen, Hao Zhou, Wenge Rong, Yuanxin Ouyang
Personalized education, tailored to individual student needs, leverages educational technology and artificial intelligence (AI) in the digital age to enhance learning effectiveness.
2 code implementations • 20 Feb 2024 • Kai Wang, Zhaopan Xu, Yukun Zhou, Zelin Zang, Trevor Darrell, Zhuang Liu, Yang You
The autoencoder extracts latent representations of a subset of the trained network parameters.
1 code implementation • 3 Feb 2024 • Bo Yang, Chen Wang, Xiaoshuang Ma, Beiping Song, Zhuang Liu, Fangde Sun
To address this gap, our study introduces a novel zero-shot, sketch-based retrieval method for remote sensing images, leveraging multi-level feature extraction, self-attention-guided tokenization and filtering, and cross-modality attention update.
1 code implementation • 25 Jan 2024 • Xinlei Chen, Zhuang Liu, Saining Xie, Kaiming He
In this study, we examine the representation learning abilities of Denoising Diffusion Models (DDM) that were originally purposed for image generation.
1 code implementation • CVPR 2024 • Shengbang Tong, Zhuang Liu, Yuexiang Zhai, Yi Ma, Yann Lecun, Saining Xie
To understand the roots of these errors, we explore the gap between the visual embedding space of CLIP and vision-only self-supervised learning.
1 code implementation • 30 Nov 2023 • Zhiqiu Xu, Yanjie Chen, Kirill Vishniakov, Yida Yin, Zhiqiang Shen, Trevor Darrell, Lingjie Liu, Zhuang Liu
Weight selection offers a new approach to leverage the power of pretrained models in resource-constrained settings, and we hope it can be a useful tool for training small models in the large-model era.
1 code implementation • 15 Nov 2023 • Kirill Vishniakov, Zhiqiang Shen, Zhuang Liu
Modern computer vision offers a great variety of models to practitioners, and selecting a model from multiple options for specific applications can be challenging.
1 code implementation • 9 Nov 2023 • Yida Yin, Zhiqiu Xu, Zhiyuan Li, Trevor Darrell, Zhuang Liu
Stochastic Variance Reduced Gradient (SVRG), introduced by Johnson & Zhang (2013), is a theoretically compelling optimization method.
no code implementations • 21 Aug 2023 • Zhuang Liu, Ye Yuan, Zhilong Ji, Jingfeng Bai, Xiang Bai
Then we design a semantic aware module (SAM), which projects the visual and classification feature into semantic space.
5 code implementations • 20 Jun 2023 • MingJie Sun, Zhuang Liu, Anna Bair, J. Zico Kolter
Motivated by the recent observation of emergent large magnitude features in LLMs, our approach prunes weights with the smallest magnitudes multiplied by the corresponding input activations, on a per-output basis.
1 code implementation • 13 Jun 2023 • Arnav Chavan, Zhuang Liu, Deepak Gupta, Eric Xing, Zhiqiang Shen
We present Generalized LoRA (GLoRA), an advanced approach for universal parameter-efficient fine-tuning tasks.
1 code implementation • CVPR 2023 • Rohit Girdhar, Alaaeldin El-Nouby, Zhuang Liu, Mannat Singh, Kalyan Vasudev Alwala, Armand Joulin, Ishan Misra
We show that all combinations of paired data are not necessary to train such a joint embedding, and only image-paired data is sufficient to bind the modalities together.
Ranked #2 on Zero-shot Classification (unified classes) on LLVIP
no code implementations • 25 Apr 2023 • Kuo Yang, Zecong Yu, Xin Su, Xiong He, Ning Wang, Qiguang Zheng, Feidie Yu, Zhuang Liu, Tiancai Wen, Xuezhong Zhou
We constructed a high-quality benchmark dataset for sequential diagnosis and treatment of diabetes and evaluated PrescDRL against this benchmark.
1 code implementation • 2 Mar 2023 • Zhuang Liu, Zhiqiu Xu, Joseph Jin, Zhiqiang Shen, Trevor Darrell
Additionally, we explore a symmetric technique for regularizing overfitting models - late dropout, where dropout is not used in the early iterations and is only activated later in training.
13 code implementations • CVPR 2023 • Sanghyun Woo, Shoubhik Debnath, Ronghang Hu, Xinlei Chen, Zhuang Liu, In So Kweon, Saining Xie
This co-design of self-supervised learning techniques and architectural improvement results in a new model family called ConvNeXt V2, which significantly improves the performance of pure ConvNets on various recognition benchmarks, including ImageNet classification, COCO detection, and ADE20K segmentation.
Ranked #45 on Semantic Segmentation on ADE20K
no code implementations • 30 Oct 2022 • Zhuang Liu, Zhichao Zhao, Ye Yuan, Zhi Qiao, Jinfeng Bai, Zhilong Ji
In this technical report, we briefly introduce the solution of our team ''summer'' for Atomospheric Turbulence Mitigation in UG$^2$+ Challenge in CVPR 2022.
48 code implementations • CVPR 2022 • Zhuang Liu, Hanzi Mao, Chao-yuan Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie
The "Roaring 20s" of visual recognition began with the introduction of Vision Transformers (ViTs), which quickly superseded ConvNets as the state-of-the-art image classification model.
Ranked #1 on Classification on InDL
1 code implementation • CVPR 2022 • Arnav Chavan, Zhiqiang Shen, Zhuang Liu, Zechun Liu, Kwang-Ting Cheng, Eric Xing
This paper explores the feasibility of finding an optimal sub-model from a vision transformer and introduces a pure vision transformer slimming (ViT-Slim) framework.
2 code implementations • CVPR 2020 • John Lambert, Zhuang Liu, Ozan Sener, James Hays, Vladlen Koltun
We adopt zero-shot cross-dataset transfer as a benchmark to systematically evaluate a model's robustness and show that MSeg training yields substantially more robust models in comparison to training on individual datasets or naive mixing of datasets without the presented contributions.
Ranked #8 on Semantic Segmentation on ScanNetV2
1 code implementation • ICLR 2022 • Zhuang Liu, Zhiqiu Xu, Hung-Ju Wang, Trevor Darrell, Evan Shelhamer
A cascade of "exits" is attached to the model to make multiple predictions.
no code implementations • 5 Jan 2021 • Zhuang Liu, Yunpu Ma, Yuanxin Ouyang, Zhang Xiong
To solve this problem, we propose a graph contrastive learning module for a general recommender system that learns the embeddings in a self-supervised manner and reduces the randomness of message dropout.
1 code implementation • ICLR 2021 • Zhuang Liu, Xuanlin Li, Bingyi Kang, Trevor Darrell
In this work, we present the first comprehensive study of regularization techniques with multiple policy optimization algorithms on continuous control tasks.
3 code implementations • 11 Mar 2020 • Zhiqiang Shen, Zechun Liu, Zhuang Liu, Marios Savvides, Trevor Darrell, Eric Xing
This drawback hinders the model from learning subtle variance and fine-grained information.
10 code implementations • ICCV 2021 • Yinbo Chen, Zhuang Liu, Huijuan Xu, Trevor Darrell, Xiaolong Wang
The edge between these two lines of works has yet been underexplored, and the effectiveness of meta-learning in few-shot learning remains unclear.
no code implementations • 8 Jan 2020 • Gao Huang, Zhuang Liu, Geoff Pleiss, Laurens van der Maaten, Kilian Q. Weinberger
Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output.
no code implementations • 7 Jan 2020 • Huiwei Zhou, Zhuang Liu, Shixian Ning, Yunlong Yang, Chengkun Lang, Yingyu Lin, Kun Ma
Automatically extracting Protein-Protein Interactions (PPI) from biomedical literature provides additional support for precision medicine efforts.
no code implementations • 2 Jan 2020 • Huiwei Zhou, Shixian Ning, Yunlong Yang, Zhuang Liu, Chengkun Lang, Yingyu Lin
KBs are important resources for biomedical relation extraction.
no code implementations • 23 Dec 2019 • Huiwei Zhou, Chengkun Lang, Zhuang Liu, Shixian Ning, Yingyu Lin, Lei Du
Results: This paper proposes a novel model called "Knowledge-guided Convolutional Networks (KCN)" to leverage prior knowledge for CDR extraction.
no code implementations • 23 Dec 2019 • Huiwei Zhou, Yunlong Yang, Shixian Ning, Zhuang Liu, Chengkun Lang, Yingyu Lin, Degen Huang
KBs contain huge amounts of structured information about entities and relationships, therefore plays a pivotal role in chemical-disease relation (CDR) extraction.
no code implementations • 11 Dec 2019 • Huiwei Zhou, Xuefei Li, Weihong Yao, Zhuang Liu, Shixian Ning, Chengkun Lang, Lei Du
Finally, the selected relation embedding and the context features are concatenated for PPI extraction.
2 code implementations • 21 Oct 2019 • Zhuang Liu, Xuanlin Li, Bingyi Kang, Trevor Darrell
In this work, we present the first comprehensive study of regularization techniques with multiple policy optimization algorithms on continuous control tasks.
1 code implementation • 21 Oct 2019 • Zhuang Liu, Hung-Ju Wang, Tinghui Zhou, Zhiqiang Shen, Bingyi Kang, Evan Shelhamer, Trevor Darrell
Interestingly, the processing model's ability to enhance recognition quality can transfer when evaluated on models of different architectures, recognized categories, tasks and training datasets.
3 code implementations • 29 Sep 2019 • Yu Sun, Xiaolong Wang, Zhuang Liu, John Miller, Alexei A. Efros, Moritz Hardt
In this paper, we propose Test-Time Training, a general approach for improving the performance of predictive models when training and test data come from different distributions.
Ranked #34 on Language Modelling on LAMBADA
Building change detection for remote sensing images CARLA MAP Leaderboard +6
no code implementations • 25 Sep 2019 • Yu Sun, Xiaolong Wang, Zhuang Liu, John Miller, Alexei A. Efros, Moritz Hardt
We introduce a general approach, called test-time training, for improving the performance of predictive models when test and training data come from different distributions.
no code implementations • WS 2019 • Huiwei Zhou, Bizun Lei, Zhe Liu, Zhuang Liu
BioNLP 2019 proposes Question Answering (QA) task, which encourages the use of text mining technology to automatically judge whether a search result is an answer to the medical question.
1 code implementation • CVPR 2020 • Tianhong Li, Jianguo Li, Zhuang Liu, Chang-Shui Zhang
Deep neural network compression techniques such as pruning and weight tensor decomposition usually require fine-tuning to recover the prediction accuracy when the compression ratio is high.
4 code implementations • ICCV 2019 • Bingyi Kang, Zhuang Liu, Xin Wang, Fisher Yu, Jiashi Feng, Trevor Darrell
The feature learner extracts meta features that are generalizable to detect novel object classes, using training data from base classes with sufficient samples.
Ranked #23 on Few-Shot Object Detection on MS-COCO (30-shot)
2 code implementations • ICLR 2019 • Zhuang Liu, Ming-Jie Sun, Tinghui Zhou, Gao Huang, Trevor Darrell
Our observations are consistent for multiple network architectures, datasets, and tasks, which imply that: 1) training a large, over-parameterized model is often not necessary to obtain an efficient final model, 2) learned "important" weights of the large model are typically not useful for the small pruned model, 3) the pruned architecture itself, rather than a set of inherited "important" weights, is more crucial to the efficiency in the final model, which suggests that in some cases pruning can be useful as an architecture search paradigm.
no code implementations • 27 Sep 2018 • Tianhong Li, Jianguo Li, Zhuang Liu, ChangShui Zhang
Taking the assumption that both "teacher" and "student" have the same feature map sizes at each corresponding block, we add a $1\times 1$ conv-layer at the end of each block in the student-net, and align the block-level outputs between "teacher" and "student" by estimating the parameters of the added layer with limited samples.
1 code implementation • 25 Sep 2018 • Zhiqiang Shen, Zhuang Liu, Jianguo Li, Yu-Gang Jiang, Yurong Chen, xiangyang xue
Thus, a better solution to handle these critical problems is to train object detectors from scratch, which motivates our proposed method.
12 code implementations • ICCV 2017 • Zhuang Liu, Jianguo Li, Zhiqiang Shen, Gao Huang, Shoumeng Yan, Chang-Shui Zhang
For VGGNet, a multi-pass version of network slimming gives a 20x reduction in model size and a 5x reduction in computing operations.
4 code implementations • ICCV 2017 • Zhiqiang Shen, Zhuang Liu, Jianguo Li, Yu-Gang Jiang, Yurong Chen, xiangyang xue
State-of-the-art object objectors rely heavily on the off-the-shelf networks pre-trained on large-scale classification datasets like ImageNet, which incurs learning bias due to the difference on both the loss functions and the category distributions between classification and detection tasks.
12 code implementations • 1 Apr 2017 • Gao Huang, Yixuan Li, Geoff Pleiss, Zhuang Liu, John E. Hopcroft, Kilian Q. Weinberger
In this paper, we propose a method to obtain the seemingly contradictory goal of ensembling multiple neural networks at no additional training cost.
144 code implementations • CVPR 2017 • Gao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger
Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output.
Ranked #1 on Classification on XImageNet-12
17 code implementations • 30 Mar 2016 • Gao Huang, Yu Sun, Zhuang Liu, Daniel Sedra, Kilian Weinberger
With stochastic depth we can increase the depth of residual networks even beyond 1200 layers and still yield meaningful improvements in test error (4. 91% on CIFAR-10).
Ranked #20 on Image Classification on SVHN