no code implementations • 2 Sep 2013 • Tianyi Zhou, DaCheng Tao
Learning big data by matrix decomposition always suffers from expensive computation, mixing of complicated structures and noise.
no code implementations • NeurIPS 2014 • Tianyi Zhou, Jeff Bilmes, Carlos Guestrin
We reduce a broad class of machine learning problems, usually addressed by EM or sampling, to the problem of finding the $k$ extremal rays spanning the conical hull of a data point set.
no code implementations • 1 Jun 2016 • Tianyi Zhou, Hua Ouyang, Yi Chang, Jeff Bilmes, Carlos Guestrin
We propose a new random pruning method (called "submodular sparsification (SS)") to reduce the cost of submodular maximization.
no code implementations • 1 Jun 2016 • Tianyi Zhou, Jeff Bilmes
We propose a streaming submodular maximization algorithm "stream clipper" that performs as well as the offline greedy algorithm on document/video summarization in practice.
3 code implementations • 14 Sep 2017 • Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Shirui Pan, Chengqi Zhang
Recurrent neural nets (RNN) and convolutional neural nets (CNN) are widely used on NLP tasks to capture the long-term and local dependencies, respectively.
Ranked #68 on Natural Language Inference on SNLI
no code implementations • ICLR 2018 • Tianyi Zhou, Jeff Bilmes
We introduce and study minimax curriculum learning (MCL), a new method for adaptively selecting a sequence of training subsets for a succession of stages in machine learning.
1 code implementation • 31 Jan 2018 • Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Sen Wang, Chengqi Zhang
In this paper, we integrate both soft and hard attention into one context fusion model, "reinforced self-attention (ReSA)", for the mutual benefit of each other.
Ranked #56 on Natural Language Inference on SNLI
1 code implementation • ICLR 2018 • Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang
In this paper, we propose a model, called "bi-directional block self-attention network (Bi-BloSAN)", for RNN/CNN-free sequence encoding.
2 code implementations • NAACL 2019 • Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang
Neural networks equipped with self-attention have parallelizable computation, light-weight structure, and the ability to capture both long-range and local dependencies.
no code implementations • NeurIPS 2018 • Tianyi Zhou, Shengjie Wang, Jeff A. Bilmes
We study a new method (``Diverse Ensemble Evolution (DivE$^2$)'') to train an ensemble of machine learning models that assigns data to models at each training epoch based on each model's current expertise and an intra- and inter-model diversity reward.
no code implementations • ICLR 2019 • Shengjie Wang, Tianyi Zhou, Jeff Bilmes
In particular, we study how to attribute a DNN's bias to its input features.
no code implementations • ICLR 2019 • Shengjie Wang, Tianyi Zhou, Jeff Bilmes
In this paper, we discuss three novel observations about dropout to better understand the generalization of DNNs with rectified linear unit (ReLU) activations: 1) dropout is a smoothing technique that encourages each local linear model of a DNN to be trained on data points from nearby regions; 2) a constant dropout rate can result in effective neural-deactivation rates that are significantly different for layers with different fractions of activated neurons; and 3) the rescaling factor of dropout causes an inconsistency to occur between the normalization during training and testing conditions when batch normalization is also used.
no code implementations • ICLR 2019 • Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang
It addresses the ``many-class'' problem by exploring the class hierarchy, e. g., the coarse-class label that covers a subset of fine classes, which helps to narrow down the candidates for the fine class and is cheaper to obtain.
2 code implementations • 10 May 2019 • Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Lina Yao, Chengqi Zhang
The resulting graph of prototypes can be continually re-used and updated for new tasks and classes.
1 code implementation • NeurIPS 2019 • Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang
It can significantly improve tasks that suffer from insufficient training data, e. g., few shot learning.
no code implementations • 25 Sep 2019 • Tianyi Zhou, Shengjie Wang, Jeff A. Bilmes
The advantages of DIHCL, compared to other curriculum learning approaches, are: (1) DIHCL does not require additional inference steps over the data not selected by DIHCL in each epoch, (2) the dynamic instance hardness, compared to static instance hardness (e. g., instantaneous loss), is more stable as it integrates information over the entire training history up to the present time.
no code implementations • 27 Nov 2019 • Yang Li, Guodong Long, Tao Shen, Tianyi Zhou, Lina Yao, Huan Huo, Jing Jiang
Distantly supervised relation extraction intrinsically suffers from noisy labels due to the strong assumption of distant supervision.
1 code implementation • NeurIPS 2019 • Meng Fang, Tianyi Zhou, Yali Du, Lei Han, Zhengyou Zhang
This ``Goal-and-Curiosity-driven Curriculum Learning'' leads to ``Curriculum-guided HER (CHER)'', which adaptively and dynamically controls the exploration-exploitation trade-off during the learning process via hindsight experience selection.
no code implementations • 12 Feb 2020 • Chi Zhang, Yong Sheng Soh, Ling Feng, Tianyi Zhou, Qianxiao Li
While current machine learning models have impressive performance over a wide range of applications, their large size and complexity render them unsuitable for tasks such as remote monitoring on edge devices with limited storage and computational power.
no code implementations • 18 Feb 2020 • Yujia Xie, Tianyi Zhou, Yi Mao, Weizhu Chen
Thereby, the contextual dependencies modeled by CSA will be highly relevant to the query.
3 code implementations • ICLR 2022 • Wensi Tang, Guodong Long, Lu Liu, Tianyi Zhou, Michael Blumenstein, Jing Jiang
Particularly, it is a set of kernel sizes that can efficiently cover the best RF size across different datasets via consisting of multiple prime numbers according to the length of the time series.
1 code implementation • 30 Apr 2020 • Bo Wang, Tao Shen, Guodong Long, Tianyi Zhou, Yi Chang
In experiments, we achieve state-of-the-art performance on three benchmarks and a zero-shot dataset for link prediction, with highlights of inference costs reduced by 1-2 orders of magnitude compared to a textual encoding method.
Ranked #4 on Link Prediction on UMLS
3 code implementations • 3 May 2020 • Guodong Long, Ming Xie, Tao Shen, Tianyi Zhou, Xianzhi Wang, Jing Jiang, Chengqi Zhang
However, due to the diverse nature of user behaviors, assigning users' gradients to different global models (i. e., centers) can better capture the heterogeneity of data distributions across users.
1 code implementation • 28 Jun 2020 • Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang
We study many-class few-shot (MCFS) problem in both supervised learning and meta-learning settings.
no code implementations • 24 Sep 2020 • Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang
To address this challenging task, most ZSL methods relate unseen test classes to seen(training) classes via a pre-defined set of attributes that can describe all classes in the same semantic space, so the knowledge learned on the training classes can be adapted to unseen classes.
2 code implementations • COLING 2020 • Yang Li, Tao Shen, Guodong Long, Jing Jiang, Tianyi Zhou, Chengqi Zhang
Then, facilitated by the proposed base model, we introduce collaborating relation features shared among relations in the hierarchies to promote the relation-augmenting process and balance the training data for long-tail relations.
no code implementations • NeurIPS 2020 • Tianyi Zhou, Shengjie Wang, Jeff A. Bilmes
Compared to existing CL methods: (1) DIH is more stable over time than using only instantaneous hardness, which is noisy due to stochastic training and DNN's non-smoothness; (2) DIHCL is computationally inexpensive since it uses only a byproduct of back-propagation and thus does not require extra inference.
no code implementations • 1 Jan 2021 • Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Xuanyi Dong, Chengqi Zhang
Few-shot learning aims to train a classifier given only a few samples per class that are highly insufficient to describe the whole data distribution.
no code implementations • ICLR 2021 • Tianyi Zhou, Shengjie Wang, Jeff Bilmes
Neural nets training can easily overfit to noisy labels and end with poor generalization performance.
no code implementations • 1 Jan 2021 • Haiyan Zhao, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang
In this paper, we introduce an efficient method, \name, to extract the local inference chains by optimizing a differentiable sparse scoring for the filters and layers to preserve the outputs on given data from a local region.
no code implementations • ICLR 2021 • Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Xuanyi Dong, Chengqi Zhang
To resolve this problem, we propose Isometric Propagation Network (IPN), which learns to strengthen the relation between classes within each space and align the class dependency in the two spaces.
4 code implementations • 1 May 2021 • Yue Tan, Guodong Long, Lu Liu, Tianyi Zhou, Qinghua Lu, Jing Jiang, Chengqi Zhang
Heterogeneity across clients in federated learning (FL) usually hinders the optimization convergence and generalization performance when the aggregation of clients' knowledge occurs in the gradient space.
1 code implementation • ICLR 2021 • Yuchen Jin, Tianyi Zhou, Liangyu Zhao, Yibo Zhu, Chuanxiong Guo, Marco Canini, Arvind Krishnamurthy
This mutual-training process between BO and the loss-prediction model allows us to limit the training steps invested in the BO search.
1 code implementation • 19 Aug 2021 • Guodong Long, Ming Xie, Tao Shen, Tianyi Zhou, Xianzhi Wang, Jing Jiang, Chengqi Zhang
By comparison, a mixture of multiple global models could capture the heterogeneity across various clients if assigning the client to different global models (i. e., centers) in FL.
1 code implementation • Findings (EMNLP) 2021 • Bo wang, Tao Shen, Guodong Long, Tianyi Zhou, Yi Chang
Aspect-level sentiment classification (ALSC) aims at identifying the sentiment polarity of a specified aspect in a sentence.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +3
no code implementations • 29 Sep 2021 • Shuang Ao, Tianyi Zhou, Jing Jiang, Guodong Long, Xuan Song, Chengqi Zhang
They are complementary in acquiring more informative feedback for RL: the planning policy provides dense reward of finishing easier sub-tasks while the environment policy modifies these sub-tasks to be adequately challenging and diverse so the RL agent can quickly adapt to different tasks/environments.
no code implementations • ICLR 2022 • Yijun Yang, Jing Jiang, Tianyi Zhou, Jie Ma, Yuhui Shi
Model-based offline RL instead trains an environment model using a dataset of pre-collected experiences so online RL methods can learn in an offline manner by solely interacting with the model.
no code implementations • 29 Sep 2021 • Haiyan Zhao, Tianyi Zhou, Guodong Long, Jing Jiang, Liming Zhu, Chengqi Zhang
Can we find a better initialization for a new task, e. g., a much smaller network closer to the final pruned model, by exploiting its similar tasks?
no code implementations • 29 Sep 2021 • Kaiwen Yang, Tianyi Zhou, Xinmei Tian, DaCheng Tao
We then adversarially perturb $G(x)$ in the VAE's bottleneck space and adds it back to the original $R(x)$ as an augmentation, which is therefore sufficiently challenging for contrastive learning and meanwhile preserves the sample identity intact.
no code implementations • ICLR 2022 • Ravikumar Balakrishnan, Tian Li, Tianyi Zhou, Nageen Himayat, Virginia Smith, Jeff Bilmes
In every communication round of federated learning, a random subset of clients communicate their model updates back to the server which then aggregates them all.
1 code implementation • 24 Oct 2021 • Zhihong Deng, Zuyue Fu, Lingxiao Wang, Zhuoran Yang, Chenjia Bai, Tianyi Zhou, Zhaoran Wang, Jing Jiang
Offline reinforcement learning (RL) harnesses the power of massive datasets for resolving sequential decision problems.
no code implementations • NeurIPS 2021 • Shengjie Wang, Tianyi Zhou, Chandrashekhar Lavania, Jeff A. Bilmes
Robust submodular partitioning promotes the diversity of every block in the partition.
1 code implementation • NeurIPS 2021 • Shuang Ao, Tianyi Zhou, Guodong Long, Qinghua Lu, Liming Zhu, Jing Jiang
Next, a bottom-up traversal of the tree trains the RL agent from easier sub-tasks with denser rewards on bottom layers to harder ones on top layers and collects its cost on each sub-task train the planner in the next episode.
no code implementations • NeurIPS 2021 • Kaiwen Yang, Tianyi Zhou, Yonggang Zhang, Xinmei Tian, DaCheng Tao
In this paper, we propose ''class-disentanglement'' that trains a variational autoencoder $G(\cdot)$ to extract this class-dependent information as $x - G(x)$ via a trade-off between reconstructing $x$ by $G(x)$ and classifying $x$ by $D(x-G(x))$, where the former competes with the latter in decomposing $x$ so the latter retains only necessary information for classification in $x-G(x)$.
no code implementations • CVPR 2022 • Shuangtong Li, Tianyi Zhou, Xinmei Tian, DaCheng Tao
Decentralized learning (DL) can exploit the images distributed over devices on a network topology to train a global model but is not designed to train personalized models for different tasks or optimize the topology.
1 code implementation • 13 Feb 2022 • Jie Ma, Guodong Long, Tianyi Zhou, Jing Jiang, Chengqi Zhang
Knowledge sharing and model personalization are essential components to tackle the non-IID challenge in federated learning (FL).
1 code implementation • 2 Mar 2022 • Fengwen Chen, Guodong Long, Zonghan Wu, Tianyi Zhou, Jing Jiang
We propose a novel structured federated learning (SFL) framework to learn both the global and personalized models simultaneously using client-wise relation graphs and clients' private data.
no code implementations • ACL 2022 • Le Hou, Richard Yuanzhe Pang, Tianyi Zhou, Yuexin Wu, Xinying Song, Xiaodan Song, Denny Zhou
Transformer-based models generally allocate the same amount of computation for each token in a given sequence.
no code implementations • 20 May 2022 • Zhuowei Wang, Tianyi Zhou, Guodong Long, Bo Han, Jing Jiang
Federated learning (FL) aims at training a global model on the server side while the training data are collected and located at the local devices.
1 code implementation • Findings (NAACL) 2022 • Yibin Lei, Yu Cao, Dianqi Li, Tianyi Zhou, Meng Fang, Mykola Pechenizkiy
Generating high-quality textual adversarial examples is critical for investigating the pitfalls of natural language processing (NLP) models and further promoting their robustness.
2 code implementations • 21 Sep 2022 • Yue Tan, Guodong Long, Jie Ma, Lu Liu, Tianyi Zhou, Jing Jiang
To prevent these issues from hindering the deployment of FL systems, we propose a lightweight framework where clients jointly learn to fuse the representations generated by multiple fixed pre-trained models rather than training a large-scale model from scratch.
1 code implementation • 27 Oct 2022 • Yu Cao, Dianqi Li, Meng Fang, Tianyi Zhou, Jun Gao, Yibing Zhan, DaCheng Tao
We present Twin Answer Sentences Attack (TASA), an adversarial attack method for question answering (QA) models that produces fluent and grammatical adversarial contexts while maintaining gold answers.
1 code implementation • 2 Nov 2022 • Kaiwen Yang, Yanchao Sun, Jiahao Su, Fengxiang He, Xinmei Tian, Furong Huang, Tianyi Zhou, DaCheng Tao
In experiments, we show that our method consistently brings non-trivial improvements to the three aforementioned learning tasks from both efficiency and final performance, either or not combined with strong pre-defined augmentations, e. g., on medical images when domain knowledge is unavailable and the existing augmentation techniques perform poorly.
1 code implementation • 16 Jan 2023 • Chunxu Zhang, Guodong Long, Tianyi Zhou, Peng Yan, Zijian Zhang, Chengqi Zhang, Bo Yang
Moreover, we provide visualizations and in-depth analysis of the personalization techniques in item embedding, which shed novel insights on the design of recommender systems in federated settings.
1 code implementation • 22 Jan 2023 • Zhiwei Li, Guodong Long, Tianyi Zhou
To address these challenges, we propose Federated Recommendation with Additive Personalization (FedRAP), which learns a global view of items via FL and a personalized view locally on each user.
no code implementations • 27 Jan 2023 • Haiyan Zhao, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang
To address these challenges, we create a small model for a new task from the pruned models of similar tasks.
no code implementations • 14 Mar 2023 • Jiuhai Chen, Lichang Chen, Chen Zhu, Tianyi Zhou
Moreover, ICL (with and w/o CoT) using only one correct demo significantly outperforms all-demo ICL adopted by most previous works, indicating the weakness of LLMs in finding correct demo(s) for input queries, which is difficult to evaluate on the biased datasets.
3 code implementations • 15 Mar 2023 • Divya Kothandaraman, Tianyi Zhou, Ming Lin, Dinesh Manocha
Aerial Diffusion leverages a pretrained text-image diffusion model for prior knowledge.
no code implementations • 28 Mar 2023 • Zhihang Li, Zhao Song, Tianyi Zhou
In this paper, we make use of the input sparsity and purpose an algorithm that use $\log ( \|x_0 - x^*\|_2 / \epsilon)$ iterations and $\widetilde{O}(\mathrm{nnz}(A) + d^{\omega} )$ per iteration time to solve the problem.
no code implementations • 6 Apr 2023 • Jiuhai Chen, Lichang Chen, Heng Huang, Tianyi Zhou
However, it is not clear whether CoT is still effective on more recent instruction finetuned (IFT) LLMs such as ChatGPT.
no code implementations • 9 Apr 2023 • Haiyan Zhao, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang
In this paper, we study which modules in neural networks are more prone to forgetting by investigating their training dynamics during CL.
no code implementations • 26 Apr 2023 • Shuai Li, Zhao Song, Yu Xia, Tong Yu, Tianyi Zhou
Large language models (LLMs) are known for their exceptional performance in natural language processing, making them highly effective in many human life-related or even job-related tasks.
no code implementations • 27 Apr 2023 • Tao Shen, Guodong Long, Xiubo Geng, Chongyang Tao, Tianyi Zhou, Daxin Jiang
In this work, we propose a simple method that applies a large language model (LLM) to large-scale retrieval in zero-shot scenarios.
no code implementations • 13 May 2023 • Chunxu Zhang, Guodong Long, Tianyi Zhou, Peng Yan, Zijjian Zhang, Bo Yang
Federated Recommendation is a new service architecture providing recommendations without sharing user data with the server.
no code implementations • 22 May 2023 • Chunxu Zhang, Guodong Long, Tianyi Zhou, Zijian Zhang, Peng Yan, Bo Yang
However, this separation of the recommendation model and users' private data poses a challenge in providing quality service, particularly when it comes to new items, namely cold-start recommendations in federated settings.
no code implementations • 24 May 2023 • Davit Soselia, Khalid Saifullah, Tianyi Zhou
We evaluate the UI-to-Code performance using a combination of automated metrics such as MSE, BLEU, IoU, and a novel htmlBLEU score.
no code implementations • 25 May 2023 • Jiangtao Kong, Zhenyu Zong, Tianyi Zhou, Huajie Shao
In this paper, we propose YONO that You Only Need to replay One condensed prototype per class, which for the first time can even outperform memory-costly exemplar-replay methods.
1 code implementation • 29 May 2023 • Yijun Yang, Tianyi Zhou, Jing Jiang, Guodong Long, Yuhui Shi
We address it by "Continual Task Allocation via Sparse Prompting (CoTASP)", which learns over-complete dictionaries to produce sparse masks as prompts extracting a sub-network for each task from a meta-policy network.
no code implementations • 4 Jun 2023 • Ritwik Sinha, Zhao Song, Tianyi Zhou
A model trained on these losses balances the trade-off between the creativity and reality of the model.
1 code implementation • 5 Jun 2023 • Lichang Chen, Jiuhai Chen, Tom Goldstein, Heng Huang, Tianyi Zhou
Large language models~(LLMs) are instruction followers, but it can be challenging to find the best instruction for different situations, especially for black-box LLMs on which backpropagation is forbidden.
no code implementations • 9 Jun 2023 • Bhrij Patel, Kasun Weerakoon, Wesley A. Suttle, Alec Koppel, Brian M. Sadler, Tianyi Zhou, Amrit Singh Bedi, Dinesh Manocha
Trajectory length stands as a crucial hyperparameter within reinforcement learning (RL) algorithms, significantly contributing to the sample inefficiency in robotics applications.
1 code implementation • 16 Jun 2023 • Shuangtong Li, Tianyi Zhou, Xinmei Tian, DaCheng Tao
We propose "Structured Cooperative Learning (SCooL)", in which a cooperation graph across devices is generated by a graphical model prior to automatically coordinate mutual learning between devices.
no code implementations • 19 Jun 2023 • Linxin Song, Jieyu Zhang, Xiaotian Lu, Tianyi Zhou
Instead of tuning the coefficient for each query round, which is sensitive and time-consuming, we propose the curriculum Firth bias reduction (CHAIN) that can automatically adjust the coefficient to be adaptive to the training process.
1 code implementation • 24 Jun 2023 • Zhenyu Zhang, Ying Sheng, Tianyi Zhou, Tianlong Chen, Lianmin Zheng, Ruisi Cai, Zhao Song, Yuandong Tian, Christopher Ré, Clark Barrett, Zhangyang Wang, Beidi Chen
Based on these insights, we propose Heavy Hitter Oracle (H$_2$O), a KV cache eviction policy that dynamically retains a balance of recent and H$_2$ tokens.
1 code implementation • ICCV 2023 • Chengkai Hou, Jieyu Zhang, Haonan Wang, Tianyi Zhou
We overcome these drawbacks by a novel ``subclass-balancing contrastive learning (SBCL)'' approach that clusters each head class into multiple subclasses of similar sizes as the tail classes and enforce representations to capture the two-layer class hierarchy between the original classes and their subclasses.
no code implementations • 29 Jun 2023 • Qiang He, Tianyi Zhou, Meng Fang, Setareh Maghsudi
In ERC, we propose a regularizer that guides the approximation error tending towards the 1-eigensubspace, resulting in a more efficient and stable path of value approximation.
1 code implementation • 17 Jul 2023 • Soumik Mukhopadhyay, Matthew Gwilliam, Vatsal Agarwal, Namitha Padmanabhan, Archana Swaminathan, Srinidhi Hegde, Tianyi Zhou, Abhinav Shrivastava
We explore optimal methods for extracting and using these embeddings for classification tasks, demonstrating promising results on the ImageNet classification task.
3 code implementations • 17 Jul 2023 • Lichang Chen, Shiyang Li, Jun Yan, Hai Wang, Kalpa Gunaratna, Vikas Yadav, Zheng Tang, Vijay Srinivasan, Tianyi Zhou, Heng Huang, Hongxia Jin
Large language models (LLMs) strengthen instruction-following capability through instruction-finetuning (IFT) on supervised instruction/response data.
2 code implementations • 23 Aug 2023 • Ming Li, Yong Zhang, Zhitao Li, Jiuhai Chen, Lichang Chen, Ning Cheng, Jianzong Wang, Tianyi Zhou, Jing Xiao
In the realm of Large Language Models (LLMs), the balance between instruction data quality and quantity is a focal point.
no code implementations • 30 Aug 2023 • Shwai He, Run-Ze Fan, Liang Ding, Li Shen, Tianyi Zhou, DaCheng Tao
Adapter tuning, which updates only a few parameters, has become a mainstream method for fine-tuning pretrained language models to downstream tasks.
1 code implementation • ICCV 2023 • Chengkai Hou, Jieyu Zhang, Tianyi Zhou
Unlike previous work, MADAug selects augmentation operators for each input image by a model-adaptive policy varying between training stages, producing a data augmentation curriculum optimized for better generalization.
no code implementations • 21 Sep 2023 • Shuang Ao, Tianyi Zhou, Guodong Long, Xuan Song, Jing Jiang
Throughout long history, natural species have learned to survive by evolving their physical structures adaptive to the environment changes.
1 code implementation • 27 Sep 2023 • Linxin Song, Jieyu Zhang, Lechao Cheng, Pengyuan Zhou, Tianyi Zhou, Irene Li
Recent developments in large language models (LLMs) have shown promise in enhancing the capabilities of natural language processing (NLP).
1 code implementation • 15 Oct 2023 • Shwai He, Run-Ze Fan, Liang Ding, Li Shen, Tianyi Zhou, DaCheng Tao
Although a sparse Mixture of Experts (MoE) can reduce the cost by activating a small subset of parameters (e. g., one expert) for each input, its computation escalates significantly if increasing the number of activated experts, limiting its practical utility.
no code implementations • 18 Oct 2023 • Yichuan Deng, Zhao Song, Tianyi Zhou
Large transformer models have achieved state-of-the-art results in numerous natural language processing tasks.
2 code implementations • 18 Oct 2023 • Ming Li, Lichang Chen, Jiuhai Chen, Shwai He, Heng Huang, Jiuxiang Gu, Tianyi Zhou
Recent advancements in Large Language Models (LLMs) have expanded the horizons of natural language understanding and generation.
5 code implementations • 23 Oct 2023 • Tianrui Guan, Fuxiao Liu, Xiyang Wu, Ruiqi Xian, Zongxia Li, Xiaoyu Liu, Xijun Wang, Lichang Chen, Furong Huang, Yaser Yacoob, Dinesh Manocha, Tianyi Zhou
Our comprehensive case studies within HallusionBench shed light on the challenges of hallucination and illusion in LVLMs.
Ranked #1 on Visual Question Answering (VQA) on HallusionBench
1 code implementation • 26 Oct 2023 • Zichang Liu, Jue Wang, Tri Dao, Tianyi Zhou, Binhang Yuan, Zhao Song, Anshumali Shrivastava, Ce Zhang, Yuandong Tian, Christopher Re, Beidi Chen
We show that contextual sparsity exists, that it can be accurately predicted, and that we can exploit it to speed up LLM inference in wall-clock time without compromising LLM's quality or in-context learning ability.
no code implementations • 19 Nov 2023 • Lianke Qin, Saayan Mitra, Zhao Song, Yuanyuan Yang, Tianyi Zhou
In this paper, we consider a heavy inner product identification problem, which generalizes the Light Bulb problem~(\cite{prr89}): Given two sets $A \subset \{-1,+1\}^d$ and $B \subset \{-1,+1\}^d$ with $|A|=|B| = n$, if there are exact $k$ pairs whose inner product passes a certain threshold, i. e., $\{(a_1, b_1), \cdots, (a_k, b_k)\} \subset A \times B$ such that $\forall i \in [k], \langle a_i, b_i \rangle \geq \rho \cdot d$, for a threshold $\rho \in (0, 1)$, the goal is to identify those $k$ heavy inner products.
2 code implementations • 27 Nov 2023 • Divya Kothandaraman, Tianyi Zhou, Ming Lin, Dinesh Manocha
It seamlessly blends the visual features from the input image within a pretrained text-to-2Dimage stable diffusion model with a test-time optimization process for a careful bias-variance trade-off, which uses an Inverse Perspective Mapping (IPM) homography transformation to provide subtle cues for aerialview synthesis.
1 code implementation • 28 Nov 2023 • Yijun Yang, Tianyi Zhou, Kanxue Li, Dapeng Tao, Lusong Li, Li Shen, Xiaodong He, Jing Jiang, Yuhui Shi
While large language models (LLMs) excel in a simulated world of texts, they struggle to interact with the more realistic world without perceptions of other modalities such as visual or audio signals.
1 code implementation • 29 Nov 2023 • Soumik Mukhopadhyay, Matthew Gwilliam, Yosuke Yamaguchi, Vatsal Agarwal, Namitha Padmanabhan, Archana Swaminathan, Tianyi Zhou, Abhinav Shrivastava
We find that the intermediate feature maps of the U-Net are diverse, discriminative feature representations.
1 code implementation • 4 Dec 2023 • Kaiwen Yang, Tao Shen, Xinmei Tian, Xiubo Geng, Chongyang Tao, DaCheng Tao, Tianyi Zhou
QVix enables a wider exploration of visual scenes, improving the LVLMs' reasoning accuracy and depth in tasks such as visual question answering and visual entailment.
1 code implementation • 10 Jan 2024 • Lichao Sun, Yue Huang, Haoran Wang, Siyuan Wu, Qihui Zhang, Yuan Li, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bertie Vidgen, Bhavya Kailkhura, Caiming Xiong, Chaowei Xiao, Chunyuan Li, Eric Xing, Furong Huang, Hao liu, Heng Ji, Hongyi Wang, huan zhang, Huaxiu Yao, Manolis Kellis, Marinka Zitnik, Meng Jiang, Mohit Bansal, James Zou, Jian Pei, Jian Liu, Jianfeng Gao, Jiawei Han, Jieyu Zhao, Jiliang Tang, Jindong Wang, Joaquin Vanschoren, John Mitchell, Kai Shu, Kaidi Xu, Kai-Wei Chang, Lifang He, Lifu Huang, Michael Backes, Neil Zhenqiang Gong, Philip S. Yu, Pin-Yu Chen, Quanquan Gu, ran Xu, Rex Ying, Shuiwang Ji, Suman Jana, Tianlong Chen, Tianming Liu, Tianyi Zhou, William Wang, Xiang Li, Xiangliang Zhang, Xiao Wang, Xing Xie, Xun Chen, Xuyu Wang, Yan Liu, Yanfang Ye, Yinzhi Cao, Yong Chen, Yue Zhao
This paper introduces TrustLLM, a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions.
no code implementations • 17 Jan 2024 • Yao Lu, Song Bian, Lequn Chen, Yongjun He, Yulong Hui, Matthew Lentz, Beibin Li, Fei Liu, Jialin Li, Qi Liu, Rui Liu, Xiaoxuan Liu, Lin Ma, Kexin Rong, Jianguo Wang, Yingjun Wu, Yongji Wu, Huanchen Zhang, Minjia Zhang, Qizhen Zhang, Tianyi Zhou, Danyang Zhuo
In this paper, we investigate the intersection of large generative AI models and cloud-native computing architectures.
1 code implementation • 1 Feb 2024 • Ming Li, Yong Zhang, Shwai He, Zhitao Li, Hongyu Zhao, Jianzong Wang, Ning Cheng, Tianyi Zhou
Data filtering for instruction tuning has proved important in improving both the efficiency and performance of the tuning process.
no code implementations • 11 Feb 2024 • Lichang Chen, Chen Zhu, Davit Soselia, Jiuhai Chen, Tianyi Zhou, Tom Goldstein, Heng Huang, Mohammad Shoeybi, Bryan Catanzaro
In this work, we study the issue of reward hacking on the response length, a challenge emerging in Reinforcement Learning from Human Feedback (RLHF) on LLMs.
no code implementations • 12 Feb 2024 • Jiuxiang Gu, Chenyang Li, YIngyu Liang, Zhenmei Shi, Zhao Song, Tianyi Zhou
Our research presents a thorough analytical characterization of the features learned by stylized one-hidden layer neural networks and one-layer Transformers in addressing this task.
2 code implementations • 15 Feb 2024 • Ming Li, Lichang Chen, Jiuhai Chen, Shwai He, Jiuxiang Gu, Tianyi Zhou
Instruction tuning is critical to large language models (LLMs) for achieving better instruction following and task adaptation capabilities but its success heavily relies on the training data quality.
1 code implementation • 16 Feb 2024 • Ming Li, Jiuhai Chen, Lichang Chen, Tianyi Zhou
To examine DEBATunE, we curate the largest dataset of debate topics so far, which covers 710 controversial topics and corresponding arguments for each topic.
1 code implementation • 20 Feb 2024 • Sen Li, Ruochen Wang, Cho-Jui Hsieh, Minhao Cheng, Tianyi Zhou
Moreover, MuLan adopts a vision-language model (VLM) to provide feedback to the image generated in each sub-task and control the diffusion model to re-generate the image if it violates the original prompt.
1 code implementation • 20 Feb 2024 • Xiaohan Xu, Ming Li, Chongyang Tao, Tao Shen, Reynold Cheng, Jinyang Li, Can Xu, DaCheng Tao, Tianyi Zhou
In the era of Large Language Models (LLMs), Knowledge Distillation (KD) emerges as a pivotal methodology for transferring advanced capabilities from leading proprietary LLMs, such as GPT-4, to their open-source counterparts like LLaMA and Mistral.
1 code implementation • 25 Feb 2024 • Xirui Li, Ruochen Wang, Minhao Cheng, Tianyi Zhou, Cho-Jui Hsieh
DrAttack includes three key components: (a) `Decomposition' of the original prompt into sub-prompts, (b) `Reconstruction' of these sub-prompts implicitly by in-context learning with semantically similar but harmless reassembling demo, and (c) a `Synonym Search' of sub-prompts, aiming to find sub-prompts' synonyms that maintain the original intent while jailbreaking LLMs.
no code implementations • 28 Feb 2024 • Yibin Lei, Di wu, Tianyi Zhou, Tao Shen, Yu Cao, Chongyang Tao, Andrew Yates
In this work, we introduce a new unsupervised embedding method, Meta-Task Prompting with Explicit One-Word Limitation (MetaEOL), for generating high-quality sentence embeddings from Large Language Models (LLMs) without the need for model fine-tuning or task-specific engineering.
1 code implementation • 28 Feb 2024 • Yibin Lei, Yu Cao, Tianyi Zhou, Tao Shen, Andrew Yates
Recent studies demonstrate that query expansions generated by large language models (LLMs) can considerably enhance information retrieval systems by generating hypothetical documents that answer the queries as expansions.
no code implementations • 6 Mar 2024 • Ziyue Li, Tian Li, Virginia Smith, Jeff Bilmes, Tianyi Zhou
Optimizing the performance of many objectives (instantiated by tasks or clients) jointly with a few Pareto stationary solutions (models) is critical in machine learning.
no code implementations • 11 Mar 2024 • Bhavya Vasudeva, Deqing Fu, Tianyi Zhou, Elliott Kau, Youqi Huang, Vatsal Sharan
Transformers achieve state-of-the-art accuracy and robustness across many tasks, but an understanding of the inductive biases that they have and how those biases are different from other neural network architectures remains elusive.
1 code implementation • 19 Apr 2024 • Qiang He, Tianyi Zhou, Meng Fang, Setareh Maghsudi
We then leverage this upper bound to propose a novel regularizer, namely BEllman Equation-based automatic rank Regularizer (BEER).
no code implementations • ICML 2020 • Tianyi Zhou, Shengjie Wang, Jeff Bilmes
In this paper, we study the dynamics of neural net outputs in SSL and show that selecting and using first the unlabeled samples with more consistent outputs over the course of training (i. e., "time-consistency") can improve the final test accuracy and save computation.