Search Results for author: Shangling Jui

Found 36 papers, 21 papers with code

Rethinking Optimization and Architecture for Tiny Language Models

1 code implementation5 Feb 2024 Yehui Tang, Fangcheng Liu, Yunsheng Ni, Yuchuan Tian, Zheyuan Bai, Yi-Qi Hu, Sichao Liu, Shangling Jui, Kai Han, Yunhe Wang

Several design formulas are empirically proved especially effective for tiny language models, including tokenizer compression, architecture tweaking, parameter inheritance and multiple-round training.

Language Modelling

A Theory of Non-Acyclic Generative Flow Networks

no code implementations23 Dec 2023 Leo Maxime Brunswic, Yinchuan Li, Yushun Xu, Shangling Jui, Lizhuang Ma

GFlowNets is a novel flow-based method for learning a stochastic policy to generate objects via a sequence of actions and with probability proportional to a given positive reward.

Exploring the Naturalness of AI-Generated Images

1 code implementation9 Dec 2023 Zijian Chen, Wei Sun, HaoNing Wu, ZiCheng Zhang, Jun Jia, Zhongpeng Ji, Fengyu Sun, Shangling Jui, Xiongkuo Min, Guangtao Zhai, Wenjun Zhang

In this paper, we take the first step to benchmark and assess the visual naturalness of AI-generated images.

Trust your Good Friends: Source-free Domain Adaptation by Reciprocal Neighborhood Clustering

no code implementations1 Sep 2023 Shiqi Yang, Yaxing Wang, Joost Van de Weijer, Luis Herranz, Shangling Jui, Jian Yang

We capture this intrinsic structure by defining local affinity of the target data, and encourage label consistency among data with high local affinity.

Clustering Source-Free Domain Adaptation

Ternary Singular Value Decomposition as a Better Parameterized Form in Linear Mapping

1 code implementation15 Aug 2023 BoYu Chen, Hanxuan Chen, Jiao He, Fengyu Sun, Shangling Jui

We present a simple yet novel parameterized form of linear mapping to achieves remarkable network compression performance: a pseudo SVD called Ternary SVD (TSVD).

Language Modelling Large Language Model +1

Reparameterization through Spatial Gradient Scaling

1 code implementation5 Mar 2023 Alexander Detkov, Mohammad Salameh, Muhammad Fetrat Qharabagh, Jialin Zhang, Wei Lui, Shangling Jui, Di Niu

Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training.

A General-Purpose Transferable Predictor for Neural Architecture Search

no code implementations21 Feb 2023 Fred X. Han, Keith G. Mills, Fabian Chudak, Parsa Riahi, Mohammad Salameh, Jialin Zhang, Wei Lu, Shangling Jui, Di Niu

In this paper, we propose a general-purpose neural predictor for NAS that can transfer across search spaces, by representing any given candidate Convolutional Neural Network (CNN) with a Computation Graph (CG) that consists of primitive operators.

Contrastive Learning Graph Representation Learning +1

GENNAPE: Towards Generalized Neural Architecture Performance Estimators

1 code implementation30 Nov 2022 Keith G. Mills, Fred X. Han, Jialin Zhang, Fabian Chudak, Ali Safari Mamaghani, Mohammad Salameh, Wei Lu, Shangling Jui, Di Niu

In this paper, we propose GENNAPE, a Generalized Neural Architecture Performance Estimator, which is pretrained on open neural architecture benchmarks, and aims to generalize to completely unseen architectures through combined innovations in network representation, contrastive pretraining, and fuzzy clustering-based predictor ensemble.

Contrastive Learning Image Classification +1

OneRing: A Simple Method for Source-free Open-partial Domain Adaptation

1 code implementation7 Jun 2022 Shiqi Yang, Yaxing Wang, Kai Wang, Shangling Jui, Joost Van de Weijer

In this paper, we investigate Source-free Open-partial Domain Adaptation (SF-OPDA), which addresses the situation where there exist both domain and category shifts between source and target domains.

Domain Generalization Open Set Learning +2

R5: Rule Discovery with Reinforced and Recurrent Relational Reasoning

1 code implementation ICLR 2022 Shengyao Lu, Bang Liu, Keith G. Mills, Shangling Jui, Di Niu

Systematicity, i. e., the ability to recombine known parts and rules to form new sequences while reasoning over relational data, is critical to machine intelligence.

Relation Relational Reasoning

Attracting and Dispersing: A Simple Approach for Source-free Domain Adaptation

1 code implementation9 May 2022 Shiqi Yang, Yaxing Wang, Kai Wang, Shangling Jui, Joost Van de Weijer

Treating SFDA as an unsupervised clustering problem and following the intuition that local neighbors in feature space should have more similar predictions than other features, we propose to optimize an objective of prediction consistency.

Clustering Source-Free Domain Adaptation

Incremental Meta-Learning via Episodic Replay Distillation for Few-Shot Image Recognition

1 code implementation9 Nov 2021 Kai Wang, Xialei Liu, Andy Bagdanov, Luis Herranz, Shangling Jui, Joost Van de Weijer

We propose an approach to IML, which we call Episodic Replay Distillation (ERD), that mixes classes from the current task with class exemplars from previous tasks when sampling episodes for meta-learning.

Continual Learning Knowledge Distillation +1

Damped Anderson Mixing for Deep Reinforcement Learning: Acceleration, Convergence, and Stabilization

no code implementations NeurIPS 2021 Ke Sun, Yafei Wang, Yi Liu, Yingnan Zhao, Bo Pan, Shangling Jui, Bei Jiang, Linglong Kong

Anderson mixing has been heuristically applied to reinforcement learning (RL) algorithms for accelerating convergence and improving the sampling efficiency of deep RL.

reinforcement-learning Reinforcement Learning (RL)

Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation

2 code implementations NeurIPS 2021 Shiqi Yang, Yaxing Wang, Joost Van de Weijer, Luis Herranz, Shangling Jui

In this paper, we address the challenging source-free domain adaptation (SFDA) problem, where the source pretrained model is adapted to the target domain in the absence of source data.

Source-Free Domain Adaptation

Exploring the Robustness of Distributional Reinforcement Learning against Noisy State Observations

no code implementations29 Sep 2021 Ke Sun, Yi Liu, Yingnan Zhao, Hengshuai Yao, Shangling Jui, Linglong Kong

In real scenarios, state observations that an agent observes may contain measurement errors or adversarial noises, misleading the agent to take suboptimal actions or even collapse while training.

Distributional Reinforcement Learning reinforcement-learning +1

Distilling GANs with Style-Mixed Triplets for X2I Translation with Limited Data

no code implementations ICLR 2022 Yaxing Wang, Joost Van de Weijer, Lu Yu, Shangling Jui

Therefore, we investigate knowledge distillation to transfer knowledge from a high-quality unconditioned generative model (e. g., StyleGAN) to a conditioned synthetic image generation modules in a variety of systems.

Image Generation Knowledge Distillation +2

Profiling Neural Blocks and Design Spaces for Mobile Neural Architecture Search

1 code implementation25 Sep 2021 Keith G. Mills, Fred X. Han, Jialin Zhang, SEYED SAEED CHANGIZ REZAEI, Fabian Chudak, Wei Lu, Shuo Lian, Shangling Jui, Di Niu

Neural architecture search automates neural network design and has achieved state-of-the-art results in many deep learning applications.

Neural Architecture Search

L$^{2}$NAS: Learning to Optimize Neural Architectures via Continuous-Action Reinforcement Learning

no code implementations25 Sep 2021 Keith G. Mills, Fred X. Han, Mohammad Salameh, SEYED SAEED CHANGIZ REZAEI, Linglong Kong, Wei Lu, Shuo Lian, Shangling Jui, Di Niu

In this paper, we propose L$^{2}$NAS, which learns to intelligently optimize and update architecture hyperparameters via an actor neural network based on the distribution of high-performing architectures in the search history.

Hyperparameter Optimization Neural Architecture Search +2

Exploring the Training Robustness of Distributional Reinforcement Learning against Noisy State Observations

1 code implementation17 Sep 2021 Ke Sun, Yingnan Zhao, Shangling Jui, Linglong Kong

In real scenarios, state observations that an agent observes may contain measurement errors or adversarial noises, misleading the agent to take suboptimal actions or even collapse while training.

Density Estimation Distributional Reinforcement Learning +2

Generalized Source-free Domain Adaptation

1 code implementation ICCV 2021 Shiqi Yang, Yaxing Wang, Joost Van de Weijer, Luis Herranz, Shangling Jui

In this paper, we propose a new domain adaptation paradigm called Generalized Source-free Domain Adaptation (G-SFDA), where the learned model needs to perform well on both the target and source domains, with only access to current unlabeled target data during adaptation.

Source-Free Domain Adaptation

ReNAS: Relativistic Evaluation of Neural Architecture Search

7 code implementations CVPR 2021 Yixing Xu, Yunhe Wang, Kai Han, Yehui Tang, Shangling Jui, Chunjing Xu, Chang Xu

An effective and efficient architecture performance evaluation scheme is essential for the success of Neural Architecture Search (NAS).

Neural Architecture Search

Generative Adversarial Neural Architecture Search

no code implementations19 May 2021 SEYED SAEED CHANGIZ REZAEI, Fred X. Han, Di Niu, Mohammad Salameh, Keith Mills, Shuo Lian, Wei Lu, Shangling Jui

Despite the empirical success of neural architecture search (NAS) in deep learning applications, the optimality, reproducibility and cost of NAS schemes remain hard to assess.

Neural Architecture Search

MineGAN++: Mining Generative Models for Efficient Knowledge Transfer to Limited Data Domains

1 code implementation28 Apr 2021 Yaxing Wang, Abel Gonzalez-Garcia, Chenshen Wu, Luis Herranz, Fahad Shahbaz Khan, Shangling Jui, Joost Van de Weijer

Therefore, we propose a novel knowledge transfer method for generative models based on mining the knowledge that is most beneficial to a specific target domain, either from a single or multiple pretrained GANs.

Transfer Learning

Generative Adversarial Neural Architecture Search with Importance Sampling

no code implementations1 Jan 2021 SEYED SAEED CHANGIZ REZAEI, Fred X. Han, Di Niu, Mohammad Salameh, Keith G Mills, Shangling Jui

Despite the empirical success of neural architecture search (NAS) algorithms in deep learning applications, the optimality, reproducibility and cost of NAS schemes remain hard to be assessed.

Neural Architecture Search

Casting a BAIT for Offline and Online Source-free Domain Adaptation

2 code implementations23 Oct 2020 Shiqi Yang, Yaxing Wang, Joost Van de Weijer, Luis Herranz, Shangling Jui

When adapting to the target domain, the additional classifier initialized from source classifier is expected to find misclassified features.

Source-Free Domain Adaptation Unsupervised Domain Adaptation

Neural Architecture Search For Keyword Spotting

no code implementations1 Sep 2020 Tong Mo, Yakun Yu, Mohammad Salameh, Di Niu, Shangling Jui

Deep neural networks have recently become a popular solution to keyword spotting systems, which enable the control of smart devices via voice.

 Ranked #1 on Keyword Spotting on Google Speech Commands (Google Speech Commands V1 6 metric)

Keyword Spotting Neural Architecture Search

Semantic Drift Compensation for Class-Incremental Learning

2 code implementations CVPR 2020 Lu Yu, Bartłomiej Twardowski, Xialei Liu, Luis Herranz, Kai Wang, Yongmei Cheng, Shangling Jui, Joost Van de Weijer

The vast majority of methods have studied this scenario for classification networks, where for each new task the classification layer of the network must be augmented with additional weights to make room for the newly added classes.

Class Incremental Learning General Classification +1

ReNAS:Relativistic Evaluation of Neural Architecture Search

4 code implementations30 Sep 2019 Yixing Xu, Yunhe Wang, Kai Han, Yehui Tang, Shangling Jui, Chunjing Xu, Chang Xu

An effective and efficient architecture performance evaluation scheme is essential for the success of Neural Architecture Search (NAS).

Neural Architecture Search

Three-Head Neural Network Architecture for AlphaZero Learning

no code implementations25 Sep 2019 Chao GAO, Martin Mueller, Ryan Hayward, Hengshuai Yao, Shangling Jui

A three-head network architecture has been recently proposed that can learn a third action-value head on a fixed dataset the same as for two-head net.

Deep Demosaicing for Edge Implementation

no code implementations26 Mar 2019 Ramchalam Kinattinkara Ramakrishnan, Shangling Jui, Vahid Patrovi Nia

We provide an exhaustive search of deep neural network architectures and obtain a pareto front of Color Peak Signal to Noise Ratio (CPSNR) as the performance criterion versus the number of parameters as the model complexity that beats the state-of-the-art.

Demosaicking Neural Architecture Search

Cannot find the paper you are looking for? You can Submit a new open access paper.