Search Results for author: Yang Zhao

Found 229 papers, 76 papers with code

Rethinking Sentiment Style Transfer

no code implementations Findings (EMNLP) 2021 Ping Yu, Yang Zhao, Chunyuan Li, Changyou Chen

To overcome this issue, we propose a graph-based method to extract attribute content and attribute-independent content from input sentences in the YELP dataset and IMDB dataset.

Attribute Style Transfer +1

A Versatile Adaptive Curriculum Learning Framework for Task-oriented Dialogue Policy Learning

no code implementations Findings (NAACL) 2022 Yang Zhao, Hua Qin, Wang Zhenyu, Changxi Zhu, Shihan Wang

It supports evaluating the difficulty of dialogue tasks only using the learning experiences of dialogue policy and skip-level selection according to their learning needs to maximize the learning efficiency.

Deep Reinforcement Learning

A Flexible Recurrent Residual Pyramid Network for Video Frame Interpolation

no code implementations ECCV 2020 Haoxian Zhang, Yang Zhao, Ronggang Wang

Inspired by classical pyramid energy minimization optical flow algorithms, this paper proposes a recurrent residual pyramid network (RRPN) for video frame interpolation.

Optical Flow Estimation Video Frame Interpolation

A Simple Yet Effective Corpus Construction Method for Chinese Sentence Compression

no code implementations LREC 2022 Yang Zhao, Hiroshi Kanayama, Issei Yoshida, Masayasu Muraoka, Akiko Aizawa

To remedy this shortcoming, we present a dependency-tree-based method to construct a Chinese corpus with 151k pairs of sentences and compression based on Chinese language-specific characteristics.

Sentence Sentence Compression

SEGA: Drivable 3D Gaussian Head Avatar from a Single Image

no code implementations19 Apr 2025 Chen Guo, Zhuo Su, Jian Wang, Shuang Li, Xu Chang, Zhaohu Li, Yang Zhao, Guidong Wang, Ruqi Huang

Creating photorealistic 3D head avatars from limited input has become increasingly important for applications in virtual reality, telepresence, and digital entertainment.

Neural Rendering

Information Gain-Guided Causal Intervention for Autonomous Debiasing Large Language Models

no code implementations17 Apr 2025 Zhouhao Sun, Xiao Ding, Li Du, Yunpeng Xu, Yixuan Ma, Yang Zhao, Bing Qin, Ting Liu

Despite significant progress, recent studies indicate that current large language models (LLMs) may still capture dataset biases and utilize them during inference, leading to the poor generalizability of LLMs.

Diversity In-Context Learning

MLKV: Efficiently Scaling up Large Embedding Model Training with Disk-based Key-Value Storage

1 code implementation2 Apr 2025 Yongjun He, Roger Waleffe, Zhichao Han, Johnu George, Binhang Yuan, Zitao Zhang, Yinan Shan, Yang Zhao, Debojyoti Dutta, Theodoros Rekatsinas, Ce Zhang

As increasingly diverse ML applications utilize embedding models and embedding tables continue to grow in size and number, there has been a surge in the ad-hoc development of specialized frameworks targeted to train large embedding models for specific tasks.

TransDiffSBDD: Causality-Aware Multi-Modal Structure-Based Drug Design

no code implementations26 Mar 2025 Xiuyuan Hu, Guoqing Liu, Can Chen, Yang Zhao, Hao Zhang, Xue Liu

To address both challenges, we propose TransDiffSBDD, an integrated framework combining autoregressive transformers and diffusion models for SBDD.

Drug Design Drug Discovery

Diff-Palm: Realistic Palmprint Generation with Polynomial Creases and Intra-Class Variation Controllable Diffusion Models

1 code implementation24 Mar 2025 Jianlong Jin, Chenglong Zhao, Ruixin Zhang, Sheng Shang, Jianqing Xu, Jingyun Zhang, Shaoming Wang, Yang Zhao, Shouhong Ding, Wei Jia, Yunsheng Wu

However, without employing real data fine-tuning, the performance of the recognition model trained on these synthetic datasets would drastically decline, indicating a large gap between generated and real palmprints.

MVD-HuGaS: Human Gaussians from a Single Image via 3D Human Multi-view Diffusion Prior

no code implementations11 Mar 2025 Kaiqiang Xiong, Ying Feng, Qi Zhang, Jianbo Jiao, Yang Zhao, Zhihao Liang, Huachen Gao, Ronggang Wang

We first generate multi-view images from the single reference image with an enhanced multi-view diffusion model, which is well fine-tuned on high-quality 3D human datasets to incorporate 3D geometry priors and human structure priors.

3D geometry 3D Human Reconstruction

Efficient Jailbreaking of Large Models by Freeze Training: Lower Layers Exhibit Greater Sensitivity to Harmful Content

no code implementations28 Feb 2025 Hongyuan Shen, Min Zheng, Jincheng Wang, Yang Zhao

With the widespread application of Large Language Models across various domains, their security issues have increasingly garnered significant attention from both academic and industrial communities.

Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models

no code implementations22 Feb 2025 Qianqi Yan, Yue Fan, Hongquan Li, Shan Jiang, Yang Zhao, Xinze Guan, Ching-Chen Kuo, Xin Eric Wang

Existing Multimodal Large Language Models (MLLMs) are predominantly trained and tested on consistent visual-textual inputs, leaving open the question of whether they can handle inconsistencies in real-world, layout-rich content.

Multimodal Reasoning

DOEI: Dual Optimization of Embedding Information for Attention-Enhanced Class Activation Maps

1 code implementation21 Feb 2025 Hongjie Zhu, Zeyu Zhang, Guansong Pang, Xu Wang, Shimin Wen, Yu Bai, Daji Ergu, Ying Cai, Yang Zhao

This alignment of activation responses with semantic information strengthens the propagation and decoupling of target features, enabling the generated embeddings to more accurately represent target features in high-level semantic space.

Weakly supervised Semantic Segmentation Weakly-Supervised Semantic Segmentation

Poster: SpiderSim: Multi-Agent Driven Theoretical Cybersecurity Simulation for Industrial Digitalization

1 code implementation19 Feb 2025 Jiaqi Li, Xizhong Guo, Yang Zhao, Lvyang Zhang, Lidong Zhai

Rapid industrial digitalization has created intricate cybersecurity demands that necessitate effective validation methods.

Diversity

PedDet: Adaptive Spectral Optimization for Multimodal Pedestrian Detection

1 code implementation19 Feb 2025 Rui Zhao, Zeyu Zhang, Yi Xu, Yi Yao, Yan Huang, Wenxin Zhang, Zirui Song, Xiuying Chen, Yang Zhao

Pedestrian detection in intelligent transportation systems has made significant progress but faces two critical challenges: (1) insufficient fusion of complementary information between visible and infrared spectra, particularly in complex scenarios, and (2) sensitivity to illumination changes, such as low-light or overexposed conditions, leading to degraded performance.

Pedestrian Detection

Fluid Antenna Enabled Over-the-Air Federated Learning: Joint Optimization of Positioning, Beamforming, and User Selection

no code implementations17 Feb 2025 Yang Zhao, Minrui Xu, Ping Wang, Dusit Niyato

Over-the-air (OTA) federated learning (FL) effectively utilizes communication bandwidth, yet it is vulnerable to errors during analog aggregation.

Federated Learning Stochastic Optimization

Beyond Similarity: A Gradient-based Graph Method for Instruction Tuning Data Selection

no code implementations16 Feb 2025 Yang Zhao, Li Du, Xiao Ding, Yangou Ouyang, Hepeng Wang, Kai Xiong, Jinglong Gao, Zhouhao Sun, Dongliang Xu, Yang Qing, Dongchen Li, Bing Qin, Ting Liu

Large language models (LLMs) have shown great potential across various industries due to their remarkable ability to generalize through instruction tuning.

Domain Adaptation Transfer Learning

3DMolFormer: A Dual-channel Framework for Structure-based Drug Discovery

1 code implementation7 Feb 2025 Xiuyuan Hu, Guoqing Liu, Can Chen, Yang Zhao, Hao Zhang, Xue Liu

Structure-based drug discovery, encompassing the tasks of protein-ligand docking and pocket-aware 3D drug design, represents a core challenge in drug discovery.

Drug Design Drug Discovery

MedConv: Convolutions Beat Transformers on Long-Tailed Bone Density Prediction

1 code implementation2 Feb 2025 Xuyin Qi, Zeyu Zhang, Huazhan Zheng, Mingxi Chen, Numan Kutaiba, Ruth Lim, Cherie Chiang, Zi En Tham, Xuan Ren, Wenxin Zhang, Lei Zhang, Hao Zhang, Wenbing Lv, Guangzhen Yao, Renda Han, Kangsheng Wang, Mingyuan Li, Hongtao Mao, Yu Li, Zhibin Liao, Yang Zhao, Minh-Son To

Bone density prediction via CT scans to estimate T-scores is crucial, providing a more precise assessment of bone health compared to traditional methods like X-ray bone density tests, which lack spatial resolution and the ability to detect localized changes.

Prediction

GENIE: Generative Note Information Extraction model for structuring EHR data

no code implementations30 Jan 2025 Huaiyuan Ying, Hongyi Yuan, Jinsen Lu, Zitian Qu, Yang Zhao, Zhengyun Zhao, Isaac Kohane, Tianxi Cai, Sheng Yu

Traditional methods for structuring EHR free-text data, such as rule-based systems and multi-stage pipelines, are often limited by their time-consuming configurations and inability to adapt across clinical notes from diverse healthcare settings.

Attribute Attribute Extraction

Movable Antenna-Aided Cooperative ISAC Network with Time Synchronization error and Imperfect CSI

no code implementations26 Jan 2025 Yue Xiu, Yang Zhao, Ran Yang, Dusit Niyato, Jing Jin, Qixing Wang, Guangyi Liu, Ning Wei

We analyze the impact of CSI errors on achievable rates and introduce a hybrid Cramer-Rao lower bound (HCRLB) to evaluate the effect of TS errors on target localization accuracy.

Deep Reinforcement Learning Integrated sensing and communication +1

SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration

no code implementations2 Jan 2025 Jianyi Wang, Zhijie Lin, Meng Wei, Yang Zhao, Ceyuan Yang, Chen Change Loy, Lu Jiang

Video restoration poses non-trivial challenges in maintaining fidelity while recovering temporally consistent details from unknown degradations in the wild.

Video Restoration

ProjectedEx: Enhancing Generation in Explainable AI for Prostate Cancer

1 code implementation2 Jan 2025 Xuyin Qi, Zeyu Zhang, Aaron Berliano Handoko, Huazhan Zheng, Mingxi Chen, Ta Duc Huy, Vu Minh Hieu Phan, Lei Zhang, Linqi Cheng, Shiyu Jiang, Zhiwei Zhang, Zhibin Liao, Yang Zhao, Minh-Son To

Additionally, we conduct comprehensive experiments on both the generator and classifier, demonstrating the clinical relevance and effectiveness of ProjectedEx in enhancing interpretability and supporting the adoption of AI in medical settings.

Attribute Diagnostic +3

Optimizing SSD Caches for Cloud Block Storage Systems Using Machine Learning Approaches

no code implementations29 Dec 2024 Chiyu Cheng, Chang Zhou, Yang Zhao, Jin Cao

The management of data writes to SSD caches plays a crucial role in improving overall system performance, reducing latency, and extending the lifespan of storage devices.

Management

Dynamic Optimization of Storage Systems Using Reinforcement Learning Techniques

no code implementations29 Dec 2024 Chiyu Cheng, Chang Zhou, Yang Zhao, Jin Cao

Traditional heuristics employed for storage performance optimization often fail to adapt to the variability and complexity of contemporary workloads, leading to significant performance bottlenecks and resource inefficiencies.

Q-Learning reinforcement-learning +2

Dynamic Adaptation in Data Storage: Real-Time Machine Learning for Enhanced Prefetching

no code implementations29 Dec 2024 Chiyu Cheng, Chang Zhou, Yang Zhao, Jin Cao

The exponential growth of data storage demands has necessitated the evolution of hierarchical storage management strategies [1].

Computational Efficiency Feature Engineering +1

SegKAN: High-Resolution Medical Image Segmentation with Long-Distance Dependencies

1 code implementation28 Dec 2024 Shengbo Tan, Rundong Xue, Shipeng Luo, Zeyu Zhang, Xinran Wang, Lei Zhang, Daji Ergu, Zhang Yi, Yang Zhao, Ying Cai

Hepatic vessels in computed tomography scans often suffer from image fragmentation and noise interference, making it difficult to maintain vessel integrity and posing significant challenges for vessel segmentation.

Image Segmentation Medical Image Segmentation +1

Fast inverse lithography based on a model-driven block stacking convolutional neural network

no code implementations19 Dec 2024 Ruixiang Chen, Yang Zhao, Haoqin Li, Rui Chen

In the realm of lithography, Optical Proximity Correction (OPC) is a crucial resolution enhancement technique that optimizes the transmission function of photomasks on a pixel-based to effectively counter Optical Proximity Effects (OPE).

Diversity

Multimodal Class-aware Semantic Enhancement Network for Audio-Visual Video Parsing

no code implementations15 Dec 2024 Pengcheng Zhao, Jinxing Zhou, Yang Zhao, Dan Guo, Yanxiang Chen

However, each segment may contain multiple events, resulting in semantically mixed holistic features that can lead to semantic interference during intra- or cross-modal interactions: the event semantics of one segment may incorporate semantics of unrelated events from other segments.

Lightweight Multiplane Images Network for Real-Time Stereoscopic Conversion from Planar Video

no code implementations4 Dec 2024 Shanding Diao, Yang Zhao, Yuan Chen, Zhao Zhang, Wei Jia, Ronggang Wang

This paper proposes a planar video real-time stereoscopic conversion network based on multi-plane images (MPI), which consists of a detail branch for generating MPI and a depth-semantic branch for perceiving depth information.

2k

Controlling the Latent Diffusion Model for Generative Image Shadow Removal via Residual Generation

no code implementations3 Dec 2024 Xinjie Li, Yang Zhao, Dong Wang, Yuan Chen, Li Cao, Xiaoping Liu

Large-scale generative models have achieved remarkable advancements in various visual tasks, yet their application to shadow removal in images remains challenging.

Image Reconstruction Image Shadow Removal +1

Movable Antenna-Aided Federated Learning with Over-the-Air Aggregation: Joint Optimization of Positioning, Beamforming, and User Selection

no code implementations11 Nov 2024 Yang Zhao, Yue Xiu, Minrui Xu, Ping Wang, Ning Wei

Federated learning (FL) in wireless computing effectively utilizes communication bandwidth, yet it is vulnerable to errors during the analog aggregation process.

Federated Learning Stochastic Optimization

Image Understanding Makes for A Good Tokenizer for Image Generation

1 code implementation7 Nov 2024 Luting Wang, Yang Zhao, Zijian Zhang, Jiashi Feng, Si Liu, Bingyi Kang

Currently, pixel reconstruction (e. g., VQGAN) dominates the training objective for image tokenizers.

Image Generation

Corporate Fundamentals and Stock Price Co-Movement

no code implementations6 Nov 2024 Lyuhong Wang, Jiawei Jiang, Yang Zhao

We introduce an innovative framework that leverages advanced big data techniques to analyze dynamic co-movement between stocks and their underlying fundamentals using high-frequency stock market data.

CIT: Rethinking Class-incremental Semantic Segmentation with a Class Independent Transformation

1 code implementation5 Nov 2024 Jinchao Ge, BoWen Zhang, Akide Liu, Minh Hieu Phan, Qi Chen, Yangyang Shu, Yang Zhao

Class-incremental semantic segmentation (CSS) requires that a model learn to segment new classes without forgetting how to segment previous ones: this is typically achieved by distilling the current knowledge and incorporating the latest data.

Class-Incremental Semantic Segmentation Segmentation

Autonomous Decision Making for UAV Cooperative Pursuit-Evasion Game with Reinforcement Learning

no code implementations5 Nov 2024 Yang Zhao, Zidong Nie, Kangsheng Dong, Qinghua Huang, Xuelong Li

This paper proposes a deep reinforcement learning-based model for decision-making in multi-role UAV cooperative pursuit-evasion game, to address the challenge of enabling UAV to autonomously make decisions in complex game environments.

Decision Making Deep Reinforcement Learning +1

Distribution alignment based transfer fusion frameworks on quantum devices for seeking quantum advantages

no code implementations4 Nov 2024 Xi He, Feiyu Du, Xiaohan Yu, Yang Zhao, Tao Lei

Two transfer fusion frameworks are proposed in this paper to predict the labels of a target domain data by aligning its distribution to a different but related labelled source domain on quantum devices.

Quantum Machine Learning

How Far is Video Generation from World Model: A Physical Law Perspective

no code implementations4 Nov 2024 Bingyi Kang, Yang Yue, Rui Lu, Zhijie Lin, Yang Zhao, Kaixin Wang, Gao Huang, Jiashi Feng

Our scaling experiments show perfect generalization within the distribution, measurable scaling behavior for combinatorial generalization, but failure in out-of-distribution scenarios.

Video Generation

Conditional Uncertainty Quantification for Tensorized Topological Neural Networks

no code implementations20 Oct 2024 Yujia Wu, Bo Yang, Yang Zhao, Elynn Chen, Yuzhou Chen, Zheshi Zheng

Graph Neural Networks (GNNs) have become the de facto standard for analyzing graph-structured data, leveraging message-passing techniques to capture both structural and node feature information.

Decision Making Graph Classification +4

Medical AI for Early Detection of Lung Cancer: A Survey

1 code implementation18 Oct 2024 Guohui Cai, Ying Cai, Zeyu Zhang, Yuanzhouhan Cao, Lin Wu, Daji Ergu, Zhinbin Liao, Yang Zhao

The recent emergence of deep learning has revolutionized medical image analysis, driving substantial advancements in this field.

Deep Learning Lung Cancer Diagnosis +3

Boosting LLM Translation Skills without General Ability Loss via Rationale Distillation

no code implementations17 Oct 2024 Junhong Wu, Yang Zhao, Yangyifan Xu, Bing Liu, Chengqing Zong

These abilities, which are developed using proprietary and unavailable training data, make existing continual instruction tuning methods ineffective.

General Knowledge Instruction Following +2

MMLF: Multi-modal Multi-class Late Fusion for Object Detection with Uncertainty Estimation

no code implementations11 Oct 2024 Qihang Yang, Yang Zhao, Hong Cheng

Autonomous driving necessitates advanced object detection techniques that integrate information from multiple modalities to overcome the limitations associated with single-modal approaches.

Autonomous Driving object-detection +1

Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach

no code implementations8 Oct 2024 Sha Guo, Zhuo Chen, Yang Zhao, Ning Zhang, Xiaotong Li, Lingyu Duan

Extensive experiments demonstrate the effectiveness of the proposed framework in both image reconstruction and downstream machine vision tasks such as object detection, segmentation, and facial landmark detection, achieving superior perceptual quality compared to state-of-the-art methods.

Data Compression Facial Landmark Detection +5

Ordinal Preference Optimization: Aligning Human Preferences via NDCG

1 code implementation6 Oct 2024 Yang Zhao, Yixin Wang, Mingzhang Yin

In this work, we propose a novel listwise approach named Ordinal Preference Optimization (OPO), which employs the Normalized Discounted Cumulative Gain (NDCG), a widely-used ranking metric, to better utilize relative proximity within ordinal multiple responses.

Information Retrieval

Loong: Generating Minute-level Long Videos with Autoregressive Language Models

no code implementations3 Oct 2024 Yuqing Wang, Tianwei Xiong, Daquan Zhou, Zhijie Lin, Yang Zhao, Bingyi Kang, Jiashi Feng, Xihui Liu

Autoregressive large language models (LLMs) have achieved great success in generating coherent and long sequences of tokens in the domain of natural language processing, while the exploration of autoregressive LLMs for video generation is limited to generating short videos of several seconds.

Video Generation

Robust Beamforming Design for Near-Field DMA-NOMA mmWave Communications With Imperfect Position Information

no code implementations24 Sep 2024 Yue Xiu, Yang Zhao, Songjie Yang, Yufeng Zhang, Dusit Niyato, Hongyang Du, Ning Wei

For millimeter-wave (mmWave) non-orthogonal multiple access (NOMA) communication systems, we propose an innovative near-field (NF) transmission framework based on dynamic metasurface antenna (DMA) technology.

Position

Supervised Fine-Tuning Achieve Rapid Task Adaption Via Alternating Attention Head Activation Patterns

no code implementations24 Sep 2024 Yang Zhao, Li Du, Xiao Ding, Kai Xiong, Ting Liu, Bing Qin

We find that: (1) LLMs selectively activate task-specific attention heads during SFT; (2) activation patterns for complex tasks are combinations of basic task patterns; and (3) changes in a few parameters can significantly impact activation patterns after SFT on a small number of samples. Based on these insights, experiments are conducted to actually enhance the efficiency and effectiveness of SFT.

Hybrid Cost Volume for Memory-Efficient Optical Flow

1 code implementation6 Sep 2024 Yang Zhao, Gangwei Xu, Gang Wu

Compared to the recurrent flow methods based the all-pairs cost volumes, our HCVFlow significantly reduces memory consumption while ensuring high accuracy.

4k Optical Flow Estimation

StimuVAR: Spatiotemporal Stimuli-aware Video Affective Reasoning with Multimodal Large Language Models

no code implementations31 Aug 2024 Yuxiang Guo, Faizan Siddiqui, Yang Zhao, Rama Chellappa, Shao-Yuan Lo

To address this issue, we propose StimuVAR, a spatiotemporal Stimuli-aware framework for Video Affective Reasoning (VAR) with MLLMs.

Video Understanding

ESA: Annotation-Efficient Active Learning for Semantic Segmentation

1 code implementation24 Aug 2024 Jinchao Ge, Zeyu Zhang, Minh Hieu Phan, BoWen Zhang, Akide Liu, Yang Zhao

Active learning enhances annotation efficiency by selecting the most revealing samples for labeling, thereby reducing reliance on extensive human input.

Active Learning Semantic Segmentation +1

Causal-Guided Active Learning for Debiasing Large Language Models

1 code implementation23 Aug 2024 Li Du, Zhouhao Sun, Xiao Ding, Yixuan Ma, Yang Zhao, Kaitao Qiu, Ting Liu, Bing Qin

Although achieving promising performance, recent analyses show that current generative large language models (LLMs) may still capture dataset biases and utilize them for generation, leading to poor generalizability and harmfulness of LLMs.

Active Learning Diversity +1

Generative Diffusion Model-based Downscaling of Observed Sea Surface Height over Kuroshio Extension since 2000

no code implementations22 Aug 2024 Qiuchang Han, Xingliang Jiang, Yang Zhao, Xudong Wang, Zhijin Li, Renhe Zhang

Satellite altimetry has been widely utilized to monitor global sea surface dynamics, enabling investigation of upper ocean variability from basin-scale to localized eddy ranges.

HeadGAP: Few-Shot 3D Head Avatar via Generalizable Gaussian Priors

no code implementations12 Aug 2024 Xiaozheng Zheng, Chao Wen, Zhaohu Li, Weiyi Zhang, Zhuo Su, Xu Chang, Yang Zhao, Zheng Lv, Xiaoyuan Zhang, YongJie Zhang, Guidong Wang, Lan Xu

The prior learning phase leverages 3D head priors derived from a large-scale multi-view dynamic dataset, and the avatar creation phase applies these priors for few-shot personalization.

Decoder

Apple Intelligence Foundation Language Models

no code implementations29 Jul 2024 Tom Gunter, ZiRui Wang, Chong Wang, Ruoming Pang, Aonan Zhang, BoWen Zhang, Chen Chen, Chung-Cheng Chiu, David Qiu, Deepak Gopinath, Dian Ang Yap, Dong Yin, Feng Nan, Floris Weers, Guoli Yin, Haoshuo Huang, Jianyu Wang, Jiarui Lu, John Peebles, Ke Ye, Mark Lee, Nan Du, Qibin Chen, Quentin Keunebroek, Sam Wiseman, Syd Evans, Tao Lei, Vivek Rathod, Xiang Kong, Xianzhi Du, Yanghao Li, Yongqiang Wang, Yuan Gao, Zaid Ahmed, Zhaoyang Xu, Zhiyun Lu, Al Rashid, Albin Madappally Jose, Alec Doane, Alfredo Bencomo, Allison Vanderby, Andrew Hansen, Ankur Jain, Anupama Mann Anupama, Areeba Kamal, Bugu Wu, Carolina Brum, Charlie Maalouf, Chinguun Erdenebileg, Chris Dulhanty, Dominik Moritz, Doug Kang, Eduardo Jimenez, Evan Ladd, Fangping Shi, Felix Bai, Frank Chu, Fred Hohman, Hadas Kotek, Hannah Gillis Coleman, Jane Li, Jeffrey Bigham, Jeffery Cao, Jeff Lai, Jessica Cheung, Jiulong Shan, Joe Zhou, John Li, Jun Qin, Karanjeet Singh, Karla Vega, Kelvin Zou, Laura Heckman, Lauren Gardiner, Margit Bowler, Maria Cordell, Meng Cao, Nicole Hay, Nilesh Shahdadpuri, Otto Godwin, Pranay Dighe, Pushyami Rachapudi, Ramsey Tantawi, Roman Frigg, Sam Davarnia, Sanskruti Shah, Saptarshi Guha, Sasha Sirovica, Shen Ma, Shuang Ma, Simon Wang, Sulgi Kim, Suma Jayaram, Vaishaal Shankar, Varsha Paidi, Vivek Kumar, Xin Wang, Xin Zheng, Walker Cheng, Yael Shrager, Yang Ye, Yasu Tanaka, Yihao Guo, Yunsong Meng, Zhao Tang Luo, Zhi Ouyang, Alp Aygar, Alvin Wan, Andrew Walkingshaw, Andy Narayanan, Antonie Lin, Arsalan Farooq, Brent Ramerth, Colorado Reed, Chris Bartels, Chris Chaney, David Riazati, Eric Liang Yang, Erin Feldman, Gabriel Hochstrasser, Guillaume Seguin, Irina Belousova, Joris Pelemans, Karen Yang, Keivan Alizadeh Vahid, Liangliang Cao, Mahyar Najibi, Marco Zuliani, Max Horton, Minsik Cho, Nikhil Bhendawade, Patrick Dong, Piotr Maj, Pulkit Agrawal, Qi Shan, Qichen Fu, Regan Poston, Sam Xu, Shuangning Liu, Sushma Rao, Tashweena Heeramun, Thomas Merth, Uday Rayala, Victor Cui, Vivek Rangarajan Sridhar, Wencong Zhang, Wenqi Zhang, Wentao Wu, Xingyu Zhou, Xinwen Liu, Yang Zhao, Yin Xia, Zhile Ren, Zhongzheng Ren

We present foundation language models developed to power Apple Intelligence features, including a ~3 billion parameter model designed to run efficiently on devices and a large server-based language model designed for Private Cloud Compute.

Language Modeling Language Modelling

MIMO Channel Shaping and Rate Maximization Using Beyond Diagonal RIS

1 code implementation21 Jul 2024 Yang Zhao, Hongyu Li, Bruno Clerckx, Massimo Franceschetti

This paper investigates the limits to which a passive Reconfigurable Intelligent Surface (RIS) can reshape a point-to-point Multiple-Input Multiple-Output (MIMO) in terms of singular values for improved wireless (e. g., rate and power) performance.

CodeV: Empowering LLMs for Verilog Generation through Multi-Level Summarization

no code implementations15 Jul 2024 Yang Zhao, Di Huang, Chongxiao Li, Pengwei Jin, Ziyuan Nan, TianYun Ma, Lei Qi, Yansong Pan, Zhenxing Zhang, Rui Zhang, Xishan Zhang, Zidong Du, Qi Guo, Xing Hu, Yunji Chen

Instruction-tuned large language models (LLMs) have demonstrated remarkable performance in automatically generating code for general-purpose programming languages like Python.

Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding

1 code implementation27 Jun 2024 Yue Fan, Lei Ding, Ching-Chen Kuo, Shan Jiang, Yang Zhao, Xinze Guan, Jie Yang, Yi Zhang, Xin Eric Wang

Based on the tree, our ToL agent not only comprehends the content of the indicated area but also articulates the layout and spatial relationships between elements.

Learning Traffic Crashes as Language: Datasets, Benchmarks, and What-if Causal Analyses

no code implementations16 Jun 2024 Zhiwen Fan, Pu Wang, Yang Zhao, Yibo Zhao, Boris Ivanovic, Zhangyang Wang, Marco Pavone, Hao Frank Yang

Leveraging this rich dataset, we further formulate the crash event feature learning as a novel text reasoning problem and further fine-tune various large language models (LLMs) to predict detailed accident outcomes, such as crash types, severity and number of injuries, based on contextual and environmental factors.

Ensemble Learning

When Will Gradient Regularization Be Harmful?

1 code implementation14 Jun 2024 Yang Zhao, Hao Zhang, Xiuyuan Hu

Meanwhile, we note that scalable models tend to rely more on the GR warmup, where the performance can be improved by up to 3\% on Cifar10 compared to baseline GR.

Research on Driver Facial Fatigue Detection Based on Yolov8 Model

no code implementations4 Jun 2024 Chang Zhou, Yang Zhao, Shaobo Liu, Yi Zhao, Xingchen Li, Chiyu Cheng

In a society where traffic accidents frequently occur, fatigue driving has emerged as a grave issue.

Predict Click-Through Rates with Deep Interest Network Model in E-commerce Advertising

no code implementations4 Jun 2024 Chang Zhou, Yang Zhao, Yuelin Zou, Jin Cao, Wenhan Fan, Yi Zhao, Chiyu Cheng

This paper proposes new methods to enhance click-through rate (CTR) prediction models using the Deep Interest Network (DIN) model, specifically applied to the advertising system of Alibaba's Taobao platform.

Click-Through Rate Prediction

Optimizing Search Advertising Strategies: Integrating Reinforcement Learning with Generalized Second-Price Auctions for Enhanced Ad Ranking and Bidding

no code implementations22 May 2024 Chang Zhou, Yang Zhao, Jin Cao, Yi Shen, Xiaoling Cui, Chiyu Cheng

This paper explores the integration of strategic optimization methods in search advertising, focusing on ad ranking and bidding mechanisms within E-commerce platforms.

reinforcement-learning

Motion Avatar: Generate Human and Animal Avatars with Arbitrary Motion

1 code implementation18 May 2024 Zeyu Zhang, Yiran Wang, Biao Wu, Shuo Chen, Zhiyuan Zhang, Shiya Huang, Wenbo Zhang, Meng Fang, Ling Chen, Yang Zhao

Firstly, we proposed a novel agent-based approach named Motion Avatar, which allows for the automatic generation of high-quality customizable human and animal avatars with motions through text queries.

Motion Generation

FreeBind: Free Lunch in Unified Multimodal Space via Knowledge Fusion

1 code implementation8 May 2024 Zehan Wang, Ziang Zhang, Xize Cheng, Rongjie Huang, Luping Liu, Zhenhui Ye, Haifeng Huang, Yang Zhao, Tao Jin, Peng Gao, Zhou Zhao

In this work, we propose FreeBind, an idea that treats multimodal representation spaces as basic units, and freely augments pre-trained unified space by integrating knowledge from extra expert spaces via "space bonds".

Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance

no code implementations CVPR 2024 Kelvin C. K. Chan, Yang Zhao, Xuhui Jia, Ming-Hsuan Yang, Huisheng Wang

In subject-driven text-to-image synthesis, the synthesis process tends to be heavily influenced by the reference images provided by users, often overlooking crucial attributes detailed in the text prompt.

Image Generation

Advancing Real-time Pandemic Forecasting Using Large Language Models: A COVID-19 Case Study

1 code implementation10 Apr 2024 Hongru Du, Jianan Zhao, Yang Zhao, Shaochong Xu, Xihong Lin, Yiran Chen, Lauren M. Gardner, Hao, Yang

Forecasting the short-term spread of an ongoing disease outbreak is a formidable challenge due to the complexity of contributing factors, some of which can be characterized through interlinked, multi-modality variables such as epidemiological time series data, viral biology, population demographics, and the intersection of public policy and human behavior.

Representation Learning Time Series

MEBS: Multi-task End-to-end Bid Shading for Multi-slot Display Advertising

no code implementations5 Mar 2024 Zhen Gong, Lvyin Niu, Yang Zhao, Miao Xu, Zhenzhe Zheng, Haoqi Zhang, Zhilin Zhang, Fan Wu, Rongquan Bai, Chuan Yu, Jian Xu, Bo Zheng

Through extensive offline and online experiments, we demonstrate the effectiveness and efficiency of our method, and we obtain a 7. 01% lift in Gross Merchandise Volume, a 7. 42% lift in Return on Investment, and a 3. 26% lift in ad buy count.

OHTA: One-shot Hand Avatar via Data-driven Implicit Priors

no code implementations CVPR 2024 Xiaozheng Zheng, Chao Wen, Zhuo Su, Zeran Xu, Zhaohu Li, Yang Zhao, Zhou Xue

In this paper, we delve into the creation of one-shot hand avatars, attaining high-fidelity and drivable hand representations swiftly from a single image.

Deciphering the Impact of Pretraining Data on Large Language Models through Machine Unlearning

no code implementations18 Feb 2024 Yang Zhao, Li Du, Xiao Ding, Kai Xiong, Zhouhao Sun, Jun Shi, Ting Liu, Bing Qin

Through pretraining on a corpus with various sources, Large Language Models (LLMs) have gained impressive performance.

Machine Unlearning

A novel spatial-frequency domain network for zero-shot incremental learning

no code implementations11 Feb 2024 Jie Ren, Yang Zhao, Weichuan Zhang, Changming Sun

The proposed SFDNet has the ability to effectively extract spatial-frequency feature representation from input images, improve the accuracy of image classification, and fundamentally alleviate catastrophic forgetting.

Image Classification Incremental Learning +1

Federated Learning with New Knowledge: Fundamentals, Advances, and Futures

1 code implementation3 Feb 2024 Lixu Wang, Yang Zhao, Jiahua Dong, Ating Yin, Qinbin Li, Xiao Wang, Dusit Niyato, Qi Zhu

Federated Learning (FL) is a privacy-preserving distributed learning approach that is rapidly developing in an era where privacy protection is increasingly valued.

Federated Learning Privacy Preserving

Audio-Infused Automatic Image Colorization by Exploiting Audio Scene Semantics

no code implementations24 Jan 2024 Pengcheng Zhao, Yanxiang Chen, Yang Zhao, Zhao Zhang

Automatic image colorization is inherently an ill-posed problem with uncertainty, which requires an accurate semantic understanding of scenes to estimate reasonable colors for grayscale images.

Colorization Image Colorization

Empirical Evidence for the Fragment level Understanding on Drug Molecular Structure of LLMs

1 code implementation15 Jan 2024 Xiuyuan Hu, Guoqing Liu, Yang Zhao, Hao Zhang

AI for drug discovery has been a research hotspot in recent years, and SMILES-based language models has been increasingly applied in drug molecular design.

Drug Design Drug Discovery

Deep Video Inverse Tone Mapping Based on Temporal Clues

1 code implementation CVPR 2024 Yuyao Ye, Ning Zhang, Yang Zhao, Hongbin Cao, Ronggang Wang

Although many deep image ITM methods can generate impressive results the field of video ITM is still to be explored.

Tone Mapping

Comparing roughness maps generated by five roughness descriptors for LiDAR-derived digital elevation models

no code implementations29 Dec 2023 Lei Fan, Yang Zhao

Terrain surface roughness, often described abstractly, poses challenges in quantitative characterisation with various descriptors found in the literature.

Multi-Modal Domain Adaptation Across Video Scenes for Temporal Video Grounding

no code implementations21 Dec 2023 Haifeng Huang, Yang Zhao, Zehan Wang, Yan Xia, Zhou Zhao

Thus, to address this issue and enhance model performance on new scenes, we explore the TVG task in an unsupervised domain adaptation (UDA) setting across scenes for the first time, where the video-query pairs in the source scene (domain) are labeled with temporal boundaries, while those in the target scene are not.

Unsupervised Domain Adaptation Video Grounding

De novo Drug Design using Reinforcement Learning with Multiple GPT Agents

2 code implementations NeurIPS 2023 Xiuyuan Hu, Guoqing Liu, Yang Zhao, Hao Zhang

A central challenge in this field is to generate molecules with specific properties while also producing a wide range of diverse candidates.

Diversity Drug Design +2

CoRTEx: Contrastive Learning for Representing Terms via Explanations with Applications on Constructing Biomedical Knowledge Graphs

1 code implementation13 Dec 2023 Huaiyuan Ying, Zhengyun Zhao, Yang Zhao, Sihang Zeng, Sheng Yu

Due to a lack of knowledge, previous contrastive learning models trained with Unified Medical Language System (UMLS) synonyms struggle at clustering difficult terms and do not generalize well beyond UMLS terms.

Clustering Contrastive Learning +2

Adapting Vision Transformer for Efficient Change Detection

no code implementations8 Dec 2023 Yang Zhao, Yuxiang Zhang, Yanni Dong, Bo Du

Most change detection models based on vision transformers currently follow a "pretraining then fine-tuning" strategy.

Change Detection

DreamInpainter: Text-Guided Subject-Driven Image Inpainting with Diffusion Models

no code implementations5 Dec 2023 Shaoan Xie, Yang Zhao, Zhisheng Xiao, Kelvin C. K. Chan, Yandong Li, Yanwu Xu, Kun Zhang, Tingbo Hou

Our extensive experiments demonstrate the superior performance of our method in terms of visual quality, identity preservation, and text control, showcasing its effectiveness in the context of text-guided subject-driven image inpainting.

Image Inpainting

HiFi Tuner: High-Fidelity Subject-Driven Fine-Tuning for Diffusion Models

no code implementations30 Nov 2023 Zhonghao Wang, Wei Wei, Yang Zhao, Zhisheng Xiao, Mark Hasegawa-Johnson, Humphrey Shi, Tingbo Hou

We further extend our method to a novel image editing task: substituting the subject in an image through textual manipulations.

Denoising Image Generation +2

MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices

no code implementations28 Nov 2023 Yang Zhao, Yanwu Xu, Zhisheng Xiao, HaoLin Jia, Tingbo Hou

The deployment of large-scale text-to-image diffusion models on mobile devices is impeded by their substantial model size and slow inference speed.

Computational Efficiency Text-to-Image Generation

Cut-and-Paste: Subject-Driven Video Editing with Attention Control

no code implementations20 Nov 2023 Zhichao Zuo, Zhao Zhang, Yan Luo, Yang Zhao, Haijun Zhang, Yi Yang, Meng Wang

This paper presents a novel framework termed Cut-and-Paste for real-word semantic video editing under the guidance of text prompt and additional reference image.

Object Video Editing

UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs

1 code implementation CVPR 2024 Yanwu Xu, Yang Zhao, Zhisheng Xiao, Tingbo Hou

Text-to-image diffusion models have demonstrated remarkable capabilities in transforming textual prompts into coherent images, yet the computational cost of their inference remains a persistent challenge.

Text-to-Image Generation

Exploring Federated Unlearning: Analysis, Comparison, and Insights

1 code implementation30 Oct 2023 Yang Zhao, Jiaxi Yang, Yiling Tao, Lixu Wang, Xiaoxiao Li, Dusit Niyato, H. Vincent Poor

The increasing demand for privacy-preserving machine learning has spurred interest in federated unlearning, which enables the selective removal of data from models trained in federated systems.

Federated Learning Privacy Preserving +1

Extending Multi-modal Contrastive Representations

1 code implementation13 Oct 2023 Zehan Wang, Ziang Zhang, Luping Liu, Yang Zhao, Haifeng Huang, Tao Jin, Zhou Zhao

Inspired by recent C-MCR, this paper proposes Extending Multimodal Contrastive Representation (Ex-MCR), a training-efficient and paired-data-free method to flexibly learn unified contrastive representation space for more than three modalities by integrating the knowledge of existing MCR spaces.

3D Object Classification Representation Learning +1

Cross-Dataset-Robust Method for Blind Real-World Image Quality Assessment

no code implementations26 Sep 2023 Yuan Chen, Zhiliang Ma, Yang Zhao

First, many individual models based on popular and state-of-the-art (SOTA) Swin-Transformer (SwinT) are trained on different real-world BIQA datasets respectively.

Blind Image Quality Assessment

BenchTemp: A General Benchmark for Evaluating Temporal Graph Neural Networks

1 code implementation31 Aug 2023 Qiang Huang, Jiawei Jiang, Xi Susie Rao, Ce Zhang, Zhichao Han, Zitao Zhang, Xin Wang, Yongjun He, Quanqing Xu, Yang Zhao, Chuang Hu, Shuo Shang, Bo Du

To handle graphs in which features or connectivities are evolving over time, a series of temporal graph neural networks (TGNNs) have been proposed.

Diversity Link Prediction +1

Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes

2 code implementations17 Aug 2023 Zehan Wang, Haifeng Huang, Yang Zhao, Ziang Zhang, Zhou Zhao

This paper presents Chat-3D, which combines the 3D visual perceptual ability of pre-trained 3D representations and the impressive reasoning and conversation capabilities of advanced LLMs to achieve the first universal dialogue systems for 3D scenes.

Language Modeling Language Modelling +3

Towards Authentic Face Restoration with Iterative Diffusion Models and Beyond

no code implementations ICCV 2023 Yang Zhao, Tingbo Hou, Yu-Chuan Su, Xuhui Jia. Yandong Li, Matthias Grundmann

An authentic face restoration system is becoming increasingly demanding in many computer vision applications, e. g., image enhancement, video communication, and taking portrait.

Blind Face Restoration Denoising +2

Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding

1 code implementation ICCV 2023 Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao

To accomplish this, we design a novel semantic matching model that analyzes the semantic similarity between object proposals and sentences in a coarse-to-fine manner.

3D visual grounding Object +3

BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs

1 code implementation17 Jul 2023 Yang Zhao, Zhijie Lin, Daquan Zhou, Zilong Huang, Jiashi Feng, Bingyi Kang

Our experiments show that BuboGPT achieves impressive multi-modality understanding and visual grounding abilities during the interaction with human.

Instruction Following Sentence +1

DVFO: Learning-Based DVFS for Energy-Efficient Edge-Cloud Collaborative Inference

no code implementations2 Jun 2023 Ziyang Zhang, Yang Zhao, Huan Li, Changyao Lin, Jie Liu

Due to limited resources on edge and different characteristics of deep neural network (DNN) models, it is a big challenge to optimize DNN inference performance in terms of energy consumption and end-to-end latency on edge devices.

Collaborative Inference Deep Reinforcement Learning

CLIP3Dstyler: Language Guided 3D Arbitrary Neural Style Transfer

no code implementations25 May 2023 Ming Gao, Yanwu Xu, Yang Zhao, Tingbo Hou, Chenkai Zhao, Mingming Gong

In this paper, we propose a novel language-guided 3D arbitrary neural style transfer method (CLIP3Dstyler).

Style Transfer

Connecting Multi-modal Contrastive Representations

no code implementations NeurIPS 2023 Zehan Wang, Yang Zhao, Xize Cheng, Haifeng Huang, Jiageng Liu, Li Tang, Linjun Li, Yongqi Wang, Aoxiong Yin, Ziang Zhang, Zhou Zhao

This paper proposes a novel training-efficient method for learning MCR without paired data called Connecting Multi-modal Contrastive Representations (C-MCR).

3D Point Cloud Classification counterfactual +4

Pink-Eggs Dataset V1: A Step Toward Invasive Species Management Using Deep Learning Embedded Solutions

no code implementations16 May 2023 Di Xu, Yang Zhao, Xiang Hao, Xin Meng

We introduce a novel dataset consisting of images depicting pink eggs that have been identified as Pomacea canaliculata eggs, accompanied by corresponding bounding box annotations.

Management

E2TIMT: Efficient and Effective Modal Adapter for Text Image Machine Translation

1 code implementation9 May 2023 Cong Ma, Yaping Zhang, Mei Tu, Yang Zhao, Yu Zhou, Chengqing Zong

Furthermore, the ablation studies verify the generalization of our method, where the proposed modal adapter is effective to bridge various OCR and MT models.

Decoder Machine Translation +3

Multi-Teacher Knowledge Distillation For Text Image Machine Translation

1 code implementation9 May 2023 Cong Ma, Yaping Zhang, Mei Tu, Yang Zhao, Yu Zhou, Chengqing Zong

Text image machine translation (TIMT) has been widely used in various real-world applications, which translates source language texts in images into another target language sentence.

Decoder Knowledge Distillation +3

Instant-NeRF: Instant On-Device Neural Radiance Field Training via Algorithm-Accelerator Co-Designed Near-Memory Processing

no code implementations9 May 2023 Yang Zhao, Shang Wu, Jingqun Zhang, Sixu Li, Chaojian Li, Yingyan Lin

Instant on-device Neural Radiance Fields (NeRFs) are in growing demand for unleashing the promise of immersive AR/VR experiences, but are still limited by their prohibitive training time.

NeRF

BCEdge: SLO-Aware DNN Inference Services with Adaptive Batching on Edge Platforms

no code implementations1 May 2023 Ziyang Zhang, Huan Li, Yang Zhao, Changyao Lin, Jie Liu

As deep neural networks (DNNs) are being applied to a wide range of edge intelligent applications, it is critical for edge inference platforms to have both high-throughput and low-latency at the same time.

Deep Reinforcement Learning Scheduling

Identity Encoder for Personalized Diffusion

no code implementations14 Apr 2023 Yu-Chuan Su, Kelvin C. K. Chan, Yandong Li, Yang Zhao, Han Zhang, Boqing Gong, Huisheng Wang, Xuhui Jia

Our approach greatly reduces the overhead for personalized image generation and is more applicable in many potential applications.

Image Enhancement Image Generation +1

CoopInit: Initializing Generative Adversarial Networks via Cooperative Learning

no code implementations21 Mar 2023 Yang Zhao, Jianwen Xie, Ping Li

The proposed algorithm consists of two learning stages: (i) Cooperative initialization stage: The discriminator of GAN is treated as an energy-based model (EBM) and is optimized via maximum likelihood estimation (MLE), with the help of the GAN's generator to provide synthetic data to approximate the learning gradients.

Image-to-Image Translation

Deformation measurement of a soil mixing retaining wall using terrestrial laser scanning

no code implementations12 Jan 2023 Yang Zhao, Lei Fan, Hyungjoon Seo

Retaining walls are often built to prevent excessive lateral movements of the ground surrounding an excavation site.

Revisiting the Stack-Based Inverse Tone Mapping

no code implementations CVPR 2023 Ning Zhang, Yuyao Ye, Yang Zhao, Ronggang Wang

In this paper, we revisit the stack-based ITM approaches and propose a novel method to reconstruct HDR radiance from a single image, which only needs to estimate two exposure images.

Tone Mapping

Dilation-Erosion for Single-Frame Supervised Temporal Action Localization

1 code implementation13 Dec 2022 Bin Wang, Yan Song, Fanming Wang, Yang Zhao, Xiangbo Shu, Yan Rui

To balance the annotation labor and the granularity of supervision, single-frame annotation has been introduced in temporal action localization.

Temporal Action Localization

Stereo Image Rain Removal via Dual-View Mutual Attention

no code implementations18 Nov 2022 Yanyan Wei, Zhao Zhang, ZhongQiu Zhao, Yang Zhao, Richang Hong, Yi Yang

Stereo images, containing left and right view images with disparity, are utilized in solving low-vision tasks recently, e. g., rain removal and super-resolution.

Disparity Estimation Image Restoration +2

Boosting Semi-Supervised 3D Object Detection with Semi-Sampling

no code implementations14 Nov 2022 Xiaopei Wu, Yang Zhao, Liang Peng, Hua Chen, Xiaoshui Huang, Binbin Lin, Haifeng Liu, Deng Cai, Wanli Ouyang

When training a teacher-student semi-supervised framework, we randomly select gt samples and pseudo samples to both labeled frames and unlabeled frames, making a strong data augmentation for them.

3D Object Detection Data Augmentation +2

NASA: Neural Architecture Search and Acceleration for Hardware Inspired Hybrid Networks

2 code implementations24 Oct 2022 Huihong Shi, Haoran You, Yang Zhao, Zhongfeng Wang, Yingyan Lin

Multiplication is arguably the most cost-dominant operation in modern deep neural networks (DNNs), limiting their achievable efficiency and thus more extensive deployment in resource-constrained applications.

Neural Architecture Search

ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design

1 code implementation18 Oct 2022 Haoran You, Zhanyi Sun, Huihong Shi, Zhongzhi Yu, Yang Zhao, Yongan Zhang, Chaojian Li, Baopu Li, Yingyan Celine Lin

Specifically, on the algorithm level, ViTCoD prunes and polarizes the attention maps to have either denser or sparser fixed patterns for regularizing two levels of workloads without hurting the accuracy, largely reducing the attention computations while leaving room for alleviating the remaining dominant data movements; on top of that, we further integrate a lightweight and learnable auto-encoder module to enable trading the dominant high-cost data movements for lower-cost computations.

Behavioral graph fraud detection in E-commerce

no code implementations13 Oct 2022 Hang Yin, Zitao Zhang, Zhurong Wang, Yilmazcan Ozyurt, Weiming Liang, Wenyu Dong, Yang Zhao, Yinan Shan

Our experiments show that embedding features learned from similarity based behavioral graph have achieved significant performance increase to the baseline fraud detection model in various business scenarios.

Fraud Detection graph construction +2

CoopHash: Cooperative Learning of Multipurpose Descriptor and Contrastive Pair Generator via Variational MCMC Teaching for Supervised Image Hashing

no code implementations9 Oct 2022 Khoa D. Doan, Jianwen Xie, Yaxuan Zhu, Yang Zhao, Ping Li

Leveraging supervised information can lead to superior retrieval performance in the image hashing domain but the performance degrades significantly without enough labeled data.

Retrieval

Improving End-to-End Text Image Translation From the Auxiliary Text Translation Task

1 code implementation8 Oct 2022 Cong Ma, Yaping Zhang, Mei Tu, Xu Han, Linghui Wu, Yang Zhao, Yu Zhou

End-to-end text image translation (TIT), which aims at translating the source language embedded in images to the target language, has attracted intensive attention in recent research.

Multi-Task Learning Translation

Video-Guided Curriculum Learning for Spoken Video Grounding

1 code implementation1 Sep 2022 Yan Xia, Zhou Zhao, Shangwei Ye, Yang Zhao, Haoyuan Li, Yi Ren

To rectify the discriminative phonemes and extract video-related information from noisy audio, we develop a novel video-guided curriculum learning (VGCL) during the audio pre-training process, which can make use of the vital visual perceptions to help understand the spoken language and suppress the external noise.

Video Grounding

Learning an Efficient Multimodal Depth Completion Model

1 code implementation23 Aug 2022 Dewang Hou, Yuanyuan Du, Kai Zhao, Yang Zhao

With the wide application of sparse ToF sensors in mobile devices, RGB image-guided sparse depth completion has attracted extensive attention recently, but still faces some problems.

Depth Completion Depth Estimation +2

Depth-Assisted ResiDualGAN for Cross-Domain Aerial Images Semantic Segmentation

1 code implementation21 Aug 2022 Yang Zhao, Peng Guo, Han Gao, Xiuwan Chen

Generative methods are common approaches to minimizing the domain gap of aerial images which improves the performance of the downstream tasks, e. g., cross-domain semantic segmentation.

Segmentation Semantic Segmentation +1

e-G2C: A 0.14-to-8.31 $μ$J/Inference NN-based Processor with Continuous On-chip Adaptation for Anomaly Detection and ECG Conversion from EGM

no code implementations24 Jul 2022 Yang Zhao, Yongan Zhang, Yonggan Fu, Xu Ouyang, Cheng Wan, Shang Wu, Anton Banta, Mathews M. John, Allison Post, Mehdi Razavi, Joseph Cavallaro, Behnaam Aazhang, Yingyan Lin

This work presents the first silicon-validated dedicated EGM-to-ECG (G2C) processor, dubbed e-G2C, featuring continuous lightweight anomaly detection, event-driven coarse/precise conversion, and on-chip adaptation.

Anomaly Detection

Turning to a Teacher for Timestamp Supervised Temporal Action Segmentation

no code implementations2 Jul 2022 Yang Zhao, Yan Song

To obtain more information to optimize the model, the existing method generated pseudo frame-wise labels iteratively based on the output of a segmentation model and the timestamp annotations.

Action Segmentation Model Optimization +2

AntPivot: Livestream Highlight Detection via Hierarchical Attention Mechanism

no code implementations10 Jun 2022 Yang Zhao, Xuan Lin, Wenqiang Xu, Maozong Zheng, Zhengyong Liu, Zhou Zhao

In recent days, streaming technology has greatly promoted the development in the field of livestream.

Highlight Detection

BRIGHT -- Graph Neural Networks in Real-Time Fraud Detection

no code implementations25 May 2022 Mingxuan Lu, Zhichao Han, Susie Xi Rao, Zitao Zhang, Yang Zhao, Yinan Shan, Ramesh Raghunathan, Ce Zhang, Jiawei Jiang

Apart from rule-based and machine learning filters that are already deployed in production, we want to enable efficient real-time inference with graph neural networks (GNNs), which is useful to catch multihop risk propagation in a transaction graph.

Entity Embeddings Fraud Detection

Age Minimization in Outdoor and Indoor Communications with Relay-aided Dual RIS

no code implementations6 May 2022 Wanting Lyu, Yue Xiu, Yang Zhao, Chadi Assi, Zhongpei Zhang

In this paper, we investigate an outdoor and indoor wireless communication network with the assistance of a novel relay-aided double-sided reconfigurable intelligent surface (RIS).

Scheduling

Indoor simultaneous localization and mapping based on fringe projection profilometry

no code implementations23 Apr 2022 Yang Zhao, Kai Zhang, Haotian Yu, Yi Zhang, Dongliang Zheng, Jing Han

Simultaneous Localization and Mapping (SLAM) plays an important role in outdoor and indoor applications ranging from autonomous driving to indoor robotics.

Autonomous Driving Simultaneous Localization and Mapping

Modelling graph dynamics in fraud detection with "Attention"

1 code implementation22 Apr 2022 Susie Xi Rao, Clémence Lanfranchi, Shuai Zhang, Zhichao Han, Zitao Zhang, Wei Min, Mo Cheng, Yinan Shan, Yang Zhao, Ce Zhang

At online retail platforms, detecting fraudulent accounts and transactions is crucial to improve customer experience, minimize loss, and avoid unauthorized transactions.

Fraud Detection Graph Neural Network

Randomized Sharpness-Aware Training for Boosting Computational Efficiency in Deep Learning

no code implementations18 Mar 2022 Yang Zhao, Hao Zhang, Xiuyuan Hu

Optimizers in RST would perform a Bernoulli trial at each iteration to choose randomly from base algorithms (SGD) and sharpness-aware algorithms (SAM) with a probability arranged by a predefined scheduling function.

Computational Efficiency Scheduling

Spatio-temporal Gait Feature with Global Distance Alignment

no code implementations7 Mar 2022 Yifan Chen, Yang Zhao, Xuelong Li

In this paper, we try to enhance the discrimination of spatio-temporal gait features from two aspects: effective extraction of spatio-temporal gait features and reasonable refinement of extracted features.

Gait Recognition

Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning

1 code implementation8 Feb 2022 Yang Zhao, Hao Zhang, Xiuyuan Hu

In this paper, we propose an effective method to improve the model generalization by additionally penalizing the gradient norm of loss function during optimization.

Deep Learning

ResiDualGAN: Resize-Residual DualGAN for Cross-Domain Remote Sensing Images Semantic Segmentation

1 code implementation27 Jan 2022 Yang Zhao, Peng Guo, Zihao Sun, Xiuwan Chen, Han Gao

The performance of a semantic segmentation model for remote sensing (RS) images pretrained on an annotated dataset would greatly decrease when testing on another unannotated dataset because of the domain gap.

Image-to-Image Translation Semantic Segmentation +2

Neighborhood Region Smoothing Regularization for Finding Flat Minima In Deep Neural Networks

no code implementations16 Jan 2022 Yang Zhao, Hao Zhang

NRS leverages the finding that models would benefit from converging to flat minima, and tries to regularize the neighborhood region in weight space to yield approximate outputs.

Image Classification

Equivalence between algorithmic instability and transition to replica symmetry breaking in perceptron learning systems

no code implementations26 Nov 2021 Yang Zhao, Junbin Qiu, Mingshan Xie, Haiping Huang

Binary perceptron is a fundamental model of supervised learning for the non-convex optimization, which is a root of the popular deep learning.

Rethinking Deep Face Restoration

no code implementations CVPR 2022 Yang Zhao, Yu-Chuan Su, Chun-Te Chu, Yandong Li, Marius Renn, Yukun Zhu, Changyou Chen, Xuhui Jia

While existing approaches for face restoration make significant progress in generating high-quality faces, they often fail to preserve facial features and cannot authentically reconstruct the faces.

Face Generation Face Reconstruction

TransTCN: An Attention-based TCN Framework for Sequential Modeling

no code implementations29 Sep 2021 Yuan Chai, Liang He, Yang Zhao, Xueyan Li, Zhenxin Wang

The model was evaluated across a wide range of the tasks in time series, which are commonly used to the benchmark of TCN and recurrent networks.

Language Modelling Time Series Analysis

Multi-frame Joint Enhancement for Early Interlaced Videos

no code implementations29 Sep 2021 Yang Zhao, Yanbo Ma, Yuan Chen, Wei Jia, Ronggang Wang, Xiaoping Liu

Early interlaced videos usually contain multiple and interlacing and complex compression artifacts, which significantly reduce the visual quality.

Video Deinterlacing Video Reconstruction

D$^2$-GCN: Data-Dependent GCNs for Boosting Both Efficiency and Scalability

no code implementations29 Sep 2021 Chaojian Li, Xu Ouyang, Yang Zhao, Haoran You, Yonggan Fu, Yuchen Gu, Haonan Liu, Siyuan Miao, Yingyan Lin

Graph Convolutional Networks (GCNs) have gained an increasing attention thanks to their state-of-the-art (SOTA) performance in graph-based learning tasks.

2-in-1 Accelerator: Enabling Random Precision Switch for Winning Both Adversarial Robustness and Efficiency

no code implementations11 Sep 2021 Yonggan Fu, Yang Zhao, Qixuan Yu, Chaojian Li, Yingyan Celine Lin

The recent breakthroughs of deep neural networks (DNNs) and the advent of billions of Internet of Things (IoT) devices have excited an explosive demand for intelligent IoT devices equipped with domain-specific DNN accelerators.

Adversarial Robustness Quantization

EAR-NET: Error Attention Refining Network For Retinal Vessel Segmentation

1 code implementation3 Jul 2021 Jun Wang, Yang Zhao, Linglong Qian, Xiaohan Yu, Yongsheng Gao

The precise detection of blood vessels in retinal images is crucial to the early diagnosis of the retinal vascular diseases, e. g., diabetic, hypertensive and solar retinopathies.

Retinal Vessel Segmentation Segmentation +1

Cascaded Prediction Network via Segment Tree for Temporal Video Grounding

no code implementations CVPR 2021 Yang Zhao, Zhou Zhao, Zhu Zhang, Zhijie Lin

Temporal video grounding aims to localize the target segment which is semantically aligned with the given sentence in an untrimmed video.

Sentence Video Grounding

Multi-Scale Context Aggregation Network with Attention-Guided for Crowd Counting

1 code implementation6 Apr 2021 Xin Wang, Yang Zhao, Tangwen Yang, Qiuqi Ruan

In this paper, we propose a multi-scale context aggregation network (MSCANet) based on single-column encoder-decoder architecture for crowd counting, which consists of an encoder based on a dense context-aware module (DCAM) and a hierarchical attention-guided decoder.

Crowd Counting Decoder

Estimating the Generalization in Deep Neural Networks via Sparsity

no code implementations2 Apr 2021 Yang Zhao, Hao Zhang

By training DNNs with a wide range of generalization gap on popular datasets, we show that our key quantities and linear model could be efficient tools for estimating the generalization gap of DNNs.

Image Classification

Super-Resolving Compressed Video in Coding Chain

no code implementations26 Mar 2021 Dewang Hou, Yang Zhao, Yuyao Ye, Jiayu Yang, Jian Zhang, Ronggang Wang

Scaling and lossy coding are widely used in video transmission and storage.

Decoder

HW-NAS-Bench:Hardware-Aware Neural Architecture Search Benchmark

1 code implementation19 Mar 2021 Chaojian Li, Zhongzhi Yu, Yonggan Fu, Yongan Zhang, Yang Zhao, Haoran You, Qixuan Yu, Yue Wang, Yingyan Lin

To design HW-NAS-Bench, we carefully collected the measured/estimated hardware performance of all the networks in the search spaces of both NAS-Bench-201 and FBNet, on six hardware devices that fall into three categories (i. e., commercial edge devices, FPGA, and ASIC).

Hardware Aware Neural Architecture Search Neural Architecture Search

Quantitative Performance Assessment of CNN Units via Topological Entropy Calculation

no code implementations ICLR 2022 Yang Zhao, Hao Zhang

We show that by investigating the feature entropy of units on only training data, it could give discrimination between networks with different generalization ability from the view of the effectiveness of feature representations.

General Classification Image Classification

Interaction between optical pulse and tumor using finite element analysis

no code implementations19 Jan 2021 Xianlin Song, Ao Teng, Jianshuang Wei, Hao Chen, Yang Zhao, Jianheng Chen, Fangwei Liu, Qianxiang Wan, Guoning Huang, Lingfang Song, Aojie Zhao, Bo Li, Zihao Li, Qiming He, Jinhong Zhang

As a non-destructive biological tissue imaging technology, photoacoustic imaging has important application value in the field of biomedicine.

Biological Physics

Waveform and Beamforming Design for Intelligent Reflecting Surface Aided Wireless Power Transfer: Single-User and Multi-User Solutions

no code implementations7 Jan 2021 Zhenyuan Feng, Bruno Clerckx, Yang Zhao

This paper highlights the fact that IRS can provide an extra passive beamforming gain on output DC power over conventional WPT designs and significantly influence the waveform design by leveraging the benefit of passive beamforming, frequency diversity and energy harvester nonlinearity.

Information Theory Signal Processing Information Theory

SmartDeal: Re-Modeling Deep Network Weights for Efficient Inference and Training

1 code implementation4 Jan 2021 Xiaohan Chen, Yang Zhao, Yue Wang, Pengfei Xu, Haoran You, Chaojian Li, Yonggan Fu, Yingyan Lin, Zhangyang Wang

Results show that: 1) applied to inference, SD achieves up to 2. 44x energy efficiency as evaluated via real hardware implementations; 2) applied to training, SD leads to 10. 56x and 4. 48x reduction in the storage and training energy, with negligible accuracy loss compared to state-of-the-art training baselines.

SDA: Improving Text Generation with Self Data Augmentation

no code implementations2 Jan 2021 Ping Yu, Ruiyi Zhang, Yang Zhao, Yizhe Zhang, Chunyuan Li, Changyou Chen

Data augmentation has been widely used to improve deep neural networks in many research fields, such as computer vision.

Data Augmentation Imitation Learning +2

HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark

no code implementations ICLR 2021 Chaojian Li, Zhongzhi Yu, Yonggan Fu, Yongan Zhang, Yang Zhao, Haoran You, Qixuan Yu, Yue Wang, Cong Hao, Yingyan Lin

To design HW-NAS-Bench, we carefully collected the measured/estimated hardware performance (e. g., energy cost and latency) of all the networks in the search space of both NAS-Bench-201 and FBNet, considering six hardware devices that fall into three categories (i. e., commercial edge devices, FPGA, and ASIC).

Hardware Aware Neural Architecture Search Neural Architecture Search

Benchmark Platform for Ultra-Fine-Grained Visual Categorization Beyond Human Performance

1 code implementation ICCV 2021 Xiaohan Yu, Yang Zhao, Yongsheng Gao, Xiaohui Yuan, Shengwu Xiong

The proposed UFG image dataset and evaluation protocols is intended to serve as a benchmark platform that can advance research of visual classification from approaching human performance to beyond human ability, via facilitating benchmark data of artificial intelligence (AI) not to be limited by the labels of human intelligence (HI).

Fine-Grained Visual Categorization

Learning Energy-Based Generative Models via Coarse-to-Fine Expanding and Sampling

no code implementations ICLR 2021 Yang Zhao, Jianwen Xie, Ping Li

Energy-based models (EBMs) for generative modeling parametrize a single net and can be directly trained by maximum likelihood estimation.

Translation Unsupervised Image-To-Image Translation

FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training

1 code implementation NeurIPS 2020 Yonggan Fu, Haoran You, Yang Zhao, Yue Wang, Chaojian Li, Kailash Gopalakrishnan, Zhangyang Wang, Yingyan Celine Lin

Recent breakthroughs in deep neural networks (DNNs) have fueled a tremendous demand for intelligent edge devices featuring on-site learning, while the practical realization of such systems remains a challenge due to the limited resources available at the edge and the required massive training costs for state-of-the-art (SOTA) DNNs.

Quantization

A Comprehensive Survey of 6G Wireless Communications

no code implementations21 Dec 2020 Yang Zhao, Wenchao Zhai, Jun Zhao, Tinghao Zhang, Sumei Sun, Dusit Niyato, Kwok-Yan Lam

First, we give an overview of 6G from perspectives of technologies, security and privacy, and applications.

Survey

Suspicious Massive Registration Detection via Dynamic Heterogeneous Graph Neural Networks

no code implementations20 Dec 2020 Susie Xi Rao, Shuai Zhang, Zhichao Han, Zitao Zhang, Wei Min, Mo Cheng, Yinan Shan, Yang Zhao, Ce Zhang

Massive account registration has raised concerns on risk management in e-commerce companies, especially when registration increases rapidly within a short time frame.

Graph Neural Network Management

IRS-Aided SWIPT: Joint Waveform, Active and Passive Beamforming Design Under Nonlinear Harvester Model

1 code implementation10 Dec 2020 Yang Zhao, Bruno Clerckx, Zhenyuan Feng

To facilitate practical implementation, we also propose a low-complexity design based on closed-form adaptive waveform schemes.

Information Theory Signal Processing Information Theory

ReMP: Rectified Metric Propagation for Few-Shot Learning

no code implementations2 Dec 2020 Yang Zhao, Chunyuan Li, Ping Yu, Changyou Chen

Few-shot learning features the capability of generalizing from a few examples.

Few-Shot Learning

Knowledge Graph Enhanced Neural Machine Translation via Multi-task Learning on Sub-entity Granularity

no code implementations COLING 2020 Yang Zhao, Lu Xiang, Junnan Zhu, Jiajun Zhang, Yu Zhou, Chengqing Zong

Previous studies combining knowledge graph (KG) with neural machine translation (NMT) have two problems: i) Knowledge under-utilization: they only focus on the entities that appear in both KG and training sentence pairs, making much knowledge in KG unable to be fully utilized.

Machine Translation Multi-Task Learning +3

Unpaired Image-to-Image Translation via Latent Energy Transport

1 code implementation CVPR 2021 Yang Zhao, Changyou Chen

Instead of explicitly extracting the two codes and applying adaptive instance normalization to combine them, our latent EBM can implicitly learn to transport the source style code to the target style code while preserving the content code, an advantage over existing image translation methods.

Image Reconstruction Image-to-Image Translation +1

Rethinking deinterlacing for early interlaced videos

no code implementations27 Nov 2020 Yang Zhao, Wei Jia, Ronggang Wang

Traditional deinterlacing approaches are mainly focused on early interlacing scanning systems and thus cannot handle the complex and complicated artifacts in real-world early interlaced videos.

Image Restoration

xFraud: Explainable Fraud Transaction Detection

1 code implementation24 Nov 2020 Susie Xi Rao, Shuai Zhang, Zhichao Han, Zitao Zhang, Wei Min, Zhiyao Chen, Yinan Shan, Yang Zhao, Ce Zhang

At online retail platforms, it is crucial to actively detect the risks of transactions to improve customer experience and minimize financial loss.

Explainable Models Fraud Detection +2

Role Taxonomy of Units in Deep Neural Networks

no code implementations2 Nov 2020 Yang Zhao, Hao Zhang, Xiuyuan Hu

Identifying the role of network units in deep neural networks (DNNs) is critical in many aspects including giving understandings on the mechanisms of DNNs and building basic connections between deep learning and neuroscience.

Retrieval Topological Data Analysis

Deconstruct to Reconstruct a Configurable Evaluation Metric for Open-Domain Dialogue Systems

1 code implementation COLING 2020 Vitou Phy, Yang Zhao, Akiko Aizawa

For instance, specificity is mandatory in a food-ordering dialogue task, whereas fluency is preferred in a language-teaching dialogue system.

Dialogue Evaluation Semantic Similarity +2

FaultNet: A Deep Convolutional Neural Network for bearing fault classification

1 code implementation5 Oct 2020 Rishikesh Magar, Lalit Ghule, Junhan Li, Yang Zhao, Amir Barati Farimani

In this work, we analyze vibration signal data of mechanical systems with bearings by combining different signal processing methods and coupling them with machine learning techniques to classify different types of bearing faults.

Ranked #3 on Classification on CWRU Bearing Dataset (using extra training data)

BIG-bench Machine Learning Classification +2

Structure-Aware Human-Action Generation

1 code implementation ECCV 2020 Ping Yu, Yang Zhao, Chunyuan Li, Junsong Yuan, Changyou Chen

Generating long-range skeleton-based human actions has been a challenging problem since small deviations of one frame can cause a malformed action sequence.

Action Generation graph construction +1

Attacks to Federated Learning: Responsive Web User Interface to Recover Training Data from User Gradients

no code implementations8 Jun 2020 Hans Albert Lianto, Yang Zhao, Jun Zhao

In a case where the aggregator is untrusted and LDP is not applied to each user gradient, the aggregator can recover sensitive user data from these gradients.

Federated Learning

SmartExchange: Trading Higher-cost Memory Storage/Access for Lower-cost Computation

no code implementations7 May 2020 Yang Zhao, Xiaohan Chen, Yue Wang, Chaojian Li, Haoran You, Yonggan Fu, Yuan Xie, Zhangyang Wang, Yingyan Lin

We present SmartExchange, an algorithm-hardware co-design framework to trade higher-cost memory storage/access for lower-cost computation, for energy-efficient inference of deep neural networks (DNNs).

Model Compression Quantization

TIMELY: Pushing Data Movements and Interfaces in PIM Accelerators Towards Local and in Time Domain

no code implementations3 May 2020 Weitao Li, Pengfei Xu, Yang Zhao, Haitong Li, Yuan Xie, Yingyan Lin

Resistive-random-access-memory (ReRAM) based processing-in-memory (R$^2$PIM) accelerators show promise in bridging the gap between Internet of Thing devices' constrained resources and Convolutional/Deep Neural Networks' (CNNs/DNNs') prohibitive energy cost.

Bayesian Meta Sampling for Fast Uncertainty Adaptation

1 code implementation ICLR 2020 Zhenyi Wang, Yang Zhao, Ping Yu, Ruiyi Zhang, Changyou Chen

Specifically, we propose a Bayesian meta sampling framework consisting of two main components: a meta sampler and a sample adapter.

Meta-Learning

Local Differential Privacy based Federated Learning for Internet of Things

no code implementations19 Apr 2020 Yang Zhao, Jun Zhao, Mengmeng Yang, Teng Wang, Ning Wang, Lingjuan Lyu, Dusit Niyato, Kwok-Yan Lam

To avoid the privacy threat and reduce the communication cost, in this paper, we propose to integrate federated learning and local differential privacy (LDP) to facilitate the crowdsourcing applications to achieve the machine learning model.

BIG-bench Machine Learning Federated Learning +1

Dual-discriminator GAN: A GAN way of profile face recognition

no code implementations20 Mar 2020 Xin-Yu Zhang, Yang Zhao, Hao Zhang

A wealth of angle problems occur when facial recognition is performed: At present, the feature extraction network presents eigenvectors with large differences between the frontal face and profile face recognition of the same person in many cases.

Face Recognition Generative Adversarial Network

A New MRAM-based Process In-Memory Accelerator for Efficient Neural Network Training with Floating Point Precision

no code implementations2 Mar 2020 Hongjie Wang, Yang Zhao, Chaojian Li, Yue Wang, Yingyan Lin

The excellent performance of modern deep neural networks (DNNs) comes at an often prohibitive training cost, limiting the rapid development of DNN innovations and raising various environmental concerns.

Efficient Neural Network

DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architectures

no code implementations26 Feb 2020 Yang Zhao, Chaojian Li, Yue Wang, Pengfei Xu, Yongan Zhang, Yingyan Lin

The recent breakthroughs in deep neural networks (DNNs) have spurred a tremendously increased demand for DNN accelerators.

AutoDNNchip: An Automated DNN Chip Predictor and Builder for Both FPGAs and ASICs

1 code implementation6 Jan 2020 Pengfei Xu, Xiaofan Zhang, Cong Hao, Yang Zhao, Yongan Zhang, Yue Wang, Chaojian Li, Zetong Guan, Deming Chen, Yingyan Lin

Specifically, AutoDNNchip consists of two integrated enablers: (1) a Chip Predictor, built on top of a graph-based accelerator representation, which can accurately and efficiently predict a DNN accelerator's energy, throughput, and area based on the DNN model parameters, hardware configuration, technology-based IPs, and platform constraints; and (2) a Chip Builder, which can automatically explore the design space of DNN chips (including IP selection, block configuration, resource balancing, etc.

Cannot find the paper you are looking for? You can Submit a new open access paper.