no code implementations • CCL 2022 • Yang Zhao, Zhang Yuanzhe, Jiang Zhongtao, Ju Yiming, Zhao Jun, Liu Kang
“Explanations can increase the transparency of neural networks and make them more trustworthy.
no code implementations • Findings (EMNLP) 2021 • Ping Yu, Yang Zhao, Chunyuan Li, Changyou Chen
To overcome this issue, we propose a graph-based method to extract attribute content and attribute-independent content from input sentences in the YELP dataset and IMDB dataset.
no code implementations • Findings (NAACL) 2022 • Yang Zhao, Hua Qin, Wang Zhenyu, Changxi Zhu, Shihan Wang
It supports evaluating the difficulty of dialogue tasks only using the learning experiences of dialogue policy and skip-level selection according to their learning needs to maximize the learning efficiency.
no code implementations • ECCV 2020 • Haoxian Zhang, Yang Zhao, Ronggang Wang
Inspired by classical pyramid energy minimization optical flow algorithms, this paper proposes a recurrent residual pyramid network (RRPN) for video frame interpolation.
no code implementations • LREC 2022 • Yang Zhao, Hiroshi Kanayama, Issei Yoshida, Masayasu Muraoka, Akiko Aizawa
To remedy this shortcoming, we present a dependency-tree-based method to construct a Chinese corpus with 151k pairs of sentences and compression based on Chinese language-specific characteristics.
no code implementations • 19 Apr 2025 • Takuma Udagawa, Yang Zhao, Hiroshi Kanayama, Bishwaranjan Bhattacharjee
Large language models (LLMs) acquire general linguistic knowledge from massive-scale pretraining.
no code implementations • 19 Apr 2025 • Chen Guo, Zhuo Su, Jian Wang, Shuang Li, Xu Chang, Zhaohu Li, Yang Zhao, Guidong Wang, Ruqi Huang
Creating photorealistic 3D head avatars from limited input has become increasingly important for applications in virtual reality, telepresence, and digital entertainment.
no code implementations • 17 Apr 2025 • Zhouhao Sun, Xiao Ding, Li Du, Yunpeng Xu, Yixuan Ma, Yang Zhao, Bing Qin, Ting Liu
Despite significant progress, recent studies indicate that current large language models (LLMs) may still capture dataset biases and utilize them during inference, leading to the poor generalizability of LLMs.
no code implementations • 11 Apr 2025 • Team Seawead, Ceyuan Yang, Zhijie Lin, Yang Zhao, Shanchuan Lin, Zhibei Ma, Haoyuan Guo, Hao Chen, Lu Qi, Sen Wang, Feng Cheng, Feilong Zuo Xuejiao Zeng, Ziyan Yang, Fangyuan Kong, Zhiwu Qing, Fei Xiao, Meng Wei, Tuyen Hoang, Siyu Zhang, Peihao Zhu, Qi Zhao, Jiangqiao Yan, Liangke Gui, Sheng Bi, Jiashi Li, Yuxi Ren, Rui Wang, Huixia Li, Xuefeng Xiao, Shu Liu, Feng Ling, Heng Zhang, Houmin Wei, Huafeng Kuang, Jerry Duncan, Junda Zhang, Junru Zheng, Li Sun, Manlin Zhang, Renfei Sun, Xiaobin Zhuang, Xiaojie Li, Xin Xia, Xuyan Chi, Yanghua Peng, Yuping Wang, Yuxuan Wang, Zhongkai Zhao, Zhuo Chen, Zuquan Song, Zhenheng Yang, Jiashi Feng, Jianchao Yang, Lu Jiang
This technical report highlights the key design decisions that enhance the performance of the medium-sized diffusion model.
1 code implementation • 2 Apr 2025 • Yongjun He, Roger Waleffe, Zhichao Han, Johnu George, Binhang Yuan, Zitao Zhang, Yinan Shan, Yang Zhao, Debojyoti Dutta, Theodoros Rekatsinas, Ce Zhang
As increasingly diverse ML applications utilize embedding models and embedding tables continue to grow in size and number, there has been a surge in the ad-hoc development of specialized frameworks targeted to train large embedding models for specific tasks.
no code implementations • 26 Mar 2025 • Xiuyuan Hu, Guoqing Liu, Can Chen, Yang Zhao, Hao Zhang, Xue Liu
To address both challenges, we propose TransDiffSBDD, an integrated framework combining autoregressive transformers and diffusion models for SBDD.
1 code implementation • 24 Mar 2025 • Jianlong Jin, Chenglong Zhao, Ruixin Zhang, Sheng Shang, Jianqing Xu, Jingyun Zhang, Shaoming Wang, Yang Zhao, Shouhong Ding, Wei Jia, Yunsheng Wu
However, without employing real data fine-tuning, the performance of the recognition model trained on these synthetic datasets would drastically decline, indicating a large gap between generated and real palmprints.
1 code implementation • 23 Mar 2025 • Yang Luo, Shiru Wang, Jun Liu, Jiaxuan Xiao, Rundong Xue, Zeyu Zhang, Hao Zhang, Yu Lu, Yang Zhao, Yutong Xie
Breast cancer survival prediction in computational pathology presents a remarkable challenge due to tumor heterogeneity.
no code implementations • 11 Mar 2025 • Kaiqiang Xiong, Ying Feng, Qi Zhang, Jianbo Jiao, Yang Zhao, Zhihao Liang, Huachen Gao, Ronggang Wang
We first generate multi-view images from the single reference image with an enhanced multi-view diffusion model, which is well fine-tuned on high-quality 3D human datasets to incorporate 3D geometry priors and human structure priors.
no code implementations • 28 Feb 2025 • Hongyuan Shen, Min Zheng, Jincheng Wang, Yang Zhao
With the widespread application of Large Language Models across various domains, their security issues have increasingly garnered significant attention from both academic and industrial communities.
no code implementations • 22 Feb 2025 • Qianqi Yan, Yue Fan, Hongquan Li, Shan Jiang, Yang Zhao, Xinze Guan, Ching-Chen Kuo, Xin Eric Wang
Existing Multimodal Large Language Models (MLLMs) are predominantly trained and tested on consistent visual-textual inputs, leaving open the question of whether they can handle inconsistencies in real-world, layout-rich content.
1 code implementation • 21 Feb 2025 • Hongjie Zhu, Zeyu Zhang, Guansong Pang, Xu Wang, Shimin Wen, Yu Bai, Daji Ergu, Ying Cai, Yang Zhao
This alignment of activation responses with semantic information strengthens the propagation and decoupling of target features, enabling the generated embeddings to more accurately represent target features in high-level semantic space.
Weakly supervised Semantic Segmentation
Weakly-Supervised Semantic Segmentation
1 code implementation • 19 Feb 2025 • Jiaqi Li, Xizhong Guo, Yang Zhao, Lvyang Zhang, Lidong Zhai
Rapid industrial digitalization has created intricate cybersecurity demands that necessitate effective validation methods.
1 code implementation • 19 Feb 2025 • Rui Zhao, Zeyu Zhang, Yi Xu, Yi Yao, Yan Huang, Wenxin Zhang, Zirui Song, Xiuying Chen, Yang Zhao
Pedestrian detection in intelligent transportation systems has made significant progress but faces two critical challenges: (1) insufficient fusion of complementary information between visible and infrared spectra, particularly in complex scenarios, and (2) sensitivity to illumination changes, such as low-light or overexposed conditions, leading to degraded performance.
no code implementations • 17 Feb 2025 • Yang Zhao, Minrui Xu, Ping Wang, Dusit Niyato
Over-the-air (OTA) federated learning (FL) effectively utilizes communication bandwidth, yet it is vulnerable to errors during analog aggregation.
no code implementations • 16 Feb 2025 • Yang Zhao, Li Du, Xiao Ding, Yangou Ouyang, Hepeng Wang, Kai Xiong, Jinglong Gao, Zhouhao Sun, Dongliang Xu, Yang Qing, Dongchen Li, Bing Qin, Ting Liu
Large language models (LLMs) have shown great potential across various industries due to their remarkable ability to generalize through instruction tuning.
1 code implementation • 7 Feb 2025 • Xiuyuan Hu, Guoqing Liu, Can Chen, Yang Zhao, Hao Zhang, Xue Liu
Structure-based drug discovery, encompassing the tasks of protein-ligand docking and pocket-aware 3D drug design, represents a core challenge in drug discovery.
1 code implementation • 2 Feb 2025 • Xuyin Qi, Zeyu Zhang, Huazhan Zheng, Mingxi Chen, Numan Kutaiba, Ruth Lim, Cherie Chiang, Zi En Tham, Xuan Ren, Wenxin Zhang, Lei Zhang, Hao Zhang, Wenbing Lv, Guangzhen Yao, Renda Han, Kangsheng Wang, Mingyuan Li, Hongtao Mao, Yu Li, Zhibin Liao, Yang Zhao, Minh-Son To
Bone density prediction via CT scans to estimate T-scores is crucial, providing a more precise assessment of bone health compared to traditional methods like X-ray bone density tests, which lack spatial resolution and the ability to detect localized changes.
no code implementations • 30 Jan 2025 • Huaiyuan Ying, Hongyi Yuan, Jinsen Lu, Zitian Qu, Yang Zhao, Zhengyun Zhao, Isaac Kohane, Tianxi Cai, Sheng Yu
Traditional methods for structuring EHR free-text data, such as rule-based systems and multi-stage pipelines, are often limited by their time-consuming configurations and inability to adapt across clinical notes from diverse healthcare settings.
no code implementations • 26 Jan 2025 • Yue Xiu, Yang Zhao, Ran Yang, Dusit Niyato, Jing Jin, Qixing Wang, Guangyi Liu, Ning Wei
We analyze the impact of CSI errors on achievable rates and introduce a hybrid Cramer-Rao lower bound (HCRLB) to evaluate the effect of TS errors on target localization accuracy.
Deep Reinforcement Learning
Integrated sensing and communication
+1
no code implementations • 2 Jan 2025 • Jianyi Wang, Zhijie Lin, Meng Wei, Yang Zhao, Ceyuan Yang, Chen Change Loy, Lu Jiang
Video restoration poses non-trivial challenges in maintaining fidelity while recovering temporally consistent details from unknown degradations in the wild.
1 code implementation • 2 Jan 2025 • Xuyin Qi, Zeyu Zhang, Aaron Berliano Handoko, Huazhan Zheng, Mingxi Chen, Ta Duc Huy, Vu Minh Hieu Phan, Lei Zhang, Linqi Cheng, Shiyu Jiang, Zhiwei Zhang, Zhibin Liao, Yang Zhao, Minh-Son To
Additionally, we conduct comprehensive experiments on both the generator and classifier, demonstrating the clinical relevance and effectiveness of ProjectedEx in enhancing interpretability and supporting the adoption of AI in medical settings.
no code implementations • 29 Dec 2024 • Chiyu Cheng, Chang Zhou, Yang Zhao, Jin Cao
The management of data writes to SSD caches plays a crucial role in improving overall system performance, reducing latency, and extending the lifespan of storage devices.
no code implementations • 29 Dec 2024 • Chiyu Cheng, Chang Zhou, Yang Zhao, Jin Cao
Traditional heuristics employed for storage performance optimization often fail to adapt to the variability and complexity of contemporary workloads, leading to significant performance bottlenecks and resource inefficiencies.
no code implementations • 29 Dec 2024 • Chiyu Cheng, Chang Zhou, Yang Zhao, Jin Cao
The exponential growth of data storage demands has necessitated the evolution of hierarchical storage management strategies [1].
1 code implementation • 28 Dec 2024 • Shengbo Tan, Rundong Xue, Shipeng Luo, Zeyu Zhang, Xinran Wang, Lei Zhang, Daji Ergu, Zhang Yi, Yang Zhao, Ying Cai
Hepatic vessels in computed tomography scans often suffer from image fragmentation and noise interference, making it difficult to maintain vessel integrity and posing significant challenges for vessel segmentation.
no code implementations • 19 Dec 2024 • Ruixiang Chen, Yang Zhao, Haoqin Li, Rui Chen
In the realm of lithography, Optical Proximity Correction (OPC) is a crucial resolution enhancement technique that optimizes the transmission function of photomasks on a pixel-based to effectively counter Optical Proximity Effects (OPE).
no code implementations • 16 Dec 2024 • Yue Xiu, Yang Zhao, Ran Yang, Huimin Tang, Long Qu, Maurice Khabbaz, Chadi Assi, Ning Wei
Specifically, the core of MA technology lies in optimizing the antenna positions to increase system capacity.
no code implementations • 15 Dec 2024 • Pengcheng Zhao, Jinxing Zhou, Yang Zhao, Dan Guo, Yanxiang Chen
However, each segment may contain multiple events, resulting in semantically mixed holistic features that can lead to semantic interference during intra- or cross-modal interactions: the event semantics of one segment may incorporate semantics of unrelated events from other segments.
no code implementations • 4 Dec 2024 • Shanding Diao, Yang Zhao, Yuan Chen, Zhao Zhang, Wei Jia, Ronggang Wang
This paper proposes a planar video real-time stereoscopic conversion network based on multi-plane images (MPI), which consists of a detail branch for generating MPI and a depth-semantic branch for perceiving depth information.
no code implementations • 3 Dec 2024 • Xinjie Li, Yang Zhao, Dong Wang, Yuan Chen, Li Cao, Xiaoping Liu
Large-scale generative models have achieved remarkable advancements in various visual tasks, yet their application to shadow removal in images remains challenging.
no code implementations • 29 Nov 2024 • Bo Qu, Zhurong Wang, Minghao Gu, Daisuke Yagi, Yang Zhao, Yinan Shan, Frank Zahradnik
The burgeoning e-Commerce sector requires advanced solutions for the detection of transaction fraud.
no code implementations • 11 Nov 2024 • Yang Zhao, Yue Xiu, Minrui Xu, Ping Wang, Ning Wei
Federated learning (FL) in wireless computing effectively utilizes communication bandwidth, yet it is vulnerable to errors during the analog aggregation process.
1 code implementation • 7 Nov 2024 • Luting Wang, Yang Zhao, Zijian Zhang, Jiashi Feng, Si Liu, Bingyi Kang
Currently, pixel reconstruction (e. g., VQGAN) dominates the training objective for image tokenizers.
no code implementations • 6 Nov 2024 • Lyuhong Wang, Jiawei Jiang, Yang Zhao
We introduce an innovative framework that leverages advanced big data techniques to analyze dynamic co-movement between stocks and their underlying fundamentals using high-frequency stock market data.
1 code implementation • 5 Nov 2024 • Jinchao Ge, BoWen Zhang, Akide Liu, Minh Hieu Phan, Qi Chen, Yangyang Shu, Yang Zhao
Class-incremental semantic segmentation (CSS) requires that a model learn to segment new classes without forgetting how to segment previous ones: this is typically achieved by distilling the current knowledge and incorporating the latest data.
no code implementations • 5 Nov 2024 • Yang Zhao, Zidong Nie, Kangsheng Dong, Qinghua Huang, Xuelong Li
This paper proposes a deep reinforcement learning-based model for decision-making in multi-role UAV cooperative pursuit-evasion game, to address the challenge of enabling UAV to autonomously make decisions in complex game environments.
no code implementations • 4 Nov 2024 • Xi He, Feiyu Du, Xiaohan Yu, Yang Zhao, Tao Lei
Two transfer fusion frameworks are proposed in this paper to predict the labels of a target domain data by aligning its distribution to a different but related labelled source domain on quantum devices.
no code implementations • 4 Nov 2024 • Bingyi Kang, Yang Yue, Rui Lu, Zhijie Lin, Yang Zhao, Kaixin Wang, Gao Huang, Jiashi Feng
Our scaling experiments show perfect generalization within the distribution, measurable scaling behavior for combinatorial generalization, but failure in out-of-distribution scenarios.
no code implementations • 1 Nov 2024 • Yue Xiu, Yang Zhao, Chenfei Xie, Fatma Benkhelifa, Songjie Yang, Wanting Lyu, Chadi Assi, Ning Wei
For the PS factor optimization problem, the SCA algorithm is proposed.
no code implementations • 20 Oct 2024 • Yujia Wu, Bo Yang, Yang Zhao, Elynn Chen, Yuzhou Chen, Zheshi Zheng
Graph Neural Networks (GNNs) have become the de facto standard for analyzing graph-structured data, leveraging message-passing techniques to capture both structural and node feature information.
1 code implementation • 18 Oct 2024 • Guohui Cai, Ying Cai, Zeyu Zhang, Yuanzhouhan Cao, Lin Wu, Daji Ergu, Zhinbin Liao, Yang Zhao
The recent emergence of deep learning has revolutionized medical image analysis, driving substantial advancements in this field.
no code implementations • 17 Oct 2024 • Junhong Wu, Yang Zhao, Yangyifan Xu, Bing Liu, Chengqing Zong
These abilities, which are developed using proprietary and unavailable training data, make existing continual instruction tuning methods ineffective.
no code implementations • 11 Oct 2024 • Qihang Yang, Yang Zhao, Hong Cheng
Autonomous driving necessitates advanced object detection techniques that integrate information from multiple modalities to overcome the limitations associated with single-modal approaches.
no code implementations • 8 Oct 2024 • Sha Guo, Zhuo Chen, Yang Zhao, Ning Zhang, Xiaotong Li, Lingyu Duan
Extensive experiments demonstrate the effectiveness of the proposed framework in both image reconstruction and downstream machine vision tasks such as object detection, segmentation, and facial landmark detection, achieving superior perceptual quality compared to state-of-the-art methods.
1 code implementation • 6 Oct 2024 • Yang Zhao, Yixin Wang, Mingzhang Yin
In this work, we propose a novel listwise approach named Ordinal Preference Optimization (OPO), which employs the Normalized Discounted Cumulative Gain (NDCG), a widely-used ranking metric, to better utilize relative proximity within ordinal multiple responses.
no code implementations • 3 Oct 2024 • Yuqing Wang, Tianwei Xiong, Daquan Zhou, Zhijie Lin, Yang Zhao, Bingyi Kang, Jiashi Feng, Xihui Liu
Autoregressive large language models (LLMs) have achieved great success in generating coherent and long sequences of tokens in the domain of natural language processing, while the exploration of autoregressive LLMs for video generation is limited to generating short videos of several seconds.
no code implementations • 25 Sep 2024 • Longguang Wang, Yulan Guo, Juncheng Li, Hongda Liu, Yang Zhao, Yingqian Wang, Zhi Jin, Shuhang Gu, Radu Timofte
This paper summarizes the 3rd NTIRE challenge on stereo image super-resolution (SR) with a focus on new solutions and results.
no code implementations • 24 Sep 2024 • Yue Xiu, Yang Zhao, Songjie Yang, Yufeng Zhang, Dusit Niyato, Hongyang Du, Ning Wei
For millimeter-wave (mmWave) non-orthogonal multiple access (NOMA) communication systems, we propose an innovative near-field (NF) transmission framework based on dynamic metasurface antenna (DMA) technology.
no code implementations • 24 Sep 2024 • Yang Zhao, Li Du, Xiao Ding, Kai Xiong, Ting Liu, Bing Qin
We find that: (1) LLMs selectively activate task-specific attention heads during SFT; (2) activation patterns for complex tasks are combinations of basic task patterns; and (3) changes in a few parameters can significantly impact activation patterns after SFT on a small number of samples. Based on these insights, experiments are conducted to actually enhance the efficiency and effectiveness of SFT.
no code implementations • 22 Sep 2024 • Yue Xiu, Yang Zhao, Songjie Yang, Minrui Xu, Dusit Niyato, Yueyang Li, Ning Wei
The increase in DoF enhances the system's anti-jamming capabilities and reduces system delay.
1 code implementation • 21 Sep 2024 • Guohui Cai, Ruicheng Zhang, Hongyang He, Zeyu Zhang, Daji Ergu, Yuanzhouhan Cao, Jinman Zhao, Binbin Hu, Zhinbin Liao, Yang Zhao, Ying Cai
Pulmonary nodules are critical indicators for the early diagnosis of lung cancer, making their detection essential for timely treatment.
1 code implementation • 6 Sep 2024 • Yang Zhao, Gangwei Xu, Gang Wu
Compared to the recurrent flow methods based the all-pairs cost volumes, our HCVFlow significantly reduces memory consumption while ensuring high accuracy.
no code implementations • 31 Aug 2024 • Yuxiang Guo, Faizan Siddiqui, Yang Zhao, Rama Chellappa, Shao-Yuan Lo
To address this issue, we propose StimuVAR, a spatiotemporal Stimuli-aware framework for Video Affective Reasoning (VAR) with MLLMs.
1 code implementation • 30 Aug 2024 • Zeyu Zhang, Nengmin Yi, Shengbo Tan, Ying Cai, Yi Yang, Lei Xu, Qingtai Li, Zhang Yi, Daji Ergu, Yang Zhao
Additionally, we customize the second-order nmODE to improve the model's resistance to noise in MRI.
no code implementations • 29 Aug 2024 • Ashton Yu Xuan Tan, Yingkai Yang, Xiaofei Zhang, Bowen Li, Xiaorong Gao, Sifa Zheng, Jianqiang Wang, Xinyu Gu, Jun Li, Yang Zhao, Yuxin Zhang, Tania Stathaki
Enhancing the safety of autonomous vehicles is crucial, especially given recent accidents involving automated systems.
1 code implementation • 24 Aug 2024 • Jinchao Ge, Zeyu Zhang, Minh Hieu Phan, BoWen Zhang, Akide Liu, Yang Zhao
Active learning enhances annotation efficiency by selecting the most revealing samples for labeling, thereby reducing reliance on extensive human input.
1 code implementation • 23 Aug 2024 • Li Du, Zhouhao Sun, Xiao Ding, Yixuan Ma, Yang Zhao, Kaitao Qiu, Ting Liu, Bing Qin
Although achieving promising performance, recent analyses show that current generative large language models (LLMs) may still capture dataset biases and utilize them for generation, leading to poor generalizability and harmfulness of LLMs.
no code implementations • 22 Aug 2024 • Qiuchang Han, Xingliang Jiang, Yang Zhao, Xudong Wang, Zhijin Li, Renhe Zhang
Satellite altimetry has been widely utilized to monitor global sea surface dynamics, enabling investigation of upper ocean variability from basin-scale to localized eddy ranges.
no code implementations • 12 Aug 2024 • Xiaozheng Zheng, Chao Wen, Zhaohu Li, Weiyi Zhang, Zhuo Su, Xu Chang, Yang Zhao, Zheng Lv, Xiaoyuan Zhang, YongJie Zhang, Guidong Wang, Lan Xu
The prior learning phase leverages 3D head priors derived from a large-scale multi-view dynamic dataset, and the avatar creation phase applies these priors for few-shot personalization.
1 code implementation • 1 Aug 2024 • Shengbo Tan, Zeyu Zhang, Ying Cai, Daji Ergu, Lin Wu, Binbin Hu, Pengzhang Yu, Yang Zhao
Medical imaging segmentation plays a significant role in the automatic recognition and analysis of lesions.
no code implementations • 29 Jul 2024 • Tom Gunter, ZiRui Wang, Chong Wang, Ruoming Pang, Aonan Zhang, BoWen Zhang, Chen Chen, Chung-Cheng Chiu, David Qiu, Deepak Gopinath, Dian Ang Yap, Dong Yin, Feng Nan, Floris Weers, Guoli Yin, Haoshuo Huang, Jianyu Wang, Jiarui Lu, John Peebles, Ke Ye, Mark Lee, Nan Du, Qibin Chen, Quentin Keunebroek, Sam Wiseman, Syd Evans, Tao Lei, Vivek Rathod, Xiang Kong, Xianzhi Du, Yanghao Li, Yongqiang Wang, Yuan Gao, Zaid Ahmed, Zhaoyang Xu, Zhiyun Lu, Al Rashid, Albin Madappally Jose, Alec Doane, Alfredo Bencomo, Allison Vanderby, Andrew Hansen, Ankur Jain, Anupama Mann Anupama, Areeba Kamal, Bugu Wu, Carolina Brum, Charlie Maalouf, Chinguun Erdenebileg, Chris Dulhanty, Dominik Moritz, Doug Kang, Eduardo Jimenez, Evan Ladd, Fangping Shi, Felix Bai, Frank Chu, Fred Hohman, Hadas Kotek, Hannah Gillis Coleman, Jane Li, Jeffrey Bigham, Jeffery Cao, Jeff Lai, Jessica Cheung, Jiulong Shan, Joe Zhou, John Li, Jun Qin, Karanjeet Singh, Karla Vega, Kelvin Zou, Laura Heckman, Lauren Gardiner, Margit Bowler, Maria Cordell, Meng Cao, Nicole Hay, Nilesh Shahdadpuri, Otto Godwin, Pranay Dighe, Pushyami Rachapudi, Ramsey Tantawi, Roman Frigg, Sam Davarnia, Sanskruti Shah, Saptarshi Guha, Sasha Sirovica, Shen Ma, Shuang Ma, Simon Wang, Sulgi Kim, Suma Jayaram, Vaishaal Shankar, Varsha Paidi, Vivek Kumar, Xin Wang, Xin Zheng, Walker Cheng, Yael Shrager, Yang Ye, Yasu Tanaka, Yihao Guo, Yunsong Meng, Zhao Tang Luo, Zhi Ouyang, Alp Aygar, Alvin Wan, Andrew Walkingshaw, Andy Narayanan, Antonie Lin, Arsalan Farooq, Brent Ramerth, Colorado Reed, Chris Bartels, Chris Chaney, David Riazati, Eric Liang Yang, Erin Feldman, Gabriel Hochstrasser, Guillaume Seguin, Irina Belousova, Joris Pelemans, Karen Yang, Keivan Alizadeh Vahid, Liangliang Cao, Mahyar Najibi, Marco Zuliani, Max Horton, Minsik Cho, Nikhil Bhendawade, Patrick Dong, Piotr Maj, Pulkit Agrawal, Qi Shan, Qichen Fu, Regan Poston, Sam Xu, Shuangning Liu, Sushma Rao, Tashweena Heeramun, Thomas Merth, Uday Rayala, Victor Cui, Vivek Rangarajan Sridhar, Wencong Zhang, Wenqi Zhang, Wentao Wu, Xingyu Zhou, Xinwen Liu, Yang Zhao, Yin Xia, Zhile Ren, Zhongzheng Ren
We present foundation language models developed to power Apple Intelligence features, including a ~3 billion parameter model designed to run efficiently on devices and a large server-based language model designed for Private Cloud Compute.
1 code implementation • 21 Jul 2024 • Yang Zhao, Hongyu Li, Bruno Clerckx, Massimo Franceschetti
This paper investigates the limits to which a passive Reconfigurable Intelligent Surface (RIS) can reshape a point-to-point Multiple-Input Multiple-Output (MIMO) in terms of singular values for improved wireless (e. g., rate and power) performance.
no code implementations • 15 Jul 2024 • Yang Zhao, Di Huang, Chongxiao Li, Pengwei Jin, Ziyuan Nan, TianYun Ma, Lei Qi, Yansong Pan, Zhenxing Zhang, Rui Zhang, Xishan Zhang, Zidong Du, Qi Guo, Xing Hu, Yunji Chen
Instruction-tuned large language models (LLMs) have demonstrated remarkable performance in automatically generating code for general-purpose programming languages like Python.
no code implementations • 3 Jul 2024 • Yang Zhao, Chang Zhou, Jin Cao, Yi Zhao, Shaobo Liu, Chiyu Cheng, Xingchen Li
This paper explores multi-scenario optimization on large platforms using multi-agent reinforcement learning (MARL).
1 code implementation • 27 Jun 2024 • Yue Fan, Lei Ding, Ching-Chen Kuo, Shan Jiang, Yang Zhao, Xinze Guan, Jie Yang, Yi Zhang, Xin Eric Wang
Based on the tree, our ToL agent not only comprehends the content of the indicated area but also articulates the layout and spatial relationships between elements.
no code implementations • 16 Jun 2024 • Zhiwen Fan, Pu Wang, Yang Zhao, Yibo Zhao, Boris Ivanovic, Zhangyang Wang, Marco Pavone, Hao Frank Yang
Leveraging this rich dataset, we further formulate the crash event feature learning as a novel text reasoning problem and further fine-tune various large language models (LLMs) to predict detailed accident outcomes, such as crash types, severity and number of injuries, based on contextual and environmental factors.
1 code implementation • 14 Jun 2024 • Yang Zhao, Hao Zhang, Xiuyuan Hu
Meanwhile, we note that scalable models tend to rely more on the GR warmup, where the performance can be improved by up to 3\% on Cifar10 compared to baseline GR.
no code implementations • 4 Jun 2024 • Chang Zhou, Yang Zhao, Shaobo Liu, Yi Zhao, Xingchen Li, Chiyu Cheng
In a society where traffic accidents frequently occur, fatigue driving has emerged as a grave issue.
no code implementations • 4 Jun 2024 • Chang Zhou, Yang Zhao, Yuelin Zou, Jin Cao, Wenhan Fan, Yi Zhao, Chiyu Cheng
This paper proposes new methods to enhance click-through rate (CTR) prediction models using the Deep Interest Network (DIN) model, specifically applied to the advertising system of Alibaba's Taobao platform.
no code implementations • 22 May 2024 • Chang Zhou, Yang Zhao, Jin Cao, Yi Shen, Xiaoling Cui, Chiyu Cheng
This paper explores the integration of strategic optimization methods in search advertising, focusing on ad ranking and bidding mechanisms within E-commerce platforms.
1 code implementation • 18 May 2024 • Zeyu Zhang, Yiran Wang, Biao Wu, Shuo Chen, Zhiyuan Zhang, Shiya Huang, Wenbo Zhang, Meng Fang, Ling Chen, Yang Zhao
Firstly, we proposed a novel agent-based approach named Motion Avatar, which allows for the automatic generation of high-quality customizable human and animal avatars with motions through text queries.
1 code implementation • 8 May 2024 • Zehan Wang, Ziang Zhang, Xize Cheng, Rongjie Huang, Luping Liu, Zhenhui Ye, Haifeng Huang, Yang Zhao, Tao Jin, Peng Gao, Zhou Zhao
In this work, we propose FreeBind, an idea that treats multimodal representation spaces as basic units, and freely augments pre-trained unified space by integrating knowledge from extra expert spaces via "space bonds".
no code implementations • CVPR 2024 • Kelvin C. K. Chan, Yang Zhao, Xuhui Jia, Ming-Hsuan Yang, Huisheng Wang
In subject-driven text-to-image synthesis, the synthesis process tends to be heavily influenced by the reference images provided by users, often overlooking crucial attributes detailed in the text prompt.
1 code implementation • 10 Apr 2024 • Hongru Du, Jianan Zhao, Yang Zhao, Shaochong Xu, Xihong Lin, Yiran Chen, Lauren M. Gardner, Hao, Yang
Forecasting the short-term spread of an ongoing disease outbreak is a formidable challenge due to the complexity of contributing factors, some of which can be characterized through interlinked, multi-modality variables such as epidemiological time series data, viral biology, population demographics, and the intersection of public policy and human behavior.
no code implementations • 5 Mar 2024 • Zhen Gong, Lvyin Niu, Yang Zhao, Miao Xu, Zhenzhe Zheng, Haoqi Zhang, Zhilin Zhang, Fan Wu, Rongquan Bai, Chuan Yu, Jian Xu, Bo Zheng
Through extensive offline and online experiments, we demonstrate the effectiveness and efficiency of our method, and we obtain a 7. 01% lift in Gross Merchandise Volume, a 7. 42% lift in Return on Investment, and a 3. 26% lift in ad buy count.
no code implementations • CVPR 2024 • Xiaozheng Zheng, Chao Wen, Zhuo Su, Zeran Xu, Zhaohu Li, Yang Zhao, Zhou Xue
In this paper, we delve into the creation of one-shot hand avatars, attaining high-fidelity and drivable hand representations swiftly from a single image.
no code implementations • 18 Feb 2024 • Yang Zhao, Li Du, Xiao Ding, Kai Xiong, Zhouhao Sun, Jun Shi, Ting Liu, Bing Qin
Through pretraining on a corpus with various sources, Large Language Models (LLMs) have gained impressive performance.
no code implementations • 11 Feb 2024 • Jie Ren, Yang Zhao, Weichuan Zhang, Changming Sun
The proposed SFDNet has the ability to effectively extract spatial-frequency feature representation from input images, improve the accuracy of image classification, and fundamentally alleviate catastrophic forgetting.
1 code implementation • 3 Feb 2024 • Lixu Wang, Yang Zhao, Jiahua Dong, Ating Yin, Qinbin Li, Xiao Wang, Dusit Niyato, Qi Zhu
Federated Learning (FL) is a privacy-preserving distributed learning approach that is rapidly developing in an era where privacy protection is increasingly valued.
no code implementations • 24 Jan 2024 • Pengcheng Zhao, Yanxiang Chen, Yang Zhao, Zhao Zhang
Automatic image colorization is inherently an ill-posed problem with uncertainty, which requires an accurate semantic understanding of scenes to estimate reasonable colors for grayscale images.
1 code implementation • 15 Jan 2024 • Xiuyuan Hu, Guoqing Liu, Yang Zhao, Hao Zhang
AI for drug discovery has been a research hotspot in recent years, and SMILES-based language models has been increasingly applied in drug molecular design.
no code implementations • CVPR 2024 • Hexiang Hu, Kelvin C. K. Chan, Yu-Chuan Su, Wenhu Chen, Yandong Li, Kihyuk Sohn, Yang Zhao, Xue Ben, Boqing Gong, William Cohen, Ming-Wei Chang, Xuhui Jia
We introduce *multi-modal instruction* for image generation, a task representation articulating a range of generation intents with precision.
1 code implementation • CVPR 2024 • Yuyao Ye, Ning Zhang, Yang Zhao, Hongbin Cao, Ronggang Wang
Although many deep image ITM methods can generate impressive results the field of video ITM is still to be explored.
no code implementations • 29 Dec 2023 • Lei Fan, Yang Zhao
Terrain surface roughness, often described abstractly, poses challenges in quantitative characterisation with various descriptors found in the literature.
no code implementations • 21 Dec 2023 • Haifeng Huang, Yang Zhao, Zehan Wang, Yan Xia, Zhou Zhao
Thus, to address this issue and enhance model performance on new scenes, we explore the TVG task in an unsupervised domain adaptation (UDA) setting across scenes for the first time, where the video-query pairs in the source scene (domain) are labeled with temporal boundaries, while those in the target scene are not.
2 code implementations • NeurIPS 2023 • Xiuyuan Hu, Guoqing Liu, Yang Zhao, Hao Zhang
A central challenge in this field is to generate molecules with specific properties while also producing a wide range of diverse candidates.
1 code implementation • 13 Dec 2023 • Huaiyuan Ying, Zhengyun Zhao, Yang Zhao, Sihang Zeng, Sheng Yu
Due to a lack of knowledge, previous contrastive learning models trained with Unified Medical Language System (UMLS) synonyms struggle at clustering difficult terms and do not generalize well beyond UMLS terms.
2 code implementations • 13 Dec 2023 • Haifeng Huang, Yilun Chen, Zehan Wang, Rongjie Huang, Runsen Xu, Tai Wang, Luping Liu, Xize Cheng, Yang Zhao, Jiangmiao Pang, Zhou Zhao
Recent advancements in 3D Large Language Models (LLMs) have demonstrated promising capabilities for 3D scene understanding.
no code implementations • 8 Dec 2023 • Yang Zhao, Yuxiang Zhang, Yanni Dong, Bo Du
Most change detection models based on vision transformers currently follow a "pretraining then fine-tuning" strategy.
no code implementations • 5 Dec 2023 • Shaoan Xie, Yang Zhao, Zhisheng Xiao, Kelvin C. K. Chan, Yandong Li, Yanwu Xu, Kun Zhang, Tingbo Hou
Our extensive experiments demonstrate the superior performance of our method in terms of visual quality, identity preservation, and text control, showcasing its effectiveness in the context of text-guided subject-driven image inpainting.
no code implementations • 30 Nov 2023 • Zhonghao Wang, Wei Wei, Yang Zhao, Zhisheng Xiao, Mark Hasegawa-Johnson, Humphrey Shi, Tingbo Hou
We further extend our method to a novel image editing task: substituting the subject in an image through textual manipulations.
no code implementations • 28 Nov 2023 • Yang Zhao, Yanwu Xu, Zhisheng Xiao, HaoLin Jia, Tingbo Hou
The deployment of large-scale text-to-image diffusion models on mobile devices is impeded by their substantial model size and slow inference speed.
no code implementations • 20 Nov 2023 • Zhichao Zuo, Zhao Zhang, Yan Luo, Yang Zhao, Haijun Zhang, Yi Yang, Meng Wang
This paper presents a novel framework termed Cut-and-Paste for real-word semantic video editing under the guidance of text prompt and additional reference image.
1 code implementation • CVPR 2024 • Yanwu Xu, Yang Zhao, Zhisheng Xiao, Tingbo Hou
Text-to-image diffusion models have demonstrated remarkable capabilities in transforming textual prompts into coherent images, yet the computational cost of their inference remains a persistent challenge.
1 code implementation • 30 Oct 2023 • Yang Zhao, Jiaxi Yang, Yiling Tao, Lixu Wang, Xiaoxiao Li, Dusit Niyato, H. Vincent Poor
The increasing demand for privacy-preserving machine learning has spurred interest in federated unlearning, which enables the selective removal of data from models trained in federated systems.
1 code implementation • 13 Oct 2023 • Zehan Wang, Ziang Zhang, Luping Liu, Yang Zhao, Haifeng Huang, Tao Jin, Zhou Zhao
Inspired by recent C-MCR, this paper proposes Extending Multimodal Contrastive Representation (Ex-MCR), a training-efficient and paired-data-free method to flexibly learn unified contrastive representation space for more than three modalities by integrating the knowledge of existing MCR spaces.
no code implementations • 29 Sep 2023 • Yang Zhao, Jiaxi Yang, Wenbo Wang, Helin Yang, Dusit Niyato
Industrial systems demand reliable predictive maintenance strategies to enhance operational efficiency and reduce downtime.
no code implementations • 26 Sep 2023 • Yuan Chen, Zhiliang Ma, Yang Zhao
First, many individual models based on popular and state-of-the-art (SOTA) Swin-Transformer (SwinT) are trained on different real-world BIQA datasets respectively.
1 code implementation • 31 Aug 2023 • Qiang Huang, Jiawei Jiang, Xi Susie Rao, Ce Zhang, Zhichao Han, Zitao Zhang, Xin Wang, Yongjun He, Quanqing Xu, Yang Zhao, Chuang Hu, Shuo Shang, Bo Du
To handle graphs in which features or connectivities are evolving over time, a series of temporal graph neural networks (TGNNs) have been proposed.
2 code implementations • 17 Aug 2023 • Zehan Wang, Haifeng Huang, Yang Zhao, Ziang Zhang, Zhou Zhao
This paper presents Chat-3D, which combines the 3D visual perceptual ability of pre-trained 3D representations and the impressive reasoning and conversation capabilities of advanced LLMs to achieve the first universal dialogue systems for 3D scenes.
no code implementations • ICCV 2023 • Lei Shen, Jianlong Jin, Ruixin Zhang, Huaen Li, Kai Zhao, Yingyi Zhang, Jingyun Zhang, Shouhong Ding, Yang Zhao, Wei Jia
Palmprint recently shows great potential in recognition applications as it is a privacy-friendly and stable biometric.
no code implementations • 25 Jul 2023 • Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao
3D visual grounding aims to localize the target object in a 3D point cloud by a free-form language description.
no code implementations • ICCV 2023 • Yang Zhao, Tingbo Hou, Yu-Chuan Su, Xuhui Jia. Yandong Li, Matthias Grundmann
An authentic face restoration system is becoming increasingly demanding in many computer vision applications, e. g., image enhancement, video communication, and taking portrait.
1 code implementation • ICCV 2023 • Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao
To accomplish this, we design a novel semantic matching model that analyzes the semantic similarity between object proposals and sentences in a coarse-to-fine manner.
1 code implementation • 17 Jul 2023 • Yang Zhao, Zhijie Lin, Daquan Zhou, Zilong Huang, Jiashi Feng, Bingyi Kang
Our experiments show that BuboGPT achieves impressive multi-modality understanding and visual grounding abilities during the interaction with human.
no code implementations • 2 Jun 2023 • Ziyang Zhang, Yang Zhao, Huan Li, Changyao Lin, Jie Liu
Due to limited resources on edge and different characteristics of deep neural network (DNN) models, it is a big challenge to optimize DNN inference performance in terms of energy consumption and end-to-end latency on edge devices.
no code implementations • 25 May 2023 • Ming Gao, Yanwu Xu, Yang Zhao, Tingbo Hou, Chenkai Zhao, Mingming Gong
In this paper, we propose a novel language-guided 3D arbitrary neural style transfer method (CLIP3Dstyler).
no code implementations • NeurIPS 2023 • Zehan Wang, Yang Zhao, Xize Cheng, Haifeng Huang, Jiageng Liu, Li Tang, Linjun Li, Yongqi Wang, Aoxiong Yin, Ziang Zhang, Zhou Zhao
This paper proposes a novel training-efficient method for learning MCR without paired data called Connecting Multi-modal Contrastive Representations (C-MCR).
no code implementations • 18 May 2023 • Liangchen Song, Liangliang Cao, Hongyu Xu, Kai Kang, Feng Tang, Junsong Yuan, Yang Zhao
The proposed framework consists of two significant components: Geometry Guided Diffusion and Mesh Optimization.
no code implementations • 16 May 2023 • Di Xu, Yang Zhao, Xiang Hao, Xin Meng
We introduce a novel dataset consisting of images depicting pink eggs that have been identified as Pomacea canaliculata eggs, accompanied by corresponding bounding box annotations.
1 code implementation • 9 May 2023 • Cong Ma, Yaping Zhang, Mei Tu, Yang Zhao, Yu Zhou, Chengqing Zong
Furthermore, the ablation studies verify the generalization of our method, where the proposed modal adapter is effective to bridge various OCR and MT models.
1 code implementation • 9 May 2023 • Cong Ma, Yaping Zhang, Mei Tu, Yang Zhao, Yu Zhou, Chengqing Zong
Text image machine translation (TIMT) has been widely used in various real-world applications, which translates source language texts in images into another target language sentence.
no code implementations • 9 May 2023 • Yang Zhao, Shang Wu, Jingqun Zhang, Sixu Li, Chaojian Li, Yingyan Lin
Instant on-device Neural Radiance Fields (NeRFs) are in growing demand for unleashing the promise of immersive AR/VR experiences, but are still limited by their prohibitive training time.
no code implementations • 1 May 2023 • Ziyang Zhang, Huan Li, Yang Zhao, Changyao Lin, Jie Liu
As deep neural networks (DNNs) are being applied to a wide range of edge intelligent applications, it is critical for edge inference platforms to have both high-throughput and low-latency at the same time.
no code implementations • 14 Apr 2023 • Yu-Chuan Su, Kelvin C. K. Chan, Yandong Li, Yang Zhao, Han Zhang, Boqing Gong, Huisheng Wang, Xuhui Jia
Our approach greatly reduces the overhead for personalized image generation and is more applicable in many potential applications.
no code implementations • CVPR 2023 • Haoyuan Li, Hao Jiang, Tao Jin, Mengyan Li, Yan Chen, Zhijie Lin, Yang Zhao, Zhou Zhao
Then, we present two cooperative seekers to simultaneously search the image for PR and localize the product for PG.
no code implementations • 5 Apr 2023 • Xuhui Jia, Yang Zhao, Kelvin C. K. Chan, Yandong Li, Han Zhang, Boqing Gong, Tingbo Hou, Huisheng Wang, Yu-Chuan Su
This paper proposes a method for generating images of customized objects specified by users.
1 code implementation • 24 Mar 2023 • Weide Liu, Zhonghua Wu, Yang Zhao, Yuming Fang, Chuan-Sheng Foo, Jun Cheng, Guosheng Lin
Current methods for few-shot segmentation (FSSeg) have mainly focused on improving the performance of novel classes while neglecting the performance of base classes.
no code implementations • 21 Mar 2023 • Yang Zhao, Jianwen Xie, Ping Li
The proposed algorithm consists of two learning stages: (i) Cooperative initialization stage: The discriminator of GAN is treated as an energy-based model (EBM) and is optimized via maximum likelihood estimation (MLE), with the help of the GAN's generator to provide synthetic data to approximate the learning gradients.
no code implementations • 12 Jan 2023 • Yang Zhao, Lei Fan, Hyungjoon Seo
Retaining walls are often built to prevent excessive lateral movements of the ground surrounding an excavation site.
no code implementations • CVPR 2023 • Ning Zhang, Yuyao Ye, Yang Zhao, Ronggang Wang
In this paper, we revisit the stack-based ITM approaches and propose a novel method to reconstruct HDR radiance from a single image, which only needs to estimate two exposure images.
1 code implementation • 13 Dec 2022 • Bin Wang, Yan Song, Fanming Wang, Yang Zhao, Xiangbo Shu, Yan Rui
To balance the annotation labor and the granularity of supervision, single-frame annotation has been introduced in temporal action localization.
no code implementations • 6 Dec 2022 • Yang Zhao, Junnan Zhu, Lu Xiang, Jiajun Zhang, Yu Zhou, FeiFei Zhai, Chengqing Zong
To alleviate the CF, we investigate knowledge distillation based life-long learning methods.
no code implementations • 18 Nov 2022 • Yanyan Wei, Zhao Zhang, ZhongQiu Zhao, Yang Zhao, Richang Hong, Yi Yang
Stereo images, containing left and right view images with disparity, are utilized in solving low-vision tasks recently, e. g., rain removal and super-resolution.
no code implementations • 14 Nov 2022 • Xiaopei Wu, Yang Zhao, Liang Peng, Hua Chen, Xiaoshui Huang, Binbin Lin, Haifeng Liu, Deng Cai, Wanli Ouyang
When training a teacher-student semi-supervised framework, we randomly select gt samples and pseudo samples to both labeled frames and unlabeled frames, making a strong data augmentation for them.
2 code implementations • 24 Oct 2022 • Huihong Shi, Haoran You, Yang Zhao, Zhongfeng Wang, Yingyan Lin
Multiplication is arguably the most cost-dominant operation in modern deep neural networks (DNNs), limiting their achievable efficiency and thus more extensive deployment in resource-constrained applications.
1 code implementation • 18 Oct 2022 • Haoran You, Zhanyi Sun, Huihong Shi, Zhongzhi Yu, Yang Zhao, Yongan Zhang, Chaojian Li, Baopu Li, Yingyan Celine Lin
Specifically, on the algorithm level, ViTCoD prunes and polarizes the attention maps to have either denser or sparser fixed patterns for regularizing two levels of workloads without hurting the accuracy, largely reducing the attention computations while leaving room for alleviating the remaining dominant data movements; on top of that, we further integrate a lightweight and learnable auto-encoder module to enable trading the dominant high-cost data movements for lower-cost computations.
no code implementations • 13 Oct 2022 • Hang Yin, Zitao Zhang, Zhurong Wang, Yilmazcan Ozyurt, Weiming Liang, Wenyu Dong, Yang Zhao, Yinan Shan
Our experiments show that embedding features learned from similarity based behavioral graph have achieved significant performance increase to the baseline fraud detection model in various business scenarios.
no code implementations • 9 Oct 2022 • Khoa D. Doan, Jianwen Xie, Yaxuan Zhu, Yang Zhao, Ping Li
Leveraging supervised information can lead to superior retrieval performance in the image hashing domain but the performance degrades significantly without enough labeled data.
1 code implementation • 8 Oct 2022 • Cong Ma, Yaping Zhang, Mei Tu, Xu Han, Linghui Wu, Yang Zhao, Yu Zhou
End-to-end text image translation (TIT), which aims at translating the source language embedded in images to the target language, has attracted intensive attention in recent research.
1 code implementation • 1 Sep 2022 • Yan Xia, Zhou Zhao, Shangwei Ye, Yang Zhao, Haoyuan Li, Yi Ren
To rectify the discriminative phonemes and extract video-related information from noisy audio, we develop a novel video-guided curriculum learning (VGCL) during the audio pre-training process, which can make use of the vital visual perceptions to help understand the spoken language and suppress the external noise.
1 code implementation • 23 Aug 2022 • Dewang Hou, Yuanyuan Du, Kai Zhao, Yang Zhao
With the wide application of sparse ToF sensors in mobile devices, RGB image-guided sparse depth completion has attracted extensive attention recently, but still faces some problems.
1 code implementation • 21 Aug 2022 • Yang Zhao, Peng Guo, Han Gao, Xiuwan Chen
Generative methods are common approaches to minimizing the domain gap of aerial images which improves the performance of the downstream tasks, e. g., cross-domain semantic segmentation.
no code implementations • 24 Jul 2022 • Yang Zhao, Yongan Zhang, Yonggan Fu, Xu Ouyang, Cheng Wan, Shang Wu, Anton Banta, Mathews M. John, Allison Post, Mehdi Razavi, Joseph Cavallaro, Behnaam Aazhang, Yingyan Lin
This work presents the first silicon-validated dedicated EGM-to-ECG (G2C) processor, dubbed e-G2C, featuring continuous lightweight anomaly detection, event-driven coarse/precise conversion, and on-chip adaptation.
no code implementations • 2 Jul 2022 • Yang Zhao, Yan Song
To obtain more information to optimize the model, the existing method generated pseudo frame-wise labels iteratively based on the output of a segmentation model and the timestamp annotations.
no code implementations • 10 Jun 2022 • Yang Zhao, Xuan Lin, Wenqiang Xu, Maozong Zheng, Zhengyong Liu, Zhou Zhao
In recent days, streaming technology has greatly promoted the development in the field of livestream.
no code implementations • 25 May 2022 • Mingxuan Lu, Zhichao Han, Susie Xi Rao, Zitao Zhang, Yang Zhao, Yinan Shan, Ramesh Raghunathan, Ce Zhang, Jiawei Jiang
Apart from rule-based and machine learning filters that are already deployed in production, we want to enable efficient real-time inference with graph neural networks (GNNs), which is useful to catch multihop risk propagation in a transaction graph.
no code implementations • 6 May 2022 • Wanting Lyu, Yue Xiu, Yang Zhao, Chadi Assi, Zhongpei Zhang
In this paper, we investigate an outdoor and indoor wireless communication network with the assistance of a novel relay-aided double-sided reconfigurable intelligent surface (RIS).
no code implementations • 23 Apr 2022 • Yang Zhao, Kai Zhang, Haotian Yu, Yi Zhang, Dongliang Zheng, Jing Han
Simultaneous Localization and Mapping (SLAM) plays an important role in outdoor and indoor applications ranging from autonomous driving to indoor robotics.
1 code implementation • 22 Apr 2022 • Susie Xi Rao, Clémence Lanfranchi, Shuai Zhang, Zhichao Han, Zitao Zhang, Wei Min, Mo Cheng, Yinan Shan, Yang Zhao, Ce Zhang
At online retail platforms, detecting fraudulent accounts and transactions is crucial to improve customer experience, minimize loss, and avoid unauthorized transactions.
no code implementations • 18 Mar 2022 • Yang Zhao, Hao Zhang, Xiuyuan Hu
Optimizers in RST would perform a Bernoulli trial at each iteration to choose randomly from base algorithms (SGD) and sharpness-aware algorithms (SAM) with a probability arranged by a predefined scheduling function.
no code implementations • 7 Mar 2022 • Yifan Chen, Yang Zhao, Xuelong Li
In this paper, we try to enhance the discrimination of spatio-temporal gait features from two aspects: effective extraction of spatio-temporal gait features and reasonable refinement of extracted features.
1 code implementation • 8 Feb 2022 • Yang Zhao, Hao Zhang, Xiuyuan Hu
In this paper, we propose an effective method to improve the model generalization by additionally penalizing the gradient norm of loss function during optimization.
1 code implementation • 27 Jan 2022 • Yang Zhao, Peng Guo, Zihao Sun, Xiuwan Chen, Han Gao
The performance of a semantic segmentation model for remote sensing (RS) images pretrained on an annotated dataset would greatly decrease when testing on another unannotated dataset because of the domain gap.
no code implementations • 16 Jan 2022 • Yang Zhao, Hao Zhang
NRS leverages the finding that models would benefit from converging to flat minima, and tries to regularize the neighborhood region in weight space to yield approximate outputs.
no code implementations • 26 Nov 2021 • Yang Zhao, Junbin Qiu, Mingshan Xie, Haiping Huang
Binary perceptron is a fundamental model of supervised learning for the non-convex optimization, which is a root of the popular deep learning.
no code implementations • 9 Oct 2021 • Mingxuan Lu, Zhichao Han, Zitao Zhang, Yang Zhao, Yinan Shan
Transaction checkout fraud detection is an essential risk control components for E-commerce marketplaces.
no code implementations • CVPR 2022 • Yang Zhao, Yu-Chuan Su, Chun-Te Chu, Yandong Li, Marius Renn, Yukun Zhu, Changyou Chen, Xuhui Jia
While existing approaches for face restoration make significant progress in generating high-quality faces, they often fail to preserve facial features and cannot authentically reconstruct the faces.
no code implementations • 29 Sep 2021 • Yuan Chai, Liang He, Yang Zhao, Xueyan Li, Zhenxin Wang
The model was evaluated across a wide range of the tasks in time series, which are commonly used to the benchmark of TCN and recurrent networks.
no code implementations • 29 Sep 2021 • Yang Zhao, Yanbo Ma, Yuan Chen, Wei Jia, Ronggang Wang, Xiaoping Liu
Early interlaced videos usually contain multiple and interlacing and complex compression artifacts, which significantly reduce the visual quality.
Ranked #1 on
Video Deinterlacing
on MSU Deinterlacer Benchmark
no code implementations • 29 Sep 2021 • Chaojian Li, Xu Ouyang, Yang Zhao, Haoran You, Yonggan Fu, Yuchen Gu, Haonan Liu, Siyuan Miao, Yingyan Lin
Graph Convolutional Networks (GCNs) have gained an increasing attention thanks to their state-of-the-art (SOTA) performance in graph-based learning tasks.
no code implementations • 11 Sep 2021 • Yonggan Fu, Yang Zhao, Qixuan Yu, Chaojian Li, Yingyan Celine Lin
The recent breakthroughs of deep neural networks (DNNs) and the advent of billions of Internet of Things (IoT) devices have excited an explosive demand for intelligent IoT devices equipped with domain-specific DNN accelerators.
1 code implementation • 3 Jul 2021 • Jun Wang, Yang Zhao, Linglong Qian, Xiaohan Yu, Yongsheng Gao
The precise detection of blood vessels in retinal images is crucial to the early diagnosis of the retinal vascular diseases, e. g., diabetic, hypertensive and solar retinopathies.
no code implementations • CVPR 2021 • Yang Zhao, Zhou Zhao, Zhu Zhang, Zhijie Lin
Temporal video grounding aims to localize the target segment which is semantically aligned with the given sentence in an untrimmed video.
1 code implementation • 6 Apr 2021 • Xin Wang, Yang Zhao, Tangwen Yang, Qiuqi Ruan
In this paper, we propose a multi-scale context aggregation network (MSCANet) based on single-column encoder-decoder architecture for crowd counting, which consists of an encoder based on a dense context-aware module (DCAM) and a hierarchical attention-guided decoder.
no code implementations • 2 Apr 2021 • Yang Zhao, Hao Zhang
By training DNNs with a wide range of generalization gap on popular datasets, we show that our key quantities and linear model could be efficient tools for estimating the generalization gap of DNNs.
no code implementations • 26 Mar 2021 • Dewang Hou, Yang Zhao, Yuyao Ye, Jiayu Yang, Jian Zhang, Ronggang Wang
Scaling and lossy coding are widely used in video transmission and storage.
1 code implementation • 19 Mar 2021 • Chaojian Li, Zhongzhi Yu, Yonggan Fu, Yongan Zhang, Yang Zhao, Haoran You, Qixuan Yu, Yue Wang, Yingyan Lin
To design HW-NAS-Bench, we carefully collected the measured/estimated hardware performance of all the networks in the search spaces of both NAS-Bench-201 and FBNet, on six hardware devices that fall into three categories (i. e., commercial edge devices, FPGA, and ASIC).
Hardware Aware Neural Architecture Search
Neural Architecture Search
no code implementations • ICLR 2022 • Yang Zhao, Hao Zhang
We show that by investigating the feature entropy of units on only training data, it could give discrimination between networks with different generalization ability from the view of the effectiveness of feature representations.
no code implementations • 19 Jan 2021 • Xianlin Song, Ao Teng, Jianshuang Wei, Hao Chen, Yang Zhao, Jianheng Chen, Fangwei Liu, Qianxiang Wan, Guoning Huang, Lingfang Song, Aojie Zhao, Bo Li, Zihao Li, Qiming He, Jinhong Zhang
As a non-destructive biological tissue imaging technology, photoacoustic imaging has important application value in the field of biomedicine.
Biological Physics
no code implementations • 7 Jan 2021 • Zhenyuan Feng, Bruno Clerckx, Yang Zhao
This paper highlights the fact that IRS can provide an extra passive beamforming gain on output DC power over conventional WPT designs and significantly influence the waveform design by leveraging the benefit of passive beamforming, frequency diversity and energy harvester nonlinearity.
Information Theory Signal Processing Information Theory
1 code implementation • 4 Jan 2021 • Xiaohan Chen, Yang Zhao, Yue Wang, Pengfei Xu, Haoran You, Chaojian Li, Yonggan Fu, Yingyan Lin, Zhangyang Wang
Results show that: 1) applied to inference, SD achieves up to 2. 44x energy efficiency as evaluated via real hardware implementations; 2) applied to training, SD leads to 10. 56x and 4. 48x reduction in the storage and training energy, with negligible accuracy loss compared to state-of-the-art training baselines.
no code implementations • 2 Jan 2021 • Ping Yu, Ruiyi Zhang, Yang Zhao, Yizhe Zhang, Chunyuan Li, Changyou Chen
Data augmentation has been widely used to improve deep neural networks in many research fields, such as computer vision.
no code implementations • ICLR 2021 • Chaojian Li, Zhongzhi Yu, Yonggan Fu, Yongan Zhang, Yang Zhao, Haoran You, Qixuan Yu, Yue Wang, Cong Hao, Yingyan Lin
To design HW-NAS-Bench, we carefully collected the measured/estimated hardware performance (e. g., energy cost and latency) of all the networks in the search space of both NAS-Bench-201 and FBNet, considering six hardware devices that fall into three categories (i. e., commercial edge devices, FPGA, and ASIC).
Hardware Aware Neural Architecture Search
Neural Architecture Search
1 code implementation • ICCV 2021 • Xiaohan Yu, Yang Zhao, Yongsheng Gao, Xiaohui Yuan, Shengwu Xiong
The proposed UFG image dataset and evaluation protocols is intended to serve as a benchmark platform that can advance research of visual classification from approaching human performance to beyond human ability, via facilitating benchmark data of artificial intelligence (AI) not to be limited by the labels of human intelligence (HI).
no code implementations • ICLR 2021 • Yang Zhao, Jianwen Xie, Ping Li
Energy-based models (EBMs) for generative modeling parametrize a single net and can be directly trained by maximum likelihood estimation.
1 code implementation • NeurIPS 2020 • Yonggan Fu, Haoran You, Yang Zhao, Yue Wang, Chaojian Li, Kailash Gopalakrishnan, Zhangyang Wang, Yingyan Celine Lin
Recent breakthroughs in deep neural networks (DNNs) have fueled a tremendous demand for intelligent edge devices featuring on-site learning, while the practical realization of such systems remains a challenge due to the limited resources available at the edge and the required massive training costs for state-of-the-art (SOTA) DNNs.
no code implementations • 21 Dec 2020 • Yang Zhao, Wenchao Zhai, Jun Zhao, Tinghao Zhang, Sumei Sun, Dusit Niyato, Kwok-Yan Lam
First, we give an overview of 6G from perspectives of technologies, security and privacy, and applications.
no code implementations • 20 Dec 2020 • Susie Xi Rao, Shuai Zhang, Zhichao Han, Zitao Zhang, Wei Min, Mo Cheng, Yinan Shan, Yang Zhao, Ce Zhang
Massive account registration has raised concerns on risk management in e-commerce companies, especially when registration increases rapidly within a short time frame.
1 code implementation • 10 Dec 2020 • Yang Zhao, Bruno Clerckx, Zhenyuan Feng
To facilitate practical implementation, we also propose a low-complexity design based on closed-form adaptive waveform schemes.
Information Theory Signal Processing Information Theory
no code implementations • 2 Dec 2020 • Yang Zhao, Chunyuan Li, Ping Yu, Changyou Chen
Few-shot learning features the capability of generalizing from a few examples.
no code implementations • COLING 2020 • Yang Zhao, Lu Xiang, Junnan Zhu, Jiajun Zhang, Yu Zhou, Chengqing Zong
Previous studies combining knowledge graph (KG) with neural machine translation (NMT) have two problems: i) Knowledge under-utilization: they only focus on the entities that appear in both KG and training sentence pairs, making much knowledge in KG unable to be fully utilized.
1 code implementation • CVPR 2021 • Yang Zhao, Changyou Chen
Instead of explicitly extracting the two codes and applying adaptive instance normalization to combine them, our latent EBM can implicitly learn to transport the source style code to the target style code while preserving the content code, an advantage over existing image translation methods.
no code implementations • 27 Nov 2020 • Yang Zhao, Wei Jia, Ronggang Wang
Traditional deinterlacing approaches are mainly focused on early interlacing scanning systems and thus cannot handle the complex and complicated artifacts in real-world early interlaced videos.
1 code implementation • 24 Nov 2020 • Susie Xi Rao, Shuai Zhang, Zhichao Han, Zitao Zhang, Wei Min, Zhiyao Chen, Yinan Shan, Yang Zhao, Ce Zhang
At online retail platforms, it is crucial to actively detect the risks of transactions to improve customer experience and minimize financial loss.
no code implementations • 2 Nov 2020 • Yang Zhao, Hao Zhang, Xiuyuan Hu
Identifying the role of network units in deep neural networks (DNNs) is critical in many aspects including giving understandings on the mechanisms of DNNs and building basic connections between deep learning and neuroscience.
1 code implementation • COLING 2020 • Vitou Phy, Yang Zhao, Akiko Aizawa
For instance, specificity is mandatory in a food-ordering dialogue task, whereas fluency is preferred in a language-teaching dialogue system.
no code implementations • EMNLP 2020 • Xiaomian Kang, Yang Zhao, Jiajun Zhang, Chengqing Zong
Specifically, we introduce a selection module that is independent of the translation module to score each candidate context sentence.
1 code implementation • EMNLP 2020 • Ryosuke Kohita, Akifumi Wachi, Yang Zhao, Ryuki Tachibana
Q-learning is leveraged to train the agent to produce proper edit actions.
1 code implementation • 5 Oct 2020 • Rishikesh Magar, Lalit Ghule, Junhan Li, Yang Zhao, Amir Barati Farimani
In this work, we analyze vibration signal data of mechanical systems with bearings by combining different signal processing methods and coupling them with machine learning techniques to classify different types of bearing faults.
Ranked #3 on
Classification
on CWRU Bearing Dataset
(using extra training data)
1 code implementation • ECCV 2020 • Ping Yu, Yang Zhao, Chunyuan Li, Junsong Yuan, Changyou Chen
Generating long-range skeleton-based human actions has been a challenging problem since small deviations of one frame can cause a malformed action sequence.
Ranked #2 on
Human action generation
on NTU RGB+D 2D
no code implementations • WS 2020 • Qian Wang, Yuchen Liu, Cong Ma, Yu Lu, Yining Wang, Long Zhou, Yang Zhao, Jiajun Zhang, Cheng-qing Zong
This paper describes the CASIA{'}s system for the IWSLT 2020 open domain translation task.
no code implementations • 8 Jun 2020 • Hans Albert Lianto, Yang Zhao, Jun Zhao
In a case where the aggregator is untrusted and LDP is not applied to each user gradient, the aggregator can recover sensitive user data from these gradients.
no code implementations • 7 May 2020 • Yang Zhao, Xiaohan Chen, Yue Wang, Chaojian Li, Haoran You, Yonggan Fu, Yuan Xie, Zhangyang Wang, Yingyan Lin
We present SmartExchange, an algorithm-hardware co-design framework to trade higher-cost memory storage/access for lower-cost computation, for energy-efficient inference of deep neural networks (DNNs).
no code implementations • 3 May 2020 • Weitao Li, Pengfei Xu, Yang Zhao, Haitong Li, Yuan Xie, Yingyan Lin
Resistive-random-access-memory (ReRAM) based processing-in-memory (R$^2$PIM) accelerators show promise in bridging the gap between Internet of Thing devices' constrained resources and Convolutional/Deep Neural Networks' (CNNs/DNNs') prohibitive energy cost.
1 code implementation • ICLR 2020 • Zhenyi Wang, Yang Zhao, Ping Yu, Ruiyi Zhang, Changyou Chen
Specifically, we propose a Bayesian meta sampling framework consisting of two main components: a meta sampler and a sample adapter.
no code implementations • 22 Apr 2020 • Yang Zhao, Ping Yu, Suchismit Mahapatra, Qinliang Su, Changyou Chen
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning.
no code implementations • 19 Apr 2020 • Yang Zhao, Jun Zhao, Mengmeng Yang, Teng Wang, Ning Wang, Lingjuan Lyu, Dusit Niyato, Kwok-Yan Lam
To avoid the privacy threat and reduce the communication cost, in this paper, we propose to integrate federated learning and local differential privacy (LDP) to facilitate the crowdsourcing applications to achieve the machine learning model.
2 code implementations • ICML 2020 • Yang Zhao, Chunyuan Li, Ping Yu, Jianfeng Gao, Changyou Chen
The instability in GAN training has been a long-standing problem despite remarkable research efforts.
Ranked #1 on
Image-to-Image Translation
on anime-to-selfie
no code implementations • 20 Mar 2020 • Xin-Yu Zhang, Yang Zhao, Hao Zhang
A wealth of angle problems occur when facial recognition is performed: At present, the feature extraction network presents eigenvectors with large differences between the frontal face and profile face recognition of the same person in many cases.
no code implementations • 2 Mar 2020 • Hongjie Wang, Yang Zhao, Chaojian Li, Yue Wang, Yingyan Lin
The excellent performance of modern deep neural networks (DNNs) comes at an often prohibitive training cost, limiting the rapid development of DNN innovations and raising various environmental concerns.
no code implementations • 26 Feb 2020 • Yang Zhao, Chaojian Li, Yue Wang, Pengfei Xu, Yongan Zhang, Yingyan Lin
The recent breakthroughs in deep neural networks (DNNs) have spurred a tremendously increased demand for DNN accelerators.
1 code implementation • CVPR 2020 • Zhu Zhang, Zhou Zhao, Yang Zhao, Qi. Wang, Huasheng Liu, Lianli Gao
In this paper, we consider a novel task, Spatio-Temporal Video Grounding for Multi-Form Sentences (STVG).
1 code implementation • 6 Jan 2020 • Pengfei Xu, Xiaofan Zhang, Cong Hao, Yang Zhao, Yongan Zhang, Yue Wang, Chaojian Li, Zetong Guan, Deming Chen, Yingyan Lin
Specifically, AutoDNNchip consists of two integrated enablers: (1) a Chip Predictor, built on top of a graph-based accelerator representation, which can accurately and efficiently predict a DNN accelerator's energy, throughput, and area based on the DNN model parameters, hardware configuration, technology-based IPs, and platform constraints; and (2) a Chip Builder, which can automatically explore the design space of DNN chips (including IP selection, block configuration, resource balancing, etc.