no code implementations • 26 Jan 2025 • Junan Zhang, Jing Yang, Zihao Fang, Yuancheng Wang, Zehua Zhang, Zhuo Wang, Fan Fan, Zhizheng Wu
Based on a masked generative model, AnyEnhance is capable of handling both speech and singing voices, supporting a wide range of enhancement tasks including denoising, dereverberation, declipping, super-resolution, and target speaker extraction, all simultaneously and without fine-tuning.
no code implementations • 19 Sep 2024 • Rengan Xu, Junjie Yang, Yifan Xu, Hong Li, Xing Liu, Devashish Shankar, Haoci Zhang, Meng Liu, Boyang Li, Yuxi Hu, Mingwei Tang, Zehua Zhang, Tunhou Zhang, Dai Li, Sijia Chen, Gian-Paolo Musumeci, Jiaqi Zhai, Bill Zhu, Hong Yan, Srihari Reddy
In production models, we observe 10% QPS improvement and 18% memory savings, enabling us to scale our recommendation systems with longer features and more complex architectures.
1 code implementation • 17 Jul 2024 • Hu Cao, Zehua Zhang, Yan Xia, Xinyi Li, Jiahao Xia, Guang Chen, Alois Knoll
The core concept is the design of the coarse-to-fine fusion module, denoted as the cross-modality adaptive feature refinement (CAFR) module.
Ranked #1 on
Object Detection
on PKU-DDD17-Car
no code implementations • 9 Jun 2024 • Mingwei Tang, Meng Liu, Hong Li, Junjie Yang, Chenglin Wei, Boyang Li, Dai Li, Rengan Xu, Yifan Xu, Zehua Zhang, Xiangyu Wang, Linfeng Liu, Yuelei Xie, Chengye Liu, Labib Fawaz, Li Li, Hongnan Wang, Bill Zhu, Sri Reddy
In recommendation systems, high-quality user embeddings can capture subtle preferences, enable precise similarity calculations, and adapt to changing preferences over time to maintain relevance.
1 code implementation • 15 Mar 2024 • Jiarui Li, Ye Yuan, Zehua Zhang
We proposed an end-to-end system design towards utilizing Retrieval Augmented Generation (RAG) to improve the factual accuracy of Large Language Models (LLMs) for domain-specific and time-sensitive queries related to private knowledge-bases.
1 code implementation • 24 Feb 2024 • Zehua Zhang, Zijie Li, Amir Barati Farimani
We propose a mask pretraining method for Graph Neural Networks (GNNs) to improve their performance on fitting potential energy surfaces, particularly in water systems.
1 code implementation • 16 Feb 2024 • Divij Handa, Zehua Zhang, Amir Saeidi, Chitta Baral
Building upon this, we introduce Layered Attacks using Custom Encryptions (LACE), which employs multiple layers of encryption through our custom ciphers to further enhance the ASR.
1 code implementation • 25 Oct 2022 • Zehua Zhang, Shilin Sun, Guixiang Ma, Caiming Zhong
Link prediction tasks focus on predicting possible future connections.
no code implementations • 15 Mar 2022 • Zehua Zhang, Lu Zhang, Xuyi Zhuang, Yukun Qian, Heng Li, Mingjiang Wang
In recent years, deep learning-based approaches have significantly improved the performance of single-channel speech enhancement.
no code implementations • 9 Jul 2021 • Lu Zhang, Mingjiang Wang, Andong Li, Zehua Zhang, Xuyi Zhuang
In this study, we make full use of the contribution of multi-target joint learning to the model generalization capability, and propose a lightweight and low-computing dilated convolutional network (DCN) model for a more robust speech denoising task.
no code implementations • 9 Jun 2021 • Lu Zhang, Mingjiang Wang, Zehua Zhang, Xuyi Zhuang
In this paper, we propose a multi-branch dilated convolutional network (DCN) to simultaneously enhance the magnitude and phase of noisy speech.
no code implementations • 23 Nov 2020 • Zehua Zhang, David Crandall
We present a novel technique for self-supervised video representation learning by: (a) decoupling the learning objective into two contrastive subtasks respectively emphasizing spatial and temporal features, and (b) performing it hierarchically to encourage multi-scale understanding.
no code implementations • NeurIPS 2020 • Hu Liu, Jing Lu, Xiwei Zhao, Sulong Xu, Hao Peng, Yutong Liu, Zehua Zhang, Jian Li, Junsheng Jin, Yongjun Bao, Weipeng Yan
First, conventional attentions mostly limit the attention field only to a single user's behaviors, which is not suitable in e-commerce where users often hunt for new demands that are irrelevant to any historical behaviors.
no code implementations • 18 Jun 2020 • Hu Liu, Jing Lu, Hao Yang, Xiwei Zhao, Sulong Xu, Hao Peng, Zehua Zhang, Wenjie Niu, Xiaokun Zhu, Yongjun Bao, Weipeng Yan
Existing algorithms usually extract visual features using off-the-shelf Convolutional Neural Networks (CNNs) and late fuse the visual and non-visual features for the finally predicted CTR.
no code implementations • 12 Mar 2020 • Zehua Zhang, Ashish Tawari, Sujitha Martin, David Crandall
A vehicle driving along the road is surrounded by many objects, but only a small subset of them influence the driver's decisions and actions.
1 code implementation • NeurIPS 2019 • Zehua Zhang, Chen Yu, David Crandall
Due to the foveated nature of the human vision system, people can focus their visual attention on a small region of their visual field at a time, which usually contains only a single object.