no code implementations • ECNLP (ACL) 2022 • Kristen Howell, Jian Wang, Akshay Hazare, Joseph Bradley, Chris Brew, Xi Chen, Matthew Dunn, Beth Hockey, Andrew Maurer, Dominic Widdows
We demonstrate that knowledge distillation can be used not only to reduce model size, but to simultaneously adapt a contextual language model to a specific domain.
no code implementations • CCL 2022 • Houli Ma, Ling Dong, Wenjun Wang, Jian Wang, Shengxiang Gao, Zhengtao Yu
“语音翻译的编码器需要同时编码语音中的声学和语义信息, 单一的Fbank或Wav2vec2语音特征表征能力存在不足。本文通过分析人工的Fbank特征与自监督的Wav2vec2特征间的差异性, 提出基于交叉注意力机制的声学特征融合方法, 并探究了不同的自监督特征和融合方式, 加强模型对语音中声学和语义信息的学习。结合越南语语音特点, 以Fbank特征为主、Pitch特征为辅混合编码Fbank表征, 构建多特征融合的越-英语音翻译模型。实验表明, 使用多特征的语音翻译模型相比单特征翻译效果更优, 与简单的特征拼接方法相比更有效, 所提的多特征融合方法在越-英语音翻译任务上提升了1. 97个BLEU值。”
no code implementations • COLING 2022 • Bo Xu, Hongtong Zhang, Jian Wang, Xiaokun Zhang, Dezhi Hao, Linlin Zong, Hongfei Lin, Fenglong Ma
We collected and annotated a wide range of meta-data with respect to medical dialogue including doctor profiles, hospital departments, diseases and symptoms for fine-grained analysis on language usage pattern and clinical diagnosis.
no code implementations • COLING 2022 • Jinzhong Ning, Zhihao Yang, Zhizheng Wang, Yuanyuan Sun, Hongfei Lin, Jian Wang
Chinese Named Entity Recognition (NER) has continued to attract research attention.
no code implementations • 20 May 2025 • Lanlan Kang, Jian Wang, Jian Qin, Yiqin Liang, Yongjun He
The ThinPrep Cytologic Test (TCT) is the most widely used method for cervical cancer screening, and the sample quality directly impacts the accuracy of the diagnosis.
no code implementations • 13 May 2025 • Jian Wang, Baoyuan Wu, Li Liu, Qingshan Liu
The rapid evolution of generative AI has increased the threat of realistic audio-visual deepfakes, demanding robust detection methods.
no code implementations • 13 May 2025 • Yuhan Zhu, Haojie Liu, Jian Wang, Bing Li, Zikang Yin, Yefei Liao
AaaS-AN unifies the entire agent lifecycle, including construction, integration, interoperability, and networked collaboration, through two core components: (1) a dynamic Agent Network, which models agents and agent groups as vertexes that self-organize within the network based on task and role dependencies; (2) service-oriented agents, incorporating service discovery, registration, and interoperability protocols.
no code implementations • 11 May 2025 • Dong Guo, Faming Wu, Feida Zhu, Fuxing Leng, Guang Shi, Haobin Chen, Haoqi Fan, Jian Wang, Jianyu Jiang, Jiawei Wang, Jingji Chen, Jingjia Huang, Kang Lei, Liping Yuan, Lishu Luo, PengFei Liu, Qinghao Ye, Rui Qian, Shen Yan, Shixiong Zhao, Shuai Peng, Shuangye Li, Sihang Yuan, Sijin Wu, Tianheng Cheng, Weiwei Liu, Wenqian Wang, Xianhan Zeng, Xiao Liu, Xiaobo Qin, Xiaohan Ding, Xiaojun Xiao, Xiaoying Zhang, Xuanwei Zhang, Xuehan Xiong, Yanghua Peng, Yangrui Chen, Yanwei Li, Yanxu Hu, Yi Lin, Yiyuan Hu, Yiyuan Zhang, Youbin Wu, Yu Li, Yudong Liu, Yue Ling, Yujia Qin, Zanbo Wang, Zhiwu He, Aoxue Zhang, Bairen Yi, Bencheng Liao, Can Huang, Can Zhang, Chaorui Deng, Chaoyi Deng, Cheng Lin, Cheng Yuan, Chenggang Li, Chenhui Gou, Chenwei Lou, Chengzhi Wei, Chundian Liu, Chunyuan Li, Deyao Zhu, Donghong Zhong, Feng Li, Feng Zhang, Gang Wu, Guodong Li, Guohong Xiao, Haibin Lin, Haihua Yang, Haoming Wang, Heng Ji, Hongxiang Hao, Hui Shen, Huixia Li, Jiahao Li, Jialong Wu, Jianhua Zhu, Jianpeng Jiao, Jiashi Feng, Jiaze Chen, Jianhui Duan, Jihao Liu, Jin Zeng, Jingqun Tang, Jingyu Sun, Joya Chen, Jun Long, Junda Feng, Junfeng Zhan, Junjie Fang, Junting Lu, Kai Hua, Kai Liu, Kai Shen, Kaiyuan Zhang, Ke Shen, Ke Wang, Keyu Pan, Kun Zhang, Kunchang Li, Lanxin Li, Lei LI, Lei Shi, Li Han, Liang Xiang, Liangqiang Chen, Lin Chen, Lin Li, Lin Yan, Liying Chi, Longxiang Liu, Mengfei Du, Mingxuan Wang, Ningxin Pan, Peibin Chen, Pengfei Chen, Pengfei Wu, Qingqing Yuan, Qingyao Shuai, Qiuyan Tao, Renjie Zheng, Renrui Zhang, Ru Zhang, Rui Wang, Rui Yang, Rui Zhao, Shaoqiang Xu, Shihao Liang, Shipeng Yan, Shu Zhong, Shuaishuai Cao, Shuangzhi Wu, Shufan Liu, Shuhan Chang, Songhua Cai, Tenglong Ao, Tianhao Yang, Tingting Zhang, Wanjun Zhong, Wei Jia, Wei Weng, Weihao Yu, Wenhao Huang, Wenjia Zhu, Wenli Yang, Wenzhi Wang, Xiang Long, XiangRui Yin, Xiao Li, Xiaolei Zhu, Xiaoying Jia, Xijin Zhang, Xin Liu, Xinchen Zhang, Xinyu Yang, Xiongcai Luo, Xiuli Chen, Xuantong Zhong, Xuefeng Xiao, Xujing Li, Yan Wu, Yawei Wen, Yifan Du, Yihao Zhang, Yining Ye, Yonghui Wu, Yu Liu, Yu Yue, Yufeng Zhou, Yufeng Yuan, Yuhang Xu, Yuhong Yang, Yun Zhang, Yunhao Fang, Yuntao Li, Yurui Ren, Yuwen Xiong, Zehua Hong, Zehua Wang, Zewei Sun, Zeyu Wang, Zhao Cai, Zhaoyue Zha, Zhecheng An, Zhehui Zhao, Zhengzhuo Xu, Zhipeng Chen, Zhiyong Wu, Zhuofan Zheng, ZiHao Wang, Zilong Huang, Ziyu Zhu, Zuquan Song
We present Seed1. 5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning.
no code implementations • 6 May 2025 • Wei-Ting Chen, Yu-Jiet Vong, Yi-Tsung Lee, Sy-Yen Kuo, Qiang Gao, Sizhuo Ma, Jian Wang
To address this limitation, we introduce a novel VQA framework, DiffVQA, which harnesses the robust generalization capabilities of diffusion models pre-trained on extensive datasets.
1 code implementation • 2 May 2025 • Yujing Zhou, Marc L. Jacquet, Robel Dawit, Skyler Fabre, Dev Sarawat, Faheem Khan, Madison Newell, Yongxin Liu, Dahai Liu, Hongyun Chen, Jian Wang, Huihui Wang
The increasing automation of traffic management systems has made them prime targets for cyberattacks, disrupting urban mobility and public safety.
1 code implementation • 2 May 2025 • Yujing Zhou, Marc L. Jacquet, Robel Dawit, Skyler Fabre, Dev Sarawat, Faheem Khan, Madison Newell, Yongxin Liu, Dahai Liu, Hongyun Chen, Jian Wang, Huihui Wang
This paper presents our simulation of cyber-attacks and detection strategies on the traffic control system in Daytona Beach, FL.
no code implementations • 1 May 2025 • Aditya Arora, Zhengzhong Tu, YuFei Wang, Ruizheng Bai, Jian Wang, Sizhuo Ma
In this paper, we propose GuideSR, a novel single-step diffusion-based image super-resolution (SR) model specifically designed to enhance image fidelity.
no code implementations • 21 Apr 2025 • Chunjing Gan, Dan Yang, Binbin Hu, Ziqi Liu, Yue Shen, Zhiqiang Zhang, Jian Wang, Jun Zhou
Large language models (LLMs) have become a disruptive force in the industry, introducing unprecedented capabilities in natural language processing, logical reasoning and so on.
no code implementations • 19 Apr 2025 • Chen Guo, Zhuo Su, Jian Wang, Shuang Li, Xu Chang, Zhaohu Li, Yang Zhao, Guidong Wang, Ruqi Huang
Creating photorealistic 3D head avatars from limited input has become increasingly important for applications in virtual reality, telepresence, and digital entertainment.
no code implementations • 11 Apr 2025 • Jian Wang, Rishabh Dabral, Diogo Luvizon, Zhe Cao, Lingjie Liu, Thabo Beeler, Christian Theobalt
First, the IMU sensor inputs, the optional egocentric image, and text description of human motion are encoded into the latent space of a motion VQ-VAE.
1 code implementation • 10 Apr 2025 • Bo Zhang, Hui Ma, Dailin Li, Jian Ding, Jian Wang, Bo Xu, Hongfei Lin
Large language models (LLMs) demonstrate remarkable text comprehension and generation capabilities but often lack the ability to utilize up-to-date or domain-specific knowledge not included in their training data.
no code implementations • 10 Apr 2025 • Xinyang Zhou, Yongyong Ren, Qianqian Zhao, Daoyi Huang, Xinbo Wang, Tingting Zhao, Zhixing Zhu, Wenyuan He, Shuyuan Li, Yan Xu, Yu Sun, Yongguo Yu, Shengnan Wu, Jian Wang, Guangjun Yu, Dake He, Bo Ban, Hui Lu
Accurate diagnosis of Mendelian diseases is crucial for precision therapy and assistance in preimplantation genetic diagnosis.
no code implementations • 4 Apr 2025 • Sheng Yang, Tong Zhan, Shichen Qiao, Jicheng Gong, Qing Yang, YanFeng Lu, Jian Wang
In typical traffic scenarios like the VoD (View-of-Delft) dataset, experiments show that with reasonable inference speed, ZFusion achieved the state-of-the-art mAP (mean average precision) in the region of interest, while having competitive mAP in the entire area compared to the baseline methods, which demonstrates performance close to LiDAR and greatly outperforms those camera-only methods.
no code implementations • 3 Apr 2025 • Yunhao Lv, Lingyu Chen, Jian Wang, Yangxi Li, Fang Chen
In recent years, deep learning methods such as convolutional neural network (CNN) and transformers have made significant progress in CT multi-organ segmentation.
no code implementations • 2 Apr 2025 • Jian Wang, Zhuo Zhao, Zeng Jie Wang, Bo Da Cheng, Lei Nie, Wen Luo, Zhao Yuan Yu, Ling Wang Yuan
Geographic Question Answering (GeoQA) addresses natural language queries in geographic domains to fulfill complex user demands and improve information retrieval efficiency.
1 code implementation • 31 Mar 2025 • YuFei Wang, Lanqing Guo, Zhihao LI, Jiaxing Huang, Pichao Wang, Bihan Wen, Jian Wang
Text-guided image editing is an essential task that enables users to modify images through natural language descriptions.
no code implementations • 31 Mar 2025 • Jian Wang, Xin Lan, Jizhe Zhou, Yuxin Tian, Jiancheng Lv
Instead of direct quantization, we first map the input latent variables into a less entangled ``style'' space and apply quantization using a learnable codebook.
no code implementations • 29 Mar 2025 • Andrea Boscolo Camiletto, Jian Wang, Eduardo Alvarado, Rishabh Dabral, Thabo Beeler, Marc Habermann, Christian Theobalt
Egocentric motion capture with a head-mounted body-facing stereo camera is crucial for VR and AR applications but presents significant challenges such as heavy occlusions and limited annotated real-world data.
no code implementations • 20 Mar 2025 • Inwoo Hwang, Bing Zhou, Young Min Kim, Jian Wang, Chuan Guo
Modeling human-scene interactions (HSI) is essential for understanding and simulating everyday human behaviors.
no code implementations • 18 Mar 2025 • Yongqi Li, Lu Yang, Jian Wang, Runyang You, Wenjie Li, Liqiang Nie
Additionally, applying BPO to the MMSafe-PO dataset greatly reduces the base MLLM's unsafe rate on other safety benchmarks (14. 5% on MM-SafetyBench and 82. 9% on HarmEval, demonstrating the effectiveness and robustness of both the dataset and the approach.
1 code implementation • 17 Mar 2025 • Kewei Sui, Anindita Ghosh, Inwoo Hwang, Jian Wang, Chuan Guo
Humans inhabit a world defined by interactions -- with other humans, objects, and environments.
no code implementations • 15 Mar 2025 • Eric M. Chen, Di Liu, Sizhuo Ma, Michael Vasilkovsky, Bing Zhou, Qiang Gao, Wenzhou Wang, Jiahao Luo, Dimitris N. Metaxas, Vincent Sitzmann, Jian Wang
Our system is capable of producing 3D Gaussian avatars that support dynamic animation, including accurate facial expression transfer.
no code implementations • 14 Mar 2025 • Hiroyasu Akada, Jian Wang, Vladislav Golyanik, Christian Theobalt
Our experiments show that the new camera configurations with back views provide superior support for 3D pose tracking compared to only frontal placements.
1 code implementation • 13 Mar 2025 • Yunpeng Qu, Kun Yuan, Qizhi Xie, Ming Sun, Chao Zhou, Jian Wang
Inspired by the Human Visual System (HVS) that links global quality to the local texture of different regions and their visual saliency, we propose a Kaleidoscope Video Quality Assessment (KVQ) framework, which aims to effectively assess both saliency and local texture, thereby facilitating the assessment of global quality.
no code implementations • 8 Mar 2025 • Yanjun Chen, Yirong Sun, Xinghao Chen, Jian Wang, Xiaoyu Shen, Wenjie Li, Wei zhang
Chain-of-Thought (CoT) reasoning has proven effective in natural language tasks but remains underexplored in multimodal alignment.
1 code implementation • 20 Feb 2025 • Hanlin Wang, Jian Wang, Chak Tou Leong, Wenjie Li
To address this, we highlight the importance of timely calibration and the need to automatically construct calibration trajectories for training agents.
no code implementations • 19 Feb 2025 • Chak Tou Leong, Qingyu Yin, Jian Wang, Wenjie Li
The safety alignment of large language models (LLMs) remains vulnerable, as their initial behavior can be easily jailbroken by even relatively simple attacks.
1 code implementation • 18 Feb 2025 • Jian Wang, Yinpei Dai, Yichi Zhang, Ziqiao Ma, Wenjie Li, Joyce Chai
Intelligent tutoring agents powered by large language models (LLMs) have been increasingly explored to deliver personalized guidance in areas such as language learning and science education.
no code implementations • 11 Feb 2025 • Jusheng Zhang, Zimeng Huang, Yijia Fan, Ningyuan Liu, Mingyan Li, Zhuojie Yang, Jiawei Yao, Jian Wang, Keze Wang
As scaling large language models faces prohibitive costs, multi-agent systems emerge as a promising alternative, though challenged by static knowledge assumptions and coordination inefficiencies.
1 code implementation • 11 Feb 2025 • Christen Millerdurai, Hiroyasu Akada, Jian Wang, Diogo Luvizon, Alain Pagani, Didier Stricker, Christian Theobalt, Vladislav Golyanik
To address these limitations, we introduce EventEgo3D++, the first approach that leverages a monocular event camera with a fisheye lens for 3D human motion capture.
no code implementations • 3 Feb 2025 • HongXin Xie, Jiande Sun, Yi Shao, Shuai Li, Sujuan Hou, YuLong Sun, Jian Wang
Olfactory perception plays a critical role in both human and organismal interactions, yet understanding of its underlying mechanisms and influencing factors remain insufficient.
no code implementations • 23 Jan 2025 • Jian Wang, Xiaokang Zhang, Xianping Ma, Weikang Yu, Pedram Ghamisi
These informative prompts are able to identify the extent of landslide areas (box prompts) and denote the centers of landslide objects (point prompts), guiding SAM in landslide segmentation.
no code implementations • 28 Dec 2024 • Xingcheng Fu, Jian Wang, Yisen Gao, Qingyun Sun, Haonan Yuan, JianXin Li, Xianxian Li
CurvGIB advances the Variational Information Bottleneck (VIB) principle for Ricci curvature optimization to learn the optimal information transport pattern for specific downstream tasks.
no code implementations • 9 Dec 2024 • Howard Zhang, Yuval Alaluf, Sizhuo Ma, Achuta Kadambi, Jian Wang, Kfir Aberman
Face image restoration aims to enhance degraded facial images while addressing challenges such as diverse degradation types, real-time processing demands, and, most crucially, the preservation of identity-specific features.
no code implementations • 29 Nov 2024 • Xianfeng Tan, Yuhan Li, Wenxiang Shang, Yubo Wu, Jian Wang, Xuanhong Chen, Yi Zhang, Ran Lin, Bingbing Ni
Standard clothing asset generation involves creating forward-facing flat-lay garment images displayed on a clear background by extracting clothing information from diverse real-world contexts, which presents significant challenges due to highly standardized sampling distributions and precise structural requirements in the generated images.
no code implementations • 13 Oct 2024 • Aoqiang Wang, Jian Wang, Zhenyu Yan, Wenxiang Shang, Ran Lin, Zhao Zhang
In image editing tasks, high-quality text editing capabilities can significantly reduce human and material resource costs.
1 code implementation • 30 Sep 2024 • Dasong Li, Wenjie Li, Baili Lu, Hongsheng Li, Sizhuo Ma, Gurunandan Krishnan, Jian Wang
Understanding and modeling the popularity of User Generated Content (UGC) short videos on social media platforms presents a critical challenge with broad implications for content creators and recommendation systems.
no code implementations • 27 Sep 2024 • Lei LI, Zhifa Chen, Jian Wang, Bin Zhou, Guizhen Yu, Xiaoxuan Chen
Recently, the application of autonomous driving in open-pit mining has garnered increasing attention for achieving safe and efficient mineral transportation.
no code implementations • 22 Sep 2024 • Jianchun Chen, Jian Wang, yinda zhang, Rohit Pandey, Thabo Beeler, Marc Habermann, Christian Theobalt
Immersive VR telepresence ideally means being able to interact and communicate with digital avatars that are indistinguishable from and precisely reflect the behaviour of their real counterparts.
1 code implementation • 21 Sep 2024 • Jian Wang, Razieh Faghihpirayesh, Danny Joca, Polina Golland, Ali Gholipour
In this paper, we introduce a Universal Motion Correction (UniMo) framework, leveraging deep neural networks to tackle the challenges of motion correction across diverse imaging modalities.
no code implementations • 5 Sep 2024 • Hanlin Wang, Chak Tou Leong, Jian Wang, Wenjie Li
Language models are exhibiting increasing capability in knowledge utilization and reasoning.
no code implementations • 22 Aug 2024 • Xiaohan Wang, Xiaoyan Yang, Yuqi Zhu, Yue Shen, Jian Wang, Peng Wei, Lei Liang, Jinjie Gu, Huajun Chen, Ningyu Zhang
Large Language Models (LLMs) like GPT-4, MedPaLM-2, and Med-Gemini achieve performance competitively with human experts across various medical benchmarks.
no code implementations • 20 Aug 2024 • Jian Wang, Xin Lan, Yuxin Tian, Jiancheng Lv
Generative adversarial networks (GANs) have made impressive advances in image generation, but they often require large-scale training data to avoid degradation caused by discriminator overfitting.
no code implementations • 14 Aug 2024 • Jing Xiao, Wenrui Ding, Zeqi Shao, Duona Zhang, Yanan Ma, Yufeng Wang, Jian Wang
These challenges diminish the capability of RFFI methods in feature representation, complicating the effective identification of device identities.
1 code implementation • 12 Aug 2024 • Bohao Peng, Jian Wang, Yuechen Zhang, Wenbo Li, Ming-Chang Yang, Jiaya Jia
In this paper, we propose ControlNeXt: a powerful and efficient method for controllable image and video generation.
1 code implementation • 6 Aug 2024 • Hui Ma, Bo Zhang, Bo Xu, Jian Wang, Hongfei Lin, Xiao Sun
During reinforcement learning training, the proximal policy optimization algorithm is used to fine-tune the policy, enabling the generation of empathetic responses.
1 code implementation • 2 Aug 2024 • Yingying Zhang, Xin Guo, Jiangwei Lao, Lei Yu, Lixiang Ru, Jian Wang, Guo Ye, Huimei He, Jingdong Chen, Ming Yang
Once pre-trained, POA allows the extraction of pre-trained models of diverse sizes for downstream tasks.
no code implementations • 31 Jul 2024 • Zhe Liu, Xiliang Zhu, Tong Han, Yuhao Huang, Jian Wang, Lian Liu, Fang Wang, Dong Ni, Zhongshan Gou, Xin Yang
Since MR data is limited and has large intra-class variability, we propose an unsupervised out-of-distribution (OOD) detection method to identify MR rather than building a deep classifier.
Out-of-Distribution Detection
Out of Distribution (OOD) Detection
no code implementations • 30 Jul 2024 • Zhixiang Wang, Baiang Li, Jian Wang, Yu-Lun Liu, Jinwei Gu, Yung-Yu Chuang, Shin'ichi Satoh
This paper introduces an innovative approach for image matting that redefines the traditional regression-based task as a generative modeling challenge.
no code implementations • 29 Jul 2024 • Jian Wang, Razieh Faghihpirayesh, Polina Golland, Ali Gholipour
In this paper, we introduce SpaER, a pioneering method for fetal motion tracking that leverages equivariant filters and self-attention mechanisms to effectively learn spatio-temporal representations.
no code implementations • 25 Jul 2024 • Jian Wang, Jing Wang, Shenghui Rong, Bo He
Underwater monocular depth estimation serves as the foundation for tasks such as 3D reconstruction of underwater scenes.
no code implementations • 22 Jul 2024 • Ziyuan Huang, Kaixiang Ji, Biao Gong, Zhiwu Qing, Qinglong Zhang, Kecheng Zheng, Jian Wang, Jingdong Chen, Ming Yang
This paper introduces Chain-of-Sight, a vision-language bridge module that accelerates the pre-training of Multimodal Large Language Models (MLLMs).
no code implementations • 8 Jul 2024 • Luzhou Xu, Jaime Lien, Haiguang Li, Nicholas Gillian, Rajeev Nongpiur, Jihan Li, Qian Zhang, Jian Cui, David Jorgensen, Adam Bernstein, Lauren Bedal, Eiji Hayashi, Jin Yamanaka, Alex Lee, Jian Wang, D Shin, Ivan Poupyrev, Trausti Thormundsson, Anupam Pathak, Shwetak Patel
This study represents the first application of the noncontact HR detection technology to sleep and meditation tracking, offering a promising alternative to wearable devices for HR monitoring during sleep and meditation.
no code implementations • 1 Jul 2024 • Yanheng Wang, Xiaohan Yu, Yongsheng Gao, Jianjun Sha, Jian Wang, Lianru Gao, Yonggang Zhang, Xianhui Rong
In this paper, we propose an spectral Kolmogorov-Arnold Network for HSIs-CD (SpectralKAN).
no code implementations • 19 Jun 2024 • Yuhan Zhu, Jian Wang, Bing Li, Xuxian Tang, Hao Li, Neng Zhang, Yuqi Zhao
Experiments conducted on the dataset collected from the benchmark show that MicroCERCL can accurately localize the root cause of microservice systems in such environments, significantly outperforming state-of-the-art approaches with an increase of at least 24. 1% in top-1 accuracy.
1 code implementation • 14 Jun 2024 • Pap M. Corea, Yongxin Liu, Jian Wang, Shuteng Niu, Houbing Song
We trained all models to the accuracy of 90\% on the UNSW-NB15 Dataset.
Explainable artificial intelligence
Explainable Artificial Intelligence (XAI)
+4
1 code implementation • 14 Jun 2024 • Junwei Luo, Zhen Pang, Yongjun Zhang, Tingzhu Wang, LinLin Wang, Bo Dang, Jiangwei Lao, Jian Wang, Jingdong Chen, Yihua Tan, Yansheng Li
Remote Sensing Large Multi-Modal Models (RSLMMs) are developing rapidly and showcase significant capabilities in remote sensing imagery (RSI) comprehension.
1 code implementation • CVPR 2024 • Wei-Ting Chen, Yu-Jiet Vong, Sy-Yen Kuo, Sizhuo Ma, Jian Wang
Segment Anything Model (SAM) has emerged as a transformative approach in image segmentation, acclaimed for its robust zero-shot segmentation capabilities and flexible prompting system.
1 code implementation • CVPR 2024 • Wei-Ting Chen, Gurunandan Krishnan, Qiang Gao, Sy-Yen Kuo, Sizhuo Ma, Jian Wang
Generic Face Image Quality Assessment (GFIQA) evaluates the perceptual quality of facial images, which is crucial in improving image restoration algorithms and selecting high-quality face images for downstream tasks.
Ranked #1 on
Face Image Quality Assessment
on CGFIQA-40k
1 code implementation • 13 Jun 2024 • Baiang Li, Sizhuo Ma, Yanhong Zeng, Xiaogang Xu, Youqing Fang, Zhao Zhang, Jian Wang, Kai Chen
Capturing High Dynamic Range (HDR) scenery using 8-bit cameras often suffers from over-/underexposure, loss of fine details due to low bit-depth compression, skewed color distributions, and strong noise in dark areas.
no code implementations • 6 Jun 2024 • Lei Liu, Xiaoyan Yang, Junchi Lei, Yue Shen, Jian Wang, Peng Wei, Zhixuan Chu, Zhan Qin, Kui Ren
With the advent of Large Language Models (LLMs), medical artificial intelligence (AI) has experienced substantial technological progress and paradigm shifts, highlighting the potential of LLMs to streamline healthcare delivery and improve patient outcomes.
2 code implementations • 5 Jun 2024 • Qiang Chen, Xiangbo Su, Xinyu Zhang, Jian Wang, Jiahui Chen, Yunpeng Shen, Chuchu Han, Ziliang Chen, Weixiang Xu, Fanrong Li, Shan Zhang, Kun Yao, Errui Ding, Gang Zhang, Jingdong Wang
In this paper, we present a light-weight detection transformer, LW-DETR, which outperforms YOLOs for real-time object detection.
no code implementations • 25 May 2024 • Chak Tou Leong, Yi Cheng, Kaishuai Xu, Jian Wang, Hanlin Wang, Wenjie Li
In particular, we analyze the two most representative types of attack approaches: Explicit Harmful Attack (EHA) and Identity-Shifting Attack (ISA).
1 code implementation • 16 May 2024 • Bo Zhang, Hui Ma, Jian Ding, Jian Wang, Bo Xu, Hongfei Lin
Integrating multimodal knowledge into large language models (LLMs) represents a significant advancement in dialogue generation capabilities.
no code implementations • 6 May 2024 • Yingying Zhang, Chuangji Shi, Xin Guo, Jiangwei Lao, Jian Wang, Jiaotuan Wang, Jingdong Chen
The design of the query is crucial for the performance of DETR and its variants.
no code implementations • 30 Apr 2024 • Yuekun Dai, Dafeng Zhang, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Peiqing Yang, Zhezhu Jin, Guanqun Liu, Chen Change Loy, Lize Zhang, Shuai Liu, Chaoyu Feng, Luyang Wang, Shuan Chen, Guangqi Shao, Xiaotao Wang, Lei Lei, Qirui Yang, Qihua Cheng, Zhiqiang Xu, Yihao Liu, Huanjing Yue, Jingyu Yang, Florin-Alexandru Vasluianu, Zongwei Wu, George Ciubotariu, Radu Timofte, Zhao Zhang, Suiyi Zhao, Bo wang, Zhichao Zuo, Yanyan Wei, Kuppa Sai Sri Teja, Jayakar Reddy A, Girish Rongali, Kaushik Mitra, Zhihao Ma, Yongxu Liu, Wanying Zhang, Wei Shang, Yuhong He, Long Peng, Zhongxin Yu, Shaofei Luo, Jian Wang, Yuqi Miao, Baiang Li, Gang Wei, Rakshank Verma, Ritik Maheshwari, Rahul Tekchandani, Praful Hambarde, Satya Narayan Tazi, Santosh Kumar Vipparthi, Subrahmanyam Murala, Haopeng Zhang, Yingli Hou, Mingde Yao, Levin M S, Aniruth Sundararajan, Hari Kumar A
The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems.
no code implementations • 24 Apr 2024 • Yanjing Wu, Yinfu Feng, Jian Wang, WenJi Zhou, Yunan Ye, Rong Xiao, Jun Xiao
To overcome these problems, we introduce an efficient Hierarchical encoding-decoding Generative retrieval method (Hi-Gen) for large-scale personalized E-commerce search systems.
1 code implementation • CVPR 2024 • Christen Millerdurai, Hiroyasu Akada, Jian Wang, Diogo Luvizon, Christian Theobalt, Vladislav Golyanik
In response to the existing limitations, this paper 1) introduces a new problem, i. e., 3D human motion capture from an egocentric monocular event camera with a fisheye lens, and 2) proposes the first approach to it called EventEgo3D (EE3D).
no code implementations • 4 Apr 2024 • Yiming Zhang, Zhe Wang, Xinjie Li, Yunchen Yuan, Chengsong Zhang, Xiao Sun, Zhihang Zhong, Jian Wang
Human body restoration plays a vital role in various applications related to the human body.
no code implementations • 11 Mar 2024 • Debarshi Kundu, Archisman Ghosh, Srinivasan Ekambaram, Jian Wang, Nikolay Dokholyan, Swaroop Ghosh
We show that protein sequences can be thought of as sentences in natural language processing and can be parsed using the existing Quantum Natural Language framework into parameterized quantum circuits of reasonable qubits, which can be trained to solve various protein-related machine-learning problems.
1 code implementation • 10 Mar 2024 • Jian Wang, Dongding Lin, Wenjie Li
Inspired by decision-making theories in cognitive science, we propose a novel target-constrained bidirectional planning (TRIP) approach, which plans an appropriate dialogue path by looking ahead and looking back.
no code implementations • 18 Feb 2024 • Jian Wang, Xin Yang, Xiaohong Jia, Wufeng Xue, Rusi Chen, Yanlin Chen, Xiliang Zhu, Lian Liu, Yan Cao, Jianqiao Zhou, Dong Ni, Ning Gu
In this study, we proposed a multi-view contrastive self-supervised method to improve thyroid nodule classification and segmentation performance with limited manual labels.
1 code implementation • 10 Feb 2024 • Jian Wang, Chak Tou Leong, Jiashuo Wang, Dongding Lin, Wenjie Li, Xiao-Yong Wei
Tuning language models for dialogue generation has been a prevalent paradigm for building capable dialogue agents.
1 code implementation • 29 Jan 2024 • Qingpei Guo, Furong Xu, Hanxiao Zhang, Wang Ren, Ziping Ma, Lin Ju, Jian Wang, Jingdong Chen, Ming Yang
Vision-language foundation models like CLIP have revolutionized the field of artificial intelligence.
Ranked #1 on
Zero-Shot Transfer Image Classification
on ImageNet
(using extra training data)
Zero-Shot Cross-Modal Retrieval
Zero-shot Image Retrieval
+3
no code implementations • 19 Jan 2024 • Haowen Wang, Tao Sun, Kaixiang Ji, Jian Wang, Cong Fan, Jinjie Gu
We advance the field of Parameter-Efficient Fine-Tuning (PEFT) with our novel multi-adapter method, OrchMoE, which capitalizes on modular skill architecture for enhanced forward transfer in neural networks.
no code implementations • 12 Jan 2024 • Kaishuai Xu, Wenjun Hou, Yi Cheng, Jian Wang, Wenjie Li
Clinicians typically employ both intuitive and analytic reasoning to formulate a differential diagnosis.
no code implementations • 2 Jan 2024 • Yunpeng Qu, Zhilin Lu, Rui Zeng, Jintao Wang, Jian Wang
Modulated signals exhibit long temporal dependencies, and extracting global features is crucial in identifying modulation schemes.
no code implementations • CVPR 2024 • Yun-Hao Cao, Kaixiang Ji, Ziyuan Huang, Chuanyang Zheng, Jiajia Liu, Jian Wang, Jingdong Chen, Ming Yang
In this paper we present a vision-inspired vision-language connection module dubbed as VIVL which efficiently exploits the vision cue for VL models.
no code implementations • CVPR 2024 • Hiroyasu Akada, Jian Wang, Vladislav Golyanik, Christian Theobalt
Hence, existing methods often fail to accurately estimate complex 3D poses from egocentric views.
Ranked #3 on
Egocentric Pose Estimation
on UnrealEgo
no code implementations • 28 Dec 2023 • Pradyumna Chari, Sizhuo Ma, Daniil Ostashev, Achuta Kadambi, Gurunandan Krishnan, Jian Wang, Kfir Aberman
This approach ensures that personalization does not interfere with the restoration process, resulting in a natural appearance with high fidelity to the person's identity and the attributes of the degraded image.
1 code implementation • 19 Dec 2023 • Yi Cheng, Wenge Liu, Jian Wang, Chak Tou Leong, Yi Ouyang, Wenjie Li, Xian Wu, Yefeng Zheng
In recent years, there has been a growing interest in exploring dialogues with more complex goals, such as negotiation, persuasion, and emotional support, which go beyond traditional service-focused dialogue systems.
no code implementations • 16 Dec 2023 • Lebin Yu, Yunbo Qiu, Quanming Yao, Yuan Shen, Xudong Zhang, Jian Wang
We propose an active defense strategy, where agents automatically reduce the impact of potentially harmful messages on the final decision.
Multi-agent Reinforcement Learning
reinforcement-learning
+1
1 code implementation • CVPR 2024 • Xin Guo, Jiangwei Lao, Bo Dang, Yingying Zhang, Lei Yu, Lixiang Ru, Liheng Zhong, Ziyuan Huang, Kang Wu, Dingxiang Hu, Huimei He, Jian Wang, Jingdong Chen, Ming Yang, Yongjun Zhang, Yansheng Li
Prior studies on Remote Sensing Foundation Model (RSFM) reveal immense potential towards a generic model for Earth Observation.
Ranked #1 on
Zero-shot Classification (unified classes)
on AID
1 code implementation • 7 Dec 2023 • Tiantian Wang, Xinxin Zuo, Fangzhou Mu, Jian Wang, Ming-Hsuan Yang
To overcome these limitations, we leverage Neural Radiance Fields (NeRFs) to represent videos, conducting stylization in the rendered feature space.
no code implementations • CVPR 2024 • Jian Wang, Zhe Cao, Diogo Luvizon, Lingjie Liu, Kripasindhu Sarkar, Danhang Tang, Thabo Beeler, Christian Theobalt
In this work, we explore egocentric whole-body motion capture using a single fisheye camera, which simultaneously estimates human body and hand motion.
Ranked #1 on
Egocentric Pose Estimation
on GlobalEgoMocap Test Dataset
(using extra training data)
1 code implementation • 20 Nov 2023 • Ling Luo, Jinzhong Ning, Yingwen Zhao, Zhijun Wang, Zeyuan Ding, Peng Chen, Weiru Fu, Qinyu Han, Guangtao Xu, Yunzhi Qiu, Dinghao Pan, Jiru Li, Hao Li, Wenduo Feng, Senbo Tu, Yuqi Liu, Zhihao Yang, Jian Wang, Yuanyuan Sun, Hongfei Lin
The case study involving additional biomedical NLP tasks further shows Taiyi's considerable potential for bilingual biomedical multi-tasking.
1 code implementation • 14 Nov 2023 • Zhihang Zhong, Xiao Sun, Yu Qiao, Gurunandan Krishnan, Sizhuo Ma, Jian Wang
Existing video frame interpolation (VFI) methods blindly predict where each object is at a specific timestep t ("time indexing"), which struggles to predict precise object movements.
1 code implementation • NeurIPS 2023 • Junkun Yuan, Xinyu Zhang, Hao Zhou, Jian Wang, Zhongwei Qiu, Zhiyin Shao, Shaofeng Zhang, Sifan Long, Kun Kuang, Kun Yao, Junyu Han, Errui Ding, Lanfen Lin, Fei Wu, Jingdong Wang
To further capture human characteristics, we propose a structure-invariant alignment loss that enforces different masked views, guided by the human part prior, to be closely aligned for the same image.
1 code implementation • 31 Oct 2023 • Hui Ma, Jian Wang, Hongfei Lin, Bo Zhang, Yijia Zhang, Bo Xu
Emotion recognition in conversations (ERC), the task of recognizing the emotion of each utterance in a conversation, is crucial for building empathetic machines.
Ranked #1 on
Emotion Recognition in Conversation
on IEMOCAP
Emotion Recognition in Conversation
Multimodal Emotion Recognition
2 code implementations • 14 Oct 2023 • Chak Tou Leong, Yi Cheng, Jiashuo Wang, Jian Wang, Wenjie Li
Drawing on this idea, we devise a method to identify the toxification direction from the normal generation process to the one prompted with the negative prefix, and then steer the generation to the reversed direction by manipulating the information movement within the attention layers.
1 code implementation • 11 Oct 2023 • Jian Wang, Yi Cheng, Dongding Lin, Chak Tou Leong, Wenjie Li
Target-oriented dialogue systems, designed to proactively steer conversations toward predefined targets or accomplish specific system-side goals, are an exciting area in conversational AI.
no code implementations • 7 Oct 2023 • Jian Wang, Yue Zhuo
The visual anomaly diagnosis can automatically analyze the defective products, which has been widely applied in industrial quality inspection.
no code implementations • 6 Oct 2023 • Surjya Ray, Pratik Mehta, Hongen Zhang, Ada Chaman, Jian Wang, Chung-Jen Ho, Michael Chiou, Tashfeen Suleman
In this paper, we gauge the extent of the impact by evaluating the performance of LLMs for the task of medical coding on real-life noisy data.
1 code implementation • ICCV 2023 • Zhiyin Shao, Xinyu Zhang, Changxing Ding, Jian Wang, Jingdong Wang
In this way, the pre-training task and the T2I-ReID task are made consistent with each other on both data and training levels.
no code implementations • 26 Aug 2023 • Chaoyu Chen, Xin Yang, Rusi Chen, Junxuan Yu, Liwei Du, Jian Wang, Xindi Hu, Yan Cao, Yingying Liu, Dong Ni
In this paper, we introduce a novel Fourier-anchor-based DTS framework called Fourier Feature Pyramid Network (FFPN) to address the aforementioned issues.
no code implementations • 25 Aug 2023 • Cheng Zhong, Yicheng Ding, Husai Wang, Jikai Chen, Jian Wang, Yang Li
In this paper, a closed-loop model predictive controller is developed that minimizes the wind farm tracking errors, the dynamical fatigue load, and and the load equalization.
2 code implementations • ICCV 2023 • Huan Liu, Qiang Chen, Zichang Tan, Jiang-Jiang Liu, Jian Wang, Xiangbo Su, Xiaolong Li, Kun Yao, Junyu Han, Errui Ding, Yao Zhao, Jingdong Wang
State-of-the-art solutions adopt the DETR-like framework, and mainly develop the complex decoder, e. g., regarding pose estimation as keypoint box detection and combining with human detection in ED-Pose, hierarchically predicting with pose decoder and joint (keypoint) decoder in PETR.
no code implementations • 3 Aug 2023 • Feng Chen, Jiajia Liu, Kaixiang Ji, Wang Ren, Jian Wang, Jingdong Wang
Our BGA-MNER consists of \texttt{image2text} and \texttt{text2image} generation with respect to entity-salient content in two modalities.
1 code implementation • 1 Aug 2023 • Bo Zhang, Jian Wang, Hui Ma, Bo Xu, Hongfei Lin
To overcome this challenge, we propose an innovative multimodal framework, called ZRIGF, which assimilates image-grounded information for dialogue generation in zero-resource situations.
1 code implementation • 10 Jul 2023 • Meng Li, Yahan Yu, Yi Yang, Guanghao Ren, Jian Wang
In this paper, we propose a deep learning-based character stroke extraction method that takes semantic features and prior information of strokes into consideration.
no code implementations • 29 Jun 2023 • Zhongwei Qiu, Qiansheng Yang, Jian Wang, Xiyu Wang, Chang Xu, Dongmei Fu, Kun Yao, Junyu Han, Errui Ding, Jingdong Wang
One of the mainstream schemes for 2D human pose estimation (HPE) is learning keypoints heatmaps by a neural network.
no code implementations • 17 Jun 2023 • Jiajie Li, Jian Wang, Chen Wang, JinJun Xiong
In this paper, we present a novel approach for image harmonization by leveraging diffusion models.
no code implementations • 12 Jun 2023 • Jian Wang, Liang Qiao, Shichong Zhou, Jin Zhou, Jun Wang, Juncheng Li, Shihui Ying, Cai Chang, Jun Shi
To address this issue, a novel Two-Stage Detection and Diagnosis Network (TSDDNet) is proposed based on weakly supervised learning to enhance diagnostic accuracy of the ultrasound-based CAD for breast cancers.
no code implementations • 5 Jun 2023 • Yuhao Huang, Xin Yang, Xiaoqiong Huang, Xinrui Zhou, Haozhe Chi, Haoran Dou, Xindi Hu, Jian Wang, Xuedong Deng, Dong Ni
Second, we introduce a regularization technique that utilizes style interpolation consistency in the frequency space to encourage self-consistency in the logit space of the model output.
1 code implementation • 29 May 2023 • Kaishuai Xu, Wenjun Hou, Yi Cheng, Jian Wang, Wenjie Li
It extracts the medical entities and dialogue acts used in the dialogue history and models their transitions with an entity-centric graph flow and a sequential act flow, respectively.
1 code implementation • 9 May 2023 • Jian Wang, Dongding Lin, Wenjie Li
The key to achieving this task lies in planning dialogue paths that smoothly and coherently direct conversations towards the target.
1 code implementation • 10 Apr 2023 • Yanpeng Sun, Qiang Chen, Jian Wang, Jingdong Wang, Zechao Li
By doing this, the model can leverage the diverse knowledge stored in different parts of the model to improve its performance on new tasks.
no code implementations • 6 Apr 2023 • Xinyue Li, Jian Wang, Wei Song, Yanling Du, Zhixiang Liu
The mainstream researche in deep metric learning can be divided into two genres: proxy-based and pair-based methods.
1 code implementation • 4 Apr 2023 • Dong Huo, Jian Wang, Yiming Qian, Yee-Hong Yang
Instead of relying on naive end-to-end training, we also propose a novel architecture that integrates the physical relationship between the spectral reflectance and the corresponding RGB images into the network based on our mathematical analysis.
no code implementations • CVPR 2023 • Zhongwei Qiu, Yang Qiansheng, Jian Wang, Haocheng Feng, Junyu Han, Errui Ding, Chang Xu, Dongmei Fu, Jingdong Wang
To handle the variances of objects as time proceeds, a novel scheme of progressive decoding is used to update pose and shape queries at each frame.
Ranked #32 on
3D Human Pose Estimation
on 3DPW
no code implementations • 9 Mar 2023 • Wenkai Tan, Justus Renkhoff, Alvaro Velasquez, Ziyu Wang, Lusi Li, Jian Wang, Shuteng Niu, Fan Yang, Yongxin Liu, Houbing Song
Our work could provide a useful tool to defend against certain adversarial attacks on deep neural networks.
1 code implementation • 8 Mar 2023 • Justus Renkhoff, Wenkai Tan, Alvaro Velasquez, illiam Yichen Wang, Yongxin Liu, Jian Wang, Shuteng Niu, Lejla Begic Fazlic, Guido Dartmann, Houbing Song
Finally, we demonstrate that the layers $Block4\_conv1$ and $Block5\_cov1$ of the VGG-16 model are more susceptible to adversarial attacks.
no code implementations • 8 Mar 2023 • Jian Wang, Jiarui Xing, Jason Druzgal, William M. Wells III, Miaomiao Zhang
This paper presents a novel predictive model, MetaMorph, for metamorphic registration of images with appearance changes (i. e., caused by brain tumors).
no code implementations • 25 Feb 2023 • Zhichao Liu, Leshan Wang, Desen Zhou, Jian Wang, Songyang Zhang, Yang Bai, Errui Ding, Rui Fan
To deal with these issues, we propose an attention based approach which we call \textit{temporal segment transformer}, for joint segment relation modeling and denoising.
no code implementations • 23 Feb 2023 • Zhixiang Wang, Yu-Lun Liu, Jia-Bin Huang, Shin'ichi Satoh, Sizhuo Ma, Gurunandan Krishnan, Jian Wang
Close-up facial images captured at short distances often suffer from perspective distortion, resulting in exaggerated facial features and unnatural/unattractive appearances.
1 code implementation • 22 Feb 2023 • Congzhou M. Sha, Jian Wang, Nikolay V. Dokholyan
Molecular dynamics is the primary computational method by which modern structural biology explores macromolecule structure and function.
no code implementations • 18 Feb 2023 • Yunbo Qiu, Yue Jin, Lebin Yu, Jian Wang, Xudong Zhang
Multi-agent reinforcement learning (MARL) has achieved great progress in cooperative tasks in recent years.
Multi-agent Reinforcement Learning
reinforcement-learning
+2
no code implementations • 10 Feb 2023 • Lebin Yu, Yunbo Qiu, Qiexiang Wang, Xudong Zhang, Jian Wang
Communication in multi-agent reinforcement learning has been drawing attention recently for its significant role in cooperation.
Multi-agent Reinforcement Learning
reinforcement-learning
+2
no code implementations • 30 Jan 2023 • Xintao Chu, Jianping Liu, Jian Wang, XiaoFeng Wang, Yingfei Wang, Meng Wang, Xunxun Gu
As the number of open and shared scientific datasets on the Internet increases under the open science movement, efficiently retrieving these datasets is a crucial task in information retrieval (IR) research.
1 code implementation • 26 Jan 2023 • Xiaohu Huang, Hao Zhou, Jian Wang, Haocheng Feng, Junyu Han, Errui Ding, Jingdong Wang, Xinggang Wang, Wenyu Liu, Bin Feng
In this paper, we propose a graph contrastive learning framework for skeleton-based action recognition (\textit{SkeletonGCL}) to explore the \textit{global} context across all sequences.
Ranked #17 on
Skeleton Based Action Recognition
on NTU RGB+D
no code implementations • 19 Jan 2023 • Yingfei Wang, Jianping Liu, Jian Wang, XiaoFeng Wang, Meng Wang, Xintao Chu
In this paper, We use Transformer as the backbone network of feature extraction, add filter layer innovatively, and propose a new Filter-Enhanced Transformer Click Model (FE-TCM) for web search.
1 code implementation • CVPR 2023 • Brevin Tilmon, Zhanghao Sun, Sanjeev J. Koppal, Yicheng Wu, Georgios Evangelidis, Ramzi Zahreddine, Gurunandan Krishnan, Sizhuo Ma, Jian Wang
Active depth sensing achieves robust depth estimation but is usually limited by the sensing range.
no code implementations • ICCV 2023 • Kaixiang Ji, Feng Chen, Xin Guo, Yadong Xu, Jian Wang, Jingdong Chen
Image manipulation detection (IMD) is of vital importance as faking images and spreading misinformation can be malicious and harm our daily life.
no code implementations • CVPR 2023 • Jiangwei Lao, Weixiang Hong, Xin Guo, Yingying Zhang, Jian Wang, Jingdong Chen, Wei Chu
In this work, we propose a novel feature enhancement network to simultaneously model short- and long-term temporal correlation.
no code implementations • ICCV 2023 • Jinhao Du, Shan Zhang, Qiang Chen, Haifeng Le, Yanpeng Sun, Yao Ni, Jian Wang, Bin He, Jingdong Wang
To provide precise information for the query image, the prototype is decoupled into task-specific ones, which provide tailored guidance for 'where to look' and 'what to look for', respectively.
1 code implementation • CVPR 2023 • Jian Wang, Lingjie Liu, Weipeng Xu, Kripasindhu Sarkar, Diogo Luvizon, Christian Theobalt
To this end, we propose an egocentric depth estimation network to predict the scene depth map from a wide-view egocentric fisheye camera while mitigating the occlusion of the human body with a depth-inpainting network.
Ranked #3 on
Egocentric Pose Estimation
on GlobalEgoMocap Test Dataset
(using extra training data)
no code implementations • 15 Dec 2022 • Dongding Lin, Jian Wang, Wenjie Li
Inspired by collaborative filtering, we propose a collaborative augmentation (COLA) method to simultaneously improve both item representation learning and user preference modeling to address these issues.
no code implementations • 5 Dec 2022 • Yourui Huangfu, Jian Wang, Shengchen Dai, Rong Li, Jun Wang, Chongwen Huang, Zhaoyang Zhang
The statistical data hinder the trained AI models from further fine-tuning for a specific scenario, and ray-tracing data with limited environments lower down the generalization capability of the trained AI models.
1 code implementation • 4 Dec 2022 • Junho Kim, Young Min Kim, Yicheng Wu, Ramzi Zahreddine, Weston A. Welge, Gurunandan Krishnan, Sizhuo Ma, Jian Wang
We present a robust, privacy-preserving visual localization algorithm using event cameras.
no code implementations • 20 Nov 2022 • Jianqiang Huang, Jian Wang, Qianru Sun, Hanwang Zhang
An intuitive solution is ``coupling'' the CAM with the long-range attention matrix of visual transformers (ViT) We find that the direct ``coupling'', e. g., pixel-wise multiplication of attention and activation, achieves a more global coverage (on the foreground), but unfortunately goes with a great increase of false positives, i. e., background pixels are mistakenly included.
Weakly supervised Semantic Segmentation
Weakly-Supervised Semantic Segmentation
no code implementations • 17 Nov 2022 • Xinyu Zhang, Jiahui Chen, Junkun Yuan, Qiang Chen, Jian Wang, Xiaodi Wang, Shumin Han, Xiaokang Chen, Jimin Pi, Kun Yao, Junyu Han, Errui Ding, Jingdong Wang
That is to say, the smaller the model, the lower the mask ratio needs to be.
no code implementations • arXiv 2022 • Qiang Chen, Jian Wang, Chuchu Han, Shan Zhang, Zexian Li, Xiaokang Chen, Jiahui Chen, Xiaodi Wang, Shuming Han, Gang Zhang, Haocheng Feng, Kun Yao, Junyu Han, Errui Ding, Jingdong Wang
The training process consists of self-supervised pretraining and finetuning a ViT-Huge encoder on ImageNet-1K, pretraining the detector on Object365, and finally finetuning it on COCO.
Ranked #7 on
Object Detection
on COCO test-dev
no code implementations • 2 Nov 2022 • Jian Wang, Xi Wang, Chaoqun Ma, Lei Kou
With the advent of the electric power big data era, semantic interoperability and interconnection of power data have received extensive attention.
1 code implementation • 25 Oct 2022 • Jian Wang, Miaomiao Zhang
We introduce a newly designed framework that (i) simultaneously derives features from both image and latent shape spaces with large intra-class variations; and (ii) gains increased model interpretability by allowing direct access to the underlying geometric features of image data.
no code implementations • 14 Oct 2022 • Srikrishna Jaganathan, Maximilian Kukla, Jian Wang, Karthik Shetty, Andreas Maier
Deep Learning-based 2D/3D registration enables fast, robust, and accurate X-ray to CT image fusion when large annotated paired datasets are available for training.
1 code implementation • 13 Oct 2022 • Jian Wang, Chenhui Gou, Qiman Wu, Haocheng Feng, Junyu Han, Errui Ding, Jingdong Wang
Recently, transformer-based networks have shown impressive results in semantic segmentation.
Ranked #2 on
Real-Time Semantic Segmentation
on CamVid
4 code implementations • 13 Oct 2022 • Jian Wang, Xiang Long, Guowei Chen, Zewu Wu, Zeyu Chen, Errui Ding
Therefore, we designed a U-shaped High-Resolution Network (U-HRNet), which adds more stages after the feature map with strongest semantic representation and relaxes the constraint in HRNet that all resolutions need to be calculated parallel for a newly added stage.
no code implementations • 4 Oct 2022 • Qing Xue, Yi-Jing Liu, Yao Sun, Jian Wang, Li Yan, Gang Feng, Shaodan Ma
Deploying ultra-dense networks that operate on millimeter wave (mmWave) band is a promising way to address the tremendous growth on mobile data traffic.
no code implementations • 3 Oct 2022 • Ruomei Qi, Jiarong Li, Jin Lin, Yonghua Song, Jiepeng Wang, Qiangqiang Cui, Yiwei Qiu, Ming Tang, Jian Wang
This paper focuses on the design of the PID temperature controller for an alkaline electrolysis system to achieve fast and stable temperature control.
no code implementations • 17 Sep 2022 • Yunbo Qiu, Yuzhu Zhan, Yue Jin, Jian Wang, Xudong Zhang
By pretraining with non-expert demonstrations, PwD-MARL improves sample efficiency in the process of online MARL with a warm start.
Multi-agent Reinforcement Learning
reinforcement-learning
+1
no code implementations • 17 Sep 2022 • Yunbo Qiu, Yue Jin, Jian Wang, Xudong Zhang
Flocking control is a challenging problem, where multiple agents, such as drones or vehicles, need to reach a target position while maintaining the flock and avoiding collisions with obstacles and collisions among agents in the environment.
Multi-agent Reinforcement Learning
reinforcement-learning
+3
no code implementations • 19 Aug 2022 • Tailin Chen, Desen Zhou, Jian Wang, Shidong Wang, Qian He, Chuanyang Hu, Errui Ding, Yu Guan, Xuming He
In this paper, we study the problem of one-shot skeleton-based action recognition, which poses unique challenges in learning transferable representation from base classes to novel classes, particularly for fine-grained actions.
no code implementations • Complex & Intelligent Systems 2022 • Jie Lai, Xiaodan Wang, Qian Xiang, Jian Wang, Lei Lei
To address this problem, a novel Fisher extreme learning machine autoencoder (FELM-AE) is proposed and is used as the component for the multilayer Fisher extreme leaning machine (ML-FELM).
1 code implementation • 6 Aug 2022 • Jian Wang, Dongding Lin, Wenjie Li
Recommendation dialogue systems aim to build social bonds with users and provide high-quality recommendations.
no code implementations • 6 Aug 2022 • Zhongwei Qiu, Qiansheng Yang, Jian Wang, Dongmei Fu
In particular, we firstly formulate video frames as a series of instance-guided tokens and each token is in charge of predicting the 3D pose of a human instance.
Ranked #11 on
3D Multi-Person Pose Estimation
on Panoptic
(using extra training data)
no code implementations • 2 Aug 2022 • Hiroyasu Akada, Jian Wang, Soshi Shimada, Masaki Takahashi, Christian Theobalt, Vladislav Golyanik
We present UnrealEgo, i. e., a new large-scale naturalistic dataset for egocentric 3D human pose estimation.
Ranked #5 on
Egocentric Pose Estimation
on UnrealEgo
2 code implementations • ICCV 2023 • Qiang Chen, Xiaokang Chen, Jian Wang, Shan Zhang, Kun Yao, Haocheng Feng, Junyu Han, Errui Ding, Gang Zeng, Jingdong Wang
Detection transformer (DETR) relies on one-to-one assignment, assigning one ground-truth object to one prediction, for end-to-end detection without NMS post-processing.
1 code implementation • 25 Jul 2022 • Zhanghao Sun, Jian Wang, Yicheng Wu, Shree Nayar
Flash illumination is widely used in imaging under low-light environments.
no code implementations • 22 Jul 2022 • Zhongwei Qiu, Qiansheng Yang, Jian Wang, Dongmei Fu
Finally, the 3D poses are decoded according to dynamic decoding graphs for each detected person.
3D Multi-Person Pose Estimation (absolute)
3D Multi-Person Pose Estimation (root-relative)
+1
1 code implementation • 21 Jul 2022 • Teng Xi, Yifan Sun, Deli Yu, Bi Li, Nan Peng, Gang Zhang, Xinyu Zhang, Zhigang Wang, Jinwen Chen, Jian Wang, Lufei Liu, Haocheng Feng, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang
UFO aims to benefit each single task with a large-scale pretraining on all tasks.
1 code implementation • 19 Jul 2022 • Yang Bai, Desen Zhou, Songyang Zhang, Jian Wang, Errui Ding, Yu Guan, Yang Long, Jingdong Wang
Action Quality Assessment(AQA) is important for action understanding and resolving the task poses unique challenges due to subtle visual differences.
2 code implementations • 16 Jul 2022 • Zhiyin Shao, Xinyu Zhang, Meng Fang, Zhifeng Lin, Jian Wang, Changxing Ding
In PGU, we adopt a set of shared and learnable prototypes as the queries to extract diverse and semantically aligned features for both modalities in the granularity-unified feature space, which further promotes the ReID performance.
no code implementations • 30 Jun 2022 • Xinxin Zhou, Jingru Feng, Jian Wang, Jianhong Pan
In this method, the integrated power is decomposed into individual device power by non-intrusive load monitoring, and the power of individual appliances is predicted separately using a federated deep learning model.
no code implementations • 19 Jun 2022 • Pengfei Zhang, Xiaohui Hu, Kaidong Yu, Jian Wang, Song Han, Cao Liu, Chunyang Yuan
Firstly, we build an evaluation metric composed of 5 groups of parallel sub-metrics called Multi-Metric Evaluation (MME) to evaluate the quality of dialogue comprehensively.
no code implementations • 18 Jun 2022 • Zhanghao Sun, Yu Zhang, Yicheng Wu, Dong Huo, Yiming Qian, Jian Wang
We propose three applications using our redundancy codes: (1) Self error-correction for SL imaging under strong ambient light, (2) Error detection for adaptive reconstruction under global illumination, and (3) Interference filtering with device-specific projection sequence encoding, especially for event camera-based SL and light curtain devices.
1 code implementation • 13 Jun 2022 • Yanpeng Sun, Qiang Chen, Xiangyu He, Jian Wang, Haocheng Feng, Junyu Han, Errui Ding, Jian Cheng, Zechao Li, Jingdong Wang
In this paper, we rethink the paradigm and explore a new regime: {\em fine-tuning a small part of parameters in the backbone}.
Ranked #15 on
Few-Shot Semantic Segmentation
on COCO-20i (1-shot)
no code implementations • 7 Jun 2022 • Haodong Yuan, Yudong Zhang, Shengyin Fan, Xue Li, Jian Wang
The integration of a SLAM algorithm with place recognition technology empowers it with the ability to mitigate accumulated errors and to relocalize itself.
no code implementations • IEEE Transactions on Dependable and Secure Computing 2022 • Mingfu Xue, Can He, Jian Wang, and Weiqiang Liu
In this article, for the first time, we propose two advanced backdoor attacks, the multi-target backdoor attacks and multi-trigger backdoor attacks: 1) One-to-N attack, where the attacker can trigger multiple backdoor targets by controlling the different intensities of the same backdoor; 2) N-to-One attack, where such attack is triggered only when all the N backdoors are satisfied.
no code implementations • CVPR 2022 • Desen Zhou, Zhichao Liu, Jian Wang, Leshan Wang, Tao Hu, Errui Ding, Jingdong Wang
To associate the predictions of disentangled decoders, we first generate a unified representation for HOI triplets with a base decoder, and then utilize it as input feature of each disentangled decoder.
1 code implementation • 20 Apr 2022 • Guowei Chen, Yi Liu, Jian Wang, Juncai Peng, Yuying Hao, Lutao Chu, Shiyu Tang, Zewu Wu, Zeyu Chen, Zhiliang Yu, Yuning Du, Qingqing Dang, Xiaoguang Hu, dianhai yu
Also, we propose a semantic context branch (SCB) that adopts a semantic segmentation subtask.
Ranked #4 on
Image Matting
on Distinctions-646
1 code implementation • CVPR 2022 • Xinyu Zhang, Dongdong Li, Zhigang Wang, Jian Wang, Errui Ding, Javen Qinfeng Shi, Zhaoxiang Zhang, Jingdong Wang
Specifically, we generate support samples from actual samples and their neighbouring clusters in the embedding space through a progressive linear interpolation (PLI) strategy.
1 code implementation • 12 Apr 2022 • Dong Huo, Jian Wang, Yiming Qian, Yee-Hong Yang
Due to the large difference between the transmission property of visible light and that of the thermal energy through the glass where most glass is transparent to the visible light but opaque to thermal energy, glass regions of a scene are made more distinguishable with a pair of RGB and thermal images than solely with an RGB image.
3 code implementations • CVPR 2022 • Qiang Chen, Qiman Wu, Jian Wang, Qinghao Hu, Tao Hu, Errui Ding, Jian Cheng, Jingdong Wang
We propose MixFormer to find a solution.
no code implementations • 24 Mar 2022 • Xiaofei Xie, Tianlin Li, Jian Wang, Lei Ma, Qing Guo, Felix Juefei-Xu, Yang Liu
Inspired by software testing, a number of structural coverage criteria are designed and proposed to measure the test adequacy of DNNs.
1 code implementation • IEEE/ACM Transactions on Audio, Speech, and Language Processing 2022 • Bo Zhang, Jian Wang, Hongfei Lin, Hui Ma, Bo Xu
Correlation integration is designed to fully exploit the pairwise mutual information among dialogue context, knowledge, and responses, while overall integration adopts an integration gate to capture global information.
no code implementations • 14 Mar 2022 • Youming Deng, Yansheng Li, Yongjun Zhang, Xiang Xiang, Jian Wang, Jingdong Chen, Jiayi Ma
After the autonomous partition of coarse and fine predicates, the model is first trained on the coarse predicates and then learns the fine predicates.
no code implementations • 9 Mar 2022 • Donghui Hu, Yu Zhang, Cong Yu, Jian Wang, Yaofei Wang
Image steganography is the art and science of using images as cover for covert communications.
no code implementations • 27 Feb 2022 • Ruomei Qi, Jiarong Li, Jin Lin, Yonghua Song, Jiepeng Wang, Qiangqiang Cui, Yiwei Qiu, Ming Tang, Jian Wang
A control-oriented thermal model is established in the form of a third-order time-delay process, which is used for simulation and controller design.
no code implementations • 31 Jan 2022 • Mingfu Xue, Shifeng Ni, Yinghao Wu, Yushu Zhang, Jian Wang, Weiqiang Liu
Recent researches demonstrate that Deep Neural Networks (DNN) models are vulnerable to backdoor attacks.
no code implementations • CVPR 2022 • Jian Wang, Lingjie Liu, Weipeng Xu, Kripasindhu Sarkar, Diogo Luvizon, Christian Theobalt
Specifically, we first generate pseudo labels for the EgoPW dataset with a spatio-temporal optimization method by incorporating the external-view supervision.
Ranked #4 on
Egocentric Pose Estimation
on GlobalEgoMocap Test Dataset
(using extra training data)
no code implementations • 10 Jan 2022 • Guangdong Xue, Qin Chang, Jian Wang, Kai Zhang, Nikhil R. Pal
The effectiveness of the FSRE-AdaTSK is demonstrated on 19 datasets of which five are in more than 2000 dimension including two with dimension greater than 7000.
no code implementations • 3 Jan 2022 • Mingfu Xue, Xin Wang, Shichang Sun, Yushu Zhang, Jian Wang, Weiqiang Liu
After training, the backdoor attack against DNN is robust to image compression.
no code implementations • CVPR 2022 • Weixiang Hong, Jiangwei Lao, Wang Ren, Jian Wang, Jingdong Chen, Wei Chu
Instead of proposing a specific vision transformer based detector, in this work, our goal is to reveal the insights of training vision transformer based detectors from scratch.
no code implementations • 13 Dec 2021 • Nian Wu, Jian Wang, Miaomiao Zhang, Guixu Zhang, Yaxin Peng, Chaomin Shen
Registration-based atlas building often poses computational challenges in high-dimensional image spaces.
1 code implementation • ICCV 2021 • Yongri Piao, Jian Wang, Miao Zhang, Huchuan Lu
The multiple accurate cues from multiple DFs are then simultaneously propagated to the saliency network with a multi-guidance loss.
no code implementations • CVPR 2022 • Fangzhou Mu, Jian Wang, Yicheng Wu, Yin Li
Our key intuition is that style transfer and view synthesis have to be jointly modeled for this task.
no code implementations • 14 Nov 2021 • Yuzi Yan, Xiaoxiang Li, Xinyou Qiu, Jiantao Qiu, Jian Wang, Yu Wang, Yuan Shen
In this paper, we propose a distributed formation and obstacle avoidance method based on multi-agent reinforcement learning (MARL).
Model Predictive Control
Multi-agent Reinforcement Learning
+3
no code implementations • 3 Nov 2021 • Keyu Li, Yangxin Xu, Jian Wang, Dong Ni, Li Liu, Max Q. -H. Meng
Ultrasound (US) imaging is commonly used to assist in the diagnosis and interventions of spine diseases, while the standardized US acquisitions performed by manually operating the probe require substantial experience and training of sonographers.
no code implementations • 1 Nov 2021 • Na Zhao, Zhen Long, Zhi-Dan Zhao, Jian Wang
This implies that URIR can effectively use knowledge graph to obtain better user codes and item codes, thereby obtaining better recommendation results.
no code implementations • 12 Oct 2021 • Yongxin Liu, Yingjie Chen, Jian Wang, Shuteng Niu, Dahai Liu, Houbing Song
In this paper, we provide a deep learning framework for RF signal surveillance.
no code implementations • 2 Oct 2021 • Zhengpin Li, Zheng Wei, Zengfeng Huang, Xiaojun Mao, Jian Wang
In this paper, we propose a unified framework for ensuring a strong privacy guarantee of one-bit matrix completion with DP.
no code implementations • 1 Oct 2021 • Jiabin Liu, Zheng Wei, Zhengpin Li, Xiaojun Mao, Jian Wang, Zhongyu Wei, Qi Zhang
In this work, we propose a novel and general self-adaptive module, the Self-adaptive Attention Module (SAM), which adjusts the selection bias by capturing contextual information based on its representation.
no code implementations • 1 Oct 2021 • Zheng Wei, Zhengpin Li, Xiaojun Mao, Jian Wang
Tensor completion aims at filling the missing or unobserved entries based on partially observed tensors.
no code implementations • 15 Sep 2021 • Muyi Sun, Jian Wang, Yunfan Liu, Qi Li, Zhenan Sun
Biphasic facial age translation aims at predicting the appearance of the input face at any age.
no code implementations • 4 Sep 2021 • Yongri Piao, Jian Wang, Miao Zhang, Zhengxuan Ma, Huchuan Lu
Despite of the success of previous works, explorations on an effective training strategy for the saliency network and accurate matches between image-level annotations and salient objects are still inadequate.
1 code implementation • ICCV 2021 • Zhenchao Jin, Tao Gong, Dongdong Yu, Qi Chu, Jian Wang, Changhu Wang, Jie Shao
To address this, this paper proposes to mine the contextual information beyond individual images to further augment the pixel representations.
no code implementations • 18 Aug 2021 • Zachary Tauscher, Yushan Jiang, Kai Zhang, Jian Wang, Houbing Song
With massive data being generated daily and the ever-increasing interconnectivity of the world's Internet infrastructures, a machine learning based intrusion detection system (IDS) has become a vital component to protect our economic and national security.
no code implementations • 11 Aug 2021 • Shuangchi He, Zehui Lin, Xin Yang, Chaoyu Chen, Jian Wang, Xue Shuang, Ziwei Deng, Qin Liu, Yan Cao, Xiduo Lu, Ruobing Huang, Nishant Ravikumar, Alejandro Frangi, Yuanji Zhang, Yi Xiong, Dong Ni
In this study, we build a novel multi-label learning (MLL) scheme to identify multiple standard planes and corresponding anatomical structures of fetus simultaneously.
1 code implementation • 10 Aug 2021 • Tailin Chen, Desen Zhou, Jian Wang, Shidong Wang, Yu Guan, Xuming He, Errui Ding
The task of skeleton-based action recognition remains a core challenge in human-centred scene understanding due to the multiple granularities and large variation in human motion.
no code implementations • 2 Aug 2021 • Yuhao Huang, Xin Yang, Yuxin Zou, Chaoyu Chen, Jian Wang, Haoran Dou, Nishant Ravikumar, Alejandro F Frangi, Jianqiao Zhou, Dong Ni
Weakly-supervised segmentation (WSS) can help reduce time-consuming and cumbersome manual annotation.
no code implementations • 31 Jul 2021 • Jian Wang, Yourui Huangfu, Rong Li, Yiqun Ge, Jun Wang
The wireless network is undergoing a trend from "onnection of things" to "connection of intelligence".
no code implementations • 27 Jul 2021 • Chunyi Huang, Mingzhi Zhang, Chengmin Wang, Ning Xie, Jian Wang, Shi Peng
To accommodate the advent of microgrids (MG) managing distributed energy resources (DER) in distribution systems, an interactive two-stage joint retail electricity market mechanism is proposed to provide an effective platform for these prosumers to proactively join in retail transactions.
no code implementations • 21 Jul 2021 • Srikrishna Jaganathan, Jian Wang, Anja Borsdorf, Karthik Shetty, Andreas Maier
A refinement step using the classical optimization-based 2D/3D registration method applied in combination with Deep Learning-based techniques can provide the required accuracy.
1 code implementation • 12 Jul 2021 • Jian Wang, Miaomiao Zhang
This paper presents a novel hierarchical Bayesian model for unbiased atlas building with subject-specific regularizations of image registration.
no code implementations • CVPR 2021 • Jinhui Xiong, Jian Wang, Wolfgang Heidrich, Shree Nayar
We propose a new flash technique for low-light imaging, using deep-red light as an illuminating source.