no code implementations • ACL 2022 • Yubo Ma, Zehao Wang, Mukai Li, Yixin Cao, Meiqi Chen, Xinze Li, Wenqi Sun, Kunquan Deng, Kun Wang, Aixin Sun, Jing Shao
Events are fundamental building blocks of real-world happenings.
no code implementations • ECCV 2020 • Lida Li, Kun Wang, Shuai Li, Xiangchu Feng, Lei Zhang
The 2D convolutional (Conv2d) layer is the fundamental element to a deep convolutional neural network (CNN).
no code implementations • 20 Apr 2025 • Chongye Guo, Jinhu Fu, Junfeng Fang, Kun Wang, Guorui Feng
In this work, we establish the principles for backdoor attacks based on model editing, and propose a relationship-driven precise backdoor poisoning method, REDEditing.
no code implementations • 15 Apr 2025 • Yize Zhang, Tianshu Wang, Sirui Chen, Kun Wang, Xingyu Zeng, Hongyu Lin, Xianpei Han, Le Sun, Chaochao Lu
Large language models (LLMs) have demonstrated impressive capabilities and are receiving increasing attention to enhance their reasoning through scaling test--time compute.
no code implementations • 9 Apr 2025 • Junfeng Fang, Yukai Wang, Ruipeng Wang, Zijun Yao, Kun Wang, An Zhang, Xiang Wang, Tat-Seng Chua
The rapid advancement of multi-modal large reasoning models (MLRMs) -- enhanced versions of multimodal language models (MLLMs) equipped with reasoning capabilities -- has revolutionized diverse applications.
no code implementations • 18 Mar 2025 • Tao Yu, Yi-Fan Zhang, Chaoyou Fu, Junkang Wu, Jinda Lu, Kun Wang, Xingyu Lu, Yunhang Shen, Guibin Zhang, Dingjie Song, Yibo Yan, Tianlong Xu, Qingsong Wen, Zhang Zhang, Yan Huang, Liang Wang, Tieniu Tan
In this paper, we aim to provide a comprehensive and systematic review of alignment algorithms for MLLMs.
1 code implementation • 17 Mar 2025 • Qi Zhang, Xiuyuan Chen, Ziyi He, Kun Wang, Lianming Wu, Hongxing Shen, Jianqi Sun
However, existing UAD methods rely on curated normal datasets and their performance frequently deteriorates when applied to clinical datasets due to domain shifts.
1 code implementation • 13 Mar 2025 • Rongyao Fang, Chengqi Duan, Kun Wang, Linjiang Huang, Hao Li, Shilin Yan, Hao Tian, Xingyu Zeng, Rui Zhao, Jifeng Dai, Xihui Liu, Hongsheng Li
We present Generation Chain-of-Thought (GoT), a novel paradigm that enables generation and editing through an explicit language reasoning process before outputting images.
1 code implementation • 8 Mar 2025 • Qi Zhang, Xiuyuan Chen, Ziyi He, Lianming Wu, Kun Wang, Jianqi Sun, Hongxing Shen
The segmentation is followed by an expert-based diagnostic framework that automates the calculation of critical clinical indicators.
1 code implementation • 8 Mar 2025 • Qi Zhang, Shunan Zhang, Ziqi Zhao, Kun Wang, Jun Xu, Jianqi Sun
Osteoporotic vertebral compression fractures (VCFs) are prevalent in the elderly population, typically assessed on computed tomography (CT) scans by evaluating vertebral height loss.
no code implementations • 7 Mar 2025 • Qijiong Liu, Jieming Zhu, Lu Fan, Kun Wang, Hengchang Hu, Wei Guo, Yong liu, Xiao-Ming Wu
However, a comprehensive benchmark is needed to thoroughly evaluate and compare the recommendation capabilities of LLMs with traditional recommender systems.
no code implementations • 6 Mar 2025 • Junyuan Mao, Fanci Meng, Yifan Duan, Miao Yu, Xiaojun Jia, Junfeng Fang, Yuxuan Liang, Kun Wang, Qingsong Wen
Large Language Model based multi-agent systems are revolutionizing autonomous communication and collaboration, yet they remain vulnerable to security threats like unauthorized access and data breaches.
no code implementations • 1 Mar 2025 • Xinliang Zhou, Chenyu Liu, Zhisheng Chen, Kun Wang, Yi Ding, Ziyu Jia, Qingsong Wen
Brain foundation models (BFMs) have emerged as a transformative paradigm in computational neuroscience, offering a revolutionary framework for processing diverse neural signals across different brain-related tasks.
1 code implementation • 20 Feb 2025 • Wujiang Xu, Yunxiao Shi, Zujie Liang, Xuying Ning, Kai Mei, Kun Wang, Xi Zhu, Min Xu, Yongfeng Zhang
Traditional recommender systems usually take the user-platform paradigm, where users are directly exposed under the control of the platform's recommendation algorithms.
1 code implementation • 20 Feb 2025 • Zhenhong Zhou, Zherui Li, Jie Zhang, Yuanhe Zhang, Kun Wang, Yang Liu, Qing Guo
We evaluate Corba on two widely-used LLM-MASs, namely, AutoGen and Camel across various topologies and commercial models.
1 code implementation • 18 Feb 2025 • Pengyu Zhu, Zhenhong Zhou, Yuanhe Zhang, Shilinlu Yan, Kun Wang, Sen Su
As LLM-based agents become increasingly prevalent, backdoors can be implanted into agents through user queries or environment feedback, raising critical concerns regarding safety vulnerabilities.
no code implementations • 16 Feb 2025 • Shilong Wang, Guibin Zhang, Miao Yu, Guancheng Wan, Fanci Meng, Chongye Guo, Kun Wang, Yang Wang
Large Language Model (LLM)-based Multi-agent Systems (MAS) have demonstrated remarkable capabilities in various complex tasks, ranging from collaborative problem-solving to autonomous decision-making.
1 code implementation • 16 Feb 2025 • Yanwei Yue, Guibin Zhang, Boyang Liu, Guancheng Wan, Kun Wang, Dawei Cheng, Yiyan Qi
Multi-agent systems (MAS) powered by Large Language Models (LLMs) have been demonstrated to push the boundaries of LLM capabilities, yet they often incur significant costs and face challenges in dynamic LLM selection.
no code implementations • 13 Feb 2025 • Sibo Cheng, Marc Bocquet, Weiping Ding, Tobias Sebastian Finn, Rui Fu, Jinlong Fu, Yike Guo, Eleda Johnson, Siyi Li, Che Liu, Eric Newton Moro, Jie Pan, Matthew Piggott, Cesar Quilodran, Prakhar Sharma, Kun Wang, Dunhui Xiao, Xiao Xue, Yong Zeng, Mingrui Zhang, Hao Zhou, Kewei Zhu, Rossella Arcucci
This review is intended as a guidebook for computational scientists seeking to apply ML approaches to unstructured grid data in their domains, as well as for ML researchers looking to address challenges in computational physics.
no code implementations • 11 Feb 2025 • Kun Wang, Zhiqiang Yan, Junkai Fan, Jun Li, Jian Yang
Depth completion endeavors to reconstruct a dense depth map from sparse depth measurements, leveraging the information provided by a corresponding color image.
no code implementations • 11 Feb 2025 • Guibin Zhang, Kaijie Chen, Guancheng Wan, Heng Chang, Hong Cheng, Kun Wang, Shuyue Hu, Lei Bai
The past two years have witnessed the evolution of large language model (LLM)-based multi-agent systems from labor-intensive manual design to partial automation (\textit{e. g.}, prompt engineering, communication topology) and eventually to fully automated design.
1 code implementation • 6 Feb 2025 • Guibin Zhang, Luyang Niu, Junfeng Fang, Kun Wang, Lei Bai, Xiang Wang
Large Language Model (LLM)-empowered multi-agent systems extend the cognitive boundaries of individual agents through disciplined collaboration and interaction, while constructing these systems often requires labor-intensive manual designs.
1 code implementation • 1 Feb 2025 • Yuan Gao, Hao Wu, Ruiqi Shu, Huanshuo Dong, Fan Xu, Rui Chen, Yibo Yan, Qingsong Wen, Xuming Hu, Kun Wang, Jiahao Wu, Qing Li, Hui Xiong, Xiaomeng Huang
Accurate weather forecasts are important for disaster prevention, agricultural planning, and water resource management.
no code implementations • 5 Jan 2025 • Kun Wang, Kaiyan Chang, Mengdi Wang, Xinqi Zou, Haobo Xu, Yinhe Han, Ying Wang
Recent advances of large language models in the field of Verilog generation have raised several ethical and security concerns, such as code copyright protection and dissemination of malicious code.
no code implementations • 2 Jan 2025 • Kun Wang, Roberto Armellin, Adam Evans, Harry Holt, Zheng Chen
This approach ensures that all loss terms related to the control Lyapunov function are either naturally satisfied or replaced by the derived control policy.
1 code implementation • 28 Dec 2024 • Miao Yu, Junfeng Fang, Yingjie Zhou, Xing Fan, Kun Wang, Shirui Pan, Qingsong Wen
While safety-aligned large language models (LLMs) are increasingly used as the cornerstone for powerful systems such as multi-agent frameworks to solve complex real-world problems, they still suffer from potential adversarial queries, such as jailbreak attacks, which attempt to induce harmful content.
no code implementations • 26 Dec 2024 • Zhiqiang Yan, Zhengxue Wang, Kun Wang, Jun Li, Jian Yang
In this paper, we introduce the Selective Image Guided Network (SigNet), a novel degradation-aware framework that transforms depth completion into depth enhancement for the first time.
no code implementations • 16 Dec 2024 • Yibo Yan, Jiamin Su, Jianxiang He, Fangteng Fu, Xu Zheng, Yuanhuiyi Lyu, Kun Wang, Shen Wang, Qingsong Wen, Xuming Hu
We categorize the field into three dimensions: benchmarks, methodologies, and challenges.
no code implementations • 16 Dec 2024 • Junkai Fan, Kun Wang, Zhiqiang Yan, Xiang Chen, Shangbing Gao, Jun Li, Jian Yang
In this paper, we study the challenging problem of simultaneously removing haze and estimating depth from real monocular hazy videos.
no code implementations • 3 Dec 2024 • Yunkai Dang, Kaichen Huang, Jiahao Huo, Yibo Yan, Sirui Huang, Dongrui Liu, Mengxi Gao, Jie Zhang, Chen Qian, Kun Wang, Yong liu, Jing Shao, Hui Xiong, Xuming Hu
The rapid development of Artificial Intelligence (AI) has revolutionized numerous fields, with large language models (LLMs) and computer vision (CV) systems driving advancements in natural language understanding and visual processing, respectively.
1 code implementation • 1 Dec 2024 • Rongkun Zheng, Lu Qi, Xi Chen, Yi Wang, Kun Wang, Yu Qiao, Hengshuang Zhao
Recent DETR-based methods have advanced the development of Video Instance Segmentation (VIS) through transformers' efficiency and capability in modeling spatial and temporal information.
1 code implementation • 27 Nov 2024 • Chen Zhou, Peng Cheng, Junfeng Fang, Yifan Zhang, Yibo Yan, Xiaojun Jia, Yanyan Xu, Kun Wang, Xiaochun Cao
Multispectral object detection, utilizing RGB and TIR (thermal infrared) modalities, is widely recognized as a challenging task.
no code implementations • 8 Nov 2024 • Kun Wang, Sumanth Varambally, Duncan Watson-Parris, Yi-An Ma, Rose Yu
Many important phenomena in scientific fields such as climate, neuroscience, and epidemiology are naturally represented as spatiotemporal gridded data with complex interactions.
1 code implementation • 30 Oct 2024 • Cong Fu, Kun Wang, Jiahua Wu, Yizhou Chen, Guangda Huzhang, Yabo Ni, AnXiang Zeng, Zhiming Zhou
ResFlow is now fully deployed in the pre-rank module of Shopee Search.
no code implementations • 21 Oct 2024 • Miao Yu, Shilong Wang, Guibin Zhang, Junyuan Mao, Chenlong Yin, Qijiong Liu, Qingsong Wen, Kun Wang, Yang Wang
Large language models (LLMs) have empowered nodes within multi-agent networks with intelligence, showing growing applications in both academia and industry.
1 code implementation • 19 Oct 2024 • Kun Wang, Zhiqiang Yan, Junkai Fan, Wanlu Zhu, Xiang Li, Jun Li, Jian Yang
In this paper, we introduce DCDepth, a novel framework for the long-standing monocular depth estimation task.
1 code implementation • 17 Oct 2024 • Zhenhong Zhou, Haiyang Yu, Xinghua Zhang, Rongwu Xu, Fei Huang, Kun Wang, Yang Liu, Junfeng Fang, Yongbin Li
In light of this, recent research on safety mechanisms has emerged, revealing that when safety representations or component are suppressed, the safety capability of LLMs are compromised.
1 code implementation • 17 Oct 2024 • Rongyao Fang, Chengqi Duan, Kun Wang, Hao Li, Hao Tian, Xingyu Zeng, Rui Zhao, Jifeng Dai, Hongsheng Li, Xihui Liu
This work represents a significant step towards a truly unified MLLM capable of adapting to the granularity demands of various visual tasks.
1 code implementation • 17 Oct 2024 • Guibin Zhang, Haonan Dong, Yuchen Zhang, ZHIXUN LI, Dingshuo Chen, Kai Wang, Tianlong Chen, Yuxuan Liang, Dawei Cheng, Kun Wang
Training high-quality deep models necessitates vast amounts of data, resulting in overwhelming computational and memory demands.
no code implementations • 15 Oct 2024 • Guibin Zhang, Yanwei Yue, Xiangguo Sun, Guancheng Wan, Miao Yu, Junfeng Fang, Kun Wang, Tianlong Chen, Dawei Cheng
Recent advancements in large language model (LLM)-based agents have demonstrated that collective intelligence can significantly surpass the capabilities of individual agents, primarily due to well-crafted inter-agent communication topologies.
no code implementations • 12 Oct 2024 • Steve Hanneke, Kun Wang
With knowledge of $\mathcal{M}$, supposing that the true model $M\in \mathcal{M}$, the objective is to identify an arm $\hat{\pi}$ of near-maximal mean reward $f^M(\hat{\pi})$ with high probability in a bounded number of rounds.
no code implementations • 11 Oct 2024 • Peng Jiang, Kun Wang, Jiaxing Wang, Zeliang Feng, Shengjie Qiao, Runhuai Deng, Fengkai Zhang
GPR full-waveform inversion optimizes the subsurface property model iteratively to match the entire waveform information.
1 code implementation • 7 Oct 2024 • Guanyu Zhou, Yibo Yan, Xin Zou, Kun Wang, Aiwei Liu, Xuming Hu
These biases arise from the visual encoder and the Large Language Model (LLM) backbone, affecting the attention mechanism responsible for aligning multimodal inputs.
1 code implementation • 7 Oct 2024 • Kaichen Huang, Jiahao Huo, Yibo Yan, Kun Wang, Yutao Yue, Xuming Hu
In recent years, multimodal large language models (MLLMs) have significantly advanced, integrating more modalities into diverse applications.
no code implementations • 6 Oct 2024 • Yibo Yan, Shen Wang, Jiahao Huo, Hang Li, Boyan Li, Jiamin Su, Xiong Gao, Yi-Fan Zhang, Tianlong Xu, Zhendong Chu, Aoxiao Zhong, Kun Wang, Hui Xiong, Philip S. Yu, Xuming Hu, Qingsong Wen
As the field of Multimodal Large Language Models (MLLMs) continues to evolve, their potential to revolutionize artificial intelligence is particularly promising, especially in addressing mathematical reasoning tasks.
no code implementations • 3 Oct 2024 • Guibin Zhang, Yanwei Yue, ZHIXUN LI, Sukwon Yun, Guancheng Wan, Kun Wang, Dawei Cheng, Jeffrey Xu Yu, Tianlong Chen
Recent advancements in large language model (LLM)-powered agents have shown that collective intelligence can significantly outperform individual capabilities, largely attributed to the meticulously designed inter-agent communication topologies.
2 code implementations • 3 Oct 2024 • Junfeng Fang, Houcheng Jiang, Kun Wang, Yunshan Ma, Xiang Wang, Xiangnan He, Tat-Seng Chua
To address this, we introduce AlphaEdit, a novel solution that projects perturbation onto the null space of the preserved knowledge before applying it to the parameters.
1 code implementation • 2 Oct 2024 • Miao Yu, Junyuan Mao, Guibin Zhang, Jingheng Ye, Junfeng Fang, Aoxiao Zhong, Yang Liu, Yuxuan Liang, Kun Wang, Qingsong Wen
Research into the external behaviors and internal mechanisms of large language models (LLMs) has shown promise in addressing complex tasks in the physical world.
no code implementations • 29 Sep 2024 • Yifan Duan, Jian Zhao, Pengcheng, Junyuan Mao, Hao Wu, Jingyu Xu, Shilong Wang, Caoyuan Ma, Kai Wang, Kun Wang, Xuelong Li
To this end, we establish a causal framework for ST predictions, termed CaPaint, which targets to identify causal regions in data and endow model with causal reasoning ability in a two-stage process.
1 code implementation • 25 Sep 2024 • Jonathan E. Lee, Min Zhu, Ziqiao Xi, Kun Wang, Yanhua O. Yuan, Lu Lu
In addition, the generalization and extrapolation ability of nested Fourier-DeepONet beyond the training range has been thoroughly evaluated.
no code implementations • 14 Sep 2024 • Tobit Klug, Kun Wang, Stefan Ruschke, Reinhard Heckel
In this paper, we propose a deep learning-based test-time-training method for accurate motion estimation.
no code implementations • 23 Aug 2024 • Yi-Fan Zhang, Huanyu Zhang, Haochen Tian, Chaoyou Fu, Shuangqing Zhang, Junfei Wu, Feng Li, Kun Wang, Qingsong Wen, Zhang Zhang, Liang Wang, Rong Jin, Tieniu Tan
The challenges of perceiving high-resolution images and understanding complex real-world scenarios remain urgent issues to be addressed.
no code implementations • 12 Aug 2024 • Chenyu Liu, Xinliang Zhou, Yihao Wu, Yi Ding, Liming Zhai, Kun Wang, Ziyu Jia, Yang Liu
In this paper, we present a comprehensive survey of these studies, delivering a systematic review of graph-related methods in this field from a methodological perspective.
1 code implementation • 18 Jul 2024 • Rongkun Zheng, Lu Qi, Xi Chen, Yi Wang, Kun Wang, Yu Qiao, Hengshuang Zhao
To bridge the gap between image and video, in this work, we propose a new video segmentation task - video reasoning segmentation.
1 code implementation • 11 Jul 2024 • Kaiyan Chang, Zhirong Chen, Yunhao Zhou, Wenlong Zhu, Kun Wang, Haobo Xu, Cangyuan Li, Mengdi Wang, Shengwen Liang, Huawei Li, Yinhe Han, Ying Wang
Natural language interfaces have exhibited considerable potential in the automation of Verilog generation derived from high-level specifications through the utilization of large language models, garnering significant attention.
1 code implementation • 3 Jul 2024 • Jiahao Wu, Ning Lu, Zeiyu Dai, Kun Wang, Wenqi Fan, Shengcai Liu, Qing Li, Ke Tang
However, while existing graph condensation studies mainly focus on the best trade-off between graph size and the GNNs' performance (model utility), they overlook the security issues of graph condensation.
1 code implementation • 24 Jun 2024 • Sirui Chen, Mengying Xu, Kun Wang, Xingyu Zeng, Rui Zhao, Shengjie Zhao, Chaochao Lu
Causal reasoning is a cornerstone of how humans interpret the world.
1 code implementation • 18 Jun 2024 • Kun Wang, Guibin Zhang, Xinnan Zhang, Junfeng Fang, Xun Wu, Guohao Li, Shirui Pan, Wei Huang, Yuxuan Liang
Based on observations, we innovatively introduce the Heterophily Snowflake Hypothesis and provide an effective solution to guide and facilitate research on heterophilic graphs and beyond.
Ranked #3 on
Node Classification
on Texas
no code implementations • 5 Jun 2024 • Kun Wang, Yi-Rui Yang, Wu-Jun Li
Asynchronous federated learning (AFL) is an effective method to address the challenge of device heterogeneity in cross-device federated learning.
1 code implementation • 29 May 2024 • Huaiwu Zhang, Yutong Xia, Siru Zhong, Kun Wang, Zekun Tong, Qingsong Wen, Roger Zimmermann, Yuxuan Liang
In this study, we aim to collectively predict future PA across Singapore with complex factors from various domains.
no code implementations • 28 May 2024 • Wanlin Cai, Kun Wang, Hao Wu, Xiaoxu Chen, Yuankai Wu
The challenge of effectively learning inter-series correlations for multivariate time series forecasting remains a substantial and unresolved problem.
Ranked #51 on
Time Series Forecasting
on ETTh1 (336) Multivariate
1 code implementation • 23 May 2024 • Guibin Zhang, Xiangguo Sun, Yanwei Yue, Chonghe Jiang, Kun Wang, Tianlong Chen, Shirui Pan
Specifically, MoG incorporates multiple sparsifier experts, each characterized by unique sparsity levels and pruning criteria, and selects the appropriate experts for each node.
1 code implementation • 23 May 2024 • Feng Gu, Jie Lu, Zhen Fang, Kun Wang, Guangquan Zhang
Uncertain changes in data streams present challenges for machine learning models to dynamically adapt and uphold performance in real-time.
no code implementations • 22 May 2024 • Yuan Sui, Yufei He, Nian Liu, Xiaoxin He, Kun Wang, Bryan Hooi
A distinctive feature of our approach is its blend of natural language planning with beam search to optimize the selection of reasoning paths.
no code implementations • CVPR 2024 • Junkai Fan, Jiangwei Weng, Kun Wang, Yijun Yang, Jianjun Qian, Jun Li, Jian Yang
Firstly, we introduce a non-aligned reference frame matching module, leveraging an adaptive sliding window to match high-quality reference frames from clear videos.
no code implementations • 13 May 2024 • Shilong Wang, Hao Wu, Yifan Duan, Guibin Zhang, Guohao Li, Yuxuan Liang, Shirui Pan, Kun Wang, Yang Wang
This assumption often poses challenges for many GNNs working with heterophilic graphs.
no code implementations • 22 Apr 2024 • Kang Luo, Yuanshao Zhu, Wei Chen, Kun Wang, Zhengyang Zhou, Sijie Ruan, Yuxuan Liang
Trajectory modeling refers to characterizing human movement behavior, serving as a pivotal step in understanding mobility patterns.
2 code implementations • 25 Mar 2024 • Xixuan Hao, Wei Chen, Yibo Yan, Siru Zhong, Kun Wang, Qingsong Wen, Yuxuan Liang
Our UrbanVLP seamlessly integrates multi-granularity information from both macro (satellite) and micro (street-view) levels, overcoming the limitations of prior pretrained models.
no code implementations • CVPR 2024 • Zhiqiang Yan, Yuankai Lin, Kun Wang, Yupeng Zheng, YuFei Wang, Zhenyu Zhang, Jun Li, Jian Yang
Depth completion is a vital task for autonomous driving, as it involves reconstructing the precise 3D geometry of a scene from sparse and noisy depth measurements.
no code implementations • 18 Mar 2024 • Hao Wu, Fan Xu, Yifan Duan, Ziwei Niu, Weiyan Wang, Gaofeng Lu, Kun Wang, Yuxuan Liang, Yang Wang
This paper proposes a two-stage framework named ST-PAD for spatio-temporal fluid dynamics modeling in the field of earth sciences, aiming to achieve high-precision simulation and prediction of fluid dynamics through spatio-temporal physics awareness and parameter diffusion guidance.
1 code implementation • 17 Mar 2024 • Kaiyan Chang, Kun Wang, Nan Yang, Ying Wang, Dantong Jin, Wenlong Zhu, Zhirong Chen, Cangyuan Li, Hao Yan, Yunhao Zhou, Zhuoliang Zhao, Yuan Cheng, Yudong Pan, Yiqi Liu, Mengdi Wang, Shengwen Liang, Yinhe Han, Huawei Li, Xiaowei Li
Our 13B model (ChipGPT-FT) has a pass rate improvement compared with GPT-3. 5 in Verilog generation and outperforms in EDA script (i. e., SiliconCompiler) generation with only 200 EDA script data.
no code implementations • 5 Mar 2024 • Hao Wu, Haomin Wen, Guibin Zhang, Yutong Xia, Yuxuan Liang, Yu Zheng, Qingsong Wen, Kun Wang
In this paper, we introduce for the first time the concept of spatio-temporal data dynamic sparse training and are committed to adaptively, dynamically filtering important sensor distributions.
1 code implementation • 28 Feb 2024 • Ziying Pan, Kun Wang, Gang Li, Feihong He, Yongxuan Lai
The class-conditional image generation based on diffusion models is renowned for generating high-quality and diverse images.
Conditional Image Generation
parameter-efficient fine-tuning
no code implementations • 22 Feb 2024 • Yifan Duan, Guibin Zhang, Shilong Wang, Xiaojiang Peng, Wang Ziqi, Junyuan Mao, Hao Wu, Xinke Jiang, Kun Wang
Credit card fraud poses a significant threat to the economy.
no code implementations • 13 Feb 2024 • Simina Brânzei, Mohammadtaghi Hajiaghayi, Reed Phillips, Suho Shin, Kun Wang
Alice cuts the cake at a point of her choice, while Bob chooses the left piece or the right piece, leaving the remainder for Alice.
1 code implementation • 7 Feb 2024 • Tianle Zhang, Yuchen Zhang, Kun Wang, Kai Wang, Beining Yang, Kaipeng Zhang, Wenqi Shao, Ping Liu, Joey Tianyi Zhou, Yang You
Training on large-scale graphs has achieved remarkable results in graph representation learning, but its cost and storage have raised growing concerns.
1 code implementation • 6 Feb 2024 • Junfeng Fang, Shuai Zhang, Chang Wu, Zhengyi Yang, Zhiyuan Liu, Sihang Li, Kun Wang, Wenjie Du, Xiang Wang
Molecular Relational Learning (MRL), aiming to understand interactions between molecular pairs, plays a pivotal role in advancing biochemical research.
no code implementations • 6 Feb 2024 • Kun Wang, Hao Wu, Guibin Zhang, Junfeng Fang, Yuxuan Liang, Yuankai Wu, Roger Zimmermann, Yang Wang
In this paper, we address the issue of modeling and estimating changes in the state of the spatio-temporal dynamical systems based on a sequence of observations like video frames.
no code implementations • 5 Feb 2024 • Junfeng Fang, Xinglin Li, Yongduo Sui, Yuan Gao, Guibin Zhang, Kun Wang, Xiang Wang, Xiangnan He
Graph representation learning on vast datasets, like web data, has made significant strides.
no code implementations • 2 Feb 2024 • Guibin Zhang, Yanwei Yue, Kun Wang, Junfeng Fang, Yongduo Sui, Kai Wang, Yuxuan Liang, Dawei Cheng, Shirui Pan, Tianlong Chen
Specifically, GST initially constructs a topology & semantic anchor at a low training cost, followed by performing dynamic sparse training to align the sparse graph with the anchor.
no code implementations • 30 Jan 2024 • Kun Wang, Jiani Cao, Zimu Zhou, Zhenjiang Li
To this end, we develop SwapNet, an efficient DNN block swapping middleware for edge AI devices.
1 code implementation • 13 Dec 2023 • Hao Wu, Yuxuan Liang, Wei Xiong, Zhengyang Zhou, Wei Huang, Shilong Wang, Kun Wang
Efficiently modeling spatio-temporal (ST) physical processes and observations presents a challenging problem for the deep learning community.
1 code implementation • 12 Dec 2023 • Yupeng Hu, Han Jiang, Hao liu, Kun Wang, Haoyu Tang, Liqiang Nie
Recently, temporal action localization (TAL) has garnered significant interest in information retrieval community.
1 code implementation • NeurIPS 2023 • Rongkun Zheng, Lu Qi, Xi Chen, Yi Wang, Kun Wang, Yu Qiao, Hengshuang Zhao
What we possess are numerous isolated filed-specific datasets, thus, it is appealing to jointly train models across the aggregation of datasets to enhance data volume and diversity.
no code implementations • 27 Nov 2023 • Xinglin Li, Kun Wang, Hanhui Deng, Yuxuan Liang, Di wu
We seminally propose the concept of Shock Absorber (a type of perturbation) that enhances the robustness and stability of the original graphs against changes in an adversarial training fashion.
no code implementations • 16 Nov 2023 • Yuhan Sun, Mukai Li, Yixin Cao, Kun Wang, Wenxiao Wang, Xingyu Zeng, Rui Zhao
In response, we introduce ControlPE (Continuously Controllable Prompt Engineering).
1 code implementation • 11 Oct 2023 • Yuhe Liu, Changhua Pei, Longlong Xu, Bohan Chen, Mingze Sun, Zhirui Zhang, Yongqian Sun, Shenglin Zhang, Kun Wang, Haiming Zhang, Jianhui Li, Gaogang Xie, Xidao Wen, Xiaohui Nie, Minghua Ma, Dan Pei
Information Technology (IT) Operations (Ops), particularly Artificial Intelligence for IT Operations (AIOps), is the guarantee for maintaining the orderly and stable operation of existing information systems.
1 code implementation • 18 Sep 2023 • Tianyi Song, Jiuxin Cao, Kun Wang, Bo Liu, Xiaofeng Zhang
The current state-of-the-art method combines the features of historical captions, historical frames, and the current captions as conditions for generating the current frame.
no code implementations • 31 Aug 2023 • Si Liu, Chen Gao, Yuan Chen, Xingyu Peng, Xianghao Kong, Kun Wang, Runsheng Xu, Wentao Jiang, Hao Xiang, Jiaqi Ma, Miao Wang
Specifically, we analyze the performance changes of different methods under different bandwidths, providing a deep insight into the performance-bandwidth trade-off issue.
no code implementations • 19 Aug 2023 • Kun Wang, Zhiqiang Yan, Huang Tian, Zhenyu Zhang, Xiang Li, Jun Li, Jian Yang
Neural Radiance Fields (NeRF) have shown promise in generating realistic novel views from sparse scene images.
no code implementations • 19 Aug 2023 • Kun Wang, Guohao Li, Shilong Wang, Guibin Zhang, Kai Wang, Yang You, Xiaojiang Peng, Yuxuan Liang, Yang Wang
Despite Graph Neural Networks demonstrating considerable promise in graph representation learning tasks, GNNs predominantly face significant issues with over-fitting and over-smoothing as they go deeper as models of computer vision realm.
no code implementations • 13 Jun 2023 • Lan Wang, Ruiling He, Lili Zhao, Jia Wang, Zhengzi Geng, Tao Ren, Guo Zhang, Peng Zhang, Kaiqiang Tang, Chaofei Gao, Fei Chen, Liting Zhang, Yonghe Zhou, Xin Li, Fanbin He, Hui Huan, Wenjuan Wang, Yunxiao Liang, Juan Tang, Fang Ai, Tingyu Wang, Liyun Zheng, Zhongwei Zhao, Jiansong Ji, Wei Liu, Jiaojiao Xu, Bo Liu, Xuemei Wang, Yao Zhang, Qiong Yan, Muhan Lv, Xiaomei Chen, Shuhua Zhang, Yihua Wang, Yang Liu, Li Yin, Yanni Liu, Yanqing Huang, Yunfang Liu, Kun Wang, Meiqin Su, Li Bian, Ping An, Xin Zhang, Linxue Qian, Shao Li, Xiaolong Qi
Validation analysis revealed that the AUCs of DLRP were 0. 91 for GEV (95% CI 0. 90 to 0. 93, p < 0. 05) and 0. 88 for HRV (95% CI 0. 86 to 0. 89, p < 0. 01), which were significantly and robustly better than canonical risk indicators, including the value of LSM and SSM.
no code implementations • 8 Jun 2023 • Kun Wang, Zhiqiang Yan, Zhenyu Zhang, Xiang Li, Jun Li, Jian Yang
Our key contributions are: (1) We parameterize the geometry and appearance of the object using a multi-scale global feature extractor, which avoids frequent point-wise feature retrieval and camera dependency.
no code implementations • 8 Jun 2023 • Kun Wang, Tao Meng, Jiakun Lei, Weijia Wang
In order to address this issue, we propose a control strategy based on control barrier functions, summarized as "safety check on kinematics" and "velocity tracking on dynamics" approach.
no code implementations • 31 May 2023 • Jiakun Lei, Tao Meng, Yang Zhu, Kun Wang, Weijia Wang
To tackle this problem, we propose a modified framework called Compatible Performance Control (CPC), which integrates the Prescribed Performance Control (PPC) scheme with a contradiction detection and alleviation strategy.
no code implementations • 31 May 2023 • Jiakun Lei, Tao Meng, Kun Wang, Weijia Wang, Shujian Sun
Further, the basic intermittent attitude controller is extended to a "constrained version" by introducing a strictly bounded virtual control law and an input saturation compensation auxiliary system.
no code implementations • 30 May 2023 • Huahui Yi, Ziyuan Qin, Wei Xu, Miaotian Guo, Kun Wang, Shaoting Zhang, Kang Li, Qicheng Lao
To achieve this, we propose a Concept Embedding Search (ConES) approach by optimizing prompt embeddings -- without the need of the text encoder -- to capture the 'concept' of the image modality through a variety of task objectives.
no code implementations • 17 May 2023 • Guiyu Zhao, Bo Qiu, A-Li Luo, XIAOYU GUO, Lin Yao, Kun Wang, Yuanbo Liu
The Wide-field Infrared Survey Explorer (WISE) has detected hundreds of millions of sources over the entire sky.
1 code implementation • 12 May 2023 • Zhengqing Yuan, Yunhong He, Kun Wang, Yanfang Ye, Lichao Sun
However, a grand challenge of exploiting LLMs for multimodal learning is the size of pre-trained LLMs which are always with billions of parameters.
1 code implementation • CVPR 2023 • Zeren Chen, Gengshi Huang, Wei Li, Jianing Teng, Kun Wang, Jing Shao, Chen Change Loy, Lu Sheng
In this work, we present Siamese DETR, a Siamese self-supervised pretraining approach for the Transformer architecture in DETR.
no code implementations • 23 Mar 2023 • Shaobo Lin, Kun Wang, Xingyu Zeng, Rui Zhao
To construct a representative synthetic training dataset, we maximize the diversity of the selected images via a sample-based and cluster-based method.
no code implementations • 9 Mar 2023 • Yi-Rui Yang, Kun Wang, Wu-Jun Li
Based on ConSpar, we further propose a novel FL framework called FedREP, which is Byzantine-robust, communication-efficient and privacy-preserving.
no code implementations • 28 Feb 2023 • Shaobo Lin, Kun Wang, Xingyu Zeng, Rui Zhao
Specifically, we first discover the base images which contain the FP of novel categories and select a certain amount of samples from them for the base and novel categories balance.
no code implementations • 21 Feb 2023 • Kun Wang, Zi Wang, Zhang Li, Ang Su, Xichao Teng, Erting Pan, Minhao Liu, Qifeng Yu
Given the rapid development of this field, this paper presents a comprehensive survey of recent advances in oriented object detection.
no code implementations • 20 Nov 2022 • Zhiqiang Yan, Kun Wang, Xiang Li, Zhenyu Zhang, Jun Li, Jian Yang
Unsupervised depth completion aims to recover dense depth from the sparse one without using the ground-truth annotation.
no code implementations • 10 Nov 2022 • Jiakun Lei, Tao Meng, Kun Wang, Weijia Wang, Zhonghe Jin
The prescribed performance control (PPC) scheme is often employed for the control with guaranteed performance.
1 code implementation • 22 Oct 2022 • Hao Wang, Yixin Cao, Yangguang Li, Zhen Huang, Kun Wang, Jing Shao
Document-level natural language inference (DOCNLI) is a new challenging task in natural language processing, aiming at judging the entailment relationship between a pair of hypothesis and premise documents.
no code implementations • 22 Oct 2022 • Zhiying Xu, Jiafan Xu, Hongding Peng, Wei Wang, Xiaoliang Wang, Haoran Wan, Haipeng Dai, Yixu Xu, Hao Cheng, Kun Wang, Guihai Chen
Deep learning models rely on highly optimized tensor libraries for efficient inference on heterogeneous hardware.
no code implementations • 13 Sep 2022 • Kun Wang, William R. Johnson III, Shiyang Lu, Xiaonan Huang, Joran Booth, Rebecca Kramer-Bottiglio, Mridul Aanjaneya, Kostas Bekris
This strategy is based on a differentiable physics engine that can be trained given limited data from a real robot.
no code implementations • 29 May 2022 • Shiyang Lu, William R. Johnson III, Kun Wang, Xiaonan Huang, Joran Booth, Rebecca Kramer-Bottiglio, Kostas Bekris
To ensure that the pose estimates of rigid elements are physically feasible, i. e., they are not resulting in collisions between rods or with the environment, physical constraints are introduced during the optimization.
no code implementations • 8 May 2022 • Xueyuan Duan, Yu Fu, Kun Wang
To address the problem that traditional network traffic anomaly detection algorithms do not suffi-ciently mine potential features in long time domain, an anomaly detection method based on mul-ti-scale residual features of network traffic is proposed.
no code implementations • COLING 2022 • Meiqi Chen, Yixin Cao, Kunquan Deng, Mukai Li, Kun Wang, Jing Shao, Yan Zhang
In this paper, we propose a novel Event Relational Graph TransfOrmer (ERGO) framework for DECI, which improves existing state-of-the-art (SOTA) methods upon two aspects.
no code implementations • 12 Apr 2022 • Haonan Qiu, Siyu Chen, Bei Gan, Kun Wang, Huafeng Shi, Jing Shao, Ziwei Liu
Notably, our method is also validated to be robust to choices of majority and minority forgery approaches.
1 code implementation • 18 Mar 2022 • Zhiqiang Yan, Xiang Li, Kun Wang, Zhenyu Zhang, Jun Li, Jian Yang
To deal with the PDC task, we train a deep network that takes both depth and image as inputs for the dense panoramic depth recovery.
no code implementations • 18 Mar 2022 • Lida Li, Shuai Li, Kun Wang, Xiangchu Feng, Lei Zhang
2D convolution (Conv2d), which is responsible for extracting features from the input image, is one of the key modules of a convolutional neural network (CNN).
2 code implementations • 15 Mar 2022 • Yuanhan Zhang, Qinghong Sun, Yichun Zhou, Zexin He, Zhenfei Yin, Kun Wang, Lu Sheng, Yu Qiao, Jing Shao, Ziwei Liu
This work thus proposes a novel active learning framework for realistic dataset annotation.
Ranked #1 on
Image Classification
on Food-101
(using extra training data)
no code implementations • 28 Feb 2022 • Kun Wang, Mridul Aanjaneya, Kostas Bekris
A model of NASA's icosahedron SUPERballBot on MuJoCo is used as the ground truth system to collect training data.
1 code implementation • ACL 2022 • Yubo Ma, Zehao Wang, Yixin Cao, Mukai Li, Meiqi Chen, Kun Wang, Jing Shao
We have conducted extensive experiments on three benchmarks, including both sentence- and document-level EAE.
1 code implementation • NeurIPS 2021 • Yuan Liang, Weikun Han, Liang Qiu, Chen Wu, Yiting shao, Kun Wang, Lei He
In this work, we pioneer to study deep learning for dental forensic identification based on panoramic radiographs.
no code implementations • 16 Nov 2021 • Jing Shao, Siyu Chen, Yangguang Li, Kun Wang, Zhenfei Yin, Yinan He, Jianing Teng, Qinghong Sun, Mengya Gao, Jihao Liu, Gengshi Huang, Guanglu Song, Yichao Wu, Yuming Huang, Fenggang Liu, Huan Peng, Shuo Qin, Chengyu Wang, Yujie Wang, Conghui He, Ding Liang, Yu Liu, Fengwei Yu, Junjie Yan, Dahua Lin, Xiaogang Wang, Yu Qiao
Enormous waves of technological innovations over the past several years, marked by the advances in AI technologies, are profoundly reshaping the industry and the society.
no code implementations • 11 Oct 2021 • Shujun Liu, Hai Zhu, Kun Wang, Huajun Wang
For the phoneme encoder, based on the analysis that same phonemes corresponding to varying pitches can produce similar pronunciations, this encoder is followed by an adversarially trained pitch classifier to enforce the identical phonemes with different pitches mapping into the same phoneme feature space.
no code implementations • 30 Aug 2021 • Yuan Liang, Weinan Song, Jiawei Yang, Liang Qiu, Kun Wang, Lei He
Different from single object reconstruction from photos, this task has the unique challenge of constructing multiple objects at high resolutions.
1 code implementation • 22 Aug 2021 • Zhengyong Wang, Liquan Shen, Mei Yu, Kun Wang, Yufei Lin, Mai Xu
However, these methods ignore the significant domain gap between the synthetic and real data (i. e., interdomain gap), and thus the models trained on synthetic data often fail to generalize well to real underwater scenarios.
2 code implementations • ICCV 2021 • Kun Wang, Zhenyu Zhang, Zhiqiang Yan, Xiang Li, Baobei Xu, Jun Li, Jian Yang
Monocular depth estimation aims at predicting depth from a single image or video.
1 code implementation • 4 Aug 2021 • Suofei Zhang, Zirui Yin, Xiofu Wu, Kun Wang, Quan Zhou, Bin Kang
In this paper, we propose a lightweight Feature Pyramid Branch (FPB) to extract features from different layers of networks and aggregate them in a bidirectional pyramid structure.
Ranked #6 on
Person Re-Identification
on CUHK03 labeled
no code implementations • 29 Jul 2021 • Zhiqiang Yan, Kun Wang, Xiang Li, Zhenyu Zhang, Jun Li, Jian Yang
However, blurry guidance in the image and unclear structure in the depth still impede the performance of the image guided frameworks.
Ranked #2 on
Depth Completion
on KITTI Depth Completion
no code implementations • 24 May 2021 • Kun Wang, Jing Dong, Baoxiang Wang, Shuai Li, Shuo Shao
This paper studies \emph{differential privacy (DP)} and \emph{local differential privacy (LDP)} in cascading bandits.
no code implementations • 17 May 2021 • Jonas Kornprobst, Kun Wang, Gerhard Hamberger, Thomas F. Eibert
The wide half power beamwidth is achieved by suitably designed parasitic patches for the first resonant mode.
no code implementations • 17 Apr 2021 • Kun Wang, Canzhe Zhao, Shuai Li, Shuo Shao
We propose the novel \emph{conservative contextual combinatorial cascading bandit ($C^4$-bandit)}, a cascading online learning game which incorporates the conservative mechanism.
1 code implementation • 5 Feb 2021 • Hanqing Chao, Kun Wang, Yiwei He, Junping Zhang, Jianfeng Feng
In this paper, we present a novel perspective that utilizes gait as a deep set, which means that a set of gait frames are integrated by a global-local fused deep network inspired by the way our left- and right-hemisphere processes information to learn information that can be used in identification.
no code implementations • 2 Feb 2021 • Yuan Liang, Weinan Song, Jiawei Yang, Liang Qiu, Kun Wang, Lei He
Second, we can largely boost the robustness of existing ConvNets, proved by: (i) testing on scans with synthetic pathologies, and (ii) training and evaluation on scans of different scanning setups across datasets.
1 code implementation • 28 Dec 2020 • Kun Wang, Zhixin Song, Xuanqiang Zhao, Zihe Wang, Xin Wang
Firstly, it decomposes a positive map into a combination of quantum operations implementable on near-term quantum devices.
Quantum Physics Strongly Correlated Electrons
no code implementations • 23 Dec 2020 • Jiawei Yang, Yuan Liang, Yao Zhang, Weinan Song, Kun Wang, Lei He
The ability of deep learning to predict with uncertainty is recognized as key for its adoption in clinical routines.
no code implementations • 10 Nov 2020 • Kun Wang, Mridul Aanjaneya, Kostas Bekris
The results indicate that only 0. 25\% of ground truth data are needed to train a policy that works on the ground truth system when the differentiable engine is used for training against training the policy directly on the ground truth system.
no code implementations • 9 Nov 2020 • Kun Wang, Mridul Aanjaneya, Kostas Bekris
We propose a novel differentiable physics engine for system identification of complex spring-rod assemblies.
1 code implementation • EMNLP 2020 • Dandan Huang, Leyang Cui, Sen yang, Guangsheng Bao, Kun Wang, Jun Xie, Yue Zhang
Deep learning has led to significant improvement in text summarization with various methods investigated and improved ROUGE scores reported over the years.
no code implementations • 6 Oct 2020 • Yanchang Gao, Gang Ni, Kun Wang, Yiqing Liu, Chong He, Ronghong Jin, Xianling Liang
The timemodulated module is implemented by adding periodic phase modulation to 2-bit phase shifters, which is simpler without performance loss compared to existing SSB time-modulated method.
no code implementations • 13 Jun 2020 • Shengyun Peng, Yunxuan Yu, Kun Wang, Lei He
Specifically, a target object is defined by a bounding box center, tracking offset, and object size.
no code implementations • L4DC 2020 • Kun Wang, Mridul Aanjaneya, Kostas Bekris
We propose a novel differentiable physics engine for system identification of complex spring-rod assemblies.
no code implementations • 13 Apr 2020 • Kun Wang, WaiChing Sun, Qiang Du
The evaluation of constitutive models, especially for high-risk and high-regret engineering applications, requires efficient and rigorous third-party calibration, validation and falsification.
2 code implementations • 13 Apr 2020 • Kun Wang, Jun He, Lei Zhang
Recently, several attention mechanisms are proposed to handle the weakly labeled human activity data, which do not require accurate data annotation.
no code implementations • 18 Mar 2020 • Weinan Song, Yuan Liang, Jiawei Yang, Kun Wang, Lei He
In this paper, we propose a framework, named Oral-3D, to reconstruct the 3D oral cavity from a single PX image and prior information of the dental arch.
no code implementations • ECCV 2020 • Peng Su, Kun Wang, Xingyu Zeng, Shixiang Tang, Dapeng Chen, Di Qiu, Xiaogang Wang
Then this domain-vector is used to encode the features from another domain through a conditional normalization, resulting in different domains' features carrying the same domain attribute.
Ranked #1 on
Unsupervised Domain Adaptation
on SIM10K to BDD100K
no code implementations • 19 Feb 2020 • Weinan Song, Yuan Liang, Jiawei Yang, Kun Wang, Lei He
The encoder-decoder network is widely used to learn deep feature representations from pixel-wise annotations in biomedical image analysis.
no code implementations • 3 Feb 2020 • Xinhe Jiang, Kun Wang, Kaiyi Qian, Zhaozhong Chen, Zhiyu Chen, Liangliang Lu, Lijun Xia, Fangmin Song, Shining Zhu, Xiaosong Ma
We experimentally obtain the scaling parameter of $r=-0. 88\pm$0. 03 and $-0. 78\pm$0. 07 for nonadaptive and adaptive strategies, respectively.
Quantum Physics Optics
no code implementations • 7 Jan 2020 • Yinqiu Liu, Kai Qian, Jianli Chen, Kun Wang, Lei He
As an emerging technology, blockchain has achieved great success in numerous application scenarios, from intelligent healthcare to smart cities.
Cryptography and Security Distributed, Parallel, and Cluster Computing 68M14 C.2.2
no code implementations • 10 Oct 2019 • Yuan Liang, Weinan Song, J. P. Dym, Kun Wang, Lei He
Label propagation is a popular technique for anatomical segmentation.
no code implementations • 14 May 2019 • Yujia Chen, Yang Lou, Kun Wang, Matthew A. Kupinski, Mark A. Anastasio
In this work, a sparsity-driven observer (SDO) that can be employed to optimize hardware by use of a stochastic object model describing object sparsity is described and investigated.
no code implementations • 3 May 2019 • Wenmian Yang, Kun Wang, Na Ruan, Wenyuan Gao, Weijia Jia, Wei Zhao, Nan Liu, Yunyong Zhang
Finally, we gain the weight of each word by combining Semantic Weight (SW) and Inverse Document Frequency (IDF).
1 code implementation • NAACL 2019 • Kai Song, Yue Zhang, Heng Yu, Weihua Luo, Kun Wang, Min Zhang
Leveraging user-provided translation to constrain NMT has practical significance.
no code implementations • 24 Mar 2019 • Kun Wang, Jun He, Lei Zhang
Unlike images or videos data which can be easily labeled by human being, sensor data annotation is a time-consuming process.
no code implementations • 8 Mar 2019 • Kun Wang, WaiChing Sun, Qiang Du
We introduce a multi-agent meta-modeling game to generate data, knowledge, and models that make predictions on constitutive responses of elasto-plastic materials.
no code implementations • 24 Oct 2018 • Kun Wang, WaiChing Sun
This paper presents a new meta-modeling framework to employ deep reinforcement learning (DRL) to generate mechanical constitutive models for interfaces.
no code implementations • 15 Aug 2018 • Kun Wang
Ricean channel model is widely used in wireless communications to characterize the channels with a line-of-sight path.
no code implementations • 8 Apr 2018 • Xiaogang Cheng, Guoqing Liu, Anders Hedman, Kun Wang, Hai-Bo Li
We assume fog and haze cause blurred images and that fog and haze can be considered as a piecewise stationary signal.
no code implementations • ICCV 2017 • Wanli Ouyang, Kun Wang, Xin Zhu, Xiaogang Wang
In this CC-Net, there are many cascade stages.
1 code implementation • ICCV 2017 • Yikang Li, Wanli Ouyang, Bolei Zhou, Kun Wang, Xiaogang Wang
Object detection, scene graph generation and region captioning, which are three scene understanding tasks at different semantic levels, are tied together: scene graphs are generated on top of objects detected in an image with their pairwise relationship predicted, while region captioning gives a language description of the objects, their attributes, relations, and other context information.
Ranked #3 on
Object Detection
on Visual Genome
1 code implementation • 8 Oct 2016 • Xingyu Zeng, Wanli Ouyang, Junjie Yan, Hongsheng Li, Tong Xiao, Kun Wang, Yu Liu, Yucong Zhou, Bin Yang, Zhe Wang, Hui Zhou, Xiaogang Wang
The effectiveness of GBD-Net is shown through experiments on three object detection datasets, ImageNet, Pascal VOC2007 and Microsoft COCO.