no code implementations • ACM Multimedia Asia 2024 • Jingbin Xu, Junwen Chen, Keiji Yanai
The Panoptic Scene Graph generation (PSG) task aims to extract the triplets composed of subject, object, and relation based on panoptic segmentation.
no code implementations • 30 Oct 2023 • Zhengliang Liu, Yiwei Li, Qian Cao, Junwen Chen, Tianze Yang, Zihao Wu, John Hale, John Gibbs, Khaled Rasheed, Ninghao Liu, Gengchen Mai, Tianming Liu
Recent advances in artificial general intelligence (AGI), particularly large language models and creative image generation systems have demonstrated impressive capabilities on diverse tasks spanning the arts and humanities.
no code implementations • 5 Sep 2023 • Junwen Chen, Jie Zhu, Yu Kong
Despite significant progress in video question answering (VideoQA), existing methods fall short of questions that require causal/temporal reasoning across frames.
Ranked #27 on Video Question Answering on NExT-QA
no code implementations • 26 Jul 2023 • Junwen Chen, Xingxing Wei
In this paper, we analyse the properties of adversarial patches, and find that: on the one hand, adversarial patches will lead to the appearance or contextual inconsistency in the target objects; on the other hand, the patch region will show abnormal changes on the high-level feature maps of the objects extracted by a backbone network.
1 code implementation • 5 Jul 2023 • Junwen Chen, Yingcheng Wang, Keiji Yanai
However, the current methods redirect the detection target of the object decoder, and the box target is not explicitly separated from the query embeddings, which leads to long and hard training.
Ranked #3 on Human-Object Interaction Detection on HICO-DET
no code implementations • CVPR 2022 • Junwen Chen, Gaurav Mittal, Ye Yu, Yu Kong, Mei Chen
We present GateHUB, Gated History Unit with Background Suppression, that comprises a novel position-guided gated cross-attention mechanism to enhance or suppress parts of the history as per how informative they are for current frame prediction.
Ranked #1 on Online Action Detection on TVSeries
no code implementations • 10 May 2022 • Jing Yang, Junwen Chen, Keiji Yanai
In this paper, we present a cross-modal recipe retrieval framework, Transformer-based Network for Large Batch Training (TNLBT), which is inspired by ACME~(Adversarial Cross-Modal Embedding) and H-T~(Hierarchical Transformer).
1 code implementation • 16 Dec 2021 • Junwen Chen, Keiji Yanai
Human-object interaction (HOI) detection as a downstream of object detection tasks requires localizing pairs of humans and objects and extracting the semantic relationships between humans and objects from an image.
Ranked #9 on Human-Object Interaction Detection on HICO-DET
no code implementations • ICCV 2021 • Junwen Chen, Yu Kong
Video entailment aims at determining if a hypothesis textual statement is entailed or contradicted by a premise video.
1 code implementation • ECCV 2020 • Junwen Chen, Wentao Bao, Yu Kong
Our model explicitly anticipates both activity features and positions by two graph auto-encoders, aiming to learn a discriminative group representation for group activity prediction.
no code implementations • 15 Jul 2020 • Junwen Chen, Yi Lu, Yaran Chen, Dongbin Zhao, Zhonghua Pang
A good object segmentation should contain clear contours and complete regions.
no code implementations • 25 Mar 2020 • Haiyang Xu, Yahao He, Kun Han, Junwen Chen, Xiangang Li
Our approach has the following contributions: first, we incorporate syntactic information such as constituency parsing trees into the encoding sequence to learn both the semantic and syntactic information from the document, resulting in more accurate summary; second, we propose a dynamic gate network to select the salient information based on the context of the decoder state, which is essential to document summarization.
no code implementations • 25 Mar 2020 • Haiyang Xu, Junwen Chen, Kun Han, Xiangang Li
Multi-class text classification is one of the key problems in machine learning and natural language processing.
no code implementations • 18 Mar 2020 • Haiyang Xu, Yun Wang, Kun Han, Baochang Ma, Junwen Chen, Xiangang Li
Abstractive text summarization is a challenging task, and one need to design a mechanism to effectively extract salient information from the source text and then generate a summary.
2 code implementations • 2 Aug 2019 • Kun Han, Junwen Chen, HUI ZHANG, Haiyang Xu, Yiping Peng, Yun Wang, Ning Ding, Hui Deng, Yonghu Gao, Tingwei Guo, Yi Zhang, Yahao He, Baochang Ma, Yu-Long Zhou, Kangli Zhang, Chao Liu, Ying Lyu, Chenxi Wang, Cheng Gong, Yunbo Wang, Wei Zou, Hui Song, Xiangang Li
In this paper we present DELTA, a deep learning based language technology platform.
Ranked #3 on Named Entity Recognition on CoNLL 2003 (English)