1 code implementation • 28 May 2022 • Yaochen Zhu, Xubin Ren, Jing Yi, Zhenzhong Chen
We first establish a causal graph to represent the relations among uploader, UGC, and tag, where the uploaders are identified as confounders that spuriously correlate UGC and tag selections.
no code implementations • 20 Apr 2022 • Jing Yi, Xubin Ren, Zhenzhong Chen
Recommending appropriate tags to items can facilitate content organization, retrieval, consumption and other applications, where hybrid tag recommender systems have been utilized to integrate collaborative information and content information for better recommendations.
no code implementations • 28 Mar 2022 • Leitian Tao, Zhenzhong Chen
To better handle the challenges of complex and large motions, instead of aligning features at each scale separately, lower-scale motion information is used to guide the higher-scale motion estimation.
1 code implementation • 6 Jan 2022 • Yaochen Zhu, Jing Yi, Jiayi Xie, Zhenzhong Chen
As with all observational studies, hidden confounders, which are factors that affect both item exposures and user ratings, lead to a systematic bias in the estimation.
no code implementations • CVPR 2022 • Yangjun Ou, Li Mi, Zhenzhong Chen
By combining an object-level graph (OG) and a relation-level graph (RG), the proposed OR2G catches the attribute transitions of objects and reasons about the relationship transitions between objects simultaneously.
1 code implementation • CVPR 2022 • Yaosi Hu, Chong Luo, Zhenzhong Chen
With both controllable appearance and motion, TI2V aims at generating videos from a static image and a text description.
no code implementations • 13 Oct 2021 • Yuantong Zhang, Huairui Wang, Zhenzhong Chen
To solve the above problem of the existing methods, we propose a coarse-to-fine bidirectional recurrent neural network instead of using ConvLSTM to leverage knowledge between adjacent frames.
Optical Flow Estimation
Space-time Video Super-resolution
+1
no code implementations • 15 Jul 2021 • Jing Yi, Yaochen Zhu, Jiayi Xie, Zhenzhong Chen
Moreover, the multimodal information is fused by the product-of-experts (PoE) principle, where the semantic information in visual and textual modalities of the micro-video are weighted according to their variance estimations such that the modality with a lower noise level is given more weights.
no code implementations • 6 Jul 2021 • Leitian Tao, Li Mi, Nannan Li, Xianhang Cheng, Yaosi Hu, Zhenzhong Chen
For a typical Scene Graph Generation (SGG) method, there is often a large gap in the performance of the predicates' head classes and tail classes.
no code implementations • 2 Jul 2021 • Li Mi, Yangjun Ou, Zhenzhong Chen
To evaluate the VRF task, we introduce two video datasets named VRF-AG and VRF-VidOR, with a series of spatio-temporally localized visual relation annotations in a video.
1 code implementation • 9 Jun 2021 • Jingyuan Chen, Guanchen Ding, Yuchen Yang, Wenwei Han, Kangmin Xu, Tianyi Gao, Zhe Zhang, Wanping Ouyang, Hao Cai, Zhenzhong Chen
For the vehicle detection and tracking module, we adopted YOLOv5 and multi-scale tracking to localize the anomalies.
1 code implementation • 17 May 2021 • Yaochen Zhu, Zhenzhong Chen
Moreover, by considering the fusion of collaborative and feature variables as a virtual communication channel from an information-theoretic perspective, we introduce a user-dependent channel to dynamically control the information allowed to be accessed from the feature embeddings.
1 code implementation • 21 Jul 2020 • Nannan Li, Zhenzhong Chen
Constructing adversarial examples in a black-box threat model injures the original images by introducing visual distortion.
1 code implementation • 15 Jun 2020 • Xianhang Cheng, Zhenzhong Chen
During the learning process, different intermediate time step can be involved as a control variable by means of an extension of coord-conv trick, allowing the estimated components to vary with different input temporal information.
no code implementations • 27 May 2020 • Xiaoying Ding, Zhenzhong Chen
Traditional 3D mesh saliency detection algorithms and corresponding databases were proposed under several constraints such as providing limited viewing directions and not taking the subject's movement into consideration.
1 code implementation • 28 Mar 2020 • Yaochen Zhu, Jiayi Xie, Zhenzhong Chen
As an emerging type of user-generated content, micro-video drastically enriches people's entertainment experiences and social interactions.
no code implementations • 24 Mar 2020 • Nannan Li, Zhenzhong Chen
Adversarial learning has shown its advances in generating natural and diverse descriptions in image captioning.
3 code implementations • 22 Jul 2019 • Wanjie Sun, Zhenzhong Chen
The proposed resampler network generates content adaptive image resampling kernels that are applied to the original HR input to generate pixels on the downscaled image.
Ranked #1 on
Image Super-Resolution
on Set14 - 2x upscaling
no code implementations • 2 Jul 2019 • Canwen Xu, Zhenzhong Chen, Chenliang Li
Recently, with the prevalence of large-scale image dataset, the co-occurrence information among classes becomes rich, calling for a new way to exploit it to facilitate inference.
no code implementations • 1 Jul 2019 • Chenliang Li, Xichuan Niu, Xiangyang Luo, Zhenzhong Chen, Cong Quan
Given a sequence of historical purchased items for a user, we devise a novel hierarchical attention over attention mechanism to capture sequential patterns at both union-level and individual-level.
no code implementations • CVPR 2018 • Yicheng Wang, Zhenzhong Chen, Feng Wu, Gang Wang
In this paper, a novel deep architecture named BraidNet is proposed for person re-identification.
no code implementations • CVPR 2018 • Bin Xu, Zhenzhong Chen
In this paper, we present an end-to-end deep learning based framework for 3D object detection from a single monocular image.
Ranked #12 on
Vehicle Pose Estimation
on KITTI Cars Hard
3D Object Detection
3D Object Detection From Monocular Images
+3
no code implementations • 28 Feb 2015 • Weiyao Lin, Ming-Ting Sun, Hongxiang Li, Zhenzhong Chen, Wei Li, Bing Zhou
We demonstrate that this low-computation-complexity method can efficiently catch the characteristics of the frame.
no code implementations • 21 Feb 2015 • Weiyao Lin, Hang Chu, Jianxin Wu, Bin Sheng, Zhenzhong Chen
In this paper, a new heat-map-based (HMB) algorithm is proposed for group activity recognition.
no code implementations • 21 Feb 2015 • Yuanzhe Chen, Weiyao Lin, Chongyang Zhang, Zhenzhong Chen, Ning Xu, Jun Xie
In this paper, we propose a new intra-and-inter-constraint-based video enhancement approach aiming to 1) achieve high intra-frame quality of the entire picture where multiple region-of-interests (ROIs) can be adaptively and simultaneously enhanced, and 2) guarantee the inter-frame quality consistencies among video frames.