no code implementations • 20 Sep 2023 • Jie Wang, Hanzhu Chen, Qitan Lv, Zhihao Shi, Jiajun Chen, Huarui He, Hongtao Xie, Yongdong Zhang, Feng Wu
This implies the great potential of the semantic correlations for the entity-independent inductive link prediction task.
no code implementations • 7 Sep 2023 • An-An Liu, Guokai Zhang, Yuting Su, Ning Xu, Yongdong Zhang, Lanjun Wang
Furthermore, we strengthen the watermark robustness of our approach by subjecting the compound image to various post-processing attacks, with minimal pixel distortion observed in the revealed watermark.
1 code implementation • 22 Aug 2023 • Zhihai Wang, Lei Chen, Jie Wang, Xing Li, Yinqi Bai, Xijun Li, Mingxuan Yuan, Jianye Hao, Yongdong Zhang, Feng Wu
In particular, we notice that the runtime of the Resub and Mfs2 operators often dominates the overall runtime of LS optimization processes.
1 code implementation • 4 Aug 2023 • Tianhao Qi, Hongtao Xie, Pandeng Li, Jiannan Ge, Yongdong Zhang
In this paper, we contend that the learning bias originates from two factors: 1) the unequal competition arising from the imbalanced distribution of foreground categories, and 2) the lack of sample diversity in tail categories.
Ranked #1 on
Long-tailed Object Detection
on LVIS v1.0 val
1 code implementation • 6 Jul 2023 • Pandeng Li, Chen-Wei Xie, Hongtao Xie, Liming Zhao, Lei Zhang, Yun Zheng, Deli Zhao, Yongdong Zhang
Video moment retrieval pursues an efficient and generalized solution to identify the specific temporal segments within an untrimmed video that correspond to a given language description.
no code implementations • 1 Jul 2023 • Zhuowei Chen, Shancheng Fang, Wei Liu, Qian He, Mengqi Huang, Yongdong Zhang, Zhendong Mao
While large-scale pre-trained text-to-image models can synthesize diverse and high-quality human-centric images, an intractable problem is how to preserve the face identity for conditioned face images.
1 code implementation • CVPR 2023 • Huan Ren, Wenfei Yang, Tianzhu Zhang, Yongdong Zhang
Weakly-supervised temporal action localization aims to localize and recognize actions in untrimmed videos with only video-level category labels during training.
Multiple Instance Learning
Weakly Supervised Action Localization
+2
1 code implementation • 24 May 2023 • Benfeng Xu, An Yang, Junyang Lin, Quan Wang, Chang Zhou, Yongdong Zhang, Zhendong Mao
The answering quality of an aligned large language model (LLM) can be drastically improved if treated with proper crafting of prompts.
1 code implementation • CVPR 2023 • Mengqi Huang, Zhendong Mao, Quan Wang, Yongdong Zhang
Existing autoregressive models follow the two-stage generation paradigm that first learns a codebook in the latent space for image reconstruction and then completes the image generation autoregressively based on the learned codebook.
1 code implementation • CVPR 2023 • Mengqi Huang, Zhendong Mao, Zhuowei Chen, Yongdong Zhang
Existing vector quantization (VQ) based autoregressive models follow a two-stage generation paradigm that first learns a codebook to encode images as discrete codes, and then completes generation based on the learned codebook.
1 code implementation • 9 May 2023 • Boqiang Zhang, Hongtao Xie, Yuxin Wang, Jianjun Xu, Yongdong Zhang
Vision model have gained increasing attention due to their simplicity and efficiency in Scene Text Recognition (STR) task.
1 code implementation • 26 Apr 2023 • Yang Zhang, Tianhao Shi, Fuli Feng, Wenjie Wang, Dingxian Wang, Xiangnan He, Yongdong Zhang
However, such a manner inevitably learns unstable feature interactions, i. e., the ones that exhibit strong correlations in historical data but generalize poorly for future serving.
1 code implementation • 24 Mar 2023 • Benfeng Xu, Quan Wang, Zhendong Mao, Yajuan Lyu, Qiaoqiao She, Yongdong Zhang
In-Context Learning (ICL), which formulates target tasks as prompt completion conditioned on in-context demonstrations, has become the prevailing utilization of LLMs.
1 code implementation • 19 Feb 2023 • Jie Wang, Rui Yang, Zijie Geng, Zhihao Shi, Mingxuan Ye, Qi Zhou, Shuiwang Ji, Bin Li, Yongdong Zhang, Feng Wu
The appealing features of RSD-OA include that: (1) RSD-OA is invariant to visual distractions, as it is conditioned on the predefined subsequent action sequence without task-irrelevant information from transition dynamics, and (2) the reward sequence captures long-term task-relevant information in both rewards and transition dynamics.
1 code implementation • 2 Feb 2023 • Zijie Geng, Shufang Xie, Yingce Xia, Lijun Wu, Tao Qin, Jie Wang, Yongdong Zhang, Feng Wu, Tie-Yan Liu
The obtained motif vocabulary consists of not only molecular motifs (i. e., the frequent fragments), but also their connection information, indicating how the motifs are connected with each other.
no code implementations • 1 Feb 2023 • Zhihai Wang, Xijun Li, Jie Wang, Yufei Kuang, Mingxuan Yuan, Jia Zeng, Yongdong Zhang, Feng Wu
Cut selection -- which aims to select a proper subset of the candidate cuts to improve the efficiency of solving MILPs -- heavily depends on (P1) which cuts should be preferred, and (P2) how many cuts should be selected.
1 code implementation • CVPR 2023 • Sun-Ao Liu, Yiheng Zhang, Zhaofan Qiu, Hongtao Xie, Yongdong Zhang, Ting Yao
POP builds a set of orthogonal prototypes, each of which represents a semantic class, and makes the prediction for each class separately based on the features projected onto its prototype.
no code implementations • CVPR 2023 • Yuchen Ren, Zhendong Mao, Shancheng Fang, Yan Lu, Tong He, Hao Du, Yongdong Zhang, Wanli Ouyang
In this paper, we introduce a new setting called Domain Generalization for Image Captioning (DGIC), where the data from the target domain is unseen in the learning process.
1 code implementation • CVPR 2023 • Zheren Fu, Zhendong Mao, Yan Song, Yongdong Zhang
Image-text matching, a bridge connecting image and language, is an important task, which generally learns a holistic cross-modal embedding to achieve a high-quality semantic alignment between the two modalities.
no code implementations • CVPR 2023 • Weiwei Feng, Nanqing Xu, Tianzhu Zhang, Yongdong Zhang
Concretely, the former adopts a dynamic convolution kernel and a static convolution kernel for the specific instance and the global dataset, respectively, which can inherit the advantages of both instance-specific and instance-agnostic attacks.
1 code implementation • 5 Dec 2022 • Yadong Qu, Qingfeng Tan, Hongtao Xie, Jianjun Xu, Yuxin Wang, Yongdong Zhang
Moreover, two new datasets (Tamper-Syn2k and Tamper-Scene) are proposed to fill the blank of public evaluation datasets.
1 code implementation • 29 Nov 2022 • Zheren Fu, Zhendong Mao, Bo Hu, An-An Liu, Yongdong Zhang
They have overlooked the wide characteristic changes of different classes and can not model abundant intra-class variations for generations.
1 code implementation • 19 Nov 2022 • Shancheng Fang, Zhendong Mao, Hongtao Xie, Yuxin Wang, Chenggang Yan, Yongdong Zhang
In this paper, we argue that the limited capacity of language models comes from 1) implicit language modeling; 2) unidirectional feature representation; and 3) language model with noise input.
Ranked #3 on
Text Spotting
on SCUT-CTW1500
no code implementations • ECCV 2022 • Kongzhu Jiang, Tianzhu Zhang, Xiang Liu, Bingqiao Qian, Yongdong Zhang, Feng Wu ;
To alleviate the above issues, we propose a novel Cross-Modality Transformer (CMT) to jointly explore a modality-level alignment module and an instance-level module for VI-ReID.
1 code implementation • 20 Oct 2022 • Jiahao Li, Quan Wang, Zhendong Mao, Junbo Guo, Yanyan Yang, Yongdong Zhang
In this paper, we consider introducing an auxiliary task of Chinese pronunciation prediction (CPP) to improve CSC, and, for the first time, systematically discuss the adaptivity and granularity of this auxiliary task.
1 code implementation • 12 Oct 2022 • Zhiying Lu, Hongtao Xie, Chuanbin Liu, Yongdong Zhang
On channel aspect, we introduce a dynamic feature aggregation module in MLP and a brand new "head token" design in multi-head self-attention module to help re-calibrate channel representation and make different channel group representation interacts with each other.
no code implementations • 3 Sep 2022 • Mengqi Huang, Zhendong Mao, Penghui Wang, Quan Wang, Yongdong Zhang
Text-to-image generation aims at generating realistic images which are semantically consistent with the given text.
no code implementations • 1 Sep 2022 • Xiaodong Chen, Wu Liu, Xinchen Liu, Yongdong Zhang, Jungong Han, Tao Mei
In DestFormer, the spatial and temporal dimensions of the 4D point cloud videos are decoupled to achieve efficient self-attention for learning both long-term and short-term features.
no code implementations • 1 Sep 2022 • Quanwei Yang, Xinchen Liu, Wu Liu, Hongtao Xie, Xiaoyan Gu, Lingyun Yu, Yongdong Zhang
Human Video Motion Transfer (HVMT) aims to, given an image of a source person, generate his/her video that imitates the motion of the driving person.
1 code implementation • 2022 2022 • Jiaqi Zheng, Xi Zhang, Sanchuan Guo, Quan Wang, Wenyu Zang, Yongdong Zhang
Rumor spreaders are increasingly taking advantage of multimedia content to attract and mislead news consumers on social media.
no code implementations • 14 Jul 2022 • Weijian Chen, Yixin Cao, Fuli Feng, Xiangnan He, Yongdong Zhang
On the one hand, their performance will dramatically degrade along with the increasing sparsity of KGs.
1 code implementation • 13 May 2022 • Xiangnan He, Yang Zhang, Fuli Feng, Chonggang Song, Lingling Yi, Guohui Ling, Yongdong Zhang
We demonstrate DCR on the backbone model of neural factorization machine (NFM), showing that DCR leads to more accurate prediction of user preference with small inference time cost.
no code implementations • 19 Apr 2022 • Yuan Gao, Xiang Wang, Xiangnan He, Huamin Feng, Yongdong Zhang
At the core is to model the rumor characteristics inherent in rich information, such as propagation patterns in social network and semantic patterns in post content, and differentiate them from the truth.
no code implementations • 9 Mar 2022 • Xiaodong Chen, Xinchen Liu, Wu Liu, Kun Liu, Dong Wu, Yongdong Zhang, Tao Mei
Therefore, researchers start to focus on a new task, Part-level Action Parsing (PAP), which aims to not only predict the video-level action but also recognize the frame-level fine-grained actions or interactions of body parts for each person in the video.
no code implementations • CVPR 2022 • Jiamin Wu, Tianzhu Zhang, Zhe Zhang, Feng Wu, Yongdong Zhang
To address this issue, we propose an end-to-end Motion-modulated Temporal Fragment Alignment Network (MTFAN) by jointly exploring the task-specific motion modulation and the multi-level temporal fragment alignment for Few-Shot Action Recognition (FSAR).
1 code implementation • CVPR 2022 • Kun Zhang, Zhendong Mao, Quan Wang, Yongdong Zhang
Image-text matching, as a fundamental task, bridges the gap between vision and language.
1 code implementation • CVPR 2022 • Sun-Ao Liu, Hongtao Xie, Hai Xu, Yongdong Zhang, Qi Tian
Current attention-based methods for semantic segmentation mainly model pixel relation through pairwise affinity and coarse segmentation.
3 code implementations • ICCV 2021 • Yuxin Wang, Hongtao Xie, Shancheng Fang, Jing Wang, Shenggao Zhu, Yongdong Zhang
Such operation guides the vision model to use not only the visual texture of characters, but also the linguistic information in visual context for recognition when the visual cues are confused (e. g. occlusion, noise, etc.).
1 code implementation • 16 Aug 2021 • Sihao Ding, Fuli Feng, Xiangnan He, Yong Liao, Jun Shi, Yongdong Zhang
Towards the goal, we propose a \textit{Causal Incremental Graph Convolution} approach, which consists of two new operators named \textit{Incremental Graph Convolution} (IGC) and \textit{Colliding Effect Distillation} (CED) to estimate the output of full graph convolution.
1 code implementation • 24 Jun 2021 • Yuxin Wang, Hongtao Xie, Shancheng Fang, Yadong Qu, Yongdong Zhang
However, there exists two problems: 1) the implicit erasure guidance causes the excessive erasure to non-text areas; 2) the one-stage erasure lacks the exhaustive removal of text region.
no code implementations • CVPR 2021 • Rui Sun, Yihao Li, Tianzhu Zhang, Zhendong Mao, Feng Wu, Yongdong Zhang
First, to the best of our knowledge, this is the first work to formulate lesion discovery as a weakly supervised lesion localization problem via a transformer decoder.
no code implementations • CVPR 2021 • Wenfei Yang, Tianzhu Zhang, Xiaoyuan Yu, Tian Qi, Yongdong Zhang, Feng Wu
To alleviate this problem, we propose a novel Uncertainty Guided Collaborative Training (UGCT) strategy, which mainly includes two key designs: (1) The first design is an online pseudo label generation module, in which the RGB and FLOW streams work collaboratively to learn from each other.
no code implementations • 13 Jun 2021 • Shaobo Min, Qi Dai, Hongtao Xie, Chuang Gan, Yongdong Zhang, Jingdong Wang
Cross-modal correlation provides an inherent supervision for video unsupervised representation learning.
no code implementations • CVPR 2021 • Yulin Li, Jianfeng He, Tianzhu Zhang, Xiang Liu, Yongdong Zhang, Feng Wu
To address these issues, we propose a novel end-to-end Part-Aware Transformer (PAT) for occluded person Re-ID through diverse part discovery via a transformer encoderdecoder architecture, including a pixel context based transformer encoder and a part prototype based transformer decoder.
1 code implementation • 13 May 2021 • Yang Zhang, Fuli Feng, Xiangnan He, Tianxin Wei, Chonggang Song, Guohui Ling, Yongdong Zhang
This work studies an unexplored problem in recommendation -- how to leverage popularity bias to improve the recommendation accuracy.
no code implementations • CVPR 2021 • Wang Luo, Tianzhu Zhang, Wenfei Yang, Jingen Liu, Tao Mei, Feng Wu, Yongdong Zhang
In this paper, we present an Action Unit Memory Network (AUMN) for weakly supervised temporal action localization, which can mitigate the above two challenges by learning an action unit memory bank.
Ranked #6 on
Weakly Supervised Action Localization
on THUMOS14
Weakly Supervised Action Localization
Weakly-supervised Temporal Action Localization
+1
no code implementations • CVPR 2021 • Jiaming Li, Hongtao Xie, Jiahong Li, Zhongyuan Wang, Yongdong Zhang
Face forgery detection is raising ever-increasing interest in computer vision since facial manipulation technologies cause serious worries.
3 code implementations • CVPR 2021 • Shancheng Fang, Hongtao Xie, Yuxin Wang, Zhendong Mao, Yongdong Zhang
Additionally, based on the ensemble of iterative predictions, we propose a self-training method which can learn from unlabeled images effectively.
1 code implementation • ICCV 2021 • Xiaodong Chen, Xinchen Liu, Wu Liu, Xiao-Ping Zhang, Yongdong Zhang, Tao Mei
In this paper, we propose a post-hoc method, named Attribute-guided Metric Distillation (AMD), to explain existing ReID models.
Ranked #44 on
Person Re-Identification
on DukeMTMC-reID
no code implementations • ICCV 2021 • Jiamin Wu, Tianzhu Zhang, Yongdong Zhang, Feng Wu
The task-aware part filters can adapt to any individual task and automatically mine task-related local parts even for an unseen task.
no code implementations • ICCV 2021 • Meng Meng, Tianzhu Zhang, Qi Tian, Yongdong Zhang, Feng Wu
To the best of our knowledge, this is the first work that can achieve remarkable performance for both tasks by optimizing them jointly via FAM for WSOL.
no code implementations • ICCV 2021 • Weiwei Feng, Baoyuan Wu, Tianzhu Zhang, Yong Zhang, Yongdong Zhang
To tackle these issues, we propose a class-agnostic and model-agnostic physical adversarial attack model (Meta-Attack), which is able to not only generate robust physical adversarial examples by simulating color and shape distortions, but also generalize to attacking novel images and novel DNN models by accessing a few digital and physical images.
1 code implementation • 17 Dec 2020 • Xi Zhu, Zhendong Mao, Chunxiao Liu, Peng Zhang, Bin Wang, Yongdong Zhang
Our method can compensate for the data biases by generating balanced data without introducing external annotations.
no code implementations • NeurIPS 2020 • Shaobo Min, Hongtao Xie, Hantao Yao, Xuran Deng, Zheng-Jun Zha, Yongdong Zhang
In this paper, we introduce a new task, named Hierarchical Granularity Transfer Learning (HGTL), to recognize sub-level categories with basic-level annotations and semantic descriptions for hierarchical categories.
1 code implementation • 11 Sep 2020 • Weijian Chen, Fuli Feng, Qifan Wang, Xiangnan He, Chonggang Song, Guohui Ling, Yongdong Zhang
In this paper, we propose a new GCN model named CatGCN, which is tailored for graph learning when the node features are categorical.
no code implementations • 9 Aug 2020 • Chenggang Yan, Zhisheng Li, Yongbing Zhang, Yutao Liu, Xiangyang Ji, Yongdong Zhang
The depth images denoising are increasingly becoming the hot research topic nowadays because they reflect the three-dimensional (3D) scene and can be applied in various fields of computer vision.
no code implementations • ACL 2020 • Benfeng Xu, Licheng Zhang, Zhendong Mao, Quan Wang, Hongtao Xie, Yongdong Zhang
With the great success of pre-trained language models, the pretrain-finetune paradigm now becomes the undoubtedly dominant solution for natural language understanding (NLU) tasks.
no code implementations • 31 May 2020 • Hantao Yao, Shaobo Min, Yongdong Zhang, Changsheng Xu
Then, an attentional graph attribute embedding is proposed to reduce the semantic bias between seen and unseen categories, which utilizes the graph operation to capture the semantic relationship between categories.
1 code implementation • 27 May 2020 • Yang Zhang, Fuli Feng, Chenxu Wang, Xiangnan He, Meng Wang, Yan Li, Yongdong Zhang
Nevertheless, normal training on new data only may easily cause overfitting and forgetting issues, since the new data is of a smaller scale and contains fewer information on long-term user preference.
1 code implementation • CVPR 2020 • Yuxin Wang, Hongtao Xie, Zheng-Jun Zha, Mengting Xing, Zilong Fu, Yongdong Zhang
Then a novel Local Orthogonal Texture-aware Module (LOTM) models the local texture information of proposal features in two orthogonal directions and represents text region with a set of contour points.
1 code implementation • CVPR 2020 • Chunxiao Liu, Zhendong Mao, Tianzhu Zhang, Hongtao Xie, Bin Wang, Yongdong Zhang
The GSMN explicitly models object, relation and attribute as a structured phrase, which not only allows to learn correspondence of object, relation and attribute separately, but also benefits to learn fine-grained correspondence of structured phrase.
Ranked #14 on
Cross-Modal Retrieval
on Flickr30k
1 code implementation • CVPR 2020 • Shaobo Min, Hantao Yao, Hongtao Xie, Chaoqun Wang, Zheng-Jun Zha, Yongdong Zhang
Recent methods focus on learning a unified semantic-aligned visual representation to transfer knowledge between two domains, while ignoring the effect of semantic-free visual representation in alleviating the biased recognition problem.
1 code implementation • 30 Mar 2020 • Shaobo Min, Hantao Yao, Hongtao Xie, Zheng-Jun Zha, Yongdong Zhang
In this paper, we propose an efficient Multi-Objective Matrix Normalization (MOMN) method that can simultaneously normalize a bilinear representation in terms of square-root, low-rank, and sparsity.
1 code implementation • 10 Feb 2020 • Hongmin Zhu, Fuli Feng, Xiangnan He, Xiang Wang, Yan Li, Kai Zheng, Yongdong Zhang
We term this framework as Bilinear Graph Neural Network (BGNN), which improves GNN representation ability with bilinear interactions between neighbor nodes.
12 code implementations • 6 Feb 2020 • Xiangnan He, Kuan Deng, Xiang Wang, Yan Li, Yongdong Zhang, Meng Wang
We propose a new model named LightGCN, including only the most essential component in GCN -- neighborhood aggregation -- for collaborative filtering.
Ranked #6 on
Recommendation Systems
on Yelp2018
no code implementations • 25 Dec 2019 • Yu Li, Sheng Tang, Rui Zhang, Yongdong Zhang, Jintao Li, Shuicheng Yan
While in situations where two domains are asymmetric in complexity, i. e., the amount of information between two domains is different, these approaches pose problems of poor generation quality, mapping ambiguity, and model sensitivity.
9 code implementations • 21 Nov 2019 • Zhanqiu Zhang, Jianyu Cai, Yongdong Zhang, Jie Wang
HAKE is inspired by the fact that concentric circles in the polar coordinate system can naturally reflect the hierarchy.
Ranked #1 on
Knowledge Graph Completion
on WN18RR
no code implementations • 23 Sep 2019 • Zhaofan Qiu, Ting Yao, Yiheng Zhang, Yongdong Zhang, Tao Mei
Moreover, we enlarge the search space of SDAS particularly for video recognition by devising several unique operations to encode spatio-temporal dynamics and demonstrate the impact in affecting the architecture search of SDAS.
no code implementations • 23 Aug 2019 • Yanhao Zhu, Zhineng Chen, Shuai Zhao, Hongtao Xie, Wenming Guo, Yongdong Zhang
Nowadays U-net-like FCNs predominate various biomedical image segmentation applications and attain promising performance, largely due to their elegant architectures, e. g., symmetric contracting and expansive paths as well as lateral skip-connections.
1 code implementation • 12 Aug 2019 • Shaobo Min, Hantao Yao, Hongtao Xie, Zheng-Jun Zha, Yongdong Zhang
In contrast to previous methods, the DSEN decomposes the domain-shared projection function into one domain-invariant and two domain-specific sub-functions to explore the similarities and differences between two domains.
no code implementations • 29 Jul 2019 • Tianyi Wu, Sheng Tang, Rui Zhang, Guodong Guo, Yongdong Zhang
However, classification networks are dominated by the discriminative portion, so directly applying classification networks to scene parsing will result in inconsistent parsing predictions within one instance and among instances of the same category.
1 code implementation • 24 Jun 2019 • Feng Dai, Hao liu, Yike Ma, Juan Cao, Qiang Zhao, Yongdong Zhang
The key component of our network is the dense dilated convolution block, in which each dilation layer is densely connected with the others to preserve information from continuously varied scales.
1 code implementation • 6 Jun 2019 • Zheng-Jun Zha, Daqing Liu, Hanwang Zhang, Yongdong Zhang, Feng Wu
With the maturity of visual detection techniques, we are more ambitious in describing visual content with open-vocabulary, fine-grained and free-form language, i. e., the task of image captioning.
2 code implementations • 29 Apr 2019 • Xin Xin, Xiangnan He, Yongfeng Zhang, Yongdong Zhang, Joemon Jose
In this work, we propose Relational Collaborative Filtering (RCF), a general framework to exploit multiple relations between items in recommender system.
no code implementations • 1 Jan 2019 • Jiarong Dong, Ke Gao, Xiaokai Chen, Junbo Guo, Juan Cao, Yongdong Zhang
To address this issue, we propose a novel learning strategy called Information Loss, which focuses on the relationship between the video-specific visual content and corresponding representative words.
3 code implementations • 20 Nov 2018 • Tianyi Wu, Sheng Tang, Rui Zhang, Yongdong Zhang
To tackle this problem, we propose a novel Context Guided Network (CGNet), which is a light-weight and efficient network for semantic segmentation.
Ranked #8 on
Semantic Segmentation
on EventScape
no code implementations • 19 Nov 2018 • Jiawei Liu, Zheng-Jun Zha, Hongtao Xie, Zhiwei Xiong, Yongdong Zhang
An appearance network is developed to learn appearance features from the full body, horizontal and vertical body parts of pedestrians with spatial dependencies among body parts.
2 code implementations • 7 Nov 2018 • Rui Zhang, Sheng Tang, Yu Li, Junbo Guo, Yongdong Zhang, Jintao Li, Shuicheng Yan
The S3-GAN consists of an encoder network, a generator network, and an adversarial network.
1 code implementation • 16 Aug 2018 • Daqing Liu, Zheng-Jun Zha, Hanwang Zhang, Yongdong Zhang, Feng Wu
To fill the gap, we propose a Context-Aware Visual Policy network (CAVP) for sequence-level image captioning.
no code implementations • 31 Jul 2018 • Shaobo Min, Xuejin Chen, Zheng-Jun Zha, Feng Wu, Yongdong Zhang
\begin{abstract} Learning-based methods suffer from a deficiency of clean annotations, especially in biomedical segmentation.
no code implementations • IJCAI 2018 • An-An Liu1, Ning Xu1, Hanwang Zhang2, Weizhi Nie1, Yuting Su1, Yongdong Zhang
Image captioning is one of the most challenging hallmarks of AI, due to its complexity in visual and natural language understanding.
no code implementations • 12 Dec 2017 • Bo Wu, Wen-Huang Cheng, Yongdong Zhang, Tao Mei
We evaluate our approach on two large-scale Flickr image datasets with over 1. 8 million photos in total, for the task of popularity prediction.
1 code implementation • 12 Dec 2017 • Bo Wu, Wen-Huang Cheng, Yongdong Zhang, Qiushi Huang, Jintao Li, Tao Mei
With a joint embedding network, we obtain a unified deep representation of multi-modal user-post data in a common embedding space.
no code implementations • Mountain View, CA, USA 2017 • Zhiwei Jin, Juan Cao, Han Guo, Yongdong Zhang
In this paper, we propose a novel Recurrent Neural Network with an at- tention mechanism (att-RNN) to fuse multimodal features for e ective rumor detection.
no code implementations • ICCV 2017 • Rui Zhang, Sheng Tang, Yongdong Zhang, Jintao Li, Shuicheng Yan
Through adding a new scale regression layer, we can dynamically infer the position-adaptive scale coefficients which are adopted to resize the convolutional patches.
3 code implementations • 18 Jul 2017 • Shiwei Shen, Guoqing Jin, Ke Gao, Yongdong Zhang
Although neural networks could achieve state-of-the-art performance while recongnizing images, they often suffer a tremendous defeat from adversarial examples--inputs generated by utilizing imperceptible but intentional perturbation to clean samples from the datasets.
no code implementations • 4 Jul 2017 • Hantao Yao, Shiliang Zhang, Yongdong Zhang, Jintao Li, Qi Tian
The representation learning risk is evaluated by the proposed part loss, which automatically generates several parts for an image, and computes the person classification loss on each part separately.
Ranked #93 on
Person Re-Identification
on Market-1501
no code implementations • 4 Jul 2017 • Hantao Yao, Shiliang Zhang, Yongdong Zhang, Jintao Li, Qi Tian
Aiming to conquer this issue, we propose a retrieval task named One-Shot Fine-Grained Instance Retrieval (OSFGIR).
no code implementations • CVPR 2017 • Xishan Zhang, Ke Gao, Yongdong Zhang, Dongming Zhang, Jintao Li, Qi Tian
This paper contributes to: 1)The first in-depth study of the weakness inherent in data-driven static fusion methods for video captioning.
1 code implementation • 19 Feb 2017 • Hantao Yao, Feng Dai, Dongming Zhang, Yike Ma, Shiliang Zhang, Yongdong Zhang, Qi Tian
Accordingly, DR$^{2}$-Net consists of two components, \emph{i. e.,} linear mapping network and residual network, respectively.
no code implementations • 16 Nov 2016 • Zhiwei Jin, Juan Cao, Jiebo Luo, Yongdong Zhang
In order to overcome the scarcity of training samples of fake images, we first construct a large-scale auxiliary dataset indirectly related to this task.
1 code implementation • 7 Nov 2016 • Depeng Liang, Yongdong Zhang
Recently deeplearning models have been shown to be capable of making remarkable performance in sentences and documents classification tasks.
no code implementations • 19 Jun 2015 • Xuehui Wang, Jinli Suo, Jingyi Yu, Yongdong Zhang, Qionghai Dai
Firstly, we capture the scene with a pinhole and analyze the scene content to determine primary edge orientations.
no code implementations • CVPR 2015 • Wu Liu, Tao Mei, Yongdong Zhang, Cherry Che, Jiebo Luo
Given the tremendous growth of online videos, video thumbnail, as the common visualization form of video content, is becoming increasingly important to influence user's browsing and searching experience.
no code implementations • CVPR 2013 • Lei Zhang, Yongdong Zhang, Jinhu Tang, Ke Lu, Qi Tian
In this paper, we propose a weighted Hamming distance ranking algorithm (WhRank) to rank the binary codes of hashing methods.