1 code implementation • 13 Mar 2023 • Dingfeng Shi, Yujie Zhong, Qiong Cao, Lin Ma, Jia Li, DaCheng Tao
In this paper, we present a one-stage framework TriDet for temporal action detection.
Ranked #1 on
Temporal Action Localization
on EPIC-KITCHENS-100
1 code implementation • 5 Feb 2023 • Sifan Zhou, Zhi Tian, Xiangxiang Chu, Xinyu Zhang, Bo Zhang, Xiaobo Lu, Chengjian Feng, Zequn Jie, Patrick Yin Chiang, Lin Ma
The deployment of 3D detectors strikes one of the major challenges in real-world self-driving scenarios.
1 code implementation • 24 Dec 2022 • Dengjie Li, Siyu Chen, Yujie Zhong, Lin Ma
In person re-identification (ReID) tasks, many works explore the learning of part features to improve the performance over global image features.
Ranked #2 on
Person Re-Identification
on Occluded-DukeMTMC
1 code implementation • 7 Dec 2022 • Feng Yan, Zhiheng Li, Weixin Luo, Zequn Jie, Fan Liang, Xiaolin Wei, Lin Ma
This is a brief technical report of our proposed method for Multiple-Object Tracking (MOT) Challenge in Complex Environments.
Ranked #2 on
Multi-Object Tracking
on DanceTrack
(using extra training data)
1 code implementation • 22 Nov 2022 • Chengjian Feng, Zequn Jie, Yujie Zhong, Xiangxiang Chu, Lin Ma
However, the typical convolution ignores the radial symmetry of the BEV features and increases the difficulty of the detector optimization.
no code implementations • 22 Oct 2022 • Jiaming Chen, Weixin Luo, Xiaolin Wei, Lin Ma, Wei zhang
To simplify the pipeline, we carefully investigate 3D visual grounding and summarize three fundamental problems about how to develop an end-to-end model with high performance for this task.
no code implementations • 17 Oct 2022 • Jing Zhang, Zhao Li, Jiqiang Zhang, Lin Ma, Guozhong Zheng, Li Chen
Here we show that oscillatory behaviors naturally emerge if incomplete information is incorporated into the cooperation evolution of a non-Markov model.
1 code implementation • 11 Oct 2022 • Lin Ma, Jiangtao Gong, Hao Xu, Hao Chen, Hao Zhao, Wenbing Huang, Guyue Zhou
In this paper, we present a graph-transformer based framework for the ASP problem which is trained and demonstrated on a self-collected ASP database.
no code implementations • 10 Oct 2022 • Zixu Wang, Yujie Zhong, Yishu Miao, Lin Ma, Lucia Specia
However, even in paired video-text segments, only a subset of the frames are semantically relevant to the corresponding text, with the remainder representing noise; where the ratio of noisy frames is higher for longer videos.
no code implementations • 8 Oct 2022 • Yufeng Zhong, Long Xu, Jiebo Luo, Lin Ma
With such global and local contextual modeling strategies, our proposed model can effectively characterize the object representations and contextual information and thereby generate comprehensive and detailed descriptions of the located objects.
7 code implementations • 5 Oct 2022 • Silvio Giancola, Anthony Cioppa, Adrien Deliège, Floriane Magera, Vladimir Somers, Le Kang, Xin Zhou, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdulrahman Darwish, Adrien Maglo, Albert Clapés, Andreas Luyts, Andrei Boiarov, Artur Xarles, Astrid Orcesi, Avijit Shah, Baoyu Fan, Bharath Comandur, Chen Chen, Chen Zhang, Chen Zhao, Chengzhi Lin, Cheuk-Yiu Chan, Chun Chuen Hui, Dengjie Li, Fan Yang, Fan Liang, Fang Da, Feng Yan, Fufu Yu, Guanshuo Wang, H. Anthony Chan, He Zhu, Hongwei Kan, Jiaming Chu, Jianming Hu, Jianyang Gu, Jin Chen, João V. B. Soares, Jonas Theiner, Jorge De Corte, José Henrique Brito, Jun Zhang, Junjie Li, Junwei Liang, Leqi Shen, Lin Ma, Lingchi Chen, Miguel Santos Marques, Mike Azatov, Nikita Kasatkin, Ning Wang, Qiong Jia, Quoc Cuong Pham, Ralph Ewerth, Ran Song, RenGang Li, Rikke Gade, Ruben Debien, Runze Zhang, Sangrok Lee, Sergio Escalera, Shan Jiang, Shigeyuki Odashima, Shimin Chen, Shoichi Masui, Shouhong Ding, Sin-wai Chan, Siyu Chen, Tallal El-Shabrawy, Tao He, Thomas B. Moeslund, Wan-Chi Siu, Wei zhang, Wei Li, Xiangwei Wang, Xiao Tan, Xiaochuan Li, Xiaolin Wei, Xiaoqing Ye, Xing Liu, Xinying Wang, Yandong Guo, YaQian Zhao, Yi Yu, YingYing Li, Yue He, Yujie Zhong, Zhenhua Guo, Zhiheng Li
The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team.
no code implementations • 19 Sep 2022 • Ruobing Xie, Lin Ma, Shaoliang Zhang, Feng Xia, Leyu Lin
Precisely, we first define a new behavior named valid read, which helps to select high-quality click instances for different users and items via dwell time.
1 code implementation • 16 Sep 2022 • Jinlong Li, Zequn Jie, Xu Wang, Xiaolin Wei, Lin Ma
To tackle with this issue, this paper proposes an Expansion and Shrinkage scheme based on the offset learning in the deformable convolution, to sequentially improve the recall and precision of the located object in the two respective stages.
Weakly supervised Semantic Segmentation
Weakly-Supervised Semantic Segmentation
1 code implementation • 16 Sep 2022 • Jinlong Li, Zequn Jie, Xu Wang, Yu Zhou, Xiaolin Wei, Lin Ma
"Progressive Patch Learning" further extends the feature destruction and patch learning to multi-level granularities in a progressive manner.
Weakly supervised Semantic Segmentation
Weakly-Supervised Semantic Segmentation
1 code implementation • 7 Sep 2022 • Yang Jiao, Zequn Jie, Shaoxiang Chen, Jingjing Chen, Lin Ma, Yu-Gang Jiang
Recent approaches aim at exploring the semantic densities of camera features through lifting points in 2D camera images (referred to as seeds) into 3D space, and then incorporate 2D semantics via cross-modal interaction or fusion techniques.
no code implementations • 30 Aug 2022 • Shuqiang Cao, Weixin Luo, Bairui Wang, Wei zhang, Lin Ma
In this paper, we advocate a novel and efficient principle for online action detection.
1 code implementation • 23 Jul 2022 • Qian Yang, Yunxin Li, Baotian Hu, Lin Ma, Yuxing Ding, Min Zhang
CSI), a relation inferrer, and a Lexical Constraint-aware Generator (arr.
1 code implementation • 14 Jul 2022 • Dingfeng Shi, Yujie Zhong, Qiong Cao, Jing Zhang, Lin Ma, Jia Li, DaCheng Tao
Moreover, we propose two losses to facilitate and stabilize the training of action classification.
Ranked #6 on
Temporal Action Localization
on THUMOS’14
no code implementations • 11 Jul 2022 • Shaoxiang Chen, Zequn Jie, Xiaolin Wei, Lin Ma
In this technical report, we introduce our submission to the Waymo 3D Detection leaderboard.
no code implementations • 3 Jul 2022 • Zhangkai Ni, Wenhan Yang, Hanli Wang, Shiqi Wang, Lin Ma, Sam Kwong
Getting rid of the fundamental limitations in fitting to the paired training data, recent unsupervised low-light enhancement methods excel in adjusting illumination and contrast of images.
no code implementations • 13 May 2022 • Mahdieh Kazemimoghadam, Zi Yang, Lin Ma, Mingli Chen, Weiguo Lu, Xuejun Gu
We proposed to leverage the consistency of organs' anatomical shape and position information in medical images.
2 code implementations • 30 Mar 2022 • Chengjian Feng, Yujie Zhong, Zequn Jie, Xiangxiang Chu, Haibing Ren, Xiaolin Wei, Weidi Xie, Lin Ma
The goal of this work is to establish a scalable pipeline for expanding an object detector towards novel/unseen categories, using zero manual annotations.
1 code implementation • 10 Mar 2022 • Yang Jiao, Shaoxiang Chen, Zequn Jie, Jingjing Chen, Lin Ma, Yu-Gang Jiang
3D dense captioning is a recently-proposed novel task, where point clouds contain more geometric information than the 2D counterpart.
no code implementations • 10 Mar 2022 • Xiaohan Lan, Yitian Yuan, Xin Wang, Long Chen, Zhi Wang, Lin Ma, Wenwu Zhu
New benchmarking results indicate that our proposed evaluation protocols can better monitor the research progress.
no code implementations • 10 Mar 2022 • Yang Jiao, Zequn Jie, Jingjing Chen, Lin Ma, Yu-Gang Jiang
However, exploring relationships among these suspected objects in the one-stage visual grounding paradigm is non-trivial due to two core problems: (1) no object proposals are available as the basis on which to select suspected objects and perform relationship modeling; (2) compared with those irrelevant to the text query, suspected objects are more confusing, as they may share similar semantics, be entangled with certain relationships, etc, and thereby more easily mislead the model's prediction.
no code implementations • 12 Feb 2022 • Guozhong Zheng, Jiqiang Zhang, Rizhou Liang, Lin Ma, Li Chen
Behavioral experiments on the Ultimatum Game have shown that we human beings have remarkable preference in fair play, contradicting the predictions by the game theory.
1 code implementation • 2 Dec 2021 • Yitian Yuan, Lin Ma, Wenwu Zhu
Enhancing the diversity of sentences to describe video contents is an important problem arising in recent video captioning research.
1 code implementation • 2 Dec 2021 • Yitian Yuan, Lin Ma, Jingwen Wang, Wenwu Zhu
In this paper, we investigate a novel and challenging task, namely controllable video captioning with an exemplar sentence.
no code implementations • 9 Oct 2021 • Wei zhang, Debin Huang, Hantao Li, Lipeng Wang, Yanzhao Wei, Kang Pan, Lin Ma, Huanhuan Feng, Jing Pan, Yuzhu Guo
The accurate and reliable detection or prediction of freezing of gaits (FOG) is important for fall prevention in Parkinson's Disease (PD) and studying the physiological transitions during the occurrence of FOG.
1 code implementation • 9 Oct 2021 • Yang Jiao, Zequn Jie, Weixin Luo, Jingjing Chen, Yu-Gang Jiang, Xiaolin Wei, Lin Ma
Referring Image Segmentation (RIS) aims at segmenting the target object from an image referred by one given natural language expression.
no code implementations • 1 Oct 2021 • Siyu Chen, Dengjie Li, Lishuai Gao, Fan Liang, Wei zhang, Lin Ma
This paper is a technical report to our submission to the ICCV 2021 VIPriors Re-identification Challenge.
1 code implementation • 4 Aug 2021 • Chen Zhang, Runmin Cong, Qinwei Lin, Lin Ma, Feng Li, Yao Zhao, Sam Kwong
For the cross-modality interaction in feature encoder, existing methods either indiscriminately treat RGB and depth modalities, or only habitually utilize depth cues as auxiliary information of the RGB branch.
no code implementations • 27 Jul 2021 • Xuan Xia, Xizhou Pan, Xing He, Jingfei Zhang, Ning Ding, Lin Ma
As a kind of generative self-supervised learning methods, generative adversarial nets have been widely studied in the field of anomaly detection.
no code implementations • 4 Jul 2021 • Yunxin Li, Qian Yang, Qingcai Chen, Lin Ma, Baotian Hu, Xiaolong Wang, Yuxin Ding
Single online handwritten Chinese character recognition~(single OLHCCR) has achieved prominent performance.
no code implementations • 1 Jul 2021 • Yunxin Li, Yu Zhao, Baotian Hu, Qingcai Chen, Yang Xiang, Xiaolong Wang, Yuxin Ding, Lin Ma
Previous works indicate that the glyph of Chinese characters contains rich semantic information and has the potential to enhance the representation of Chinese characters.
no code implementations • 9 May 2021 • Kaihao Zhang, Wenhan Luo, Yanjiang Yu, Wenqi Ren, Fang Zhao, Changsheng Li, Lin Ma, Wei Liu, Hongdong Li
We first use a coarse deraining network to reduce the rain streaks on the input images, and then adopt a pre-trained semantic segmentation network to extract semantic features from the coarse derained image.
1 code implementation • CVPR 2021 • Yongfei Liu, Bo Wan, Lin Ma, Xuming He
Visual grounding, which aims to build a correspondence between visual objects and their language entities, plays a key role in cross-modal scene understanding.
1 code implementation • 18 Feb 2021 • Bo Liu, Li-Ming Zhan, Li Xu, Lin Ma, Yan Yang, Xiao-Ming Wu
We show that SLAKE can be used to facilitate the development and evaluation of Med-VQA systems.
no code implementations • 12 Jan 2021 • Xuanyu He, Wei zhang, Ran Song, Qian Zhang, Xiangyuan Lan, Lin Ma
By studying two unsupervised person re-ID methods in a cross-method way, we point out a hard negative problem is handled implicitly by their designs of data augmentations and PK sampler respectively.
1 code implementation • 5 Jan 2021 • Haiwen Diao, Ying Zhang, Lin Ma, Huchuan Lu
Image-text matching plays a critical role in bridging the vision and language, and great progress has been made by exploiting the global alignment between image and sentence, or local alignments between regions and words.
Ranked #2 on
Image Retrieval
on Flickr30K 1K test
1 code implementation • 30 Dec 2020 • Zhangkai Ni, Wenhan Yang, Shiqi Wang, Lin Ma, Sam Kwong
In this paper, we present an unsupervised image enhancement generative adversarial network (UEGAN), which learns the corresponding image-to-image mapping from a set of images with desired characteristics in an unsupervised manner, rather than learning on a large number of paired images.
no code implementations • 30 Dec 2020 • Zhangkai Ni, Wenhan Yang, Shiqi Wang, Lin Ma, Sam Kwong
The key novelty of the proposed QAGAN lies in the injected QAM for the generator such that it learns domain-relevant quality attention directly from the two domains.
3 code implementations • 18 Nov 2020 • Wen Liu, Zhixin Piao, Zhi Tu, Wenhan Luo, Lin Ma, Shenghua Gao
Also, we build a new dataset, namely iPER dataset, for the evaluation of human motion imitation, appearance transfer, and novel view synthesis.
no code implementations • NeurIPS 2018 • Xing Yan, Weizhong Zhang, Lin Ma, Wei Liu, Qi Wu
We propose a parsimonious quantile regression framework to learn the dynamic tail behaviors of financial asset returns.
no code implementations • 2 Sep 2020 • Kui Fu, Jia Li, Lin Ma, Kai Mu, Yonghong Tian
In this paper, we propose a novel context reasoning approach for small object detection which models and infers the intrinsic semantic and spatial layout relationships between objects.
1 code implementation • ECCV 2020 • Haoran Wang, Ying Zhang, Zhong Ji, Yanwei Pang, Lin Ma
In this paper, we propose a Consensus-aware Visual-Semantic Embedding (CVSE) model to incorporate the consensus information, namely the commonsense knowledge shared between both modalities, into image-text matching.
1 code implementation • CVPR 2020 • Kaihao Zhang, Wenhan Luo, Yiran Zhong, Lin Ma, Bjorn Stenger, Wei Liu, Hongdong Li
To address this problem, we propose a new method which combines two GAN models, i. e., a learning-to-Blur GAN (BGAN) and learning-to-DeBlur GAN (DBGAN), in order to learn a better model for image deblurring by primarily learning how to blur images.
Ranked #14 on
Deblurring
on HIDE (trained on GOPRO)
no code implementations • 18 Mar 2020 • Xu Li, Jingwen Wang, Lin Ma, Kaihao Zhang, Fengzong Lian, Zhanhui Kang, Jinjun Wang
Such a design enables efficient spatio-temporal modeling and maintains a small model scale.
no code implementations • 16 Mar 2020 • Yijun Song, Jingwen Wang, Lin Ma, Zhou Yu, Jun Yu
The task of temporally grounding textual queries in videos is to localize one video segment that semantically corresponds to the given query.
no code implementations • CVPR 2020 • Zhenfang Chen, Peng Wang, Lin Ma, Kwan-Yee K. Wong, Qi Wu
To bridge the gap, we propose a new dataset for visual reasoning in context of referring expression comprehension with two main features.
no code implementations • 25 Jan 2020 • Zhenfang Chen, Lin Ma, Wenhan Luo, Peng Tang, Kwan-Yee K. Wong
In this paper, we study the problem of weakly-supervised temporal grounding of sentence in video.
no code implementations • CVPR 2020 • Wei Xiong, Yutong He, Yixuan Zhang, Wenhan Luo, Lin Ma, Jiebo Luo
In this paper, we aim at transforming an image with a fine-grained category to synthesize new images that preserve the identity of the input image, which can thereby benefit the subsequent fine-grained image recognition and few-shot learning tasks.
no code implementations • 18 Dec 2019 • Tianrui Liu, Wenhan Luo, Lin Ma, Jun-Jie Huang, Tania Stathaki, Tianhong Dai
Ablation studies have validated the effectiveness of both the proposed gated multi-layer feature extraction sub-network and the deformable occlusion handling sub-network.
1 code implementation • 26 Nov 2019 • Yang Yang, Xiaojie Guo, Jiayi Ma, Lin Ma, Haibin Ling
It is challenging to inpaint face images in the wild, due to the large variation of appearance, such as different poses, expressions and occlusions.
1 code implementation • 24 Nov 2019 • Yuanbin Fu, Jiayi Ma, Lin Ma, Xiaojie Guo
The principle behind is that, for images from multiple domains, the content features can be obtained by a uniform extractor, while (re-)stylization is achieved by mapping the extracted features specifically to different purposes (domains and exemplars).
1 code implementation • NeurIPS 2019 • Xu Wang, Jingming He, Lin Ma
In this paper, we propose one novel model for point cloud semantic segmentation, which exploits both the local and global structures within the point cloud based on the contextual point representations.
1 code implementation • NeurIPS 2019 • Yitian Yuan, Lin Ma, Jingwen Wang, Wei Liu, Wenwu Zhu
Temporal sentence grounding in videos aims to detect and localize one target video segment, which semantically corresponds to a given sentence.
1 code implementation • ECCV 2020 • Xudong Lin, Lin Ma, Wei Liu, Shih-Fu Chang
As such, being aware of the global context, the modulated convolution kernel of our proposed CGC can better extract representative local patterns and compose discriminative features.
Ranked #59 on
Image Classification
on ObjectNet
(using extra training data)
2 code implementations • ICCV 2019 • Wen Liu, Zhixin Piao, Jie Min, Wenhan Luo, Lin Ma, Shenghua Gao
In this paper, we propose to use a 3D body mesh recovery module to disentangle the pose and shape, which can not only model the joint location and rotation but also characterize the personalized body shape.
1 code implementation • 11 Sep 2019 • Jingwen Wang, Lin Ma, Wenhao Jiang
The task of temporally grounding language queries in videos is to temporally localize the best matched video segment corresponding to a given language (sentence).
1 code implementation • ICCV 2019 • Bairui Wang, Lin Ma, Wei zhang, Wenhao Jiang, Jingwen Wang, Wei Liu
In this paper, we propose to guide the video caption generation with Part-of-Speech (POS) information, based on a gated fusion of multiple representations of input videos.
no code implementations • 14 Aug 2019 • Yifu Chen, Zongsheng Wang, Bowen Wu, Mengyuan Li, huan zhang, Lin Ma, Feng Liu, Qihang Feng, Baoxun Wang
Chinese meme-face is a special kind of internet subculture widely spread in Chinese Social Community Networks.
1 code implementation • 12 Aug 2019 • Yitian Yuan, Lin Ma, Wenwu Zhu
With the tremendous growth of videos over the Internet, video thumbnails, providing video content previews, are becoming increasingly crucial to influencing users' online searching experiences.
1 code implementation • 23 Jul 2019 • Yaxiong Wang, Hao Yang, Xueming Qian, Lin Ma, Jing Lu, Biao Li, Xin Fan
Then, an attention mechanism is proposed to model the relations between the image region and blocks and generate the valuable position feature, which will be further utilized to enhance the region expression and model a more reliable relationship between the visual image and the textual sentence.
1 code implementation • ACL 2019 • Zhenfang Chen, Lin Ma, Wenhan Luo, Kwan-Yee K. Wong
In this paper, we address a novel task, namely weakly-supervised spatio-temporally grounding natural sentence in video.
no code implementations • 3 Jun 2019 • Wei Zhang, Bairui Wang, Lin Ma, Wei Liu
Unlike previous video captioning work mainly exploiting the cues of video contents to make a language description, we propose a reconstruction network (RecNet) in a novel encoder-decoder-reconstructor architecture, which leverages both forward (video to sentence) and backward (sentence to video) flows for video captioning.
1 code implementation • CVPR 2019 • Zitian Chen, Yanwei Fu, Yu-Xiong Wang, Lin Ma, Wei Liu, Martial Hebert
Humans can robustly learn novel visual concepts even when images undergo various deformations and lose certain information.
1 code implementation • 28 May 2019 • Yongyi Tang, Lin Ma, Lianqiang Zhou
However, extracting motion information, specifically in the form of optical flow features, is extremely computationally expensive, especially for large-scale video classification.
no code implementations • CVPR 2019 • Yang Feng, Lin Ma, Wei Liu, Jiebo Luo
The need for efficiently finding the video content a user wants is increasing because of the erupting of user-generated videos on the Web.
18 code implementations • 28 Feb 2019 • Xiaojie Guo, Siyuan Li, Jinke Yu, Jiawan Zhang, Jiayi Ma, Lin Ma, Wei Liu, Haibin Ling
Being accurate, efficient, and compact is essential to a facial landmark detector for practical use.
no code implementations • 2 Feb 2019 • Bairui Wang, Lin Ma, Wei zhang, Wenhao Jiang, Feng Zhang
In this paper, we propose a novel model with a hierarchical photo-scene encoder and a reconstructor for the task of album storytelling.
no code implementations • 9 Dec 2018 • Xinpeng Chen, Lin Ma, Jingyuan Chen, Zequn Jie, Wei Liu, Jiebo Luo
Experiments on RefCOCO, RefCOCO+, and RefCOCOg datasets demonstrate that our proposed SSG without relying on any region proposals can achieve comparable performance with other advanced models.
no code implementations • NeurIPS 2018 • Wenqi Ren, Jiawei Zhang, Lin Ma, Jinshan Pan, Xiaochun Cao, WangMeng Zuo, Wei Liu, Ming-Hsuan Yang
In this paper, we present a deep convolutional neural network to capture the inherent properties of image degradation, which can handle different kernels and saturated pixels in a unified framework.
no code implementations • CVPR 2019 • Yuan Liu, Lin Ma, Yifeng Zhang, Wei Liu, Shih-Fu Chang
In this paper, we propose a multi-granularity generator (MGG) to perform the temporal action proposal from different granularity perspectives, relying on the video visual features equipped with the position embedding information.
Ranked #2 on
Action Recognition
on THUMOS’14
1 code implementation • CVPR 2019 • Yang Feng, Lin Ma, Wei Liu, Jiebo Luo
Instead of relying on manually labeled image-sentence pairs, our proposed model merely requires an image set, a sentence corpus, and an existing visual concept detector.
no code implementations • EMNLP 2018 • Jingyuan Chen, Xinpeng Chen, Lin Ma, Zequn Jie, Tat-Seng Chua
We introduce an effective and efficient method that grounds (i. e., localizes) natural sentences in long, untrimmed video sequences.
no code implementations • 29 Sep 2018 • Yongyi Tang, Xing Zhang, Jingwen Wang, Shaoxiang Chen, Lin Ma, Yu-Gang Jiang
This paper describes our solution for the 2$^\text{nd}$ YouTube-8M video understanding challenge organized by Google AI.
1 code implementation • 30 Aug 2018 • Fan Zhu, Lin Ma, Xin Xu, Dingfeng Guo, Xiao Cui, Qi Kong
Since manual calibration is not sustainable once entering into mass production stage for industrial purposes, we here introduce a machine-learning based auto-calibration system for autonomous driving vehicles.
1 code implementation • ECCV 2018 • Yang Feng, Lin Ma, Wei Liu, Tong Zhang, Jiebo Luo
We first exploit and reorganize the videos in ActivityNet to form a new dataset for video re-localization research, which consists of about 10, 000 videos of diverse visual appearances associated with localized boundary information.
no code implementations • ECCV 2018 • Wenhao Jiang, Lin Ma, Yu-Gang Jiang, Wei Liu, Tong Zhang
In this paper, in order to exploit the complementary information from multiple encoders, we propose a novel Recurrent Fusion Network (RFNet) for tackling image captioning.
no code implementations • ECCV 2018 • Minjun Li, Hao-Zhi Huang, Lin Ma, Wei Liu, Tong Zhang, Yu-Gang Jiang
Recent studies on unsupervised image-to-image translation have made a remarkable progress by training a pair of generative adversarial networks with a cycle-consistent loss.
2 code implementations • 2 Jun 2018 • Yunzhe Tao, Lin Ma, Weizhong Zhang, Jian Liu, Wei Liu, Qiang Du
Time series prediction has been studied in a variety of domains.
no code implementations • ICML 2018 • Weizhong Zhang, Bin Hong, Lin Ma, Wei Liu, Tong Zhang
Relying on this study, we subsequently propose a novel safe screening method to quickly identify the elements guaranteed to be included (we refer to them as active) or excluded (inactive) in the final optimal solution of SFM during the optimization process.
no code implementations • 7 May 2018 • Yongyi Tang, Lin Ma, Wei Liu, Wei-Shi Zheng
Human motion prediction aims at generating future frames of human motion based on an observed sequence of skeletons.
no code implementations • 4 Apr 2018 • Xinpeng Chen, Jingyuan Chen, Lin Ma, Jian Yao, Wei Liu, Jiebo Luo, Tong Zhang
First, we demonstrate that video attractiveness and different engagements present different relationships.
no code implementations • 3 Apr 2018 • Wenhao Jiang, Lin Ma, Xinpeng Chen, Hanwang Zhang, Wei Liu
Recently, much advance has been made in image captioning, and an encoder-decoder framework has achieved outstanding performance for this task.
no code implementations • CVPR 2018 • Wenqi Ren, Lin Ma, Jiawei Zhang, Jinshan Pan, Xiaochun Cao, Wei Liu, Ming-Hsuan Yang
The proposed algorithm hinges on an end-to-end trainable neural network that consists of an encoder and a decoder.
Ranked #13 on
Image Dehazing
on SOTS Outdoor
1 code implementation • CVPR 2018 • Jingwen Wang, Wenhao Jiang, Lin Ma, Wei Liu, Yong Xu
We propose a bidirectional proposal method that effectively exploits both past and future contexts to make proposal predictions.
3 code implementations • CVPR 2018 • Bairui Wang, Lin Ma, Wei zhang, Wei Liu
Unlike previous video captioning work mainly exploiting the cues of video contents to make a language description, we propose a reconstruction network (RecNet) with a novel encoder-decoder-reconstructor architecture, which leverages both the forward (video to sentence) and backward (sentence to video) flows for video captioning.
1 code implementation • CVPR 2018 • Xinpeng Chen, Lin Ma, Wenhao Jiang, Jian Yao, Wei Liu
Recently, caption generation with an encoder-decoder framework has been extensively studied and applied in different domains, such as image captioning, code captioning, and so on.
1 code implementation • 28 Mar 2018 • Kaihao Zhang, Wenhan Luo, Yiran Zhong, Lin Ma, Wei Liu, Hongdong Li
To tackle the second challenge, we leverage the developed DBLRNet as a generator in the GAN (generative adversarial network) architecture, and employ a content loss in addition to an adversarial loss for efficient adversarial training.
no code implementations • ECCV 2018 • Xinyu Gong, HaoZhi Huang, Lin Ma, Fumin Shen, Wei Liu, Tong Zhang
While each view of the stereoscopic pair is processed in an individual path, a novel feature aggregation strategy is proposed to effectively share information between the two paths.
3 code implementations • CVPR 2018 • Wei Xiong, Wenhan Luo, Lin Ma, Wei Liu, Jiebo Luo
The first stage generates videos of realistic contents for each frame.
1 code implementation • ICLR 2018 • Zijun Zhang, Lin Ma, Zongpeng Li, Chuan Wu
Adaptive optimization algorithms, such as Adam and RMSprop, have shown better optimization performance than stochastic gradient descent (SGD) in some scenarios.
no code implementations • CVPR 2017 • Hao-Zhi Huang, Hao Wang, Wenhan Luo, Lin Ma, Wenhao Jiang, Xiaolong Zhu, Zhifeng Li, Wei Liu
More specifically, a hybrid loss is proposed to capitalize on the content information of input frames, the style information of a given style image, and the temporal information of consecutive frames.
no code implementations • 13 Apr 2017 • Lin Ma, Caifa Zhou, Xi Liu, Yubin Xu
By verifying the proposed algorithm on embedding Swiss roll from R3 to R2 based on LLE and ISOMAP algorithm, the simulation results show that the proposed adaptive neighboring selection algorithm is feasible and able to find the optimal value of K, making the residual variance relatively small and better visualization of the results.
no code implementations • 12 Apr 2017 • Caifa Zhou, Lin Ma, Xuezhi Tan
Another significant innovation of this paper is jointing the fingerprint based algorithm with CM-SDE algorithm to improve the localization accuracy of indoor localization.
no code implementations • ICCV 2015 • Lin Ma, Xiaoqin Zhang, Weiming Hu, Junliang Xing, Jiwen Lu, Jie zhou
To address this, this paper presents a local subspace collaborative tracking method for robust visual tracking, where multiple linear and nonlinear subspaces are learned to better model the nonlinear relationship of object appearances.
no code implementations • ICCV 2015 • Lin Ma, Jiwen Lu, Jianjiang Feng, Jie zhou
It is desirable to combine multiple feature descriptors to improve the visual tracking performance because different features can provide complementary information to describe objects of interest.
no code implementations • 1 Jun 2015 • Lin Ma, Zhengdong Lu, Hang Li
We demonstrate the efficacy of our proposed model on the DAQUAR and COCO-QA datasets, which are two benchmark datasets for the image QA, with the performances significantly outperforming the state-of-the-art.
2 code implementations • ICCV 2015 • Lin Ma, Zhengdong Lu, Lifeng Shang, Hang Li
In this paper, we propose multimodal convolutional neural networks (m-CNNs) for matching image and sentence.
Ranked #15 on
Image Retrieval
on Flickr30K 1K test