Search Results for author: Long Chen

Found 177 papers, 67 papers with code

基于中文信息与越南语句法指导的越南语事件检测(Vietnamese event detection based on Chinese information and Vietnamese syntax guidance)

no code implementations • CCL 2021 • Long Chen, Junjun Guo, Yafei Zhang, Chengxiang Gao, Zhengtao Yu

“当前基于深度学习的事件检测模型都依赖足够数量的标注数据, 而标注数据的稀缺及事件类型歧义为越南语事件检测带来了极大的挑战。根据“表达相同观点但语言不同的句子通常有相同或相似的语义成分”这一多语言一致性特征, 本文提出了一种基于中文信息与越南语句法指导的越南语事件检测框架。首先通过共享编码器策略和交叉注意力网络将中文信息融入到越南语中, 然后使用图卷积网络融入越南语依存句法信息, 最后在中文事件类型指导下实现越南语事件检测。实验结果表明, 在中文信息和越南语句法的指导下越南语事件检测取得了较好的效果。”

Event Detection

Paper
Add Code

Cross-Modal Conditioned Reconstruction for Language-guided Medical Image Segmentation

no code implementations • 3 Apr 2024 • Xiaoshuang Huang, Hongxiang Li, Meng Cao, Long Chen, Chenyu You, Dong An

Recent developments underscore the potential of textual information in enhancing learning models for a deeper understanding of medical visual semantics.

Image Segmentation Medical Image Segmentation +2

Paper
Add Code

Long and Short-Term Constraints Driven Safe Reinforcement Learning for Autonomous Driving

no code implementations • 27 Mar 2024 • Xuemin Hu, Pan Chen, Yijun Wen, Bo Tang, Long Chen

Reinforcement learning (RL) has been widely used in decision-making tasks, but it cannot guarantee the agent's safety in the training process due to the requirements of interaction with the environment, which seriously limits its industrial applications such as autonomous driving.

Autonomous Driving Decision Making +2

Paper
Add Code

View-Consistent 3D Editing with Gaussian Splatting

no code implementations • 18 Mar 2024 • Yuxuan Wang, Xuanyu Yi, Zike Wu, Na Zhao, Long Chen, Hanwang Zhang

The advent of 3D Gaussian Splatting (3DGS) has revolutionized 3D editing, offering efficient, high-fidelity rendering and enabling precise local manipulations.

Paper
Add Code

Distributionally Generative Augmentation for Fair Facial Attribute Classification

1 code implementation • 11 Mar 2024 • Fengda Zhang, Qianpei He, Kun Kuang, Jiashuo Liu, Long Chen, Chao Wu, Jun Xiao, Hanwang Zhang

This work proposes a novel, generation-based two-stage framework to train a fair FAC model on biased data without additional annotation.

Attribute Classification +2

Paper
Code

SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos

no code implementations • 3 Mar 2024 • Yulei Niu, Wenliang Guo, Long Chen, Xudong Lin, Shih-Fu Chang

We study the problem of procedure planning in instructional videos, which aims to make a goal-oriented sequence of action steps given partial visual state observations.

Contrastive Learning

Paper
Add Code

GenAD: Generative End-to-End Autonomous Driving

1 code implementation • 18 Feb 2024 • Wenzhao Zheng, Ruiqi Song, Xianda Guo, Chenming Zhang, Long Chen

We then employ a variational autoencoder to learn the future trajectory distribution in a structural latent space for trajectory prior modeling.

Autonomous Driving motion prediction

Paper
Code

Improving Data Augmentation for Robust Visual Question Answering with Effective Curriculum Learning

no code implementations • 28 Jan 2024 • Yuhang Zheng, Zhen Wang, Long Chen

Compared to training on the entire augmented dataset, our ECL strategy can further enhance VQA models' performance with fewer training samples.

Data Augmentation Question Answering +1

Paper
Add Code

Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion

no code implementations • 26 Jan 2024 • Jinhan Wang, Long Chen, Aparna Khare, Anirudh Raju, Pranav Dheram, Di He, Minhua Wu, Andreas Stolcke, Venkatesh Ravichandran

We propose an approach for continuous prediction of turn-taking and backchanneling locations in spoken dialogue by fusing a neural acoustic model with a large language model (LLM).

Language Modelling Large Language Model

Paper
Add Code

Boundary and Relation Distillation for Semantic Segmentation

no code implementations • 24 Jan 2024 • Dong Zhang, Pingcheng Dong, Xinting Hu, Long Chen, Kwang-Ting Cheng

Concurrently, the relation distillation transfers implicit relations from the teacher model to the student model using pixel-level self-relation as a bridge, ensuring that the student's mask has strong target region connectivity.

Implicit Relations Knowledge Distillation +2

Paper
Add Code

Two-pass Endpoint Detection for Speech Recognition

no code implementations • 17 Jan 2024 • Anirudh Raju, Aparna Khare, Di He, Ilya Sklyar, Long Chen, Sam Alptekin, Viet Anh Trinh, Zhe Zhang, Colin Vaz, Venkatesh Ravichandran, Roland Maas, Ariya Rastrow

Endpoint (EP) detection is a key component of far-field speech recognition systems that assist the user through voice commands.

speech-recognition Speech Recognition

Paper
Add Code

Multiperson Detection and Vital-Sign Sensing Empowered by Space-Time-Coding RISs

no code implementations • 15 Jan 2024 • Xinyu Li, Jian Wei You, Ze Gu, Qian Ma, Jingyuan Zhang, Long Chen, Tie Jun Cui

Passive human sensing using wireless signals has attracted increasing attention due to its superiorities of non-contact and robustness in various lighting conditions.

Human Detection

Paper
Add Code

LingoQA: Video Question Answering for Autonomous Driving

2 code implementations • 21 Dec 2023 • Ana-Maria Marcu, Long Chen, Jan Hünermann, Alice Karnsund, Benoit Hanotte, Prajwal Chidananda, Saurabh Nair, Vijay Badrinarayanan, Alex Kendall, Jamie Shotton, Elahe Arani, Oleg Sinavski

To fill this gap, we introduce LingoQA, a benchmark specifically for autonomous driving Video QA.

Autonomous Driving Decision Making +3

301

Paper
Code

Beneath the Surface: Unveiling Harmful Memes with Multimodal Reasoning Distilled from Large Language Models

1 code implementation • 9 Dec 2023 • Hongzhan Lin, Ziyang Luo, Jing Ma, Long Chen

The age of social media is rife with memes.

Multimodal Reasoning

Paper
Code

OpenStereo: A Comprehensive Benchmark for Stereo Matching and Strong Baseline

1 code implementation • 1 Dec 2023 • Xianda Guo, Juntao Lu, Chenming Zhang, Yiqi Wang, Yiqun Duan, Tian Yang, Zheng Zhu, Long Chen

Based on OpenStereo, we conducted experiments and have achieved or surpassed the performance metrics reported in the original paper.

Autonomous Driving Autonomous Navigation +1

240

Paper
Code

Automatic Detection of Alzheimer's Disease with Multi-Modal Fusion of Clinical MRI Scans

no code implementations • 30 Nov 2023 • Long Chen, Liben Chen, Binfeng Xu, Wenxin Zhang, Narges Razavian

Notably, literature on the application of deep learning in the automatic detection of the disease has been proliferating.

Paper
Add Code

DECap: Towards Generalized Explicit Caption Editing via Diffusion Mechanism

no code implementations • 25 Nov 2023 • Zhen Wang, Xinyun Jiang, Jun Xiao, Tao Chen, Long Chen

The denoising process involves the explicit predictions of edit operations and corresponding content words, refining reference captions through iterative step-wise editing.

Caption Generation Denoising +1

Paper
Add Code

Compositional Zero-shot Learning via Progressive Language-based Observations

no code implementations • 23 Nov 2023 • Lin Li, Guikun Chen, Jun Xiao, Long Chen

Compositional zero-shot learning aims to recognize unseen state-object compositions by leveraging known primitives (state and object) during training.

Compositional Zero-Shot Learning

Paper
Add Code

Passive Human Sensing Enhanced by Reconfigurable Intelligent Surface: Opportunities and Challenges

no code implementations • 14 Nov 2023 • Xinyu Li, Jian Wei You, Ze Gu, Qian Ma, Long Chen, Jingyuan Zhang, Shi Jin, Tie Jun Cui

Reconfigurable intelligent surfaces (RISs) have flexible and exceptional performance in manipulating electromagnetic waves and customizing wireless channels.

Activity Recognition

Paper
Add Code

Video Referring Expression Comprehension via Transformer with Content-conditioned Query

no code implementations • 25 Oct 2023 • Ji Jiang, Meng Cao, Tengtao Song, Long Chen, Yi Wang, Yuexian Zou

Video Referring Expression Comprehension (REC) aims to localize a target object in videos based on the queried natural language.

Referring Expression Referring Expression Comprehension +1

Paper
Add Code

Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond

no code implementations • 23 Oct 2023 • Zhecan Wang, Long Chen, Haoxuan You, Keyang Xu, Yicheng He, Wenhao Li, Noel Codella, Kai-Wei Chang, Shih-Fu Chang

Vision-language (VL) understanding tasks evaluate models' comprehension of complex visual scenes through multiple-choice questions.

counterfactual Multiple-choice +2

Paper
Add Code

V2X-AHD:Vehicle-to-Everything Cooperation Perception via Asymmetric Heterogenous Distillation Network

1 code implementation • 10 Oct 2023 • Caizhen He, Hai Wang, Long Chen, Tong Luo, Yingfeng Cai

The V2X-AHD can effectively improve the accuracy of 3D object detection and reduce the number of network parameters, according to this study, which serves as a benchmark for cooperative perception.

Ranked #2 on 3D Object Detection on V2XSet

3D Object Detection object-detection

Paper
Code

On the Cognition of Visual Question Answering Models and Human Intelligence: A Comparative Study

no code implementations • 4 Oct 2023 • Liben Chen, Long Chen, Tian Ellison-Chen, Zhuoyuan Xu

Visual Question Answering (VQA) is a challenging task that requires cross-modal understanding and reasoning of visual image and natural language question.

Question Answering Visual Question Answering

Paper
Add Code

Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving

1 code implementation • 3 Oct 2023 • Long Chen, Oleg Sinavski, Jan Hünermann, Alice Karnsund, Andrew James Willmott, Danny Birch, Daniel Maund, Jamie Shotton

Large Language Models (LLMs) have shown promise in the autonomous driving sector, particularly in generalization and interpretability.

Action Generation Autonomous Driving +1

301

Paper
Code

SortedAP: Rethinking evaluation metrics for instance segmentation

1 code implementation • 9 Sep 2023 • Long Chen, Yuli Wu, Johannes Stegmaier, Dorit Merhof

Designing metrics for evaluating instance segmentation revolves around comprehensively considering object detection and segmentation accuracy.

Instance Segmentation Object +4

Paper
Code

Semi-supervised Instance Segmentation with a Learned Shape Prior

no code implementations • 9 Sep 2023 • Long Chen, Weiwen Zhang, Yuli Wu, Martin Strauch, Dorit Merhof

To date, most instance segmentation approaches are based on supervised learning that requires a considerable amount of annotated object contours as training ground truth.

Cell Segmentation Instance Segmentation +4

Paper
Add Code

UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory

1 code implementation • 28 Aug 2023 • Haiwen Diao, Bo Wan, Ying Zhang, Xu Jia, Huchuan Lu, Long Chen

Parameter-efficient transfer learning (PETL), i. e., fine-tuning a small portion of parameters, is an effective strategy for adapting pre-trained models to downstream domains.

Question Answering Retrieval +5

Paper
Code

MEDOE: A Multi-Expert Decoder and Output Ensemble Framework for Long-tailed Semantic Segmentation

no code implementations • 16 Aug 2023 • Junao Shen, Long Chen, Kun Kuang, Fei Wu, Tian Feng, Wei zhang

The proposed two-sage framework comprises a multi-expert decoder (MED) and a multi-expert output ensemble (MOE).

Segmentation Semantic Segmentation

Paper
Add Code

FusionPlanner: A Multi-task Motion Planner for Mining Trucks via Multi-sensor Fusion

no code implementations • 14 Aug 2023 • Siyu Teng, Luxi Li, Yuchen Li, Xuemin Hu, Lingxi Li, Yunfeng Ai, Long Chen

Firstly, we propose a multi-task motion planning algorithm, called FusionPlanner, for autonomous mining trucks by the multi-sensor fusion method to adapt both lateral and longitudinal control tasks for unmanned transportation.

Motion Planning Scheduling +1

Paper
Add Code

Compositional Feature Augmentation for Unbiased Scene Graph Generation

1 code implementation • ICCV 2023 • Lin Li, Guikun Chen, Jun Xiao, Yi Yang, Chunping Wang, Long Chen

Specifically, we first decompose each relation triplet feature into two components: intrinsic feature and extrinsic feature, which correspond to the intrinsic characteristics and extrinsic contexts of a relation triplet, respectively.

Graph Generation Relation +1

Paper
Code

A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future

no code implementations • 18 Jul 2023 • Chaoyang Zhu, Long Chen

By ``open-vocabulary'', we mean that the models can classify objects beyond pre-defined categories.

Knowledge Distillation object-detection +6

Paper
Add Code

In Defense of Clip-based Video Relation Detection

no code implementations • 18 Jul 2023 • Meng Wei, Long Chen, Wei Ji, Xiaoyu Yue, Roger Zimmermann

While recent video-based methods utilizing video tubelets have shown promising results, we argue that the effective modeling of spatial and temporal context plays a more significant role than the choice between clip tubelets and video tubelets.

Feature Compression Object Tracking +2

Paper
Add Code

Machine Learning Study of the Extended Drug-target Interaction Network informed by Pain Related Voltage-Gated Sodium Channels

1 code implementation • 11 Jul 2023 • Long Chen, Jian Jiang, Bozheng Dou, Hongsong Feng, Jie Liu, Yueying Zhu, Bengong Zhang, Tianshou Zhou, Guo-Wei Wei

Pain is a significant global health issue, and the current treatment options for pain management have limitations in terms of effectiveness, side effects, and potential for addiction.

Management

Paper
Code

Improving Reference-based Distinctive Image Captioning with Contrastive Rewards

no code implementations • 25 Jun 2023 • Yangjun Mao, Jun Xiao, Dong Zhang, Meng Cao, Jian Shao, Yueting Zhuang, Long Chen

A recent DIC method proposes to generate distinctive captions by comparing the target image with a set of semantic-similar reference images, i. e., reference-based DIC (Ref-DIC).

Benchmarking Contrastive Learning +1

Paper
Add Code

Milestones in Autonomous Driving and Intelligent Vehicles Part II: Perception and Planning

no code implementations • 3 Jun 2023 • Long Chen, Siyu Teng, Bai Li, Xiaoxiang Na, Yuchen Li, Zixuan Li, Jinjun Wang, Dongpu Cao, Nanning Zheng, Fei-Yue Wang

Growing interest in autonomous driving (AD) and intelligent vehicles (IVs) is fueled by their promise for enhanced safety, efficiency, and economic benefits.

Autonomous Driving Ethics

Paper
Add Code

Enhanced Chart Understanding in Vision and Language Task via Cross-modal Pre-training on Plot Table Pairs

no code implementations • 29 May 2023 • Mingyang Zhou, Yi R. Fung, Long Chen, Christopher Thomas, Heng Ji, Shih-Fu Chang

Building cross-model intelligence that can understand charts and communicate the salient information hidden behind them is an appealing challenge in the vision and language(V+L) community.

Chart Question Answering Question Answering +1

Paper
Add Code

IdealGPT: Iteratively Decomposing Vision and Language Reasoning via Large Language Models

1 code implementation • 24 May 2023 • Haoxuan You, Rui Sun, Zhecan Wang, Long Chen, Gengyu Wang, Hammad A. Ayyubi, Kai-Wei Chang, Shih-Fu Chang

Specifically, IdealGPT utilizes an LLM to generate sub-questions, a VLM to provide corresponding sub-answers, and another LLM to reason to achieve the final answer.

Paper
Code

Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models

1 code implementation • NeurIPS 2023 • Lin Li, Jun Xiao, Guikun Chen, Jian Shao, Yueting Zhuang, Long Chen

To dynamically fuse different cues, we further introduce a chain-of-thought method that prompts LLMs to generate reasonable weights for different visual cues.

Relation

Paper
Code

TreePrompt: Learning to Compose Tree Prompts for Explainable Visual Grounding

no code implementations • 19 May 2023 • Chenchi Zhang, Jun Xiao, Lei Chen, Jian Shao, Long Chen

In this paper, we argue that their poor interpretability is attributed to the holistic prompt generation and inference process.

Sentence Visual Grounding

Paper
Add Code

Milestones in Autonomous Driving and Intelligent Vehicles Part I: Control, Computing System Design, Communication, HD Map, Testing, and Human Behaviors

no code implementations • 12 May 2023 • Long Chen, Yuchen Li, Chao Huang, Yang Xing, Daxin Tian, Li Li, Zhongxu Hu, Siyu Teng, Chen Lv, Jinjun Wang, Dongpu Cao, Nanning Zheng, Fei-Yue Wang

Our work is divided into 3 independent articles and the first part is a Survey of Surveys (SoS) for total technologies of AD and IVs that involves the history, summarizes the milestones, and provides the perspectives, ethics, and future research directions.

Autonomous Driving Ethics

Paper
Add Code

Multi-Prompt with Depth Partitioned Cross-Modal Learning

1 code implementation • 10 May 2023 • Yingjie Tian, Yiqi Wang, Xianda Guo, Zheng Zhu, Long Chen

In recent years, soft prompt learning methods have been proposed to fine-tune large-scale vision-language pre-trained models for various downstream tasks.

Domain Generalization

Paper
Code

How Simulation Helps Autonomous Driving:A Survey of Sim2real, Digital Twins, and Parallel Intelligence

no code implementations • 2 May 2023 • Xuemin Hu, Shen Li, Tingyu Huang, Bo Tang, Rouxing Huai, Long Chen

In general, a large scale of testing in simulation environment is conducted and then the learned driving knowledge is transferred to the real world, so how to adapt driving knowledge learned in simulation to reality becomes a critical issue.

Autonomous Driving

Paper
Add Code

Discrepancy-Guided Reconstruction Learning for Image Forgery Detection

no code implementations • 26 Apr 2023 • Zenan Shi, Haipeng Chen, Long Chen, Dong Zhang

In this paper, we propose a novel image forgery detection paradigm for boosting the model learning capacity on both forgery-sensitive and genuine compact visual patterns.

Image Forgery Detection

Paper
Add Code

Milestones in Autonomous Driving and Intelligent Vehicles: Survey of Surveys

no code implementations • 30 Mar 2023 • Long Chen, Yuchen Li, Chao Huang, Bai Li, Yang Xing, Daxin Tian, Li Li, Zhongxu Hu, Xiaoxiang Na, Zixuan Li, Siyu Teng, Chen Lv, Jinjun Wang, Dongpu Cao, Nanning Zheng, Fei-Yue Wang

Interest in autonomous driving (AD) and intelligent vehicles (IVs) is growing at a rapid pace due to the convenience, safety, and economic benefits.

Autonomous Driving Ethics

Paper
Add Code

Cross-utterance ASR Rescoring with Graph-based Label Propagation

no code implementations • 27 Mar 2023 • Srinath Tankasala, Long Chen, Andreas Stolcke, Anirudh Raju, Qianli Deng, Chander Chandak, Aparna Khare, Roland Maas, Venkatesh Ravichandran

We propose a novel approach for ASR N-best hypothesis rescoring with graph-based label propagation by leveraging cross-utterance acoustic similarity.

Fairness Language Modelling

Paper
Add Code

Decomposed Prototype Learning for Few-Shot Scene Graph Generation

no code implementations • 20 Mar 2023 • Xingchen Li, Long Chen, Guikun Chen, Yinfu Feng, Yi Yang, Jun Xiao

To this end, we propose a novel Decomposed Prototype Learning (DPL).

Few-Shot Learning Graph Generation +1

Paper
Add Code

Motion Planning for Autonomous Driving: The State of the Art and Future Perspectives

no code implementations • 17 Mar 2023 • Siyu Teng, Xuemin Hu, Peng Deng, Bai Li, Yuchen Li, Dongsheng Yang, Yunfeng Ai, Lingxi Li, Zhe XuanYuan, Fenghua Zhu, Long Chen

Intelligent vehicles (IVs) have gained worldwide attention due to their increased convenience, safety advantages, and potential commercial value.

Autonomous Driving Motion Planning

Paper
Add Code

A Simple Baseline for Supervised Surround-view Depth Estimation

no code implementations • 14 Mar 2023 • Xianda Guo, Wenjie Yuan, Yunpeng Zhang, Tian Yang, Chenming Zhang, Zheng Zhu, Long Chen

The former is achieved by the self-attention module within each view, while the latter is realized by the adjacent attention module, which computes the attention across multi-cameras to exchange the multi-scale representations across surround-view feature maps.

Autonomous Driving Monocular Depth Estimation

Paper
Add Code

CrossFormer++: A Versatile Vision Transformer Hinging on Cross-scale Attention

1 code implementation • 13 Mar 2023 • Wenxiao Wang, Wei Chen, Qibo Qiu, Long Chen, Boxi Wu, Binbin Lin, Xiaofei He, Wei Liu

On the one hand, CEL blends each token with multiple patches of different scales, providing the self-attention module itself with cross-scale features.

Image Classification Instance Segmentation +3

317

Paper
Code

Learning Combinatorial Prompts for Universal Controllable Image Captioning

no code implementations • 11 Mar 2023 • Zhen Wang, Jun Xiao, Yueting Zhuang, Fei Gao, Jian Shao, Long Chen

To this end, we propose a novel prompt-based framework for CIC by learning Combinatorial Prompts, dubbed as ComPro.

controllable image captioning Language Modelling +1

Paper
Add Code

Compositional Prompt Tuning with Motion Cues for Open-vocabulary Video Relation Detection

1 code implementation • 1 Feb 2023 • Kaifeng Gao, Long Chen, Hanwang Zhang, Jun Xiao, Qianru Sun

Without bells and whistles, our RePro achieves a new state-of-the-art performance on two VidVRD benchmarks of not only the base training object and predicate categories, but also the unseen ones.

Object Relation +1

Paper
Code

Iterative Proposal Refinement for Weakly-Supervised Video Grounding

no code implementations • CVPR 2023 • Meng Cao, Fangyun Wei, Can Xu, Xiubo Geng, Long Chen, Can Zhang, Yuexian Zou, Tao Shen, Daxin Jiang

Weakly-Supervised Video Grounding (WSVG) aims to localize events of interest in untrimmed videos with only video-level annotations.

Sentence Video Grounding

Paper
Add Code

TempCLR: Temporal Alignment Representation with Contrastive Learning

1 code implementation • 28 Dec 2022 • Yuncong Yang, Jiawei Ma, Shiyuan Huang, Long Chen, Xudong Lin, Guangxing Han, Shih-Fu Chang

For long videos, given a paragraph of description where the sentences describe different segments of the video, by matching all sentence-clip pairs, the paragraph and the full video are aligned implicitly.

Ranked #2 on Long Video Retrieval (Background Removed) on YouCook2

Contrastive Learning Dynamic Time Warping +7

Paper
Code

MRTNet: Multi-Resolution Temporal Network for Video Sentence Grounding

no code implementations • 26 Dec 2022 • Wei Ji, Long Chen, Yinwei Wei, Yiming Wu, Tat-Seng Chua

In this work, we propose a novel multi-resolution temporal video sentence grounding network: MRTNet, which consists of a multi-modal feature encoder, a Multi-Resolution Temporal (MRT) module, and a predictor module.

Descriptive Sentence

Paper
Add Code

Short term prediction of demand for ride hailing services: A deep learning approach

no code implementations • 7 Dec 2022 • Long Chen, Piyushimita, Thakuriah, Konstantinos Ampountolas

UberNet empploys a multivariate framework that utilises a number of temporal and spatial features that have been found in the literature to explain demand for ride-hailing services.

Paper
Add Code

Line Drawing Guided Progressive Inpainting of Mural Damages

1 code implementation • 12 Nov 2022 • Luxi Li, Qin Zou, Fan Zhang, Hongkai Yu, Long Chen, Chengfang Song, Xianfeng Huang, Xiaoguang Wang

Mural image inpainting refers to repairing the damage or missing areas in a mural image to restore the visual appearance.

Image Inpainting

Paper
Code

Sequential Transformer for End-to-End Person Search

no code implementations • 6 Nov 2022 • Long Chen, Jinhua Xu

Person Search aims to simultaneously localize and recognize a target person from realistic and uncropped gallery images.

Human Detection Person Re-Identification +1

Paper
Add Code

Respecting Transfer Gap in Knowledge Distillation

no code implementations • 23 Oct 2022 • Yulei Niu, Long Chen, Chang Zhou, Hanwang Zhang

The network response serves as additional supervision to formulate the machine domain, which uses the data collected from the human domain as a transfer set.

Knowledge Distillation

Paper
Add Code

Weakly-Supervised Temporal Article Grounding

1 code implementation • 22 Oct 2022 • Long Chen, Yulei Niu, Brian Chen, Xudong Lin, Guangxing Han, Christopher Thomas, Hammad Ayyubi, Heng Ji, Shih-Fu Chang

Specifically, given an article and a relevant video, WSAG aims to localize all ``groundable'' sentences to the video, and these sentences are possibly at different semantic scales.

Natural Language Queries Sentence +1

Paper
Code

Instance Segmentation of Dense and Overlapping Objects via Layering

1 code implementation • 7 Oct 2022 • Long Chen, Yuli Wu, Dorit Merhof

Instance segmentation aims to delineate each individual object of interest in an image.

Instance Segmentation Object +1

Paper
Code

Transformer Meets Boundary Value Inverse Problems

1 code implementation • 29 Sep 2022 • Ruchi Guo, Shuhao Cao, Long Chen

A Transformer-based deep direct sampling method is proposed for electrical impedance tomography, a well-known severely ill-posed nonlinear boundary value inverse problem.

Paper
Code

Cross-Skeleton Interaction Graph Aggregation Network for Representation Learning of Mouse Social Behaviour

no code implementations • 7 Aug 2022 • Feixiang Zhou, Xinyu Yang, Fang Chen, Long Chen, Zheheng Jiang, Hui Zhu, Reiko Heckel, Haikuan Wang, Minrui Fei, Huiyu Zhou

Furthermore, we design a novel Interaction-Aware Transformer (IAT) to dynamically learn the graph-level representation of social behaviours and update the node-level representation, guided by our proposed interaction-aware self-attention mechanism.

Representation Learning Self-Supervised Learning

Paper
Add Code

Label Semantic Knowledge Distillation for Unbiased Scene Graph Generation

no code implementations • 7 Aug 2022 • Lin Li, Long Chen, Hanrong Shi, Wenxiao Wang, Jian Shao, Yi Yang, Jun Xiao

To this end, we propose a novel model-agnostic Label Semantic Knowledge Distillation (LS-KD) for unbiased SGG.

Graph Generation Knowledge Distillation +3

Paper
Add Code

Integrating Object-aware and Interaction-aware Knowledge for Weakly Supervised Scene Graph Generation

1 code implementation • 3 Aug 2022 • Xingchen Li, Long Chen, Wenbo Ma, Yi Yang, Jun Xiao

However, we argue that most existing WSSGG works only focus on object-consistency, which means the grounded regions should have the same object category label as text entities.

Graph Generation Object +1

Paper
Code

Rethinking the Evaluation of Unbiased Scene Graph Generation

no code implementations • 3 Aug 2022 • Xingchen Li, Long Chen, Jian Shao, Shaoning Xiao, Songyang Zhang, Jun Xiao

Current Scene Graph Generation (SGG) methods tend to predict frequent predicate categories and fail to recognize rare ones due to the severe imbalanced distribution of predicates.

Graph Generation Unbiased Scene Graph Generation

Paper
Add Code

A Transformer-based Generative Adversarial Network for Brain Tumor Segmentation

no code implementations • 28 Jul 2022 • Liqun Huang, Long Chen, Baihai Zhang, Senchun Chai

Our architecture consists of a generator and a discriminator, which are trained in min-max game progress.

Brain Tumor Segmentation Generative Adversarial Network +3

Paper
Add Code

NICEST: Noisy Label Correction and Training for Robust Scene Graph Generation

no code implementations • 27 Jul 2022 • Lin Li, Long Chen, Hanrong Shi, Hanwang Zhang, Yi Yang, Wei Liu, Jun Xiao

To this end, we propose a novel NoIsy label CorrEction and Sample Training strategy for SGG: NICEST.

Graph Generation Knowledge Distillation +1

Paper
Add Code

Rethinking the Reference-based Distinctive Image Captioning

1 code implementation • 22 Jul 2022 • Yangjun Mao, Long Chen, Zhihong Jiang, Dong Zhang, Zhimeng Zhang, Jian Shao, Jun Xiao

Unfortunately, reference images used by existing Ref-DIC works are easy to distinguish: these reference images only resemble the target image at scene-level and have few common objects, such that a Ref-DIC model can trivially generate distinctive captions even without considering the reference images.

Attribute Benchmarking +1

Paper
Code

Correspondence Matters for Video Referring Expression Comprehension

1 code implementation • 21 Jul 2022 • Meng Cao, Ji Jiang, Long Chen, Yuexian Zou

Extensive experiments demonstrate that our DCNet achieves state-of-the-art performance on both video and image REC benchmarks.

Contrastive Learning Referring Expression +3

Paper
Code

Explicit Image Caption Editing

1 code implementation • 20 Jul 2022 • Zhen Wang, Long Chen, Wenbo Ma, Guangxing Han, Yulei Niu, Jian Shao, Jun Xiao

Given an image and a reference caption, the image caption editing task aims to correct the misalignment errors and generate a refined caption.

Sentence

Paper
Code

Balancing the trade-off between cost and reliability for wireless sensor networks: a multi-objective optimized deployment method

1 code implementation • 19 Jul 2022 • Long Chen, Yingying Xu, Fangyi Xu, Qian Hu, Zhenzhou Tang

In addition, this work fully considers the heterogeneity of SNs (i. e. differentiated sensing range and deployment cost) and three-dimensional (3-D) deployment scenarios.

Multiobjective Optimization

Paper
Code

Rethinking Data Augmentation for Robust Visual Question Answering

1 code implementation • 18 Jul 2022 • Long Chen, Yuhang Zheng, Jun Xiao

Unfortunately, to guarantee augmented samples have reasonable ground-truth answers, they manually design a set of heuristic rules for several question types, which extremely limits its generalization abilities.

Data Augmentation Knowledge Distillation +2

Paper
Code

Graph-based Multi-View Fusion and Local Adaptation: Mitigating Within-Household Confusability for Speaker Identification

no code implementations • 8 Jul 2022 • Long Chen, Yixiong Meng, Venkatesh Ravichandran, Andreas Stolcke

Speaker identification (SID) in the household scenario (e. g., for smart speakers) is an important but challenging problem due to limited number of labeled (enrollment) utterances, confusable voices, and demographic imbalances.

Fairness Speaker Identification +1

Paper
Add Code

Beyond Grounding: Extracting Fine-Grained Event Hierarchies Across Modalities

no code implementations • 14 Jun 2022 • Hammad A. Ayyubi, Christopher Thomas, Lovish Chum, Rahul Lokesh, Long Chen, Yulei Niu, Xudong Lin, Xuande Feng, Jaywon Koo, Sounak Ray, Shih-Fu Chang

To support research on this task, we introduce the Multimodal Hierarchical Events (MultiHiEve) dataset.

Paper
Add Code

The Devil is in the Labels: Noisy Label Correction for Robust Scene Graph Generation

1 code implementation • CVPR 2022 • Lin Li, Long Chen, Yifeng Huang, Zhimeng Zhang, Songyang Zhang, Jun Xiao

Then, in Pos-NSD, we use a clustering-based algorithm to divide all positive samples into multiple sets, and treat the samples in the noisiest set as noisy positive samples.

Graph Generation Out-of-Distribution Detection +2

Paper
Code

The scope for AI-augmented interpretation of building blueprints in commercial and industrial property insurance

no code implementations • 29 Apr 2022 • Long Chen, Mao Ye, Alistair Milne, John Hillier, Frances Oglesby

This report, commissioned by the WTW research network, investigates the use of AI in property risk assessment.

BIG-bench Machine Learning

Paper
Add Code

Rethinking Multi-Modal Alignment in Video Question Answering from Feature and Sample Perspectives

no code implementations • 25 Apr 2022 • Shaoning Xiao, Long Chen, Kaifeng Gao, Zhao Wang, Yi Yang, Zhimeng Zhang, Jun Xiao

From the view of feature, we break down the video into trajectories and first leverage trajectory feature in VideoQA to enhance the alignment between two modalities.

Question Answering Video Question Answering

Paper
Add Code

Proximal Implicit ODE Solvers for Accelerating Learning Neural ODEs

no code implementations • 19 Apr 2022 • Justin Baker, Hedi Xia, Yiwei Wang, Elena Cherkaev, Akil Narayan, Long Chen, Jack Xin, Andrea L. Bertozzi, Stanley J. Osher, Bao Wang

Learning neural ODEs often requires solving very stiff ODE systems, primarily using explicit adaptive step size ODE solvers.

Computational Efficiency

Paper
Add Code

Learning 3D Semantics from Pose-Noisy 2D Images with Hierarchical Full Attention Network

1 code implementation • 17 Apr 2022 • Yuhang He, Lin Chen, Junkun Xie, Long Chen

This motivates us to conduct a "task transfer" paradigm so that 3D semantic segmentation benefits from aggregating 2D semantic cues, albeit pose noises are contained in 2D image observations.

2D Semantic Segmentation 3D Semantic Segmentation

Paper
Code

Multi-Modal Few-Shot Object Detection with Meta-Learning-Based Cross-Modal Prompting

no code implementations • 16 Apr 2022 • Guangxing Han, Long Chen, Jiawei Ma, Shiyuan Huang, Rama Chellappa, Shih-Fu Chang

Our approach is motivated by the high-level conceptual similarity of (metric-based) meta-learning and prompt-based learning to learn generalizable few-shot and zero-shot object detection models respectively without fine-tuning.

Few-Shot Learning Few-Shot Object Detection +3

Paper
Add Code

Few-Shot Object Detection with Fully Cross-Transformer

1 code implementation • CVPR 2022 • Guangxing Han, Jiawei Ma, Shiyuan Huang, Long Chen, Shih-Fu Chang

Inspired by the recent work on vision transformers and vision-language transformers, we propose a novel Fully Cross-Transformer based model (FCT) for FSOD by incorporating cross-transformer into both the feature backbone and detection head.

Few-Shot Object Detection Metric Learning +2

Paper
Code

SATr: Slice Attention with Transformer for Universal Lesion Detection

no code implementations • 13 Mar 2022 • Han Li, Long Chen, Hu Han, S. Kevin Zhou

Universal Lesion Detection (ULD) in computed tomography plays an essential role in computer-aided diagnosis.

Lesion Detection

Paper
Add Code

A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach

no code implementations • 10 Mar 2022 • Xiaohan Lan, Yitian Yuan, Xin Wang, Long Chen, Zhi Wang, Lin Ma, Wenwu Zhu

New benchmarking results indicate that our proposed evaluation protocols can better monitor the research progress.

Benchmarking Sentence +1

Paper
Add Code

openFEAT: Improving Speaker Identification by Open-set Few-shot Embedding Adaptation with Transformer

no code implementations • 24 Feb 2022 • Kishan K C, Zhenning Tan, Long Chen, Minho Jin, Eunjung Han, Andreas Stolcke, Chul Lee

Household speaker identification with few enrollment utterances is an important yet challenging problem, especially when household members share similar voice characteristics and room acoustics.

Open Set Learning Speaker Identification

Paper
Add Code

AutoMine: An Unmanned Mine Dataset

no code implementations • CVPR 2022 • Yuchen Li, Zixuan Li, Siyu Teng, Yu Zhang, YuHang Zhou, Yuchang Zhu, Dongpu Cao, Bin Tian, Yunfeng Ai, Zhe XuanYuan, Long Chen

The main contributions of the AutoMine dataset are as follows: 1. The first autonomous driving dataset for perception and localization in mine scenarios.

Autonomous Driving

Paper
Add Code

Rethinking the Two-Stage Framework for Grounded Situation Recognition

1 code implementation • 10 Dec 2021 • Meng Wei, Long Chen, Wei Ji, Xiaoyu Yue, Tat-Seng Chua

Since each verb is associated with a specific set of semantic roles, all existing GSR methods resort to a two-stage framework: predicting the verb in the first stage and detecting the semantic roles in the second stage.

Ranked #3 on Situation Recognition on imSitu

Grounded Situation Recognition Object Recognition +1

Paper
Code

Classification-Then-Grounding: Reformulating Video Scene Graphs as Temporal Bipartite Graphs

1 code implementation • CVPR 2022 • Kaifeng Gao, Long Chen, Yulei Niu, Jian Shao, Jun Xiao

To this end, we propose a new classification-then-grounding framework for VidSGG, which can avoid all the three overlooked drawbacks.

Predicate Classification

Paper
Code

Unified Group Fairness on Federated Learning

no code implementations • 9 Nov 2021 • Fengda Zhang, Kun Kuang, Yuxuan Liu, Long Chen, Chao Wu, Fei Wu, Jiaxun Lu, Yunfeng Shao, Jun Xiao

We validate the advantages of the FMDA-M algorithm with various kinds of distribution shift settings in experiments, and the results show that FMDA-M algorithm outperforms the existing fair FL algorithms on unified group fairness.

Attribute Fairness +1

Paper
Add Code

High-throughput Phenotyping of Nematode Cysts

no code implementations • 13 Oct 2021 • Long Chen, Matthias Daub, Hans-Georg Luigs, Marcus Jansen, Martin Strauch, Dorit Merhof

The beet cyst nematode (BCN) Heterodera schachtii is a plant pest responsible for crop loss on a global scale.

Instance Segmentation Semantic Segmentation +1

Paper
Add Code

Counterfactual Samples Synthesizing and Training for Robust Visual Question Answering

1 code implementation • 3 Oct 2021 • Long Chen, Yuhang Zheng, Yulei Niu, Hanwang Zhang, Jun Xiao

Specifically, CSST is composed of two parts: Counterfactual Samples Synthesizing (CSS) and Counterfactual Samples Training (CST).

counterfactual Question Answering +1

Paper
Code

Natural Language Video Localization with Learnable Moment Proposals

1 code implementation • EMNLP 2021 • Shaoning Xiao, Long Chen, Jian Shao, Yueting Zhuang, Jun Xiao

Given an untrimmed video and a natural language query, Natural Language Video Localization (NLVL) aims to identify the video moment described by the query.

Paper
Code

Image Deraining and Denoising Convolutional Neural Network ForAutonomous Driving

no code implementations • 15 Sep 2021 • Kaige Wang, Long Chen, Tianming Wang, Qixiang Meng, Huatao Jiang, Lin Chang

Perception plays an important role in reliable decision-making for autonomous vehicles.

Autonomous Vehicles Decision Making +4

Paper
Add Code

On Pursuit of Designing Multi-modal Transformer for Video Grounding

no code implementations • EMNLP 2021 • Meng Cao, Long Chen, Mike Zheng Shou, Can Zhang, Yuexian Zou

Almost all existing video grounding methods fall into two frameworks: 1) Top-down model: It predefines a set of segment candidates and then conducts segment classification and regression.

Sentence Video Grounding

Paper
Add Code

Instance-wise or Class-wise? A Tale of Neighbor Shapley for Concept-based Explanation

no code implementations • 3 Sep 2021 • Jiahui Li, Kun Kuang, Lin Li, Long Chen, Songyang Zhang, Jian Shao, Jun Xiao

Deep neural networks have demonstrated remarkable performance in many data-driven and prediction-oriented applications, and sometimes even perform better than humans.

Medical Diagnosis

Paper
Add Code

Video Relation Detection via Tracklet based Visual Transformer

1 code implementation • 19 Aug 2021 • Kaifeng Gao, Long Chen, Yifeng Huang, Jun Xiao

Video Visual Relation Detection (VidVRD), has received significant attention of our community over recent years.

Relation Video Visual Relation Detection

Paper
Code

Deep Motion Prior for Weakly-Supervised Temporal Action Localization

no code implementations • 12 Aug 2021 • Meng Cao, Can Zhang, Long Chen, Mike Zheng Shou, Yuexian Zou

In this paper, we analyze that the motion cues behind the optical flow features are complementary informative.

Optical Flow Estimation Weakly-supervised Temporal Action Localization +1

Paper
Add Code

FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention

no code implementations • NeurIPS 2021 • Tan M. Nguyen, Vai Suliafu, Stanley J. Osher, Long Chen, Bao Wang

For instance, FMMformers achieve an average classification accuracy of $60. 74\%$ over the five Long Range Arena tasks, which is significantly better than the standard transformer's average accuracy of $58. 70\%$.

Language Modelling

Paper
Add Code

CrossFormer: A Versatile Vision Transformer Hinging on Cross-scale Attention

3 code implementations • ICLR 2022 • Wenxiao Wang, Lu Yao, Long Chen, Binbin Lin, Deng Cai, Xiaofei He, Wei Liu

On the one hand, CEL blends each embedding with multiple patches of different scales, providing the self-attention module itself with cross-scale features.

Ranked #42 on Semantic Segmentation on ADE20K val

Image Classification Instance Segmentation +4

317

Paper
Code

Graph-based Label Propagation for Semi-Supervised Speaker Identification

no code implementations • 15 Jun 2021 • Long Chen, Venkatesh Ravichandran, Andreas Stolcke

We show in experiments on the VoxCeleb dataset that this approach makes effective use of unlabeled data and improves speaker identification accuracy compared to two state-of-the-art scoring methods as well as their semi-supervised variants based on pseudo-labels.

Speaker Identification Speaker Recognition

Paper
Add Code

Shapley Counterfactual Credits for Multi-Agent Reinforcement Learning

no code implementations • 1 Jun 2021 • Jiahui Li, Kun Kuang, Baoxiang Wang, Furui Liu, Long Chen, Fei Wu, Jun Xiao

Specifically, Shapley Value and its desired properties are leveraged in deep MARL to credit any combinations of agents, which grants us the capability to estimate the individual credit for each agent.

counterfactual Multi-agent Reinforcement Learning +4

Paper
Add Code

SDNet: mutil-branch for single image deraining using swin

3 code implementations • 31 May 2021 • Fuxiang Tan, YuTing Kong, Yingying Fan, Feng Liu, Daxin Zhou, Hao Zhang, Long Chen, Liang Gao, Yurong Qian

The former implements the basic rain pattern feature extraction, while the latter fuses different features to further extract and process the image features.

Autonomous Driving Single Image Deraining

Paper
Code

Deep Learning for Weakly-Supervised Object Detection and Object Localization: A Survey

no code implementations • 26 May 2021 • Feifei Shao, Long Chen, Jian Shao, Wei Ji, Shaoning Xiao, Lu Ye, Yueting Zhuang, Jun Xiao

With the success of deep neural networks in object detection, both WSOD and WSOL have received unprecedented attention.

Object object-detection +2

Paper
Add Code

SimNet: Learning Reactive Self-driving Simulations from Real-world Observations

1 code implementation • 26 May 2021 • Luca Bergamini, Yawei Ye, Oliver Scheel, Long Chen, Chih Hu, Luca Del Pero, Blazej Osinski, Hugo Grimmett, Peter Ondruska

We train our system directly from 1, 000 hours of driving logs and measure both realism, reactivity of the simulation as the two key properties of the simulation.

Paper
Code

What data do we need for training an AV motion planner?

no code implementations • 26 May 2021 • Long Chen, Lukas Platinsky, Stefanie Speichert, Blazej Osinski, Oliver Scheel, Yawei Ye, Hugo Grimmett, Luca Del Pero, Peter Ondruska

If cheaper sensors could be used for collection instead, data availability would go up, which is crucial in a field where data volume requirements are large and availability is small.

Imitation Learning Motion Planning

Paper
Add Code

VL-NMS: Breaking Proposal Bottlenecks in Two-Stage Visual-Language Matching

no code implementations • 12 May 2021 • Chenchi Zhang, Wenbo Ma, Jun Xiao, Hanwang Zhang, Jian Shao, Yueting Zhuang, Long Chen

In this paper, we argue that these methods overlook an obvious \emph{mismatch} between the roles of proposals in the two stages: they generate proposals solely based on the detection confidence (i. e., query-agnostic), hoping that the proposals contain all instances mentioned in the text query (i. e., query-aware).

Image-text matching Referring Expression +2

Paper
Add Code

Textual Analysis of Communications in COVID-19 Infected Community on Social Media

no code implementations • 3 May 2021 • YuHan Liu, Yuhan Gao, Zhifan Nan, Long Chen

During the COVID-19 pandemic, people started to discuss about pandemic-related topics on social media.

Paper
Add Code

Conditional Training with Bounding Map for Universal Lesion Detection

no code implementations • 23 Mar 2021 • Han Li, Long Chen, Hu Han, S. Kevin Zhou

Universal Lesion Detection (ULD) in computed tomography plays an essential role in computer-aided diagnosis.

Ranked #3 on Medical Object Detection on DeepLesion

Lesion Detection Medical Object Detection

Paper
Add Code

Human-like Controllable Image Captioning with Verb-specific Semantic Roles

1 code implementation • CVPR 2021 • Long Chen, Zhihong Jiang, Jun Xiao, Wei Liu

However, we argue that almost all existing objective control signals have overlooked two indispensable characteristics of an ideal control signal: 1) Event-compatible: all visual contents referred to in a single sentence should be compatible with the described activity.

Caption Generation controllable image captioning +3

Paper
Code

Boundary Proposal Network for Two-Stage Natural Language Video Localization

no code implementations • 15 Mar 2021 • Shaoning Xiao, Long Chen, Songyang Zhang, Wei Ji, Jian Shao, Lu Ye, Jun Xiao

State-of-the-art NLVL methods are almost in one-stage fashion, which can be typically grouped into two categories: 1) anchor-based approach: it first pre-defines a series of video segment candidates (e. g., by sliding window), and then does classification for each candidate; 2) anchor-free approach: it directly predicts the probabilities for each video frame as a boundary or intermediate frame inside the positive segment.

Vocal Bursts Valence Prediction

Paper
Add Code

A Closer Look at Temporal Sentence Grounding in Videos: Dataset and Metric

no code implementations • 22 Jan 2021 • Yitian Yuan, Xiaohan Lan, Xin Wang, Long Chen, Zhi Wang, Wenwu Zhu

All the results demonstrate that the re-organized dataset splits and new metric can better monitor the progress in TSGV.

Benchmarking Sentence +1

Paper
Add Code

The electric dipole moment of the tau lepton revisited

no code implementations • 20 Jan 2021 • Werner Bernreuther, Long Chen, Otto Nachtmann

We reconsider the issue of the search for a nonzero electric dipole form factor (EDM) $d_\tau(s)$ using optimal observables in $\tau^+\tau^-$ production by $e^+ e^-$ collisions in the center-of-mass energy range from the $\tau$-pair threshold to about $\sqrt{s} \sim 15$ GeV.

High Energy Physics - Phenomenology High Energy Physics - Experiment

Paper
Add Code

Class balanced underwater object detection dataset generated by class-wise style augmentation

no code implementations • 20 Jan 2021 • Long Chen, Junyu Dong, Huiyu Zhou

CWSA is a new kind of data augmentation technique which augments the training data for the minority classes by generating various colors, textures and contrasts for the minority classes.

Data Augmentation object-detection +1

Paper
Add Code

Structured Context Enhancement Network for Mouse Pose Estimation

1 code implementation • 1 Dec 2020 • Feixiang Zhou, Zheheng Jiang, Zhihua Liu, Fang Chen, Long Chen, Lei Tong, Zhile Yang, Haikuan Wang, Minrui Fei, Ling Li, Huiyu Zhou

However, quantifying mouse behaviours from videos or images remains a challenging problem, where pose estimation plays an important role in describing mouse behaviours.

Animal Pose Estimation

Paper
Code

Trading Personalization for Accuracy: Data Debugging in Collaborative Filtering

1 code implementation • NeurIPS 2020 • Long Chen, Yuan YAO, Feng Xu, Miao Xu, Hanghang Tong

Collaborative filtering has been widely used in recommender systems.

Collaborative Filtering Recommendation Systems

Paper
Code

$ZH$ production in gluon fusion: two-loop amplitudes with full top quark mass dependence

no code implementations • 24 Nov 2020 • Long Chen, Gudrun Heinrich, Stephen P. Jones, Matthias Kerner, Jonas Klappert, Johannes Schlenk

We present results for the two-loop helicity amplitudes entering the NLO QCD corrections to the production of a Higgs boson in association with a $Z$-boson in gluon fusion.

High Energy Physics - Phenomenology

Paper
Add Code

Lightweight Single-Image Super-Resolution Network with Attentive Auxiliary Feature Learning

1 code implementation • 13 Nov 2020 • Xuehui Wang, Qing Wang, Yuzhi Zhao, Junchi Yan, Lei Fan, Long Chen

In this paper, we develop a computation efficient yet accurate network based on the proposed attentive auxiliary features (A$^2$F) for SISR.

Image Super-Resolution

Paper
Code

Multi-View Adaptive Fusion Network for 3D Object Detection

1 code implementation • 2 Nov 2020 • Guojun Wang, Bin Tian, Yachen Zhang, Long Chen, Dongpu Cao, Jian Wu

3D object detection based on LiDAR-camera fusion is becoming an emerging research theme for autonomous driving.

3D Object Detection Autonomous Driving +3

Paper
Code

SWIPENET: Object detection in noisy underwater images

1 code implementation • 19 Oct 2020 • Long Chen, Feixiang Zhou, Shengke Wang, Junyu Dong, Ning li, Haiping Ma, Xin Wang, Huiyu Zhou

Moreover, inspired by the human education process that drives the learning from easy to hard concepts, we here propose the CMA training paradigm that first trains a clean detector which is free from the influence of noisy data.

Object object-detection +1

Paper
Code

Accelerate CNNs from Three Dimensions: A Comprehensive Pruning Framework

no code implementations • 10 Oct 2020 • Wenxiao Wang, Minghao Chen, Shuai Zhao, Long Chen, Jinming Hu, Haifeng Liu, Deng Cai, Xiaofei He, Wei Liu

Specifically, it first casts the relationships between a certain model's accuracy and depth/width/resolution into a polynomial regression and then maximizes the polynomial to acquire the optimal values for the three dimensions.

Network Pruning Neural Architecture Search +1

Paper
Add Code

AIM 2020 Challenge on Efficient Super-Resolution: Methods and Results

3 code implementations • 15 Sep 2020 • Kai Zhang, Martin Danelljan, Yawei Li, Radu Timofte, Jie Liu, Jie Tang, Gangshan Wu, Yu Zhu, Xiangyu He, Wenjie Xu, Chenghua Li, Cong Leng, Jian Cheng, Guangyang Wu, Wenyi Wang, Xiaohong Liu, Hengyuan Zhao, Xiangtao Kong, Jingwen He, Yu Qiao, Chao Dong, Maitreya Suin, Kuldeep Purohit, A. N. Rajagopalan, Xiaochuan Li, Zhiqiang Lang, Jiangtao Nie, Wei Wei, Lei Zhang, Abdul Muqeet, Jiwon Hwang, Subin Yang, JungHeum Kang, Sung-Ho Bae, Yongwoo Kim, Geun-Woo Jeon, Jun-Ho Choi, Jun-Hyuk Kim, Jong-Seok Lee, Steven Marty, Eric Marty, Dongliang Xiong, Siang Chen, Lin Zha, Jiande Jiang, Xinbo Gao, Wen Lu, Haicheng Wang, Vineeth Bhaskara, Alex Levinshtein, Stavros Tsogkas, Allan Jepson, Xiangzhen Kong, Tongtong Zhao, Shanshan Zhao, Hrishikesh P. S, Densen Puthussery, Jiji C. V, Nan Nan, Shuai Liu, Jie Cai, Zibo Meng, Jiaming Ding, Chiu Man Ho, Xuehui Wang, Qiong Yan, Yuzhi Zhao, Long Chen, Jiangtao Zhang, Xiaotong Luo, Liang Chen, Yanyun Qu, Long Sun, Wenhao Wang, Zhenbing Liu, Rushi Lan, Rao Muhammad Umer, Christian Micheloni

This paper reviews the AIM 2020 challenge on efficient single image super-resolution with focus on the proposed solutions and results.

Image Super-Resolution

2,710

Paper
Code

Ref-NMS: Breaking Proposal Bottlenecks in Two-Stage Referring Expression Grounding

1 code implementation • 3 Sep 2020 • Long Chen, Wenbo Ma, Jun Xiao, Hanwang Zhang, Shih-Fu Chang

The prevailing framework for solving referring expression grounding is based on a two-stage process: 1) detecting proposals with an object detector and 2) grounding the referent to one of the proposals.

Referring Expression Vocal Bursts Valence Prediction

Paper
Code

Perceptual underwater image enhancement with deep learning and physical priors

no code implementations • 21 Aug 2020 • Long Chen, Zheheng Jiang, Lei Tong, Zhihua Liu, Aite Zhao, Qianni Zhang, Junyu Dong, Huiyu Zhou

Underwater image enhancement, as a pre-processing step to improve the accuracy of the following object detection task, has drawn considerable attention in the field of underwater navigation and ocean exploration.

Image Enhancement Image Generation +2

Paper
Add Code

Defining Digital Quadruplets in the Cyber-Physical-Social Space for Parallel Driving

no code implementations • 26 Jul 2020 • Teng Liu, Yang Xing, Long Chen, Dongpu Cao, Fei-Yue Wang

The objectives of the three virtual digital vehicles are interacting, guiding, simulating and improving with the real vehicles.

Descriptive

Paper
Add Code

Digital Quadruplets for Cyber-Physical-Social Systems based Parallel Driving: From Concept to Applications

no code implementations • 21 Jul 2020 • Teng Liu, Xing Yang, Hong Wang, Xiaolin Tang, Long Chen, Huilong Yu, Fei-Yue Wang

The three virtual vehicles (descriptive, predictive, and prescriptive) dynamically interact with the real one in order to enhance the safety and performance of the real vehicle.

Descriptive

Paper
Add Code

Deep Learning Based Brain Tumor Segmentation: A Survey

1 code implementation • 18 Jul 2020 • Zhihua Liu, Lei Tong, Zheheng Jiang, Long Chen, Feixiang Zhou, Qianni Zhang, Xiangrong Zhang, Yaochu Jin, Huiyu Zhou

Brain tumor segmentation is one of the most challenging problems in medical image analysis.

Brain Tumor Segmentation Image Classification +4

Paper
Code

Comparison of Different Methods for Time Sequence Prediction in Autonomous Vehicles

no code implementations • 16 Jul 2020 • Teng Liu, Bin Tian, Yunfeng Ai, Long Chen, Fei Liu, Dongpu Cao

As a combination of various kinds of technologies, autonomous vehicles could complete a series of driving tasks by itself, such as perception, decision-making, planning, and control.

Autonomous Vehicles Decision Making +2

Paper
Add Code

CANet: Context Aware Network for 3D Brain Glioma Segmentation

1 code implementation • 15 Jul 2020 • Zhihua Liu, Lei Tong, Long Chen, Feixiang Zhou, Zheheng Jiang, Qianni Zhang, Yinhai Wang, Caifeng Shan, Ling Li, Huiyu Zhou

Automated segmentation of brain glioma plays an active role in diagnosis decision, progression monitoring and surgery planning.

Brain Tumor Segmentation Segmentation +1

Paper
Code

Improving Pixel Embedding Learning through Intermediate Distance Regression Supervision for Instance Segmentation

no code implementations • 13 Jul 2020 • Yuli Wu, Long Chen, Dorit Merhof

A distance regression module is incorporated into our architecture to generate seeds for fast clustering.

Clustering Distance regression +4

Paper
Add Code

CenterNet3D: An Anchor Free Object Detector for Point Cloud

2 code implementations • 13 Jul 2020 • Guojun Wang, Jian Wu, Bin Tian, Siyu Teng, Long Chen, Dongpu Cao

However, because inherent sparsity of point clouds, 3D object center points are likely to be in empty space which makes it difficult to estimate accurate boundaries.

3D Object Detection Autonomous Driving +3

117

Paper
Code

On Connections between Regularizations for Improving DNN Robustness

no code implementations • 4 Jul 2020 • Yiwen Guo, Long Chen, Yurong Chen, Chang-Shui Zhang

This paper analyzes regularization terms proposed recently for improving the adversarial robustness of deep neural networks (DNNs), from a theoretical point of view.

Adversarial Robustness BIG-bench Machine Learning +1

Paper
Add Code

A Benchmark dataset for both underwater image enhancement and underwater object detection

no code implementations • 29 Jun 2020 • Long Chen, Lei Tong, Feixiang Zhou, Zheheng Jiang, Zhenyang Li, Jialin Lv, Junyu Dong, Huiyu Zhou

To investigate how the underwater image enhancement methods influence the following underwater object detection tasks, in this paper, we provide a large-scale underwater object detection dataset with both bounding box annotations and high quality reference images, namely OUC dataset.

Image Enhancement Image Quality Assessment +3

Paper
Add Code

One Thousand and One Hours: Self-driving Motion Prediction Dataset

3 code implementations • 25 Jun 2020 • John Houston, Guido Zuidhof, Luca Bergamini, Yawei Ye, Long Chen, Ashesh Jain, Sammy Omari, Vladimir Iglovikov, Peter Ondruska

Motivated by the impact of large-scale datasets on ML systems we present the largest self-driving dataset for motion prediction to date, containing over 1, 000 hours of data.

Autonomous Vehicles Motion Forecasting +2

Paper
Code

Hierarchical Fashion Graph Network for Personalized Outfit Recommendation

1 code implementation • 26 May 2020 • Xingchen Li, Xiang Wang, Xiangnan He, Long Chen, Jun Xiao, Tat-Seng Chua

Fashion outfit recommendation has attracted increasing attentions from online shopping services and fashion communities. Distinct from other scenarios (e. g., social networking or content sharing) which recommend a single item (e. g., a friend or picture) to a user, outfit recommendation predicts user preference on a set of well-matched fashion items. Hence, performing high-quality personalized outfit recommendation should satisfy two requirements -- 1) the nice compatibility of fashion items and 2) the consistence with user preference.

Paper
Code

Underwater object detection using Invert Multi-Class Adaboost with deep learning

1 code implementation • 23 May 2020 • Long Chen, Zhihua Liu, Lei Tong, Zheheng Jiang, Shengke Wang, Junyu Dong, Huiyu Zhou

In addition, we propose a novel sample-weighted loss function which can model sample weights for SWIPENet, which uses a novel sample re-weighting algorithm, namely Invert Multi-Class Adaboost (IMA), to reduce the influence of noise on the proposed SWIPENet.

Object object-detection +1

Paper
Code

A CNN Framenwork Based on Line Annotations for Detecting Nematodes in Microscopic Images

no code implementations • 21 Apr 2020 • Long Chen, Martin Strauch, Matthias Daub, Xiaochen Jiang, Marcus Jansen, Hans-Georg Luigs, Susanne Schultz-Kuhlmann, Stefan Krüssel, Dorif Merhof

The endpoints serve to untangle the skeletons from which segmentation masks are reconstructed by estimating the body width at each location along the skeleton.

Paper
Add Code

In the Eyes of the Beholder: Analyzing Social Media Use of Neutral and Controversial Terms for COVID-19

no code implementations • 21 Apr 2020 • Long Chen, Hanjia Lyu, Tongyu Yang, Yu Wang, Jiebo Luo

To model the substantive difference of tweets with controversial terms and those with non-controversial terms, we apply topic modeling and LIWC-based sentiment analysis.

Sentiment Analysis

Paper
Add Code

Instance Segmentation of Biomedical Images with an Object-aware Embedding Learned with Local Constraints

2 code implementations • 21 Apr 2020 • Long Chen, Martin Strauch, Dorit Merhof

The network is trained to output embedding vectors of similar directions for pixels from the same object, while adjacent objects are orthogonal in the embedding space, which effectively avoids the fusion of objects in a crowd.

Cell Segmentation Instance Segmentation +4

Paper
Code

MixNet: Multi-modality Mix Network for Brain Segmentation

1 code implementation • 21 Apr 2020 • Long Chen, Dorit Merhof

Automated brain structure segmentation is important to many clinical quantitative analysis and diagnoses.

Brain Segmentation

Paper
Code

Deep Learning for Image and Point Cloud Fusion in Autonomous Driving: A Review

no code implementations • 10 Apr 2020 • Yaodong Cui, Ren Chen, Wenbo Chu, Long Chen, Daxin Tian, Ying Li, Dongpu Cao

Autonomous vehicles were experiencing rapid development in the past few years.

Autonomous Driving Depth Completion +3

Paper
Add Code

Multi-Task Learning via Co-Attentive Sharing for Pedestrian Attribute Recognition

no code implementations • 7 Apr 2020 • Haitian Zeng, Haizhou Ai, Zijie Zhuang, Long Chen

In this paper, we propose a novel Co-Attentive Sharing (CAS) module which extracts discriminative channels and spatial regions for more effective feature sharing in multi-task learning.

Attribute Multi-Task Learning +1

Paper
Add Code

Location-Enabled IoT (LE-IoT): A Survey of Positioning Techniques, Error Sources, and Mitigation

no code implementations • 7 Apr 2020 • You Li, Yuan Zhuang, Xin Hu, Zhouzheng Gao, Jia Hu, Long Chen, Zhe He, Ling Pei, Kejie Chen, Maosong Wang, Xiaoji Niu, Ruizhi Chen, John Thompson, Fadhel Ghannouchi, Naser El-Sheimy

Compared to the related surveys, this paper has a more comprehensive and state-of-the-art review on IoT localization methods, an original review on IoT localization error sources and mitigation, an original review on IoT localization performance evaluation, and a more comprehensive review of IoT localization applications, opportunities, and challenges.

Networking and Internet Architecture Signal Processing

Paper
Add Code

Distinguish Confusing Law Articles for Legal Judgment Prediction

1 code implementation • ACL 2020 • Nuo Xu, Pinghui Wang, Long Chen, Li Pan, Xiaoyan Wang, Junzhou Zhao

Legal Judgment Prediction (LJP) is the task of automatically predicting a law case's judgment results given a text describing its facts, which has excellent prospects in judicial assistance systems and convenient services for the public.

Paper
Code

Counterfactual Samples Synthesizing for Robust Visual Question Answering

2 code implementations • CVPR 2020 • Long Chen, Xin Yan, Jun Xiao, Hanwang Zhang, ShiLiang Pu, Yueting Zhuang

To reduce the language biases, several recent works introduce an auxiliary question-only model to regularize the training of targeted VQA model, and achieve dominating performance on VQA-CP.

Ranked #1 on Visual Question Answering (VQA) on VQA-CP (using extra training data)

counterfactual Question Answering +1

Paper
Code

Cross-View Tracking for Multi-Human 3D Pose Estimation at over 100 FPS

2 code implementations • CVPR 2020 • Long Chen, Haizhou Ai, Rui Chen, Zijie Zhuang, Shuang Liu

To further verify the scalability of our method, we propose a new large-scale multi-human dataset with 12 to 28 camera views.

Ranked #9 on 3D Multi-Person Pose Estimation on Campus

3D Multi-Person Pose Estimation 3D Pose Estimation

136

Paper
Code

Transductive Zero-Shot Hashing for Multilabel Image Retrieval

1 code implementation • 17 Nov 2019 • Qin Zou, Zheng Zhang, Ling Cao, Long Chen, Song Wang

Given semantic annotations such as class labels and pairwise similarities of the training data, hashing methods can learn and generate effective and compact binary codes.

Multi-Label Image Retrieval Quantization +1

Paper
Code

DEBUG: A Dense Bottom-Up Grounding Approach for Natural Language Video Localization

no code implementations • IJCNLP 2019 • Chujie Lu, Long Chen, Chilie Tan, Xiaolin Li, Jun Xiao

In this paper, we focus on natural language video localization: localizing (ie, grounding) a natural language description in a long and untrimmed video sequence.

Paper
Add Code

Learning Lightweight Pedestrian Detector with Hierarchical Knowledge Distillation

no code implementations • 20 Sep 2019 • Rui Chen, Haizhou Ai, Chong Shang, Long Chen, Zijie Zhuang

It remains very challenging to build a pedestrian detection system for real world applications, which demand for both accuracy and speed.

Knowledge Distillation Pedestrian Detection

Paper
Add Code

Extreme Low Resolution Activity Recognition with Confident Spatial-Temporal Attention Transfer

no code implementations • 9 Sep 2019 • Yucai Bai, Qin Zou, Xieyuanli Chen, Lingxi Li, Zhengming Ding, Long Chen

Given the fact that one same activity may be represented by videos in both high resolution (HR) and extreme low resolution (eLR), it is worth studying to utilize the relevant HR data to improve the eLR activity recognition.

Activity Recognition Privacy Preserving +1

Paper
Add Code

Exploiting Entity BIO Tag Embeddings and Multi-task Learning for Relation Extraction with Imbalanced Data

no code implementations • ACL 2019 • Wei Ye, Bo Li, Rui Xie, Zhonghao Sheng, Long Chen, Shikun Zhang

In practical scenario, relation extraction needs to first identify entity pairs that have relation and then assign a correct relation class.

Multi-Task Learning named-entity-recognition +7

Paper
Add Code

Detection and Tracking of Multiple Mice Using Part Proposal Networks

no code implementations • 6 Jun 2019 • Zheheng Jiang, Zhihua Liu, Long Chen, Lei Tong, Xiangrong Zhang, Xiangyuan Lan, Danny Crookes, Ming-Hsuan Yang, Huiyu Zhou

The study of mouse social behaviours has been increasingly undertaken in neuroscience research.

Object Tracking

Paper
Add Code

Cost-sensitive Boosting Pruning Trees for depression detection on Twitter

1 code implementation • 2 Jun 2019 • Lei Tong, Zhihua Liu, Zheheng Jiang, Feixiang Zhou, Long Chen, Jialin Lyu, Xiangrong Zhang, Qianni Zhang, Abdul Sadka Senior, Yinhai Wang, Ling Li, Huiyu Zhou

Depression is one of the most common mental health disorders, and a large number of depressed people commit suicide each year.

Depression Detection General Classification

Paper
Code

MR-GNN: Multi-Resolution and Dual Graph Neural Network for Predicting Structured Entity Interactions

2 code implementations • 23 May 2019 • Nuo Xu, Pinghui Wang, Long Chen, Jing Tao, Junzhou Zhao

To resolve these problems, we present MR-GNN, an end-to-end graph neural network with the following features: i) it uses a multi-resolution based architecture to extract node features from different neighborhoods of each node, and, ii) it uses dual graph-state long short-term memory networks (L-STMs) to summarize local features of each graph and extracts the interaction features between pairwise graphs.

699

Paper
Code

A prescription for projectors to compute helicity amplitudes in D dimensions

no code implementations • 1 Apr 2019 • Long Chen

The usage of these D-dimensional polarized amplitude projectors results in helicity amplitudes that can be expressed solely in terms of external momenta, but different from those defined in the existing dimensional regularization schemes.

High Energy Physics - Phenomenology High Energy Physics - Theory

Paper
Add Code

Robust Lane Detection from Continuous Driving Scenes Using Deep Neural Networks

2 code implementations • 6 Mar 2019 • Qin Zou, Hanwen Jiang, Qiyu Dai, Yuanhao Yue, Long Chen, Qian Wang

Specifically, information of each frame is abstracted by a CNN block, and the CNN features of multiple continuous frames, holding the property of time-series, are then fed into the RNN block for feature learning and lane prediction.

Lane Detection Time Series +1

209

Paper
Code

Monocular Outdoor Semantic Mapping with a Multi-task Network

no code implementations • 17 Jan 2019 • Yucai Bai, Lei Fan, Ziyu Pan, Long Chen

First, with the correlation of underlying information between depth and semantic prediction, a novel multi-task Convolutional Neural Network (CNN) is designed for joint prediction.

3D Reconstruction Autonomous Driving +2

Paper
Add Code

Counterfactual Critic Multi-Agent Training for Scene Graph Generation

no code implementations • ICCV 2019 • Long Chen, Hanwang Zhang, Jun Xiao, Xiangnan He, ShiLiang Pu, Shih-Fu Chang

CMAT is a multi-agent policy gradient method that frames objects as cooperative agents, and then directly maximizes a graph-level metric as the reward.

counterfactual Graph Generation +2

Paper
Add Code

Cross-Resolution Person Re-identification with Deep Antithetical Learning

no code implementations • 24 Oct 2018 • Zijie Zhuang, Haizhou Ai, Long Chen, Chong Shang

One paradigm to deal with this problem is to use some complicated methods for mapping all images into an artificial image space, which however will disrupt the natural image distribution and requires heavy image preprocessing.

Person Re-Identification

Paper
Add Code

Real-time Multiple People Tracking with Deeply Learned Candidate Selection and Person Re-Identification

3 code implementations • 12 Sep 2018 • Long Chen, Haizhou Ai, Zijie Zhuang, Chong Shang

Online multi-object tracking is a fundamental problem in time-critical video analysis applications.

Ranked #4 on Online Multi-Object Tracking on MOT16

Large-Scale Person Re-Identification Multi-Object Tracking +2

531

Paper
Code

End-to-end driving simulation via angle branched network

no code implementations • 19 May 2018 • Qing Wang, Long Chen, Wei Tian

Imitation learning for end-to-end autonomous driving has drawn attention from academic communities.

Autonomous Driving Imitation Learning +1

Paper
Add Code

Deep Dynamic Boosted Forest

no code implementations • 19 Apr 2018 • Haixin Wang, Xingzhang Ren, Jinan Sun, Wei Ye, Long Chen, Muzhi Yu, Shikun Zhang

Specically, we propose to measure the quality of each leaf node of every decision tree in the random forest to determine hard examples.

Ensemble Learning

Paper
Add Code

Self-Supervised Monocular Image Depth Learning and Confidence Estimation

no code implementations • 14 Mar 2018 • Long Chen, Wen Tang, Nigel John

Convolutional Neural Networks (CNNs) need large amounts of data with ground truth annotation, which is a challenging problem that has limited the development and fast deployment of CNNs for many computer vision tasks.

Depth Estimation

Paper
Add Code

Context-Aware Mixed Reality: A Framework for Ubiquitous Interaction

1 code implementation • 14 Mar 2018 • Long Chen, Wen Tang, Nigel John, Tao Ruan Wan, Jian Jun Zhang

Mixed Reality (MR) is a powerful interactive technology that yields new types of user experience.

Mixed Reality

Paper
Code

Improved Deep Hashing with Soft Pairwise Similarity for Multi-label Image Retrieval

1 code implementation • 8 Mar 2018 • Zheng Zhang, Qin Zou, Yuewei Lin, Long Chen, Song Wang

In this paper, a new deep hashing method is proposed for multi-label image retrieval by re-defining the pairwise similarity into an instance similarity, where the instance similarity is quantified into a percentage based on the normalized semantic labels.

Deep Hashing Multi-Label Image Retrieval

Paper
Code

Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Networks

1 code implementation • CVPR 2018 • Long Chen, Hanwang Zhang, Jun Xiao, Wei Liu, Shih-Fu Chang

We propose a novel framework called Semantics-Preserving Adversarial Embedding Network (SP-AEN) for zero-shot visual recognition (ZSL), where test images and their classes are both unseen during training.

General Classification Zero-Shot Learning

Paper
Code

Improving Negative Sampling for Word Representation using Self-embedded Features

no code implementations • 26 Oct 2017 • Long Chen, Fajie Yuan, Joemon M. Jose, Wei-Nan Zhang

Although the word-popularity based negative sampler has shown superb performance in the skip-gram model, the theoretical motivation behind oversampling popular (non-observed) words as negative samples is still not well understood.

Paper
Add Code

Maximum Principle Based Algorithms for Deep Learning

2 code implementations • 26 Oct 2017 • Qianxiao Li, Long Chen, Cheng Tai, Weinan E

The continuous dynamical system approach to deep learning is explored in order to devise alternative frameworks for training algorithms.

Paper
Code

Semantic Augmented Reality Environment with Material-Aware Physical Interactions

no code implementations • 3 Aug 2017 • Long Chen, Karl Francis, Wen Tang

In Augmented Reality (AR) environment, realistic interactions between the virtual and real objects play a crucial role in user experience.

Scene Understanding

Paper
Add Code

Real-time Geometry-Aware Augmented Reality in Minimally Invasive Surgery

no code implementations • 3 Aug 2017 • Long Chen, Wen Tang, Nigel W. John

The potential of Augmented Reality (AR) technology to assist minimally invasive surgeries (MIS) lies in its computational performance and accuracy in dealing with challenging MIS scenes.

Stereo Matching Stereo Matching Hand +1

Paper
Add Code

Recent Developments and Future Challenges in Medical Mixed Reality

no code implementations • 3 Aug 2017 • Long Chen, Thomas Day, Wen Tang, Nigel W. John

Mixed Reality (MR) is of increasing interest within technology-driven modern medicine but is not yet used in everyday practice.

Classification General Classification +1

Paper
Add Code

Video Question Answering via Attribute-Augmented Attention Network Learning

no code implementations • 20 Jul 2017 • Yunan Ye, Zhou Zhao, Yimeng Li, Long Chen, Jun Xiao, Yueting Zhuang

Video Question Answering is a challenging problem in visual information retrieval, which provides the answer to the referenced video content according to the question.

Attribute Information Retrieval +6

Paper
Add Code

Planecell: Representing the 3D Space with Planes

no code implementations • 30 Mar 2017 • Lei Fan, Ziyu Pan, Long Chen, Kai Huang

Reconstruction based on the stereo camera has received considerable attention recently, but two particular challenges still remain.

Image Segmentation Semantic Segmentation

Paper
Add Code

Augmented Reality for Depth Cues in Monocular Minimally Invasive Surgery

no code implementations • 1 Mar 2017 • Long Chen, Wen Tang, Nigel W. John, Tao Ruan Wan, Jian Jun Zhang

In vivo laparoscopic videos used in the tests have demonstrated the robustness and accuracy of our proposed framework on both camera tracking and surface reconstruction, illustrating the potential of our algorithm for depth augmentation and depth-corrected augmented reality in MIS with monocular endoscopes.

Simultaneous Localization and Mapping Surface Reconstruction

Paper
Add Code

Cascade one-vs-rest detection network for fine-grained recognition without part annotations

no code implementations • 28 Feb 2017 • Long Chen, Junyu Dong, Shengke Wang, Kin-Man Lam, Muwei Jian, Hua Zhang, Xiaochun Cao

To bridge this gap, we introduce a cascaded structure to eliminate background and exploit a one-vs-rest loss to capture more minute variances among different subordinate categories.

Object

Paper
Add Code

SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning

2 code implementations • CVPR 2017 • Long Chen, Hanwang Zhang, Jun Xiao, Liqiang Nie, Jian Shao, Wei Liu, Tat-Seng Chua

Existing visual attention models are generally spatial, i. e., the attention is modeled as spatial probabilities that re-weight the last conv-layer feature map of a CNN encoding an input image.

Image Captioning Sentence

207

Paper
Code

Who Leads the Clothing Fashion: Style, Color, or Texture? A Computational Study

no code implementations • 26 Aug 2016 • Qin Zou, Zheng Zhang, Qian Wang, Qingquan Li, Long Chen, Song Wang

Specifically, a classification-based model is proposed to quantify the influence of different visual stimuli, in which each visual stimulus's influence is quantified by its corresponding accuracy in fashion classification.

General Classification

Paper
Add Code

Learning Bilingual Sentiment Word Embeddings for Cross-language Sentiment Classification

no code implementations • IJCNLP 2015 • Huiwei Zhou, Long Chen, Fulin Shi, Degen Huang

Classification General Classification +4

Paper
Add Code

Using Collocations and K-means Clustering to Improve the N-pos Model for Japanese IME

no code implementations • WS 2012 • Long Chen, Xianchao Wu, Jingzhou He

Clustering Language Modelling +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.