Search Results for author: Jun Xiao

Found 68 papers, 27 papers with code

De-Biased Court's View Generation with Causality

no code implementations EMNLP 2020 Yiquan Wu, Kun Kuang, Yating Zhang, Xiaozhong Liu, Changlong Sun, Jun Xiao, Yueting Zhuang, Luo Si, Fei Wu

Court{'}s view generation is a novel but essential task for legal AI, aiming at improving the interpretability of judgment prediction results and enabling automatic legal document generation.

Text Generation

Learning Combinatorial Prompts for Universal Controllable Image Captioning

no code implementations11 Mar 2023 Zhen Wang, Jun Xiao, Lei Chen, Fei Gao, Jian Shao, Long Chen

Due to its simplicity, our ComPro can easily be extended to more complex combined control signals by concatenating these prompts.

Image Captioning Language Modelling

Compositional Prompt Tuning with Motion Cues for Open-vocabulary Video Relation Detection

1 code implementation1 Feb 2023 Kaifeng Gao, Long Chen, Hanwang Zhang, Jun Xiao, Qianru Sun

Without bells and whistles, our RePro achieves a new state-of-the-art performance on two VidVRD benchmarks of not only the base training object and predicate categories, but also the unseen ones.

Video Visual Relation Detection

Further Improving Weakly-supervised Object Localization via Causal Knowledge Distillation

1 code implementation3 Jan 2023 Feifei Shao, Yawei Luo, Shengjian Wu, Qiyi Li, Fei Gao, Yi Yang, Jun Xiao

Weakly-supervised object localization aims to indicate the category as well as the scope of an object in an image given only the image-level labels.

Knowledge Distillation Weakly-Supervised Object Localization

SAViT: Structure-Aware Vision Transformer Pruning via Collaborative Optimization

no code implementations NIPS 2022 Zheng Chuanyang, Zheyang Li, Kai Zhang, Zhi Yang, Wenming Tan, Jun Xiao, Ye Ren, ShiLiang Pu

In this paper, we introduce joint importance, which integrates essential structural-aware interactions between components for the first time, to perform collaborative pruning.

object-detection Object Detection

DS-MVSNet: Unsupervised Multi-view Stereo via Depth Synthesis

no code implementations13 Aug 2022 Jingliang Li, Zhengda Lu, Yiqun Wang, Ying Wang, Jun Xiao

To mine the information in probability volume, we creatively synthesize the source depths by splattering the probability volume and depth hypotheses to source views.

Multi-scale Sampling and Aggregation Network For High Dynamic Range Imaging

no code implementations4 Aug 2022 Jun Xiao, Qian Ye, Tianshan Liu, Cong Zhang, Kin-Man Lam

High dynamic range (HDR) imaging is a fundamental problem in image processing, which aims to generate well-exposed images, even in the presence of varying illumination in the scenes.

Online Video Super-Resolution with Convolutional Kernel Bypass Graft

no code implementations4 Aug 2022 Jun Xiao, Xinyang Jiang, Ningxin Zheng, Huan Yang, Yifan Yang, Yuqing Yang, Dongsheng Li, Kin-Man Lam

Then, our proposed CKBG method enhances this lightweight base model by bypassing the original network with ``kernel grafts'', which are extra convolutional kernels containing the prior knowledge of external pretrained image SR models.

Transfer Learning Video Super-Resolution

Integrating Object-aware and Interaction-aware Knowledge for Weakly Supervised Scene Graph Generation

1 code implementation3 Aug 2022 Xingchen Li, Long Chen, Wenbo Ma, Yi Yang, Jun Xiao

However, we argue that most existing WSSGG works only focus on object-consistency, which means the grounded regions should have the same object category label as text entities.

Graph Generation Scene Graph Generation

Rethinking the Evaluation of Unbiased Scene Graph Generation

no code implementations3 Aug 2022 Xingchen Li, Long Chen, Jian Shao, Shaoning Xiao, Songyang Zhang, Jun Xiao

Current Scene Graph Generation (SGG) methods tend to predict frequent predicate categories and fail to recognize rare ones due to the severe imbalanced distribution of predicates.

Graph Generation Unbiased Scene Graph Generation

Unified Normalization for Accelerating and Stabilizing Transformers

1 code implementation2 Aug 2022 Qiming Yang, Kai Zhang, Chaoxiang Lan, Zhi Yang, Zheyang Li, Wenming Tan, Jun Xiao, ShiLiang Pu

To tackle these issues, we propose Unified Normalization (UN), which can speed up the inference by being fused with other linear operations and achieve comparable performance on par with LN.

Rethinking the Reference-based Distinctive Image Captioning

1 code implementation22 Jul 2022 Yangjun Mao, Long Chen, Zhihong Jiang, Dong Zhang, Zhimeng Zhang, Jian Shao, Jun Xiao

Unfortunately, reference images used by existing Ref-DIC works are easy to distinguish: these reference images only resemble the target image at scene-level and have few common objects, such that a Ref-DIC model can trivially generate distinctive captions even without considering the reference images.

Benchmarking Image Captioning

Explicit Image Caption Editing

1 code implementation20 Jul 2022 Zhen Wang, Long Chen, Wenbo Ma, Guangxing Han, Yulei Niu, Jian Shao, Jun Xiao

Given an image and a reference caption, the image caption editing task aims to correct the misalignment errors and generate a refined caption.

Rethinking Data Augmentation for Robust Visual Question Answering

1 code implementation18 Jul 2022 Long Chen, Yuhang Zheng, Jun Xiao

Unfortunately, to guarantee augmented samples have reasonable ground-truth answers, they manually design a set of heuristic rules for several question types, which extremely limits its generalization abilities.

Data Augmentation Knowledge Distillation +2

Learning Regularized Multi-Scale Feature Flow for High Dynamic Range Imaging

no code implementations6 Jul 2022 Qian Ye, Masanori Suganuma, Jun Xiao, Takayuki Okatani

Reconstructing ghosting-free high dynamic range (HDR) images of dynamic scenes from a set of multi-exposure images is a challenging task, especially with large object motion and occlusions, leading to visible artifacts using existing methods.

The Devil is in the Labels: Noisy Label Correction for Robust Scene Graph Generation

1 code implementation CVPR 2022 Lin Li, Long Chen, Yifeng Huang, Zhimeng Zhang, Songyang Zhang, Jun Xiao

Then, in Pos-NSD, we use a clustering-based algorithm to divide all positive samples into multiple sets, and treat the samples in the noisiest set as noisy positive samples.

Graph Generation Out-of-Distribution Detection +2

A Knowledge-Enhanced Adversarial Model for Cross-lingual Structured Sentiment Analysis

no code implementations31 May 2022 Qi Zhang, Jie zhou, Qin Chen, Qingchun Bai, Jun Xiao, Liang He

Notably, we propose a Knowledge-Enhanced Adversarial Model (\texttt{KEAM}) with both implicit distributed and explicit structural knowledge to enhance the cross-lingual transfer.

Cross-Lingual Transfer Sentiment Analysis

Rethinking Multi-Modal Alignment in Video Question Answering from Feature and Sample Perspectives

no code implementations25 Apr 2022 Shaoning Xiao, Long Chen, Kaifeng Gao, Zhao Wang, Yi Yang, Zhimeng Zhang, Jun Xiao

From the view of feature, we break down the video into trajectories and first leverage trajectory feature in VideoQA to enhance the alignment between two modalities.

Question Answering Video Question Answering

Bidirectional Self-Training with Multiple Anisotropic Prototypes for Domain Adaptive Semantic Segmentation

1 code implementation16 Apr 2022 Yulei Lu, Yawei Luo, Li Zhang, Zheyang Li, Yi Yang, Jun Xiao

A thriving trend for domain adaptive segmentation endeavors to generate the high-quality pseudo labels for target domain and retrain the segmentor on them.

Pseudo Label Semantic Segmentation +2

DepthGAN: GAN-based Depth Generation of Indoor Scenes from Semantic Layouts

no code implementations22 Mar 2022 Yidi Li, Yiqun Wang, Zhengda Lu, Jun Xiao

Limited by the computational efficiency and accuracy, generating complex 3D scenes remains a challenging problem for existing generation networks.

Active Learning for Point Cloud Semantic Segmentation via Spatial-Structural Diversity Reasoning

no code implementations25 Feb 2022 Feifei Shao, Yawei Luo, Ping Liu, Jie Chen, Yi Yang, Yulei Lu, Jun Xiao

To deploy SSDR-AL in a more practical scenario, we design a noise-aware iterative labeling strategy to confront the "noisy annotation" problem introduced by the previous "dominant labeling" strategy in superpoints.

Active Learning Semantic Segmentation

Classification-Then-Grounding: Reformulating Video Scene Graphs as Temporal Bipartite Graphs

1 code implementation CVPR 2022 Kaifeng Gao, Long Chen, Yulei Niu, Jian Shao, Jun Xiao

To this end, we propose a new classification-then-grounding framework for VidSGG, which can avoid all the three overlooked drawbacks.

Predicate Classification

Relational Graph Learning for Grounded Video Description Generation

no code implementations2 Dec 2021 Wenqiao Zhang, Xin Eric Wang, Siliang Tang, Haizhou Shi, Haocheng Shi, Jun Xiao, Yueting Zhuang, William Yang Wang

Such a setting can help explain the decisions of captioning models and prevents the model from hallucinating object words in its description.

Graph Learning Video Description

Consensus Graph Representation Learning for Better Grounded Image Captioning

no code implementations2 Dec 2021 Wenqiao Zhang, Haochen Shi, Siliang Tang, Jun Xiao, Qiang Yu, Yueting Zhuang

The contemporary visual captioning models frequently hallucinate objects that are not actually in a scene, due to the visual misclassification or over-reliance on priors that resulting in the semantic inconsistency between the visual information and the target lexical words.

Graph Representation Learning Image Captioning

Unified Group Fairness on Federated Learning

no code implementations9 Nov 2021 Fengda Zhang, Kun Kuang, Yuxuan Liu, Long Chen, Chao Wu, Fei Wu, Jiaxun Lu, Yunfeng Shao, Jun Xiao

We validate the advantages of the FMDA-M algorithm with various kinds of distribution shift settings in experiments, and the results show that FMDA-M algorithm outperforms the existing fair FL algorithms on unified group fairness.

Fairness Federated Learning

Counterfactual Samples Synthesizing and Training for Robust Visual Question Answering

1 code implementation3 Oct 2021 Long Chen, Yuhang Zheng, Yulei Niu, Hanwang Zhang, Jun Xiao

Specifically, CSST is composed of two parts: Counterfactual Samples Synthesizing (CSS) and Counterfactual Samples Training (CST).

Question Answering Visual Question Answering (VQA)

Natural Language Video Localization with Learnable Moment Proposals

1 code implementation EMNLP 2021 Shaoning Xiao, Long Chen, Jian Shao, Yueting Zhuang, Jun Xiao

Given an untrimmed video and a natural language query, Natural Language Video Localization (NLVL) aims to identify the video moment described by the query.

Instance-wise or Class-wise? A Tale of Neighbor Shapley for Concept-based Explanation

no code implementations3 Sep 2021 Jiahui Li, Kun Kuang, Lin Li, Long Chen, Songyang Zhang, Jian Shao, Jun Xiao

Deep neural networks have demonstrated remarkable performance in many data-driven and prediction-oriented applications, and sometimes even perform better than humans.

Medical Diagnosis

Video Relation Detection via Tracklet based Visual Transformer

1 code implementation19 Aug 2021 Kaifeng Gao, Long Chen, Yifeng Huang, Jun Xiao

Video Visual Relation Detection (VidVRD), has received significant attention of our community over recent years.

Video Visual Relation Detection

Shapley Counterfactual Credits for Multi-Agent Reinforcement Learning

no code implementations1 Jun 2021 Jiahui Li, Kun Kuang, Baoxiang Wang, Furui Liu, Long Chen, Fei Wu, Jun Xiao

Specifically, Shapley Value and its desired properties are leveraged in deep MARL to credit any combinations of agents, which grants us the capability to estimate the individual credit for each agent.

Multi-agent Reinforcement Learning reinforcement-learning +3

VL-NMS: Breaking Proposal Bottlenecks in Two-Stage Visual-Language Matching

no code implementations12 May 2021 Chenchi Zhang, Wenbo Ma, Jun Xiao, Hanwang Zhang, Jian Shao, Yueting Zhuang, Long Chen

In this paper, we argue that these methods overlook an obvious \emph{mismatch} between the roles of proposals in the two stages: they generate proposals solely based on the detection confidence (i. e., query-agnostic), hoping that the proposals contain all instances mentioned in the text query (i. e., query-aware).

Referring Expression Text Matching

Improving Weakly-supervised Object Localization via Causal Intervention

1 code implementation21 Apr 2021 Feifei Shao, Yawei Luo, Li Zhang, Lu Ye, Siliang Tang, Yi Yang, Jun Xiao

The recent emerged weakly supervised object localization (WSOL) methods can learn to localize an object in the image only using image-level labels.

Weakly-Supervised Object Localization

Efficient Ring-topology Decentralized Federated Learning with Deep Generative Models for Industrial Artificial Intelligent

no code implementations15 Apr 2021 Zhao Wang, Yifan Hu, Jun Xiao, Chao Wu

A novel ring FL topology as well as a map-reduce based synchronizing method are designed in the proposed RDFL to improve decentralized FL performance and bandwidth utilization.

Federated Learning

Human-like Controllable Image Captioning with Verb-specific Semantic Roles

1 code implementation CVPR 2021 Long Chen, Zhihong Jiang, Jun Xiao, Wei Liu

However, we argue that almost all existing objective control signals have overlooked two indispensable characteristics of an ideal control signal: 1) Event-compatible: all visual contents referred to in a single sentence should be compatible with the described activity.

Image Captioning Semantic Role Labeling

Boundary Proposal Network for Two-Stage Natural Language Video Localization

no code implementations15 Mar 2021 Shaoning Xiao, Long Chen, Songyang Zhang, Wei Ji, Jian Shao, Lu Ye, Jun Xiao

State-of-the-art NLVL methods are almost in one-stage fashion, which can be typically grouped into two categories: 1) anchor-based approach: it first pre-defines a series of video segment candidates (e. g., by sliding window), and then does classification for each candidate; 2) anchor-free approach: it directly predicts the probabilities for each video frame as a boundary or intermediate frame inside the positive segment.

Kinetic Energy Distribution of Fragments for Thermal Neutron-Induced $^{235}$U and $^{239}$Pu Fission Reactions

no code implementations24 Dec 2020 Xiaojun Sun, Haiyuan Peng, Liying Xie, Kai Zhang, Yan Liang, Yinlu Han, Nengchuan Su, Jie Yan, Jun Xiao, Junjie Sun

(2) Every complementary pair of the primary fission fragments is approximatively described as two ellipsoids with large deformation at scission moment.

Nuclear Theory

ROBY: Evaluating the Robustness of a Deep Model by its Decision Boundaries

no code implementations18 Dec 2020 Jinyin Chen, Zhen Wang, Haibin Zheng, Jun Xiao, Zhaoyan Ming

This work proposes a generic evaluation metric ROBY, a novel attack-independent robustness measure based on the model's decision boundaries.

GFL: A Decentralized Federated Learning Framework Based On Blockchain

no code implementations21 Oct 2020 Yifan Hu, YuHang Zhou, Jun Xiao, Chao Wu

Federated learning(FL) is a rapidly growing field and many centralized and decentralized FL frameworks have been proposed.

Data Poisoning Federated Learning

Federated Unsupervised Representation Learning

no code implementations18 Oct 2020 Fengda Zhang, Kun Kuang, Zhaoyang You, Tao Shen, Jun Xiao, Yin Zhang, Chao Wu, Yueting Zhuang, Xiaolin Li

FURL poses two new challenges: (1) data distribution shift (Non-IID distribution) among clients would make local models focus on different categories, leading to the inconsistency of representation spaces.

Federated Learning Representation Learning

Ref-NMS: Breaking Proposal Bottlenecks in Two-Stage Referring Expression Grounding

1 code implementation3 Sep 2020 Long Chen, Wenbo Ma, Jun Xiao, Hanwang Zhang, Shih-Fu Chang

The prevailing framework for solving referring expression grounding is based on a two-stage process: 1) detecting proposals with an object detector and 2) grounding the referent to one of the proposals.

Referring Expression

Topic Adaptation and Prototype Encoding for Few-Shot Visual Storytelling

no code implementations11 Aug 2020 Jiacheng Li, Siliang Tang, Juncheng Li, Jun Xiao, Fei Wu, ShiLiang Pu, Yueting Zhuang

In this paper, we focus on enhancing the generalization ability of the VIST model by considering the few-shot setting.

Meta-Learning Visual Storytelling

Accurate Lung Nodules Segmentation with Detailed Representation Transfer and Soft Mask Supervision

no code implementations29 Jul 2020 Changwei Wang, Rongtao Xu, Shibiao Xu, Weiliang Meng, Jun Xiao, Xiaopeng Zhang

Then, a novel Network with detailed representation transfer and Soft Mask supervision (DSNet) is proposed to process the input low-resolution images of lung nodules into high-quality segmentation results.

Computed Tomography (CT) Lesion Segmentation +2

Hierarchical Fashion Graph Network for Personalized Outfit Recommendation

1 code implementation26 May 2020 Xingchen Li, Xiang Wang, Xiangnan He, Long Chen, Jun Xiao, Tat-Seng Chua

Fashion outfit recommendation has attracted increasing attentions from online shopping services and fashion communities. Distinct from other scenarios (e. g., social networking or content sharing) which recommend a single item (e. g., a friend or picture) to a user, outfit recommendation predicts user preference on a set of well-matched fashion items. Hence, performing high-quality personalized outfit recommendation should satisfy two requirements -- 1) the nice compatibility of fashion items and 2) the consistence with user preference.

Counterfactual Samples Synthesizing for Robust Visual Question Answering

2 code implementations CVPR 2020 Long Chen, Xin Yan, Jun Xiao, Hanwang Zhang, ShiLiang Pu, Yueting Zhuang

To reduce the language biases, several recent works introduce an auxiliary question-only model to regularize the training of targeted VQA model, and achieve dominating performance on VQA-CP.

 Ranked #1 on Visual Question Answering (VQA) on VQA-CP (using extra training data)

Question Answering Visual Question Answering (VQA)

Evaluation Framework For Large-scale Federated Learning

1 code implementation3 Mar 2020 Lifeng Liu, Fengda Zhang, Jun Xiao, Chao Wu

Federated learning is proposed as a machine learning setting to enable distributed edge devices, such as mobile phones, to collaboratively learn a shared prediction model while keeping all the training data on device, which can not only take full advantage of data distributed across millions of nodes to train a good model but also protect data privacy.

Federated Learning

Reinforcement-Learning based Portfolio Management with Augmented Asset Movement Prediction States

1 code implementation9 Feb 2020 Yunan Ye, Hengzhi Pei, Boxin Wang, Pin-Yu Chen, Yada Zhu, Jun Xiao, Bo Li

Our framework aims to address two unique challenges in financial PM: (1) data heterogeneity -- the collected information for each asset is usually diverse, noisy and imbalanced (e. g., news articles); and (2) environment uncertainty -- the financial market is versatile and non-stationary.

Management reinforcement-learning +1

Video Dialog via Progressive Inference and Cross-Transformer

no code implementations IJCNLP 2019 Weike Jin, Zhou Zhao, Mao Gu, Jun Xiao, Furu Wei, Yueting Zhuang

Video dialog is a new and challenging task, which requires the agent to answer questions combining video information with dialog history.

Answer Generation Question Answering +4

DEBUG: A Dense Bottom-Up Grounding Approach for Natural Language Video Localization

no code implementations IJCNLP 2019 Chujie Lu, Long Chen, Chilie Tan, Xiaolin Li, Jun Xiao

In this paper, we focus on natural language video localization: localizing (ie, grounding) a natural language description in a long and untrimmed video sequence.

Weak Supervision Enhanced Generative Network for Question Generation

no code implementations1 Jul 2019 Yutong Wang, Jiyuan Zheng, Qijiong Liu, Zhou Zhao, Jun Xiao, Yueting Zhuang

More specifically, we devise a discriminator, Relation Guider, to capture the relations between the whole passage and the associated answer and then the Multi-Interaction mechanism is deployed to transfer the knowledge dynamically for our question generation system.

Question Answering Question Generation +1

Galaxy Learning -- A Position Paper

no code implementations22 Apr 2019 Chao Wu, Jun Xiao, Gang Huang, Fei Wu

Model training, as well as the communication, is achieved with blockchain and its smart contracts.

BIG-bench Machine Learning

Counterfactual Critic Multi-Agent Training for Scene Graph Generation

no code implementations ICCV 2019 Long Chen, Hanwang Zhang, Jun Xiao, Xiangnan He, ShiLiang Pu, Shih-Fu Chang

CMAT is a multi-agent policy gradient method that frames objects as cooperative agents, and then directly maximizes a graph-level metric as the reward.

Graph Generation Scene Graph Generation +1

Textually Guided Ranking Network for Attentional Image Retweet Modeling

no code implementations24 Oct 2018 Zhou Zhao, Hanbing Zhan, Lingtao Meng, Jun Xiao, Jun Yu, Min Yang, Fei Wu, Deng Cai

In this paper, we study the problem of image retweet prediction in social media, which predicts the image sharing behavior that the user reposts the image tweets from their followees.

Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Networks

1 code implementation CVPR 2018 Long Chen, Hanwang Zhang, Jun Xiao, Wei Liu, Shih-Fu Chang

We propose a novel framework called Semantics-Preserving Adversarial Embedding Network (SP-AEN) for zero-shot visual recognition (ZSL), where test images and their classes are both unseen during training.

General Classification Zero-Shot Learning

Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks

6 code implementations15 Aug 2017 Jun Xiao, Hao Ye, Xiangnan He, Hanwang Zhang, Fei Wu, Tat-Seng Chua

Factorization Machines (FMs) are a supervised learning approach that enhances the linear regression model by incorporating the second-order feature interactions.


Graph-Theoretic Spatiotemporal Context Modeling for Video Saliency Detection

no code implementations25 Jul 2017 Lina Wei, Fangfang Wang, Xi Li, Fei Wu, Jun Xiao

As a result, a key issue in video saliency detection is how to effectively capture the intrinsical properties of atomic video structures as well as their associated contextual interactions along the spatial and temporal dimensions.

Video Saliency Detection

Video Question Answering via Attribute-Augmented Attention Network Learning

no code implementations20 Jul 2017 Yunan Ye, Zhou Zhao, Yimeng Li, Long Chen, Jun Xiao, Yueting Zhuang

Video Question Answering is a challenging problem in visual information retrieval, which provides the answer to the referenced video content according to the question.

Information Retrieval Multiple-choice +5

SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning

2 code implementations CVPR 2017 Long Chen, Hanwang Zhang, Jun Xiao, Liqiang Nie, Jian Shao, Wei Liu, Tat-Seng Chua

Existing visual attention models are generally spatial, i. e., the attention is modeled as spatial probabilities that re-weight the last conv-layer feature map of a CNN encoding an input image.

Image Captioning

Metric Learning Driven Multi-Task Structured Output Optimization for Robust Keypoint Tracking

no code implementations4 Dec 2014 Liming Zhao, Xi Li, Jun Xiao, Fei Wu, Yueting Zhuang

As an important and challenging problem in computer vision and graphics, keypoint-based object tracking is typically formulated in a spatio-temporal statistical learning framework.

Metric Learning Object Tracking

Cannot find the paper you are looking for? You can Submit a new open access paper.