Search Results for author: Ming Jiang

Found 48 papers, 20 papers with code

SALICON: Saliency in Context

no code implementations CVPR 2015 Ming Jiang, Shengsheng Huang, Juanyong Duan, Qi Zhao

Saliency in Context (SALICON) is an ongoing effort that aims at understanding and predicting visual attention.

Saliency Prediction

Says Who\ldots? Identification of Expert versus Layman Critics' Reviews of Documentary Films

no code implementations COLING 2016 Ming Jiang, Jana Diesner

We extend classic review mining work by building a binary classifier that predicts whether a review of a documentary film was written by an expert or a layman with 90. 70{\%} accuracy (F1 score), and compare the characteristics of the predicted classes.

Decision Making Recommendation Systems

Learning Visual Attention to Identify People With Autism Spectrum Disorder

no code implementations ICCV 2017 Ming Jiang, Qi Zhao

This paper presents a novel method for quantitative and objective diagnoses of Autism Spectrum Disorder (ASD) using eye tracking and deep neural networks.

Beyond Trade-off: Accelerate FCN-based Face Detector with Higher Accuracy

no code implementations CVPR 2018 Guanglu Song, Yu Liu, Ming Jiang, Yujie Wang, Junjie Yan, Biao Leng

Fully convolutional neural network (FCN) has been dominating the game of face detection task for a few years with its congenital capability of sliding-window-searching with shared kernels, which boiled down all the redundant calculation, and most recent state-of-the-art methods such as Faster-RCNN, SSD, YOLO and FPN use FCN as their backbone.

Face Detection Philosophy +1

Emotional Attention: A Study of Image Sentiment and Visual Attention

no code implementations CVPR 2018 Shaojing Fan, Zhiqi Shen, Ming Jiang, Bryan L. Koenig, Juan Xu, Mohan S. Kankanhalli, Qi Zhao

In this paper, we present the first study to focus on the relation between emotional properties of an image and visual attention.

Saliency Prediction

A Neural Network Aided Approach for LDPC Coded DCO-OFDM with Clipping Distortion

no code implementations4 Sep 2018 Yuan He, Ming Jiang, Chunming Zhao

In this paper, a neural network-aided bit-interleaved coded modulation (NN-BICM) receiver is designed to mitigate the nonlinear clipping distortion in the LDPC coded direct currentbiased optical orthogonal frequency division multiplexing (DCOOFDM) systems.

Which Has Better Visual Quality: The Clear Blue Sky or a Blurry Animal?

1 code implementation IEEE Transactions on Multimedia 2018 Dingquan Li, Tingting Jiang, Weisi Lin, Ming Jiang

The proposed method, SFA, is compared with nine representative blur-specific NR-IQA methods, two general-purpose NR-IQA methods, and two extra full-reference IQA methods on Gaussian blur images (with and without Gaussian noise/JPEG compression) and realistic blur images from multiple databases, including LIVE, TID2008, TID2013, MLIVE1, MLIVE2, BID, and CLIVE.

Blind Image Quality Assessment Image Classification +3

Exploiting High-Level Semantics for No-Reference Image Quality Assessment of Realistic Blur Images

1 code implementation18 Oct 2018 Dingquan Li, Tingting Jiang, Ming Jiang

To guarantee a satisfying Quality of Experience (QoE) for consumers, it is required to measure image quality efficiently and reliably.

Blind Image Quality Assessment Image Quality Estimation +1

Quality Assessment for Tone-Mapped HDR Images Using Multi-Scale and Multi-Layer Information

1 code implementation19 Oct 2018 Qin He, Dingquan Li, Tingting Jiang, Ming Jiang

So we propose a new no-reference method of tone-mapped image quality assessment based on multi-scale and multi-layer features that are extracted from a pre-trained deep convolutional neural network model.

Blind Image Quality Assessment No-Reference Image Quality Assessment Multimedia

Parsing R-CNN for Instance-Level Human Analysis

2 code implementations CVPR 2019 Lu Yang, Qing Song, Zhihui Wang, Ming Jiang

Models need to distinguish different human instances in the image panel and learn rich features to represent the details of each instance.

Human Part Segmentation Multi-Human Parsing +1

Single Image Blind Deblurring Using Multi-Scale Latent Structure Prior

no code implementations11 Jun 2019 Yuanchao Bai, Huizhu Jia, Ming Jiang, Xian-Ming Liu, Xiaodong Xie, Wen Gao

Blind image deblurring is a challenging problem in computer vision, which aims to restore both the blur kernel and the latent sharp image from only a blurry observation.

Blind Image Deblurring Image Deblurring +3

Quality Assessment of In-the-Wild Videos

2 code implementations1 Aug 2019 Dingquan Li, Tingting Jiang, Ming Jiang

We propose an objective no-reference video quality assessment method by integrating both effects into a deep neural network.

Image Classification Video Quality Assessment

REO-Relevance, Extraness, Omission: A Fine-grained Evaluation for Image Captioning

1 code implementation IJCNLP 2019 Ming Jiang, Junjie Hu, Qiuyuan Huang, Lei Zhang, Jana Diesner, Jianfeng Gao

In this study, we present a fine-grained evaluation method REO for automatically measuring the performance of image captioning systems.

Image Captioning

LabelFool: A Trick in the Label Space

no code implementations25 Sep 2019 Yujia Liu, Tingting Jiang, Ming Jiang

It is widely known that well-designed perturbations can cause state-of-the-art machine learning classifiers to mis-label an image, with sufficiently small perturbations that are imperceptible to the human eyes.

Improving Scholarly Knowledge Representation: Evaluating BERT-based Models for Scientific Relation Classification

no code implementations13 Apr 2020 Ming Jiang, Jennifer D'Souza, Sören Auer, J. Stephen Downie

With the rapid growth of research publications, there is a vast amount of scholarly knowledge that needs to be organized in digital libraries.

Classification General Classification +2

Fantastic Answers and Where to Find Them: Immersive Question-Directed Visual Attention

no code implementations CVPR 2020 Ming Jiang, Shi Chen, Jinhui Yang, Qi Zhao

The Immersive Question-directed Visual Attention (IQVA) dataset features visual attention and corresponding task performance (i. e., answer correctness).

Decision Making

Leveraging Bottom-Up and Top-Down Attention for Few-Shot Object Detection

no code implementations23 Jul 2020 Xianyu Chen, Ming Jiang, Qi Zhao

Few-shot object detection aims at detecting objects with few annotated examples, which remains a challenging research problem yet to be explored.

Few-Shot Learning Few-Shot Object Detection +2

Saliency Prediction with External Knowledge

no code implementations27 Jul 2020 Yifeng Zhang, Ming Jiang, Qi Zhao

At the core of the method is a new Graph Semantic Saliency Network (GraSSNet) that constructs a graph that encodes semantic relationships learned from external knowledge.

Graph Attention Saliency Prediction

AiR: Attention with Reasoning Capability

1 code implementation ECCV 2020 Shi Chen, Ming Jiang, Jinhui Yang, Qi Zhao

In this work, we propose an Attention with Reasoning capability (AiR) framework that uses attention to understand and improve the process leading to task outcomes.

Norm-in-Norm Loss with Faster Convergence and Better Performance for Image Quality Assessment

1 code implementation10 Aug 2020 Dingquan Li, Tingting Jiang, Ming Jiang

Experiments on two relevant datasets (KonIQ-10k and CLIVE) show that, compared to MAE or MSE loss, the new loss enables the IQA model to converge about 10 times faster and the final model achieves better performance.

Blind Image Quality Assessment No-Reference Image Quality Assessment +1

Unified Quality Assessment of In-the-Wild Videos with Mixed Datasets Training

1 code implementation9 Nov 2020 Dingquan Li, Tingting Jiang, Ming Jiang

We focus on automatically assessing the quality of in-the-wild videos, which is a challenging problem due to the absence of reference videos, the complexity of distortions, and the diversity of video contents.

Video Quality Assessment

Self-Distillation for Few-Shot Image Captioning

1 code implementation IEEE Winter Conference on Applications of Computer Vision 2021 Xianyu Chen, Ming Jiang, Qi Zhao

We propose an ensemble-based self-distillation method that allows image captioning models to be trained with unpaired images and captions.

Image Captioning

A Portable, Self-Contained Neuroprosthetic Hand with Deep Learning-Based Finger Control

no code implementations24 Mar 2021 Anh Tuan Nguyen, Markus W. Drealan, Diu Khue Luu, Ming Jiang, Jian Xu, Jonathan Cheng, Qi Zhao, Edward W. Keefer, Zhi Yang

This enables the implementation of the neuroprosthetic hand as a portable and self-contained unit with real-time control of individual finger movements.

Edge-computing

Predicting Human Scanpaths in Visual Question Answering

1 code implementation CVPR 2021 Xianyu Chen, Ming Jiang, Qi Zhao

Conditioned on a task guidance map, the proposed model learns question-specific attention patterns to generate scanpaths.

Question Answering Scanpath prediction +2

Explicit Knowledge Incorporation for Visual Reasoning

no code implementations CVPR 2021 Yifeng Zhang, Ming Jiang, Qi Zhao

Existing explainable and explicit visual reasoning methods only perform reasoning based on visual evidence but do not take into account knowledge beyond what is in the visual scene.

Visual Reasoning

Leveraging Human Attention in Novel Object Captioning

1 code implementation Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence 2021 Xianyu Chen, Ming Jiang, Qi Zhao

Image captioning models depend on training with paired image-text corpora, which poses various challenges in describing images containing novel objects absent from the training data.

Image Captioning Object

A Speaker-aware Parallel Hierarchical Attentive Encoder-Decoder Model for Multi-turn Dialogue Generation

no code implementations13 Oct 2021 ZiHao Wang, Ming Jiang, Junli Wang

Differing from prior work that solely relies on the content of conversation history to generate a response, we argue that capturing relative social relations among utterances (i. e., generated by either the same speaker or different persons) benefits the machine capturing fine-grained context information from a conversation history to improve context coherence in the generated response.

Dialogue Generation

VisualHow: Multimodal Problem Solving

1 code implementation CVPR 2022 Jinhui Yang, Xianyu Chen, Ming Jiang, Shi Chen, Louis Wang, Qi Zhao

With an overarching goal of developing intelligent systems to assist humans in various daily activities, we propose VisualHow, a free-form and open-ended research that focuses on understanding a real-life problem and deriving its solution by incorporating key components across multiple modalities.

Query and Attention Augmentation for Knowledge-Based Explainable Reasoning

1 code implementation CVPR 2022 Yifeng Zhang, Ming Jiang, Qi Zhao

Explainable visual question answering (VQA) models have been developed with neural modules and query-based knowledge incorporation to answer knowledge-requiring questions.

Question Answering Visual Question Answering

Artificial Intelligence Enables Real-Time and Intuitive Control of Prostheses via Nerve Interface

no code implementations16 Mar 2022 Diu Khue Luu, Anh Tuan Nguyen, Ming Jiang, Markus W. Drealan, Jian Xu, Tong Wu, Wing-kin Tam, Wenfeng Zhao, Brian Z. H. Lim, Cynthia K. Overstreet, Qi Zhao, Jonathan Cheng, Edward W. Keefer, Zhi Yang

Objective: The next generation prosthetic hand that moves and feels like a real hand requires a robust neural interconnection between the human minds and machines.

Attention in Reasoning: Dataset, Analysis, and Modeling

1 code implementation20 Apr 2022 Shi Chen, Ming Jiang, Jinhui Yang, Qi Zhao

In this work, we propose an Attention with Reasoning capability (AiR) framework that uses attention to understand and improve the process leading to task outcomes.

Question Answering Visual Question Answering

Cross-Modality Gated Attention Fusion for Multimodal Sentiment Analysis

no code implementations25 Aug 2022 Ming Jiang, Shaoxiong Ji

Multimodal sentiment analysis is an important research task to predict the sentiment score based on the different modality data from a specific opinion video.

Multimodal Sentiment Analysis

Proportionate Recursive Maximum Correntropy Criterion Adaptive Filtering Algorithms and their Performance Analysis

no code implementations22 Oct 2022 Zhen Qin, Jun Tao, Le Yang, Ming Jiang

Motivated by the success of our recently proposed proportionate recursive least squares (PRLS) algorithm for sparse system identification, we propose to introduce the proportionate updating (PU) mechanism into the RMCC, leading to two sparsity-aware RMCC algorithms: the proportionate recursive MCC (PRMCC) algorithm and the combinational PRMCC (CPRMCC) algorithm.

Benchmarking LLM-based Machine Translation on Cultural Awareness

no code implementations23 May 2023 Binwei Yao, Ming Jiang, Diyi Yang, Junjie Hu

Furthermore, we devise a novel evaluation metric to assess the understandability of translations in a reference-free manner by GPT-4.

Benchmarking In-Context Learning +3

Key Gene Mining in Transcriptional Regulation for Specific Biological Processes with Small Sample Sizes Using Multi-network pipeline Transformer

no code implementations7 Aug 2023 Kerui Huang, Jianhong Tian, Lei Sun, Li Zeng, Peng Xie, Aihua Deng, Ping Mo, Zhibo Zhou, Ming Jiang, Yun Wang, Xiaocheng Jiang

Gene mining is an important topic in the field of life sciences, but traditional machine learning methods cannot consider the regulatory relationships between genes.

Data Augmentation

What Do Deep Saliency Models Learn about Visual Attention?

1 code implementation NeurIPS 2023 Shi Chen, Ming Jiang, Qi Zhao

In recent years, deep saliency models have made significant progress in predicting human visual attention.

Saliency Prediction

CPopQA: Ranking Cultural Concept Popularity by LLMs

no code implementations14 Nov 2023 Ming Jiang, Mansi Joshi

In this study, we introduce a novel few-shot question-answering task (CPopQA) that examines LLMs' statistical ranking abilities for long-tail cultural concepts (e. g., holidays), with a specific focus on these concepts' popularity in the United States and the United Kingdom, respectively.

Question Answering

Prompting Large Vision-Language Models for Compositional Reasoning

1 code implementation20 Jan 2024 Timothy Ossowski, Ming Jiang, Junjie Hu

Vision-language models such as CLIP have shown impressive capabilities in encoding texts and images into aligned embeddings, enabling the retrieval of multimodal data in a shared embedding space.

Retrieval Visual Reasoning

GAN Based Near-Field Channel Estimation for Extremely Large-Scale MIMO Systems

no code implementations27 Feb 2024 Ming Ye, Xiao Liang, Cunhua Pan, Yinfei Xu, Ming Jiang, ChunGuo Li

The mixed line-of-sight (LoS) and non-line-of-sight (NLoS) XL-MIMO near-field channel model is adopted to describe the XL-MIMO near-field channel accurately.

Generative Adversarial Network

Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation

1 code implementation2 Apr 2024 Zihan Wang, Xiangyang Li, Jiahao Yang, Yeqi Liu, Junjie Hu, Ming Jiang, Shuqiang Jiang

Vision-and-language navigation (VLN) enables the agent to navigate to a remote location following the natural language instruction in 3D environments.

Navigate Vision and Language Navigation +1

Beyond Average: Individualized Visual Scanpath Prediction

no code implementations18 Apr 2024 Xianyu Chen, Ming Jiang, Qi Zhao

Understanding how attention varies across individuals has significant scientific and societal impacts.

Cannot find the paper you are looking for? You can Submit a new open access paper.