Search Results for author: Ji Zhang

Found 109 papers, 42 papers with code

Segment, Mask, and Predict: Augmenting Chinese Word Segmentation with Self-Supervision

no code implementations • EMNLP 2021 • Mieradilijiang Maimaiti, Yang Liu, Yuanhang Zheng, Gang Chen, Kaiyu Huang, Ji Zhang, Huanbo Luan, Maosong Sun

Besides, the robustness of the previous neural methods is limited by the large-scale annotated data.

Chinese Word Segmentation Language Modelling +1

Paper
Add Code

Turn-Level User Satisfaction Estimation in E-commerce Customer Service

no code implementations • ACL (ECNLP) 2021 • Runze Liang, Ryuichi Takanobu, Feng-Lin Li, Ji Zhang, Haiqing Chen, Minlie Huang

To this end, we formalize the turn-level satisfaction estimation as a reinforcement learning problem, in which the model can be optimized with only session-level satisfaction labels.

Paper
Add Code

Incorporating Casual Analysis into Diversified and Logical Response Generation

no code implementations • COLING 2022 • Jiayi Liu, Wei Wei, Zhixuan Chu, Xing Gao, Ji Zhang, Tan Yan, Yulin kang

Although the Conditional Variational Auto-Encoder (CVAE) model can generate more diversified responses than the traditional Seq2Seq model, the responses often have low relevance with the input words or are illogical with the question.

Response Generation

Paper
Add Code

Continual Few-shot Intent Detection

no code implementations • COLING 2022 • Guodun Li, Yuchen Zhai, Qianglong Chen, Xing Gao, Ji Zhang, Yin Zhang

Intent detection is at the core of task-oriented dialogue systems.

Intent Detection Task-Oriented Dialogue Systems +1

Paper
Add Code

ReAct Meets ActRe: When Language Agents Enjoy Training Data Autonomy

no code implementations • 21 Mar 2024 • Zonghan Yang, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Yang Liu

In WebShop, the 1-shot performance of the A$^3$T agent matches human average, and 4 rounds of iterative refinement lead to the performance approaching human experts.

Policy Gradient Methods

Paper
Add Code

RoleInteract: Evaluating the Social Interaction of Role-Playing Agents

1 code implementation • 20 Mar 2024 • Hongzhan Chen, Hehong Chen, Ming Yan, Wenshen Xu, Xing Gao, Weizhou Shen, Xiaojun Quan, Chenliang Li, Ji Zhang, Fei Huang, Jingren Zhou

In this paper, we introduce RoleInteract, the first benchmark designed to systematically evaluate the sociality of role-playing conversational agents at both individual and group levels of social interactions.

Paper
Code

mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding

1 code implementation • 19 Mar 2024 • Anwen Hu, Haiyang Xu, Jiabo Ye, Ming Yan, Liang Zhang, Bo Zhang, Chen Li, Ji Zhang, Qin Jin, Fei Huang, Jingren Zhou

In this work, we emphasize the importance of structure information in Visual Document Understanding and propose the Unified Structure Learning to boost the performance of MLLMs.

document understanding Optical Character Recognition (OCR)

807

Paper
Code

From Skepticism to Acceptance: Simulating the Attitude Dynamics Toward Fake News

no code implementations • 14 Mar 2024 • YuHan Liu, Xiuying Chen, Xiaoqing Zhang, Xing Gao, Ji Zhang, Rui Yan

Our simulation results uncover patterns in fake news propagation related to topic relevance, and individual traits, aligning with real-world observations.

Paper
Add Code

OccFusion: Depth Estimation Free Multi-sensor Fusion for 3D Occupancy Prediction

no code implementations • 8 Mar 2024 • Ji Zhang, Yiran Ding

We introduce OccFusion, a multi-modal fusion method free from depth estimation, and a corresponding point cloud sampling algorithm for dense integration of image features.

Autonomous Driving Depth Estimation +1

Paper
Add Code

Improving Cross-lingual Representation for Semantic Retrieval with Code-switching

no code implementations • 3 Mar 2024 • Mieradilijiang Maimaiti, Yuanhang Zheng, Ji Zhang, Fei Huang, Yue Zhang, Wenpei Luo, Kaiyu Huang

Semantic Retrieval (SR) has become an indispensable part of the FAQ system in the task-oriented question-answering (QA) dialogue scenario.

Question Answering Retrieval +3

Paper
Add Code

Semantics-enhanced Cross-modal Masked Image Modeling for Vision-Language Pre-training

no code implementations • 1 Mar 2024 • Haowei Liu, Yaya Shi, Haiyang Xu, Chunfeng Yuan, Qinghao Ye, Chenliang Li, Ming Yan, Ji Zhang, Fei Huang, Bing Li, Weiming Hu

In vision-language pre-training (VLP), masked image modeling (MIM) has recently been introduced for fine-grained cross-modal alignment.

Representation Learning

Paper
Add Code

Unifying Latent and Lexicon Representations for Effective Video-Text Retrieval

no code implementations • 26 Feb 2024 • Haowei Liu, Yaya Shi, Haiyang Xu, Chunfeng Yuan, Qinghao Ye, Chenliang Li, Ming Yan, Ji Zhang, Fei Huang, Bing Li, Weiming Hu

In this work, we propose the UNIFY framework, which learns lexicon representations to capture fine-grained semantics and combines the strengths of latent and lexicon representations for video-text retrieval.

Retrieval Text Retrieval +1

Paper
Add Code

Budget-Constrained Tool Learning with Planning

1 code implementation • 25 Feb 2024 • Yuanhang Zheng, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Yang Liu

Despite intensive efforts devoted to tool learning, the problem of budget-constrained tool learning, which focuses on resolving user queries within a specific budget constraint, has been widely overlooked.

Paper
Code

Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models

no code implementations • 24 Feb 2024 • Chaoya Jiang, Wei Ye, Mengfan Dong, Hongrui Jia, Haiyang Xu, Ming Yan, Ji Zhang, Shikun Zhang

Large Vision Language Models exhibit remarkable capabilities but struggle with hallucinations inconsistencies between images and their descriptions.

Hallucination Hallucination Evaluation

Paper
Add Code

PANDA: Preference Adaptation for Enhancing Domain-Specific Abilities of LLMs

no code implementations • 20 Feb 2024 • An Liu, Zonghan Yang, Zhenhe Zhang, Qingyuan Hu, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Yang Liu

While Large language models (LLMs) have demonstrated considerable capabilities across various natural language tasks, they often fall short of the performance achieved by domain-specific state-of-the-art models.

text-classification Text Classification

Paper
Add Code

Model Composition for Multimodal Large Language Models

no code implementations • 20 Feb 2024 • Chi Chen, Yiyang Du, Zheng Fang, Ziyue Wang, Fuwen Luo, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Maosong Sun, Yang Liu

In this paper, we propose a new paradigm through the model composition of existing MLLMs to create a new model that retains the modal understanding capabilities of each original model.

Paper
Add Code

Meta Ranking: Less Capable Language Models are Capable for Single Response Judgement

1 code implementation • 19 Feb 2024 • Zijun Liu, Boqun Kou, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Yang Liu

Although Large Language Models (LLMs) have demonstrated strong performance on a wide range of tasks, they still face reliability challenges such as hallucination.

Hallucination

Paper
Code

Browse and Concentrate: Comprehending Multimodal Content via prior-LLM Context Fusion

1 code implementation • 19 Feb 2024 • Ziyue Wang, Chi Chen, Yiqi Zhu, Fuwen Luo, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Maosong Sun, Yang Liu

With the bloom of Large Language Models (LLMs), Multimodal Large Language Models (MLLMs) that incorporate LLMs with pre-trained vision models have recently demonstrated impressive performance across diverse vision-language tasks.

Paper
Code

Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception

1 code implementation • 29 Jan 2024 • Junyang Wang, Haiyang Xu, Jiabo Ye, Ming Yan, Weizhou Shen, Ji Zhang, Fei Huang, Jitao Sang

To assess the performance of Mobile-Agent, we introduced Mobile-Eval, a benchmark for evaluating mobile device operations.

1,802

Paper
Code

Small LLMs Are Weak Tool Learners: A Multi-LLM Agent

1 code implementation • 14 Jan 2024 • Weizhou Shen, Chenliang Li, Hongzhan Chen, Ming Yan, Xiaojun Quan, Hehong Chen, Ji Zhang, Fei Huang

Each component is implemented by a single LLM that focuses on a specific capability and collaborates with others to accomplish the task.

Language Modelling Large Language Model

129

Paper
Code

Knowledge Distillation for Closed-Source Language Models

no code implementations • 13 Jan 2024 • Hongzhan Chen, Xiaojun Quan, Hehong Chen, Ming Yan, Ji Zhang

The prior estimation aims to derive a prior distribution by utilizing the corpus generated by closed-source language models, while the posterior estimation employs a proxy model to update the prior distribution and derive a posterior distribution.

Knowledge Distillation

Paper
Add Code

TiMix: Text-aware Image Mixing for Effective Vision-Language Pre-training

1 code implementation • 14 Dec 2023 • Chaoya Jiang, Wei Ye, Haiyang Xu, Qinghao Ye, Ming Yan, Ji Zhang, Shikun Zhang

Self-supervised Multi-modal Contrastive Learning (SMCL) remarkably advances modern Vision-Language Pre-training (VLP) models by aligning visual and linguistic modalities.

Contrastive Learning Data Augmentation

Paper
Code

Hallucination Augmented Contrastive Learning for Multimodal Large Language Model

1 code implementation • 12 Dec 2023 • Chaoya Jiang, Haiyang Xu, Mengfan Dong, Jiaxing Chen, Wei Ye, Ming Yan, Qinghao Ye, Ji Zhang, Fei Huang, Shikun Zhang

We first analyzed the representation distribution of textual and visual tokens in MLLM, revealing two important findings: 1) there is a significant gap between textual and visual representations, indicating unsatisfactory cross-modal representation alignment; 2) representations of texts that contain and do not contain hallucinations are entangled, making it challenging to distinguish them.

Ranked #72 on Visual Question Answering on MM-Vet

Contrastive Learning Hallucination +4

Paper
Code

mPLUG-PaperOwl: Scientific Diagram Analysis with the Multimodal Large Language Model

1 code implementation • 30 Nov 2023 • Anwen Hu, Yaya Shi, Haiyang Xu, Jiabo Ye, Qinghao Ye, Ming Yan, Chenliang Li, Qi Qian, Ji Zhang, Fei Huang

In this work, towards a more versatile copilot for academic paper writing, we mainly focus on strengthening the multi-modal diagram analysis ability of Multimodal LLMs.

Language Modelling Large Language Model

807

Paper
Code

Class Gradient Projection For Continual Learning

1 code implementation • 25 Nov 2023 • Cheng Chen, Ji Zhang, Jingkuan Song, Lianli Gao

Catastrophic forgetting is one of the most critical challenges in Continual Learning (CL).

Continual Learning Contrastive Learning

Paper
Code

AMBER: An LLM-free Multi-dimensional Benchmark for MLLMs Hallucination Evaluation

1 code implementation • 13 Nov 2023 • Junyang Wang, Yuhang Wang, Guohai Xu, Jing Zhang, Yukai Gu, Haitao Jia, Jiaqi Wang, Haiyang Xu, Ming Yan, Ji Zhang, Jitao Sang

Despite making significant progress in multi-modal tasks, current Multi-modal Large Language Models (MLLMs) encounter the significant challenge of hallucinations, which may lead to harmful consequences.

Attribute Hallucination +2

Paper
Code

mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration

2 code implementations • 7 Nov 2023 • Qinghao Ye, Haiyang Xu, Jiabo Ye, Ming Yan, Anwen Hu, Haowei Liu, Qi Qian, Ji Zhang, Fei Huang, Jingren Zhou

Multi-modal Large Language Models (MLLMs) have demonstrated impressive instruction abilities across various open-ended tasks.

Ranked #11 on Visual Question Answering (VQA) on InfiMM-Eval

Language Modelling Large Language Model +1

1,900

Paper
Code

CycleAlign: Iterative Distillation from Black-box LLM to White-box Models for Better Human Alignment

no code implementations • 25 Oct 2023 • Jixiang Hong, Quan Tu, Changyu Chen, Xing Gao, Ji Zhang, Rui Yan

With in-context learning (ICL) as the core of the cycle, the black-box models are able to rank the model-generated responses guided by human-craft instruction and demonstrations about their preferences.

In-Context Learning Instruction Following +2

Paper
Add Code

MCC-KD: Multi-CoT Consistent Knowledge Distillation

1 code implementation • 23 Oct 2023 • Hongzhan Chen, Siyue Wu, Xiaojun Quan, Rui Wang, Ming Yan, Ji Zhang

Large language models (LLMs) have showcased remarkable capabilities in complex reasoning through chain of thought (CoT) prompting.

Knowledge Distillation Mathematical Reasoning

Paper
Code

Improving Seq2Seq Grammatical Error Correction via Decoding Interventions

1 code implementation • 23 Oct 2023 • Houquan Zhou, Yumeng Liu, Zhenghua Li, Min Zhang, Bo Zhang, Chen Li, Ji Zhang, Fei Huang

In this paper, we propose a unified decoding intervention framework that employs an external critic to assess the appropriateness of the token to be generated incrementally, and then dynamically influence the choice of the next token.

Ranked #1 on Grammatical Error Correction on MuCGEC

Grammatical Error Correction Language Modelling

Paper
Code

UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model

2 code implementations • 8 Oct 2023 • Jiabo Ye, Anwen Hu, Haiyang Xu, Qinghao Ye, Ming Yan, Guohai Xu, Chenliang Li, Junfeng Tian, Qi Qian, Ji Zhang, Qin Jin, Liang He, Xin Alex Lin, Fei Huang

Text is ubiquitous in our visual world, conveying crucial information, such as in documents, websites, and everyday photographs.

Language Modelling Large Language Model +1

807

Paper
Code

DePT: Decoupled Prompt Tuning

1 code implementation • 14 Sep 2023 • Ji Zhang, Shihan Wu, Lianli Gao, Heng Tao Shen, Jingkuan Song

Specifically, through an in-depth analysis of the learned features of the base and new tasks, we observe that the BNT stems from a channel bias issue, i. e., the vast majority of feature channels are occupied by base-specific knowledge, resulting in the collapse of taskshared knowledge important to new tasks.

Zero-shot Generalization

Paper
Code

ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models

1 code implementation • 2 Sep 2023 • Chenliang Li, Hehong Chen, Ming Yan, Weizhou Shen, Haiyang Xu, Zhikai Wu, Zhicheng Zhang, Wenmeng Zhou, Yingda Chen, Chen Cheng, Hongzhu Shi, Ji Zhang, Fei Huang, Jingren Zhou

Large language models (LLMs) have recently demonstrated remarkable capabilities to comprehend human intentions, engage in reasoning, and design planning-like behavior.

1,815

Paper
Code

Evaluation and Analysis of Hallucination in Large Vision-Language Models

1 code implementation • 29 Aug 2023 • Junyang Wang, Yiyang Zhou, Guohai Xu, Pengcheng Shi, Chenlin Zhao, Haiyang Xu, Qinghao Ye, Ming Yan, Ji Zhang, Jihua Zhu, Jitao Sang, Haoyu Tang

In this paper, we propose Hallucination Evaluation based on Large Language Models (HaELM), an LLM-based hallucination evaluation framework.

Hallucination Hallucination Evaluation

Paper
Code

From Global to Local: Multi-scale Out-of-distribution Detection

1 code implementation • 20 Aug 2023 • Ji Zhang, Lianli Gao, Bingguang Hao, Hao Huang, Jingkuan Song, HengTao Shen

Out-of-distribution (OOD) detection aims to detect "unknown" data whose labels have not been seen during the in-distribution (ID) training process.

Out-of-Distribution Detection Out of Distribution (OOD) Detection +1

Paper
Code

Improving Anomaly Segmentation with Multi-Granularity Cross-Domain Alignment

no code implementations • 16 Aug 2023 • Ji Zhang, Xiao Wu, Zhi-Qi Cheng, Qi He, Wei Li

Anomaly segmentation plays a pivotal role in identifying atypical objects in images, crucial for hazard detection in autonomous driving systems.

Autonomous Driving Contrastive Learning

Paper
Add Code

CValues: Measuring the Values of Chinese Large Language Models from Safety to Responsibility

1 code implementation • 19 Jul 2023 • Guohai Xu, Jiayi Liu, Ming Yan, Haotian Xu, Jinghui Si, Zhuoran Zhou, Peng Yi, Xing Gao, Jitao Sang, Rong Zhang, Ji Zhang, Chao Peng, Fei Huang, Jingren Zhou

In this paper, we present CValues, the first Chinese human values evaluation benchmark to measure the alignment ability of LLMs in terms of both safety and responsibility criteria.

406

Paper
Code

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

1 code implementation • 4 Jul 2023 • Jiabo Ye, Anwen Hu, Haiyang Xu, Qinghao Ye, Ming Yan, Yuhao Dan, Chenlin Zhao, Guohai Xu, Chenliang Li, Junfeng Tian, Qian Qi, Ji Zhang, Fei Huang

Nevertheless, without in-domain training, these models tend to ignore fine-grained OCR features, such as sophisticated tables or large blocks of text, which are essential for OCR-free document understanding.

document understanding Language Modelling +2

807

Paper
Code

DialoGPS: Dialogue Path Sampling in Continuous Semantic Space for Data Augmentation in Multi-Turn Conversations

no code implementations • 29 Jun 2023 • Ang Lv, Jinpeng Li, Yuhan Chen, Xing Gao, Ji Zhang, Rui Yan

In open-domain dialogue generation tasks, contexts and responses in most datasets are one-to-one mapped, violating an important many-to-many characteristic: a context leads to various responses, and a response answers multiple contexts.

Data Augmentation Dialogue Generation +2

Paper
Add Code

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks

1 code implementation • 7 Jun 2023 • Haiyang Xu, Qinghao Ye, Xuan Wu, Ming Yan, Yuan Miao, Jiabo Ye, Guohai Xu, Anwen Hu, Yaya Shi, Guangwei Xu, Chenliang Li, Qi Qian, Maofei Que, Ji Zhang, Xiao Zeng, Fei Huang

In addition, to facilitate a comprehensive evaluation of video-language models, we carefully build the largest human-annotated Chinese benchmarks covering three popular video-language tasks of cross-modal retrieval, video captioning, and video category classification.

Cross-Modal Retrieval Language Modelling +3

252

Paper
Code

Distinguish Before Answer: Generating Contrastive Explanation as Knowledge for Commonsense Question Answering

no code implementations • 14 May 2023 • Qianglong Chen, Guohai Xu, Ming Yan, Ji Zhang, Fei Huang, Luo Si, Yin Zhang

Existing knowledge-enhanced methods have achieved remarkable results in certain QA tasks via obtaining diverse knowledge from different knowledge bases.

Explanation Generation Question Answering

Paper
Add Code

AMTSS: An Adaptive Multi-Teacher Single-Student Knowledge Distillation Framework For Multilingual Language Inference

no code implementations • 13 May 2023 • Qianglong Chen, Feng Ji, Feng-Lin Li, Guohai Xu, Ming Yan, Ji Zhang, Yin Zhang

To support cost-effective language inference in multilingual settings, we propose AMTSS, an adaptive multi-teacher single-student distillation framework, which allows distilling knowledge from multiple teachers to a single student.

Knowledge Distillation

Paper
Add Code

mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality

1 code implementation • 27 Apr 2023 • Qinghao Ye, Haiyang Xu, Guohai Xu, Jiabo Ye, Ming Yan, Yiyang Zhou, Junyang Wang, Anwen Hu, Pengcheng Shi, Yaya Shi, Chenliang Li, Yuanhong Xu, Hehong Chen, Junfeng Tian, Qi Qian, Ji Zhang, Fei Huang, Jingren Zhou

Our code, pre-trained model, instruction-tuned models, and evaluation set are available at https://github. com/X-PLUG/mPLUG-Owl.

Ranked #3 on Visual Question Answering (VQA) on HallusionBench

Visual Question Answering (VQA) Zero-Shot Video Question Answer

1,900

Paper
Code

ContrastMotion: Self-supervised Scene Motion Learning for Large-Scale LiDAR Point Clouds

no code implementations • 25 Apr 2023 • Xiangze Jia, Hui Zhou, Xinge Zhu, Yandong Guo, Ji Zhang, Yuexin Ma

In this paper, we propose a novel self-supervised motion estimator for LiDAR-based autonomous driving via BEV representation.

Autonomous Driving Contrastive Learning +2

Paper
Add Code

ChatPLUG: Open-Domain Generative Dialogue System with Internet-Augmented Instruction Tuning for Digital Human

1 code implementation • 16 Apr 2023 • Junfeng Tian, Hehong Chen, Guohai Xu, Ming Yan, Xing Gao, Jianhai Zhang, Chenliang Li, Jiayi Liu, Wenshen Xu, Haiyang Xu, Qi Qian, Wei Wang, Qinghao Ye, Jiejing Zhang, Ji Zhang, Fei Huang, Jingren Zhou

In this paper, we present ChatPLUG, a Chinese open-domain dialogue system for digital human applications that instruction finetunes on a wide range of dialogue tasks in a unified internet-augmented format.

World Knowledge

300

Paper
Code

DETA: Denoised Task Adaptation for Few-Shot Learning

2 code implementations • ICCV 2023 • Ji Zhang, Lianli Gao, Xu Luo, HengTao Shen, Jingkuan Song

Test-time task adaptation in few-shot learning aims to adapt a pre-trained task-agnostic model for capturing taskspecific knowledge of the test task, rely only on few-labeled support samples.

Denoising Few-Shot Learning

Paper
Code

Self-Supervised Category-Level Articulated Object Pose Estimation with Part-Level SE(3) Equivariance

1 code implementation • 28 Feb 2023 • Xueyi Liu, Ji Zhang, Ruizhen Hu, Haibin Huang, He Wang, Li Yi

Category-level articulated object pose estimation aims to estimate a hierarchy of articulation-aware object poses of an unseen articulated object from a known category.

Disentanglement Object +1

Paper
Code

Active Velocity Estimation using Light Curtains via Self-Supervised Multi-Armed Bandits

no code implementations • 24 Feb 2023 • Siddharth Ancha, Gaurav Pathak, Ji Zhang, Srinivasa Narasimhan, David Held

To navigate in an environment safely and autonomously, robots must accurately estimate where obstacles are and how they move.

Multi-Armed Bandits Navigate +1

Paper
Add Code

mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video

4 code implementations • 1 Feb 2023 • Haiyang Xu, Qinghao Ye, Ming Yan, Yaya Shi, Jiabo Ye, Yuanhong Xu, Chenliang Li, Bin Bi, Qi Qian, Wei Wang, Guohai Xu, Ji Zhang, Songfang Huang, Fei Huang, Jingren Zhou

In contrast to predominant paradigms of solely relying on sequence-to-sequence generation or encoder-based instance discrimination, mPLUG-2 introduces a multi-module composition network by sharing common universal modules for modality collaboration and disentangling different modality modules to deal with modality entanglement.

Ranked #1 on Video Captioning on MSR-VTT

Action Classification Image Classification +7

5,985

Paper
Code

A Closer Look at Few-shot Classification Again

2 code implementations • 28 Jan 2023 • Xu Luo, Hao Wu, Ji Zhang, Lianli Gao, Jing Xu, Jingkuan Song

Few-shot classification consists of a training phase where a model is learned on a relatively large dataset and an adaptation phase where the learned model is adapted to previously-unseen tasks with limited labeled samples.

Classification Representation Learning +1

Paper
Code

HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training

no code implementations • ICCV 2023 • Qinghao Ye, Guohai Xu, Ming Yan, Haiyang Xu, Qi Qian, Ji Zhang, Fei Huang

We achieve state-of-the-art results on 15 well-established video-language understanding and generation tasks, especially on temporal-oriented datasets (e. g., SSv2-Template and SSv2-Label) with 8. 6% and 11. 1% improvement respectively.

Ranked #1 on Visual Question Answering (VQA) on TGIF-QA

TGIF-Action TGIF-Frame +7

Paper
Add Code

Intelligent Computing: The Latest Advances, Challenges and Future

no code implementations • 21 Nov 2022 • Shiqiang Zhu, Ting Yu, Tao Xu, Hongyang Chen, Schahram Dustdar, Sylvain Gigan, Deniz Gunduz, Ekram Hossain, Yaochu Jin, Feng Lin, Bo Liu, Zhiguo Wan, Ji Zhang, Zhifeng Zhao, Wentao Zhu, Zuoning Chen, Tariq Durrani, Huaimin Wang, Jiangxing Wu, Tongyi Zhang, Yunhe Pan

In recent years, we have witnessed the emergence of intelligent computing, a new computing paradigm that is reshaping traditional computing and promoting digital revolution in the era of big data, artificial intelligence and internet-of-things with new computing theories, architectures, methods, systems, and applications.

Paper
Add Code

Zero-shot Image Captioning by Anchor-augmented Vision-Language Space Alignment

no code implementations • 14 Nov 2022 • Junyang Wang, Yi Zhang, Ming Yan, Ji Zhang, Jitao Sang

We further propose Anchor Augment to guide the generative model's attention to the fine-grained information in the representation of CLIP.

Computational Efficiency Image Captioning +2

Paper
Add Code

MUI-TARE: Multi-Agent Cooperative Exploration with Unknown Initial Position

no code implementations • 22 Sep 2022 • Jingtian Yan, Xingqiao Lin, Zhongqiang Ren, Shiqi Zhao, Jieqiong Yu, Chao Cao, Peng Yin, Ji Zhang, Sebastian Scherer

To intelligently balance the robustness of sub-map merging and exploration efficiency, we develop a new approach for lidar-based multi-agent exploration, which can direct one agent to repeat another agent's trajectory in an \emph{adaptive} manner based on the quality indicator of the sub-map merging process.

Position

Paper
Add Code

Incorporating Causal Analysis into Diversified and Logical Response Generation

no code implementations • 20 Sep 2022 • Jiayi Liu, Wei Wei, Zhixuan Chu, Xing Gao, Ji Zhang, Tan Yan, Yulin kang

Although the Conditional Variational AutoEncoder (CVAE) model can generate more diversified responses than the traditional Seq2Seq model, the responses often have low relevance with the input words or are illogical with the question.

Response Generation

Paper
Add Code

Generating Persuasive Responses to Customer Reviews with Multi-Source Prior Knowledge in E-commerce

no code implementations • 20 Sep 2022 • Bo Chen, Jiayi Liu, Mieradilijiang Maimaiti, Xing Gao, Ji Zhang

A multi-aspect attentive network is proposed to automatically attend to different aspects in a review and ensure most of the issues are tackled.

Response Generation

Paper
Add Code

iSimLoc: Visual Global Localization for Previously Unseen Environments with Simulated Images

no code implementations • 14 Sep 2022 • Peng Yin, Ivan Cisneros, Ji Zhang, Howie Choset, Sebastian Scherer

The visual camera is an attractive device in beyond visual line of sight (B-VLOS) drone operation, since they are low in size, weight, power, and cost, and can provide redundant modality to GPS failures.

Retrieval Visual Localization

Paper
Add Code

Class-Level Logit Perturbation

1 code implementation • 13 Sep 2022 • Mengyang Li, Fengguang Su, Ou wu, Ji Zhang

However, limited studies have explicitly explored for the perturbation of logit vectors.

Data Augmentation Image Classification +1

Paper
Code

DictBERT: Dictionary Description Knowledge Enhanced Language Model Pre-training via Contrastive Learning

no code implementations • 1 Aug 2022 • Qianglong Chen, Feng-Lin Li, Guohai Xu, Ming Yan, Ji Zhang, Yin Zhang

We evaluate our approach on a variety of knowledge driven and language understanding tasks, including NER, relation extraction, CommonsenseQA, OpenBookQA and GLUE.

Contrastive Learning Language Modelling +2

Paper
Add Code

RCA: Ride Comfort-Aware Visual Navigation via Self-Supervised Learning

no code implementations • 29 Jul 2022 • Xinjie Yao, Ji Zhang, Jean Oh

Under shared autonomy, wheelchair users expect vehicles to provide safe and comfortable rides while following users high-level navigation plans.

Self-Supervised Learning Visual Navigation

Paper
Add Code

Scene Recognition with Objectness, Attribute and Category Learning

no code implementations • 20 Jul 2022 • Ji Zhang, Jean-Paul Ainam, Li-hui Zhao, Wenai Song, Xin Wang

Based on the complementarity of attribute and category labels, we propose a Multi-task Attribute-Scene Recognition (MASR) network which learns a category embedding and at the same time predicts scene attributes.

Attribute Scene Classification +1

Paper
Add Code

ALTO: A Large-Scale Dataset for UAV Visual Place Recognition and Localization

1 code implementation • 19 Jul 2022 • Ivan Cisneros, Peng Yin, Ji Zhang, Howie Choset, Sebastian Scherer

We present the ALTO dataset, a vision-focused dataset for the development and benchmarking of Visual Place Recognition and Localization methods for Unmanned Aerial Vehicles.

Benchmarking Image Registration +2

Paper
Code

X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval

1 code implementation • 15 Jul 2022 • Yiwei Ma, Guohai Xu, Xiaoshuai Sun, Ming Yan, Ji Zhang, Rongrong Ji

However, cross-grained contrast, which is the contrast between coarse-grained representations and fine-grained representations, has rarely been explored in prior research.

Ranked #12 on Video Retrieval on MSVD

Contrastive Learning Retrieval +2

109

Paper
Code

AutoMerge: A Framework for Map Assembling and Smoothing in City-scale Environments

no code implementations • 14 Jul 2022 • Peng Yin, Haowen Lai, Shiqi Zhao, Ruohai Ge, Ji Zhang, Howie Choset, Sebastian Scherer

We present AutoMerge, a LiDAR data processing framework for assembling a large number of map segments into a complete map.

Loop Closure Detection Retrieval

Paper
Add Code

SHREC'22 Track: Sketch-Based 3D Shape Retrieval in the Wild

no code implementations • 11 Jul 2022 • Jie Qin, Shuaihang Yuan, Jiaxin Chen, Boulbaba Ben Amor, Yi Fang, Nhat Hoang-Xuan, Chi-Bien Chu, Khoi-Nguyen Nguyen-Ngoc, Thien-Tri Cao, Nhat-Khang Ngo, Tuan-Luc Huynh, Hai-Dang Nguyen, Minh-Triet Tran, Haoyang Luo, Jianning Wang, Zheng Zhang, Zihao Xin, Yang Wang, Feng Wang, Ying Tang, Haiqin Chen, Yan Wang, Qunying Zhou, Ji Zhang, Hongyuan Wang

We define two SBSR tasks and construct two benchmarks consisting of more than 46, 000 CAD models, 1, 700 realistic models, and 145, 000 sketches in total.

3D Object Retrieval 3D Shape Retrieval +1

Paper
Add Code

mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections

3 code implementations • 24 May 2022 • Chenliang Li, Haiyang Xu, Junfeng Tian, Wei Wang, Ming Yan, Bin Bi, Jiabo Ye, Hehong Chen, Guohai Xu, Zheng Cao, Ji Zhang, Songfang Huang, Fei Huang, Jingren Zhou, Luo Si

Large-scale pretrained foundation models have been an emerging paradigm for building artificial intelligence (AI) systems, which can be quickly adapted to a wide range of downstream tasks.

Ranked #1 on Image Captioning on COCO Captions

Computational Efficiency Image Captioning +6

5,985

Paper
Code

MGIMN: Multi-Grained Interactive Matching Network for Few-shot Text Classification

no code implementations • NAACL 2022 • Jianhai Zhang, Mieradilijiang Maimaiti, Xing Gao, Yuanhang Zheng, Ji Zhang

They also ignore the importance to capture the inter-dependency between query and the support set for few-shot text classification.

Few-Shot Learning Few-Shot Text Classification +1

Paper
Add Code

Auto-MLM: Improved Contrastive Learning for Self-supervised Multi-lingual Knowledge Retrieval

no code implementations • 30 Mar 2022 • Wenshen Xu, Mieradilijiang Maimaiti, Yuanhang Zheng, Xin Tang, Ji Zhang

Unexpectedly, MLM ignores the sentence-level training, and CL also neglects extraction of the internal info from the query.

Contrastive Learning Language Modelling +3

Paper
Add Code

Shifting More Attention to Visual Backbone: Query-modulated Refinement Networks for End-to-End Visual Grounding

1 code implementation • CVPR 2022 • Jiabo Ye, Junfeng Tian, Ming Yan, Xiaoshan Yang, Xuwu Wang, Ji Zhang, Liang He, Xin Lin

Moreover, since the backbones are query-agnostic, it is difficult to completely avoid the inconsistency issue by training the visual backbone end-to-end in the visual grounding framework.

Multimodal Reasoning Visual Grounding

Paper
Code

LQoCo: Learning to Optimize Cache Capacity Overloading in Storage Systems

no code implementations • 21 Mar 2022 • Ji Zhang, Xijun Li, Xiyao Zhou, Mingxuan Yuan, Zhuo Cheng, Keji Huang, YiFan Li

Cache plays an important role to maintain high and stable performance (i. e. high throughput, low tail latency and throughput jitter) in storage systems.

Management

Paper
Add Code

Deep Multi-Branch Aggregation Network for Real-Time Semantic Segmentation in Street Scenes

no code implementations • 8 Mar 2022 • Xi Weng, Yan Yan, Genshun Dong, Chang Shu, Biao Wang, Hanzi Wang, Ji Zhang

This shows that DMA-Net provides a good tradeoff between segmentation quality and speed for semantic segmentation in street scenes.

Real-Time Semantic Segmentation Segmentation

Paper
Add Code

Achieving Human Parity on Visual Question Answering

no code implementations • 17 Nov 2021 • Ming Yan, Haiyang Xu, Chenliang Li, Junfeng Tian, Bin Bi, Wei Wang, Weihua Chen, Xianzhe Xu, Fan Wang, Zheng Cao, Zhicheng Zhang, Qiyu Zhang, Ji Zhang, Songfang Huang, Fei Huang, Luo Si, Rong Jin

The Visual Question Answering (VQA) task utilizes both visual image and language analysis to answer a textual question with respect to an image.

Ranked #7 on Visual Question Answering (VQA) on VQA v2 test-dev

Question Answering Visual Question Answering

Paper
Add Code

A Generic Knowledge Based Medical Diagnosis Expert System

no code implementations • 9 Oct 2021 • Xin Huang, Xuejiao Tang, Wenbin Zhang, Shichao Pei, Ji Zhang, Mingli Zhang, Zhen Liu, Ruijun Chen, Yiyi Huang

The proposed disease diagnosis system also uses a graphical user interface (GUI) to facilitate users to interact with the expert system.

Medical Diagnosis

Paper
Add Code

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

no code implementations • 22 Sep 2021 • Fu Sun, Feng-Lin Li, Ruize Wang, Qianglong Chen, Xingyi Cheng, Ji Zhang

Knowledge enhanced pre-trained language models (K-PLMs) are shown to be effective for many public tasks in the literature but few of them have been successfully applied in practice.

Knowledge Distillation Question Answering +4

Paper
Add Code

AliMe MKG: A Multi-modal Knowledge Graph for Live-streaming E-commerce

no code implementations • 13 Sep 2021 • Guohai Xu, Hehong Chen, Feng-Lin Li, Fu Sun, Yunzhou Shi, Zhixiong Zeng, Wei Zhou, Zhongzhou Zhao, Ji Zhang

Live streaming is becoming an increasingly popular trend of sales in E-commerce.

Multi-modal Knowledge Graph Question Answering

Paper
Add Code

GGP: A Graph-based Grouping Planner for Explicit Control of Long Text Generation

no code implementations • 18 Aug 2021 • Xuming Lin, Shaobo Cui, Zhongzhou Zhao, Wei Zhou, Ji Zhang, Haiqing Chen

With these two synergic representations, we then regroup these phrases into a fine-grained plan, based on which we generate the final long text.

Story Generation

Paper
Add Code

SPMoE: Generate Multiple Pattern-Aware Outputs with Sparse Pattern Mixture of Experts

no code implementations • 17 Aug 2021 • Shaobo Cui, Xintong Bao, Xuming Lin, Zhongzhou Zhao, Ji Zhang, Wei Zhou, Haiqing Chen

Each one-to-one mapping is associated with a conditional generation pattern and is modeled with an expert in SPMoE.

Paraphrase Generation

Paper
Add Code

ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration

1 code implementation • 16 Aug 2021 • Yuhao Cui, Zhou Yu, Chunqi Wang, Zhongzhou Zhao, Ji Zhang, Meng Wang, Jun Yu

Nevertheless, most existing VLP approaches have not fully utilized the intrinsic knowledge within the image-text pairs, which limits the effectiveness of the learned alignments and further restricts the performance of their models.

Visual Reasoning

Paper
Code

KACE: Generating Knowledge Aware Contrastive Explanations for Natural Language Inference

no code implementations • ACL 2021 • Qianglong Chen, Feng Ji, Xiangji Zeng, Feng-Lin Li, Ji Zhang, Haiqing Chen, Yin Zhang

In order to better understand the reason behind model behaviors (i. e., making predictions), most recent works have exploited generative models to provide complementary explanations.

counterfactual Language Modelling +1

Paper
Add Code

Cognitive Visual Commonsense Reasoning Using Dynamic Working Memory

1 code implementation • 4 Jul 2021 • Xuejiao Tang, Xin Huang, Wenbin Zhang, Travers B. Child, Qiong Hu, Zhen Liu, Ji Zhang

Moreover, the proposed model provides intuitive interpretation into visual commonsense reasoning.

Question Answering Scene Understanding +2

Paper
Code

Accurate Few-Shot Object Detection With Support-Query Mutual Guidance and Hybrid Loss

no code implementations • CVPR 2021 • Lu Zhang, Shuigeng Zhou, Jihong Guan, Ji Zhang

Most object detection methods require huge amounts of annotated data and can detect only the categories that appear in the training set.

Few-Shot Object Detection object-detection

Paper
Add Code

i3dLoc: Image-to-range Cross-domain Localization Robust to Inconsistent Environmental Conditions

no code implementations • 27 May 2021 • Peng Yin, Lingyun Xu, Ji Zhang, Howie Choset, Sebastian Scherer

Based on such features, we further design a spherical convolution network to learn viewpoint-invariant symmetric place descriptors.

Generative Adversarial Network Visual Localization

Paper
Add Code

State-Promoted Investment for Industrial Reforms: an Information Design Approach

no code implementations • 20 May 2021 • Keeyoung Rhee, Myungkyu Shim, Ji Zhang

We analyze the optimal strategy for a government to promote large-scale investment projects under information frictions.

Paper
Add Code

AdaVQA: Overcoming Language Priors with Adapted Margin Cosine Loss

1 code implementation • 5 May 2021 • Yangyang Guo, Liqiang Nie, Zhiyong Cheng, Feng Ji, Ji Zhang, Alberto del Bimbo

Experimental results demonstrate that our adapted margin cosine loss can greatly enhance the baseline models with an absolute performance gain of 15\% on average, strongly verifying the potential of tackling the language prior problem in VQA from the angle of the answer feature space learning.

Question Answering Visual Question Answering

Paper
Code

LSTM Based Sentiment Analysis for Cryptocurrency Prediction

no code implementations • 27 Mar 2021 • Xin Huang, Wenbin Zhang, Xuejiao Tang, Mingli Zhang, Jayachander Surbiryala, Vasileios Iosifidis, Zhen Liu, Ji Zhang

Recent studies in big data analytics and natural language processing develop automatic techniques in analyzing sentiment in the social media information.

Sentiment Analysis

Paper
Add Code

OneStop QAMaker: Extract Question-Answer Pairs from Text in a One-Stop Approach

no code implementations • 24 Feb 2021 • Shaobo Cui, Xintong Bao, Xinxing Zu, Yangyang Guo, Zhongzhou Zhao, Ji Zhang, Haiqing Chen

This pipeline approach, however, is undesired in mining the most appropriate QA pairs from documents since it ignores the connection between question generation and answer extraction, which may lead to incompatible QA pair generation, i. e., the selected answer span is inappropriate for question generation.

Machine Reading Comprehension Question Answering +2

Paper
Add Code

Using Machine Learning to Automate Mammogram Images Analysis

no code implementations • 6 Dec 2020 • Xuejiao Tang, Liuhua Zhang, Wenbin Zhang, Xin Huang, Vasileios Iosifidis, Zhen Liu, Mingli Zhang, Enza Messina, Ji Zhang

Early detection of breast cancer in X-ray mammography is believed to have effectively reduced the mortality rate.

BIG-bench Machine Learning Classification +4

Paper
Add Code

A Data-driven Human Responsibility Management System

no code implementations • 6 Dec 2020 • Xuejiao Tang, Jiong Qiu, Ruijun Chen, Wenbin Zhang, Vasileios Iosifidis, Zhen Liu, Wei Meng, Mingli Zhang, Ji Zhang

An ideal safe workplace is described as a place where staffs fulfill responsibilities in a well-organized order, potential hazardous events are being monitored in real-time, as well as the number of accidents and relevant damages are minimized.

Management

Paper
Add Code

Distant Supervision for E-commerce Query Segmentation via Attention Network

no code implementations • 9 Nov 2020 • Zhao Li, Donghui Ding, Pengcheng Zou, Yu Gong, Xi Chen, Ji Zhang, Jianliang Gao, Youxi Wu, Yucong Duan

The booming online e-commerce platforms demand highly accurate approaches to segment queries that carry the product requirements of consumers.

Segmentation

Paper
Add Code

AI Marker-based Large-scale AI Literature Mining

no code implementations • 1 Nov 2020 • Rujing Yao, Yingchun Ye, Ji Zhang, Shuxiao Li, Ou wu

Inspired by the idea of molecular markers tracing in the field of biochemistry, three named entities, namely, methods, datasets and metrics are used as AI markers for AI literature.

Clustering Literature Mining +1

Paper
Add Code

Method and Dataset Entity Mining in Scientific Literature: A CNN + Bi-LSTM Model with Self-attention

no code implementations • 26 Oct 2020 • Linlin Hou, Ji Zhang, Ou wu, Ting Yu, Zhen Wang, Zhao Li, Jianliang Gao, Yingchun Ye, Rujing Yao

We finally apply our model on PAKDD papers published from 2009-2019 to mine insightful results from scientific papers published in a longer time span.

Data Augmentation

Paper
Add Code

AliMe KG: Domain Knowledge Graph Construction and Application in E-commerce

no code implementations • 24 Sep 2020 • Feng-Lin Li, Hehong Chen, Guohai Xu, Tian Qiu, Feng Ji, Ji Zhang, Haiqing Chen

Pre-sales customer service is of importance to E-commerce platforms as it contributes to optimizing customers' buying process.

graph construction Question Answering

Paper
Add Code

Character Matters: Video Story Understanding with Character-Aware Relations

no code implementations • 9 May 2020 • Shijie Geng, Ji Zhang, Zuohui Fu, Peng Gao, Hang Zhang, Gerard de Melo

Without identifying the connection between appearing people and character names, a model is not able to obtain a genuine understanding of the plots.

Question Answering

Paper
Add Code

Monocular Camera Localization in Prior LiDAR Maps with 2D-3D Line Correspondences

1 code implementation • 1 Apr 2020 • Huai Yu, Weikun Zhen, Wen Yang, Ji Zhang, Sebastian Scherer

With the pose prediction from VIO, we can efficiently obtain coarse 2D-3D line correspondences.

Camera Localization Pose Prediction +1

Paper
Code

Target-Guided Structured Attention Network for Target-Dependent Sentiment Analysis

no code implementations • TACL 2020 • Ji Zhang, Chengyao Chen, PengFei Liu, Chao He, Cane Wing-Ki Leung

Second, it shows a strong advantage in determining the sentiment of a target when the context sentence contains multiple semantic segments.

Sentence Sentiment Analysis +1

Paper
Add Code

Method and Dataset Mining in Scientific Papers

no code implementations • 29 Nov 2019 • Rujing Yao, Linlin Hou, Yingchun Ye, Ou wu, Ji Zhang, Jian Wu

In the field of machine learning, the involved methods (M) and datasets (D) are key information in papers.

Paper
Add Code

Following Social Groups: Socially Compliant Autonomous Navigation in Dense Crowds

no code implementations • 27 Nov 2019 • Xinjie Yao, Ji Zhang, Jean Oh

The underlying system incorporates a deep neural network to track social groups and join the flow of a social group in facilitating the navigation.

Autonomous Navigation Collision Avoidance +1

Paper
Add Code

Keep it Consistent: Topic-Aware Storytelling from an Image Stream via Iterative Multi-agent Communication

no code implementations • COLING 2020 • Ruize Wang, Zhongyu Wei, Ying Cheng, Piji Li, Haijun Shan, Ji Zhang, Qi Zhang, Xuanjing Huang

Visual storytelling aims to generate a narrative paragraph from a sequence of images automatically.

Ranked #9 on Visual Storytelling on VIST

Image Captioning Question Generation +1

Paper
Add Code

2nd Place Solution to the GQA Challenge 2019

no code implementations • 16 Jul 2019 • Shijie Geng, Ji Zhang, Hang Zhang, Ahmed Elgammal, Dimitris N. Metaxas

We present a simple method that achieves unexpectedly superior performance for Complex Reasoning involved Visual Question Answering.

Question Answering Visual Question Answering +1

Paper
Add Code

Graphical Contrastive Losses for Scene Graph Parsing

3 code implementations • CVPR 2019 • Ji Zhang, Kevin J. Shih, Ahmed Elgammal, Andrew Tao, Bryan Catanzaro

The first, Entity Instance Confusion, occurs when the model confuses multiple instances of the same type of entity (e. g. multiple cups).

Relationship Detection Scene Graph Generation +1

12,970

Paper
Code

A Deep Cascade Model for Multi-Document Reading Comprehension

no code implementations • 28 Nov 2018 • Ming Yan, Jiangnan Xia, Chen Wu, Bin Bi, Zhongzhou Zhao, Ji Zhang, Luo Si, Rui Wang, Wei Wang, Haiqing Chen

To address this problem, we develop a novel deep cascade learning model, which progressively evolves from the document-level and paragraph-level ranking of candidate texts to more precise answer extraction with machine reading comprehension.

Ranked #2 on Question Answering on MS MARCO

Machine Reading Comprehension Question Answering +2

Paper
Add Code

An Interpretable Model for Scene Graph Generation

no code implementations • 21 Nov 2018 • Ji Zhang, Kevin Shih, Andrew Tao, Bryan Catanzaro, Ahmed Elgammal

We propose an efficient and interpretable scene graph generator.

Graph Generation Image Captioning +3

Paper
Add Code

Introduction to the 1st Place Winning Model of OpenImages Relationship Detection Challenge

no code implementations • 1 Nov 2018 • Ji Zhang, Kevin Shih, Andrew Tao, Bryan Catanzaro, Ahmed Elgammal

This article describes the model we built that achieved 1st place in the OpenImage Visual Relationship Detection Challenge on Kaggle.

Relationship Detection Visual Relationship Detection

Paper
Add Code

Improving Multilingual Semantic Textual Similarity with Shared Sentence Encoder for Low-resource Languages

no code implementations • 20 Oct 2018 • Xin Tang, Shanbo Cheng, Loc Do, Zhiyu Min, Feng Ji, Heng Yu, Ji Zhang, Haiqin Chen

Our approach is extended from a basic monolingual STS framework to a shared multilingual encoder pretrained with translation task to incorporate rich-resource language data.

Machine Translation Semantic Similarity +4

Paper
Add Code

Semi-Autoregressive Neural Machine Translation

1 code implementation • EMNLP 2018 • Chunqi Wang, Ji Zhang, Haiqing Chen

Existing approaches to neural machine translation are typically autoregressive models.

Machine Translation Translation

Paper
Code

Exploiting Effective Representations for Chinese Sentiment Analysis Using a Multi-Channel Convolutional Neural Network

no code implementations • 8 Aug 2018 • Pengfei Liu, Ji Zhang, Cane Wing-Ki Leung, Chao He, Thomas L. Griffiths

Effective representation of a text is critical for various natural language processing tasks.

Chinese Sentiment Analysis Sentiment Analysis

Paper
Add Code

Large-Scale Visual Relationship Understanding

2 code implementations • 27 Apr 2018 • Ji Zhang, Yannis Kalantidis, Marcus Rohrbach, Manohar Paluri, Ahmed Elgammal, Mohamed Elhoseiny

Large scale visual understanding is challenging, as it requires a model to handle the widely-spread and imbalanced distribution of <subject, relation, object> triples.

Relationship Detection

113

Paper
Code

Relationship Proposal Networks

no code implementations • CVPR 2017 • Ji Zhang, Mohamed Elhoseiny, Scott Cohen, Walter Chang, Ahmed Elgammal

We demonstrate the ability of our Rel-PN to localize relationships with only a few thousand proposals.

Scene Understanding

Paper
Add Code

LOAM: Lidar Odometry and Mapping in Real-Time

1 code implementation • Robotics: Science and Systems Conference 2014 • Ji Zhang, Sanjiv Singh

We propose a real-time method for odometry and mapping using range measurements from a 2-axis lidar moving in 6-DOF.

Motion Estimation Simultaneous Localization and Mapping

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.