Search Results for author: Bin Xu

Found 39 papers, 18 papers with code

CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations

1 code implementation6 Feb 2024 Ji Qi, Ming Ding, Weihan Wang, Yushi Bai, Qingsong Lv, Wenyi Hong, Bin Xu, Lei Hou, Juanzi Li, Yuxiao Dong, Jie Tang

Vision-Language Models (VLMs) have demonstrated their widespread viability thanks to extensive training in aligning visual instructions to answers.

Visual Reasoning

CogAgent: A Visual Language Model for GUI Agents

1 code implementation14 Dec 2023 Wenyi Hong, Weihan Wang, Qingsong Lv, Jiazheng Xu, Wenmeng Yu, Junhui Ji, Yan Wang, Zihan Wang, Yuxuan Zhang, Juanzi Li, Bin Xu, Yuxiao Dong, Ming Ding, Jie Tang

People are spending an enormous amount of time on digital devices through graphical user interfaces (GUIs), e. g., computer or smartphone screens.

Language Modelling Visual Question Answering

When does In-context Learning Fall Short and Why? A Study on Specification-Heavy Tasks

no code implementations15 Nov 2023 Hao Peng, Xiaozhi Wang, Jianhui Chen, Weikai Li, Yunjia Qi, Zimu Wang, Zhili Wu, Kaisheng Zeng, Bin Xu, Lei Hou, Juanzi Li

In this paper, we find that ICL falls short of handling specification-heavy tasks, which are tasks with complicated and extensive task specifications, requiring several hours for ordinary humans to master, such as traditional information extraction tasks.

In-Context Learning

Mastering the Task of Open Information Extraction with Large Language Models and Consistent Reasoning Environment

no code implementations16 Oct 2023 Ji Qi, Kaixuan Ji, Xiaozhi Wang, Jifan Yu, Kaisheng Zeng, Lei Hou, Juanzi Li, Bin Xu

Open Information Extraction (OIE) aims to extract objective structured knowledge from natural texts, which has attracted growing attention to build dedicated models with human experience.

In-Context Learning Open Information Extraction

BiLL-VTG: Bridging Large Language Models and Lightweight Visual Tools for Video-based Texts Generation

no code implementations16 Oct 2023 Ji Qi, Kaixuan Ji, Jifan Yu, Duokang Wang, Bin Xu, Lei Hou, Juanzi Li

Building models that generate textual responses to user instructions for videos is a practical and challenging topic, as it requires both vision understanding and knowledge reasoning.

Caption Generation Descriptive +3

ViLTA: Enhancing Vision-Language Pre-training through Textual Augmentation

no code implementations ICCV 2023 Weihan Wang, Zhen Yang, Bin Xu, Juanzi Li, Yankui Sun

Vision-language pre-training (VLP) methods are blossoming recently, and its crucial goal is to jointly learn visual and textual features via a transformer-based architecture, demonstrating promising improvements on a variety of vision-language tasks.

Image-text matching Language Modelling +2

Preserving Knowledge Invariance: Rethinking Robustness Evaluation of Open Information Extraction

1 code implementation23 May 2023 Ji Qi, Chuchun Zhang, Xiaozhi Wang, Kaisheng Zeng, Jifan Yu, Jinxin Liu, Jiuding Sun, Yuxiang Chen, Lei Hou, Juanzi Li, Bin Xu

In this paper, we present the first benchmark that simulates the evaluation of open information extraction models in the real world, where the syntactic and expressive distributions under the same knowledge meaning may drift variously.

Language Modelling Large Language Model +1

GOAL: A Challenging Knowledge-grounded Video Captioning Benchmark for Real-time Soccer Commentary Generation

1 code implementation26 Mar 2023 Ji Qi, Jifan Yu, Teng Tu, Kunyu Gao, Yifan Xu, Xinyu Guan, Xiaozhi Wang, Yuxiao Dong, Bin Xu, Lei Hou, Juanzi Li, Jie Tang, Weidong Guo, Hui Liu, Yu Xu

Despite the recent emergence of video captioning models, how to generate vivid, fine-grained video descriptions based on the background knowledge (i. e., long and informative commentary about the domain-specific scenes with appropriate reasoning) is still far from being solved, which however has great applications such as automatic sports narrative.

Video Captioning

Syntactically Robust Training on Partially-Observed Data for Open Information Extraction

1 code implementation17 Jan 2023 Ji Qi, Yuxiang Chen, Lei Hou, Juanzi Li, Bin Xu

In this paper, we propose a syntactically robust training framework that enables models to be trained on a syntactic-abundant distribution based on diverse paraphrase generation.

Open Information Extraction Paraphrase Generation +2

ConstGCN: Constrained Transmission-based Graph Convolutional Networks for Document-level Relation Extraction

no code implementations8 Oct 2022 Ji Qi, Bin Xu, Kaisheng Zeng, Jinxin Liu, Jifan Yu, Qi Gao, Juanzi Li, Lei Hou

Document-level relation extraction with graph neural networks faces a fundamental graph construction gap between training and inference - the golden graph structure only available during training, which causes that most methods adopt heuristic or syntactic rules to construct a prior graph as a pseudo proxy.

Document-level Relation Extraction graph construction +1

DSCA: A Dual-Stream Network with Cross-Attention on Whole-Slide Image Pyramids for Cancer Prognosis

1 code implementation12 Jun 2022 Pei Liu, Bo Fu, Feng Ye, Rui Yang, Bin Xu, Luping Ji

Our experiments and ablation studies verify that (i) the proposed DSCA could outperform existing state-of-the-art methods in cancer prognosis, by an average C-Index improvement of around 4. 6%; (ii) our DSCA network is more efficient in computation -- it has more learnable parameters (6. 31M vs. 860. 18K) but less computational costs (2. 51G vs. 4. 94G), compared to a typical existing multi-resolution network.

whole slide images

A Critical Analysis of Image-based Camera Pose Estimation Techniques

no code implementations15 Jan 2022 Meng Xu, Youchen Wang, Bin Xu, Jun Zhang, Jian Ren, Stefan Poslad, Pengfei Xu

Camera, and associated with its objects within the field of view, localization could benefit many computer vision fields, such as autonomous driving, robot navigation, and augmented reality (AR).

Autonomous Driving Camera Localization +3

OpenQA: Hybrid QA System Relying on Structured Knowledge Base as well as Non-structured Data

no code implementations31 Dec 2021 Gaochen Wu, Bin Xu, Yuxin Qin, Yang Liu, Lingyu Liu, Ziwei Wang

Search engines based on keyword retrieval can no longer adapt to the way of information acquisition in the era of intelligent Internet of Things due to the return of keyword related Internet pages.

Answer Selection Machine Reading Comprehension +3

Improving Low-resource Reading Comprehension via Cross-lingual Transposition Rethinking

no code implementations11 Jul 2021 Gaochen Wu, Bin Xu, Yuxin Qin, Fei Kong, Bangchang Liu, Hongwen Zhao, Dejie Chang

To address this issue, we propose a Cross-Lingual Transposition ReThinking (XLTT) model by modelling existing high-quality extractive reading comprehension datasets in a multilingual environment.

Reading Comprehension

PatentMiner: Patent Vacancy Mining via Context-enhanced and Knowledge-guided Graph Attention

no code implementations10 Jul 2021 Gaochen Wu, Bin Xu, Yuxin Qin, Fei Kong, Bangchang Liu, Hongwen Zhao, Dejie Chang

In this paper, we propose a new patent vacancy prediction approach named PatentMiner to mine rich semantic knowledge and predict new potential patents based on knowledge graph (KG) and graph attention mechanism.

Graph Attention Link Prediction +3

Bilateral Grid Learning for Stereo Matching Networks

no code implementations CVPR 2021 Bin Xu, Yuhua Xu, Xiaoli Yang, Wei Jia, Yulan Guo

In this paper, we present a novel edge-preserving cost volume upsampling module based on the slicing operation in the learned bilateral grid.

Robot Navigation Stereo Matching

A Multilingual Modeling Method for Span-Extraction Reading Comprehension

no code implementations31 May 2021 Gaochen Wu, Bin Xu, Dejie Chang, Bangchang Liu

In this paper, in order to solve the scarce availability of extractive reading comprehension training data in the target language, we propose a multilingual extractive reading comprehension approach called XLRC by simultaneously modeling the existing extractive reading comprehension training data in a multilingual environment using self-adaptive attention and multilingual attention.

Multilingual NLP Reading Comprehension

DiaKG: an Annotated Diabetes Dataset for Medical Knowledge Graph Construction

1 code implementation31 May 2021 Dejie Chang, Mosha Chen, Chaozhen Liu, LiPing Liu, Dongdong Li, Wei Li, Fei Kong, Bangchang Liu, Xiaobin Luo, Ji Qi, Qiao Jin, Bin Xu

In order to accelerate the research for domain-specific knowledge graphs in the medical domain, we introduce DiaKG, a high-quality Chinese dataset for Diabetes knowledge graph, which contains 22, 050 entities and 6, 890 relations in total.

graph construction Knowledge Graphs +4

Energy Consumption and Battery Aging Minimization Using a Q-learning Strategy for a Battery/Ultracapacitor Electric Vehicle

no code implementations27 Oct 2020 Bin Xu, Junzhe Shi, Sixu Li, Huayi Li, Zhe Wang

Then, the result from a vehicle without ultracapacitor is used as the baseline, which is compared with the results from the vehicle with ultracapacitor using Q-learning, and two heuristic methods as the energy management strategies.

energy management Management +1

Learning Time Reduction Using Warm Start Methods for a Reinforcement Learning Based Supervisory Control in Hybrid Electric Vehicle Applications

no code implementations27 Oct 2020 Bin Xu, Jun Hou, Junzhe Shi, Huayi Li, Dhruvang Rathod, Zhe Wang, Zoran Filipi

This study aims to reduce the learning iterations of Q-learning in HEV application and improve fuel consumption in initial learning phases utilizing warm start methods.

Q-Learning Reinforcement Learning (RL)

Model reconstruction from temporal data for coupled oscillator networks

no code implementations4 May 2019 Mark J Panaggio, Maria-Veronica Ciocanel, Lauren Lazarus, Chad M Topaz, Bin Xu

In a complex system, the interactions between individual agents often lead to emergent collective behavior like spontaneous synchronization, swarming, and pattern formation.

Convergence analysis of beetle antennae search algorithm and its applications

1 code implementation4 Apr 2019 Yinyan Zhang, Shuai Li, Bin Xu

The beetle antennae search algorithm was recently proposed and investigated for solving global optimization problems.

Stitching Videos from a Fisheye Lens Camera and a Wide-Angle Lens Camera for Telepresence Robots

no code implementations15 Mar 2019 Yanmei Dong, Mingtao Pei, Lijia Zhang, Bin Xu, Yuwei Wu, Yunde Jia

In this paper, we propose to stitch videos from the FF-camera with a wide-angle lens and the DF-camera with a fisheye lens for telepresence robots.

Magic numbers in polymer phase separation -- the importance of being rigid

1 code implementation27 Jan 2019 Bin Xu, Guanhua He, Benjamin G. Weiner, Pierre Ronceray, Yigal Meir, Martin C. Jonikas, Ned S. Wingreen

One class of such condensates is composed of two polymer species, where each consists of repeated binding sites that interact in a one-to-one fashion with the binding sites of the other polymer.

Biological Physics Soft Condensed Matter Subcellular Processes

Robustness of Maximum Correntropy Estimation Against Large Outliers

no code implementations23 Mar 2017 Badong Chen, Lei Xing, Haiquan Zhao, Bin Xu, Jose C. Principe

The maximum correntropy criterion (MCC) has recently been successfully applied in robust regression, classification and adaptive filtering, where the correntropy is maximized instead of minimizing the well-known mean square error (MSE) to improve the robustness with respect to outliers (or impulsive noises).

Maximum Correntropy Unscented Filter

no code implementations26 Aug 2016 Xi Liu, Badong Chen, Bin Xu, Zongze Wu, Paul Honeine

To improve the robustness of the UKF against impulsive noises, a new filter for nonlinear systems is proposed in this work, namely the maximum correntropy unscented filter (MCUF).

Kernel Risk-Sensitive Loss: Definition, Properties and Application to Robust Adaptive Filtering

no code implementations1 Aug 2016 Badong Chen, Lei Xing, Bin Xu, Haiquan Zhao, Nanning Zheng, Jose C. Principe

Nonlinear similarity measures defined in kernel space, such as correntropy, can extract higher-order statistics of data and offer potentially significant performance improvement over their linear counterparts especially in non-Gaussian signal processing and machine learning.

Social cycling and conditional responses in the Rock-Paper-Scissors game

1 code implementation21 Apr 2014 Zhijian Wang, Bin Xu, Hai-Jun Zhou

How humans make decisions in non-cooperative strategic interactions is a challenging question.

Physics and Society Computer Science and Game Theory

Cannot find the paper you are looking for? You can Submit a new open access paper.