Search Results for author: Xingyu Fu

Found 18 papers, 9 papers with code

There’s a Time and Place for Reasoning Beyond the Image

1 code implementation ACL 2022 Xingyu Fu, Ben Zhou, Ishaan Chandratreya, Carl Vondrick, Dan Roth

Images are often more significant than only the pixels to human eyes, as we can infer, associate, and reason with contextual information from other sources to establish a more complete picture.

16k Image Clustering

BLINK: Multimodal Large Language Models Can See but Not Perceive

no code implementations18 Apr 2024 Xingyu Fu, Yushi Hu, Bangzheng Li, Yu Feng, Haoyu Wang, Xudong Lin, Dan Roth, Noah A. Smith, Wei-Chiu Ma, Ranjay Krishna

We introduce Blink, a new benchmark for multimodal language models (LLMs) that focuses on core visual perception abilities not found in other evaluations.

Depth Estimation Multiple-choice +1

Deceptive Semantic Shortcuts on Reasoning Chains: How Far Can Models Go without Hallucination?

1 code implementation16 Nov 2023 Bangzheng Li, Ben Zhou, Fei Wang, Xingyu Fu, Dan Roth, Muhao Chen

During the construction of the evidence, we purposefully replace semantic clues (entities) that may lead to the correct answer with distractor clues (evidence) that will not directly lead to the correct answer but require a chain-like reasoning process.

Hallucination Sentence

ImagenHub: Standardizing the evaluation of conditional image generation models

2 code implementations2 Oct 2023 Max Ku, Tianle Li, Kai Zhang, Yujie Lu, Xingyu Fu, Wenwen Zhuang, Wenhu Chen

Recently, a myriad of conditional image generation and editing models have been developed to serve different downstream tasks, including text-to-image generation, text-guided image editing, subject-driven image generation, control-guided image generation, etc.

Conditional Image Generation text-guided-image-editing

Typing on Any Surface: A Deep Learning-based Method for Real-Time Keystroke Detection in Augmented Reality

no code implementations31 Aug 2023 Xingyu Fu, Mingze Xi

This paper proposes and validates a deep-learning based approach, that enables AR applications to accurately predict keystrokes from the user perspective RGB video stream that can be captured by any AR headset.

Dynamic Clue Bottlenecks: Towards Interpretable-by-Design Visual Question Answering

no code implementations24 May 2023 Xingyu Fu, Ben Zhou, Sihao Chen, Mark Yatskar, Dan Roth

We propose the Dynamic Clue Bottleneck Model ( (DCLUB), a method that is designed towards an inherently interpretable VQA system.

Question Answering Visual Question Answering

There is a Time and Place for Reasoning Beyond the Image

1 code implementation1 Mar 2022 Xingyu Fu, Ben Zhou, Ishaan Preetam Chandratreya, Carl Vondrick, Dan Roth

For example, in Figure 1, we can find a way to identify the news articles related to the picture through segment-wise understandings of the signs, the buildings, the crowds, and more.

16k Image Clustering +1

An FEA surrogate model with Boundary Oriented Graph Embedding approach

1 code implementation30 Aug 2021 Xingyu Fu, Fengfeng Zhou, Dheeraj Peddireddy, Zhengyang Kang, Martin Byung-Guk Jun, Vaneet Aggarwal

In this work, we present a Boundary Oriented Graph Embedding (BOGE) approach for the Graph Neural Network (GNN) to serve as a general surrogate model for regressing physical fields and solving boundary value problems.

Cantilever Beam Decision Making +2

ASTRAL: Adversarial Trained LSTM-CNN for Named Entity Recognition

1 code implementation2 Sep 2020 Jiuniu Wang, Wenjia Xu, Xingyu Fu, Guangluan Xu, Yirong Wu

Under such circumstances, how to make full use of the information extracted by word embedding requires more in-depth research.

named-entity-recognition Named Entity Recognition +1

SRQA: Synthetic Reader for Factoid Question Answering

1 code implementation2 Sep 2020 Jiuniu Wang, Wenjia Xu, Xingyu Fu, Yang Wei, Li Jin, Ziyan Chen, Guangluan Xu, Yirong Wu

This model enhances the question answering system in the multi-document scenario from three aspects: model structure, optimization goal, and training method, corresponding to Multilayer Attention (MA), Cross Evidence (CE), and Adversarial Training (AT) respectively.

Question Answering

Design Challenges in Low-resource Cross-lingual Entity Linking

1 code implementation EMNLP 2020 Xingyu Fu, Weijia Shi, Xiaodong Yu, Zian Zhao, Dan Roth

Cross-lingual Entity Linking (XEL), the problem of grounding mentions of entities in a foreign language text into an English knowledge base such as Wikipedia, has seen a lot of research in recent years, with a range of promising techniques.

Cross-Lingual Entity Linking Entity Linking

AlphaGomoku: An AlphaGo-based Gomoku Artificial Intelligence using Curriculum Learning

1 code implementation27 Sep 2018 Zheng Xie, Xingyu Fu, JinYuan Yu

In this project, we combine AlphaGo algorithm with Curriculum Learning to crack the game of Gomoku.

A Machine Learning Framework for Stock Selection

no code implementations5 Jun 2018 XingYu Fu, JinHong Du, Yifeng Guo, Mingwen Liu, Tao Dong, XiuWen Duan

The effectiveness of the stock selection strategy is validated in Chinese stock market in both statistical and practical aspects, showing that: 1) Stacking outperforms other models reaching an AUC score of 0. 972; 2) Genetic Algorithm picks a subset of 114 features and the prediction performances of all models remain almost unchanged after the selection procedure, which suggests some features are indeed redundant; 3) LR and DNN are radical models; RF is risk-neutral model; Stacking is somewhere between DNN and RF.

BIG-bench Machine Learning feature selection

Robust Log-Optimal Strategy with Reinforcement Learning

no code implementations1 May 2018 Yifeng Guo, Xingyu Fu, Yuyan Shi, Mingwen Liu

We proposed a new Portfolio Management method termed as Robust Log-Optimal Strategy (RLOS), which ameliorates the General Log-Optimal Strategy (GLOS) by approximating the traditional objective function with quadratic Taylor expansion.

Management reinforcement-learning +1

Language Distribution Prediction based on Batch Markov Monte Carlo Simulation with Migration

no code implementations26 Feb 2018 XingYu Fu, ZiYi Yang, XiuWen Duan

To model the randomness of language spreading, we propose the Batch Markov Monte Carlo Simulation with Migration(BMMCSM) algorithm, in which each agent is treated as a language stack.

Cultural Vocal Bursts Intensity Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.