Search Results for author: Wenxuan Xie

Found 10 papers, 2 papers with code

Learning to Update for Object Tracking with Recurrent Meta-learner

no code implementations • 19 Jun 2018 • Bi Li, Wenxuan Xie, Wen-Jun Zeng, Wenyu Liu

Generally, model update is formulated as an online learning problem where a target model is learned over the online training set.

Ranked #1 on Visual Tracking on OTB-100

Meta-Learning Visual Object Tracking +1

Paper
Add Code

Detect or Track: Towards Cost-Effective Video Object Detection/Tracking

no code implementations • 13 Nov 2018 • Hao Luo, Wenxuan Xie, Xinggang Wang, Wen-Jun Zeng

Trackers are in general more efficient than detectors but bear the risk of drifting.

Object object-detection +1

Paper
Add Code

A Semi-supervised Sensing Rate Learning based CMAB Scheme to Combat COVID-19 by Trustful Data Collection in the Crowd

no code implementations • 17 Jan 2023 • Jianheng Tang, Kejia Fan, Wenxuan Xie, Luomin Zeng, Feijiang Han, Guosheng Huang, Tian Wang, Anfeng Liu, Shaobo Zhang

In this paper, an incentive mechanism named Semi-supervision based Combinatorial Multi-Armed Bandit reverse Auction (SCMABA) is proposed to solve the recruitment problem of multiple unknown and strategic workers in MCS.

Paper
Add Code

Unifying Layout Generation with a Decoupled Diffusion Model

no code implementations • CVPR 2023 • Mude Hui, Zhizheng Zhang, Xiaoyi Zhang, Wenxuan Xie, Yuwang Wang, Yan Lu

Since different attributes have their individual semantics and characteristics, we propose to decouple the diffusion processes for them to improve the diversity of training samples and learn the reverse process jointly to exploit global-scope contexts for facilitating generation.

Paper
Add Code

Responsible Task Automation: Empowering Large Language Models as Responsible Task Automators

no code implementations • 2 Jun 2023 • Zhizheng Zhang, Xiaoyi Zhang, Wenxuan Xie, Yan Lu

In specific, we present Responsible Task Automation (ResponsibleTA) as a fundamental framework to facilitate responsible collaboration between LLM-based coordinators and executors for task automation with three empowered capabilities: 1) predicting the feasibility of the commands for executors; 2) verifying the completeness of executors; 3) enhancing the security (e. g., the protection of users' privacy).

Prompt Engineering

Paper
Add Code

Reinforced UI Instruction Grounding: Towards a Generic UI Task Automation API

no code implementations • 7 Oct 2023 • Zhizheng Zhang, Wenxuan Xie, Xiaoyi Zhang, Yan Lu

In this work, we build a multimodal model to ground natural language instructions in given UI screenshots as a generic UI task automation executor.

Decoder document understanding +1

Paper
Add Code

Retrieval-based Video Language Model for Efficient Long Video Question Answering

no code implementations • 8 Dec 2023 • Jiaqi Xu, Cuiling Lan, Wenxuan Xie, Xuejin Chen, Yan Lu

To address these issues, we introduce a simple yet effective retrieval-based video language model (R-VLM) for efficient and interpretable long video QA.

Language Modelling Natural Language Understanding +4

Paper
Add Code

Slot-VLM: SlowFast Slots for Video-Language Modeling

no code implementations • 20 Feb 2024 • Jiaqi Xu, Cuiling Lan, Wenxuan Xie, Xuejin Chen, Yan Lu

A pivotal challenge is the development of an efficient method to encapsulate video content into a set of representative tokens to align with LLMs.

Language Modelling Object +3

Paper
Add Code

Unsupervised Visual Representation Learning by Tracking Patches in Video

1 code implementation • CVPR 2021 • Guangting Wang, Yizhou Zhou, Chong Luo, Wenxuan Xie, Wenjun Zeng, Zhiwei Xiong

The proxy task is to estimate the position and size of the image patch in a sequence of video frames, given only the target bounding box in the first frame.

Action Classification Action Recognition +1

Paper
Code

Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?

2 code implementations • 12 Sep 2021 • Chuanxin Tang, Yucheng Zhao, Guangting Wang, Chong Luo, Wenxuan Xie, Wenjun Zeng

Specifically, we replace the MLP module in the token-mixing step with a novel sparse MLP (sMLP) module.

Ranked #394 on Image Classification on ImageNet

Image Classification

191

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.