Search Results for author: ran Xu

Found 55 papers, 27 papers with code

DocQueryNet: Value Retrieval with Arbitrary Queries for Form-like Documents

1 code implementation • COLING 2022 • Mingfei Gao, Le Xue, Chetan Ramaiah, Chen Xing, ran Xu, Caiming Xiong

Unlike previous methods that only address a fixed set of field items, our method predicts target value for an arbitrary query based on the understanding of the layout and semantics of a form.

document understanding Language Modelling +1

Paper
Code

SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant

1 code implementation • 17 Mar 2024 • Guohao Sun, Can Qin, Jiamian Wang, Zeyuan Chen, ran Xu, Zhiqiang Tao

Recent advancements in the vision-language model have shown notable generalization in vision-language tasks after visual instruction tuning.

Ranked #35 on Visual Question Answering on MM-Vet

Language Modelling Question Answering +2

Paper
Code

NaturalVLM: Leveraging Fine-grained Natural Language for Affordance-Guided Visual Manipulation

no code implementations • 13 Mar 2024 • ran Xu, Yan Shen, Xiaoqi Li, Ruihai Wu, Hao Dong

To address these challenges, we introduce a comprehensive benchmark, NrVLM, comprising 15 distinct manipulation tasks, containing over 4500 episodes meticulously annotated with fine-grained language instructions.

Robot Manipulation

Paper
Add Code

FOFO: A Benchmark to Evaluate LLMs' Format-Following Capability

1 code implementation • 28 Feb 2024 • Congying Xia, Chen Xing, Jiangshu Du, Xinyi Yang, Yihao Feng, ran Xu, Wenpeng Yin, Caiming Xiong

This paper presents FoFo, a pioneering benchmark for evaluating large language models' (LLMs) ability to follow complex, domain-specific formats, a crucial yet underexamined capability for their application as AI agents.

Paper
Code

RAM-EHR: Retrieval Augmentation Meets Clinical Predictions on Electronic Health Records

no code implementations • 25 Feb 2024 • ran Xu, Wenqi Shi, Yue Yu, Yuchen Zhuang, Bowen Jin, May D. Wang, Joyce C. Ho, Carl Yang

We present RAM-EHR, a Retrieval AugMentation pipeline to improve clinical predictions on Electronic Health Records (EHRs).

Retrieval

Paper
Add Code

Multimodal Fusion of EHR in Structures and Semantics: Integrating Clinical Records and Notes with Hypergraph and LLM

no code implementations • 19 Feb 2024 • Hejie Cui, Xinyu Fang, ran Xu, Xuan Kan, Joyce C. Ho, Carl Yang

While there has been a lot of research on representation learning of structured EHR data, the fusion of different types of EHR data (multimodal fusion) is not well studied.

Decision Making Representation Learning

Paper
Add Code

MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation

no code implementations • 14 Jan 2024 • Jiaqi Chen, Bingqian Lin, ran Xu, Zhenhua Chai, Xiaodan Liang, Kwan-Yee K. Wong

Embodied agents equipped with GPT as their brain have exhibited extraordinary decision-making and generalization abilities across various tasks.

Decision Making Vision and Language Navigation

Paper
Add Code

EHRAgent: Code Empowers Large Language Models for Few-shot Complex Tabular Reasoning on Electronic Health Records

1 code implementation • 13 Jan 2024 • Wenqi Shi, ran Xu, Yuchen Zhuang, Yue Yu, Jieyu Zhang, Hang Wu, Yuanda Zhu, Joyce Ho, Carl Yang, May D. Wang

Large language models (LLMs) have demonstrated exceptional capabilities in planning and tool utilization as autonomous agents, but few have been developed for medical problem-solving.

Code Generation Few-Shot Learning +1

Paper
Code

TrustLLM: Trustworthiness in Large Language Models

1 code implementation • 10 Jan 2024 • Lichao Sun, Yue Huang, Haoran Wang, Siyuan Wu, Qihui Zhang, Yuan Li, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bertie Vidgen, Bhavya Kailkhura, Caiming Xiong, Chaowei Xiao, Chunyuan Li, Eric Xing, Furong Huang, Hao liu, Heng Ji, Hongyi Wang, huan zhang, Huaxiu Yao, Manolis Kellis, Marinka Zitnik, Meng Jiang, Mohit Bansal, James Zou, Jian Pei, Jian Liu, Jianfeng Gao, Jiawei Han, Jieyu Zhao, Jiliang Tang, Jindong Wang, Joaquin Vanschoren, John Mitchell, Kai Shu, Kaidi Xu, Kai-Wei Chang, Lifang He, Lifu Huang, Michael Backes, Neil Zhenqiang Gong, Philip S. Yu, Pin-Yu Chen, Quanquan Gu, ran Xu, Rex Ying, Shuiwang Ji, Suman Jana, Tianlong Chen, Tianming Liu, Tianyi Zhou, William Wang, Xiang Li, Xiangliang Zhang, Xiao Wang, Xing Xie, Xun Chen, Xuyu Wang, Yan Liu, Yanfang Ye, Yinzhi Cao, Yong Chen, Yue Zhao

This paper introduces TrustLLM, a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions.

Ethics Fairness

265

Paper
Code

Continual-MAE: Adaptive Distribution Masked Autoencoders for Continual Test-Time Adaptation

no code implementations • 19 Dec 2023 • Jiaming Liu, ran Xu, Senqiao Yang, Renrui Zhang, Qizhe Zhang, Zehui Chen, Yandong Guo, Shanghang Zhang

To tackle these issues, we propose a continual self-supervised method, Adaptive Distribution Masked Autoencoders (ADMA), which enhances the extraction of target domain knowledge while mitigating the accumulation of distribution shifts.

Self-Supervised Learning Test-time Adaptation

Paper
Add Code

X-InstructBLIP: A Framework for aligning X-Modal instruction-aware representations to LLMs and Emergent Cross-modal Reasoning

1 code implementation • 30 Nov 2023 • Artemis Panagopoulou, Le Xue, Ning Yu, Junnan Li, Dongxu Li, Shafiq Joty, ran Xu, Silvio Savarese, Caiming Xiong, Juan Carlos Niebles

Vision-language pre-training and instruction tuning have demonstrated general-purpose capabilities in 2D visual reasoning tasks by aligning visual encoders with state-of-the-art large language models (LLMs).

Visual Reasoning

Paper
Code

Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models

1 code implementation • 1 Nov 2023 • ran Xu, Hejie Cui, Yue Yu, Xuan Kan, Wenqi Shi, Yuchen Zhuang, Wei Jin, Joyce Ho, Carl Yang

Clinical natural language processing requires methods that can address domain-specific challenges, such as complex medical terminology and clinical contexts.

Clinical Knowledge Knowledge Graphs +1

Paper
Code

Distribution-Aware Continual Test-Time Adaptation for Semantic Segmentation

no code implementations • 24 Sep 2023 • Jiayi Ni, Senqiao Yang, ran Xu, Jiaming Liu, Xiaoqi Li, Wenyu Jiao, Zehui Chen, Yi Liu, Shanghang Zhang

In this paper, we propose a distribution-aware tuning (DAT) method to make the semantic segmentation CTTA efficient and practical in real-world applications.

Autonomous Driving Semantic Segmentation +1

Paper
Add Code

BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents

2 code implementations • 11 Aug 2023 • Zhiwei Liu, Weiran Yao, JianGuo Zhang, Le Xue, Shelby Heinecke, Rithesh Murthy, Yihao Feng, Zeyuan Chen, Juan Carlos Niebles, Devansh Arpit, ran Xu, Phil Mui, Huan Wang, Caiming Xiong, Silvio Savarese

The massive successes of large language models (LLMs) encourage the emerging exploration of LLM-augmented Autonomous Agents (LAAs).

Benchmarking Decision Making

267

Paper
Code

Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization

no code implementations • 4 Aug 2023 • Weiran Yao, Shelby Heinecke, Juan Carlos Niebles, Zhiwei Liu, Yihao Feng, Le Xue, Rithesh Murthy, Zeyuan Chen, JianGuo Zhang, Devansh Arpit, ran Xu, Phil Mui, Huan Wang, Caiming Xiong, Silvio Savarese

This demonstrates that using policy gradient optimization to improve language agents, for which we believe our work is one of the first, seems promising and can be applied to optimize other models in the agent architecture to enhance agent performances over time.

Language Modelling

Paper
Add Code

REX: Rapid Exploration and eXploitation for AI Agents

no code implementations • 18 Jul 2023 • Rithesh Murthy, Shelby Heinecke, Juan Carlos Niebles, Zhiwei Liu, Le Xue, Weiran Yao, Yihao Feng, Zeyuan Chen, Akash Gokul, Devansh Arpit, ran Xu, Phil Mui, Huan Wang, Caiming Xiong, Silvio Savarese

In this paper, we propose an enhanced approach for Rapid Exploration and eXploitation for AI Agents called REX.

Decision Making Reinforcement Learning (RL)

Paper
Add Code

Weakly-Supervised Scientific Document Classification via Retrieval-Augmented Multi-Stage Training

1 code implementation • 12 Jun 2023 • ran Xu, Yue Yu, Joyce C. Ho, Carl Yang

To address this challenge, we propose a weakly-supervised approach for scientific document classification using label names only.

Document Classification Retrieval

Paper
Code

A Review on Knowledge Graphs for Healthcare: Resources, Applications, and Promises

no code implementations • 7 Jun 2023 • Hejie Cui, Jiaying Lu, Shiyu Wang, ran Xu, Wenjing Ma, Shaojun Yu, Yue Yu, Xuan Kan, Chen Ling, Tianfan Fu, Liang Zhao, Joyce Ho, Fei Wang, Carl Yang

This work aims to serve as a valuable resource for understanding the potential and opportunities of HKG in health research.

Knowledge Graphs

Paper
Add Code

R-Mixup: Riemannian Mixup for Biological Networks

no code implementations • 5 Jun 2023 • Xuan Kan, Zimu Li, Hejie Cui, Yue Yu, ran Xu, Shaojun Yu, Zilong Zhang, Ying Guo, Carl Yang

Biological networks are commonly used in biomedical and healthcare domains to effectively model the structure of complex biological systems with interactions linking biological entities.

Data Augmentation

Paper
Add Code

UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild

1 code implementation • NeurIPS 2023 • Can Qin, Shu Zhang, Ning Yu, Yihao Feng, Xinyi Yang, Yingbo Zhou, Huan Wang, Juan Carlos Niebles, Caiming Xiong, Silvio Savarese, Stefano Ermon, Yun Fu, ran Xu

Visual generative foundation models such as Stable Diffusion show promise in navigating these goals, especially when prompted with arbitrary languages.

Image Generation

577

Paper
Code

ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding

1 code implementation • 14 May 2023 • Le Xue, Ning Yu, Shu Zhang, Junnan Li, Roberto Martín-Martín, Jiajun Wu, Caiming Xiong, ran Xu, Juan Carlos Niebles, Silvio Savarese

Recent advancements in multimodal pre-training methods have shown promising efficacy in 3D representation learning by aligning multimodal features across 3D shapes, their 2D counterparts, and language descriptions.

Ranked #4 on 3D Point Cloud Classification on ScanObjectNN (using extra training data)

3D Point Cloud Classification Representation Learning +1

350

Paper
Code

Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual Mask Annotations

no code implementations • CVPR 2023 • Vibashan VS, Ning Yu, Chen Xing, Can Qin, Mingfei Gao, Juan Carlos Niebles, Vishal M. Patel, ran Xu

In summary, an OV method learns task-specific information using strong supervision from base annotations and novel category information using weak supervision from image-captions pairs.

Image Captioning Instance Segmentation +2

Paper
Add Code

GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation

1 code implementation • ICCV 2023 • Can Qin, Ning Yu, Chen Xing, Shu Zhang, Zeyuan Chen, Stefano Ermon, Yun Fu, Caiming Xiong, ran Xu

Empirical results show that GlueNet can be trained efficiently and enables various capabilities beyond previous state-of-the-art models: 1) multilingual language models such as XLM-Roberta can be aligned with existing T2I models, allowing for the generation of high-quality images from captions beyond English; 2) GlueNet can align multi-modal encoders such as AudioCLIP with the Stable Diffusion model, enabling sound-to-image generation; 3) it can also upgrade the current text encoder of the latent diffusion model for challenging case generation.

Image Generation

Paper
Code

HIVE: Harnessing Human Feedback for Instructional Visual Editing

1 code implementation • 16 Mar 2023 • Shu Zhang, Xinyi Yang, Yihao Feng, Can Qin, Chia-Chih Chen, Ning Yu, Zeyuan Chen, Huan Wang, Silvio Savarese, Stefano Ermon, Caiming Xiong, ran Xu

Incorporating human feedback has been shown to be crucial to align text generated by large language models to human preferences.

Text-based Image Editing

Paper
Code

Deformer: Dynamic Fusion Transformer for Robust Hand Pose Estimation

no code implementations • ICCV 2023 • Qichen Fu, Xingyu Liu, ran Xu, Juan Carlos Niebles, Kris M. Kitani

Accurately estimating 3D hand pose is crucial for understanding how humans interact with the world.

Hand Pose Estimation

Paper
Add Code

Neighborhood-Regularized Self-Training for Learning with Few Labels

1 code implementation • 10 Jan 2023 • ran Xu, Yue Yu, Hejie Cui, Xuan Kan, Yanqiao Zhu, Joyce Ho, Chao Zhang, Carl Yang

Our further analysis demonstrates that our proposed data selection strategy reduces the noise of pseudo labels by 36. 8% and saves 57. 3% of the time when compared with the best baseline.

Paper
Code

Model-Agnostic Hierarchical Attention for 3D Object Detection

no code implementations • 6 Jan 2023 • Manli Shu, Le Xue, Ning Yu, Roberto Martín-Martín, Juan Carlos Niebles, Caiming Xiong, ran Xu

By plugging our proposed modules into the state-of-the-art transformer-based 3D detector, we improve the previous best results on both benchmarks, with the largest improvement margin on small objects.

3D Object Detection Object +1

Paper
Add Code

LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer

1 code implementation • 19 Dec 2022 • Ning Yu, Chia-Chih Chen, Zeyuan Chen, Rui Meng, Gang Wu, Paul Josel, Juan Carlos Niebles, Caiming Xiong, ran Xu

Graphic layout designs play an essential role in visual communication.

Paper
Code

ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding

1 code implementation • CVPR 2023 • Le Xue, Mingfei Gao, Chen Xing, Roberto Martín-Martín, Jiajun Wu, Caiming Xiong, ran Xu, Juan Carlos Niebles, Silvio Savarese

Then, ULIP learns a 3D representation space aligned with the common image-text space, using a small number of automatically synthesized triplets.

Ranked #3 on Training-free 3D Point Cloud Classification on ModelNet40 (using extra training data)

3D Architecture 3D Classification +5

350

Paper
Code

Tackling Data Heterogeneity in Federated Learning with Class Prototypes

1 code implementation • 6 Dec 2022 • Yutong Dai, Zeyuan Chen, Junnan Li, Shelby Heinecke, Lichao Sun, ran Xu

We propose FedNH, a novel method that improves the local models' performance for both personalization and generalization by combining the uniformity and semantics of class prototypes.

Personalized Federated Learning

Paper
Code

Learning Task-Aware Effective Brain Connectivity for fMRI Analysis with Graph Neural Networks

1 code implementation • 1 Nov 2022 • Yue Yu, Xuan Kan, Hejie Cui, ran Xu, Yujia Zheng, Xiangchen Song, Yanqiao Zhu, Kun Zhang, Razieh Nabi, Ying Guo, Chao Zhang, Carl Yang

To better adapt GNNs for fMRI analysis, we propose TBDS, an end-to-end framework based on \underline{T}ask-aware \underline{B}rain connectivity \underline{D}AG (short for Directed Acyclic Graph) \underline{S}tructure generation for fMRI analysis.

Time Series Time Series Analysis

Paper
Code

Cold-Start Data Selection for Few-shot Language Model Fine-tuning: A Prompt-Based Uncertainty Propagation Approach

1 code implementation • 15 Sep 2022 • Yue Yu, Rongzhi Zhang, ran Xu, Jieyu Zhang, Jiaming Shen, Chao Zhang

Large Language Models have demonstrated remarkable few-shot performance, but the performance can be sensitive to the selection of few-shot instances.

Language Modelling Text Classification

Paper
Code

TAG: Boosting Text-VQA via Text-aware Visual Question-answer Generation

1 code implementation • 3 Aug 2022 • Jun Wang, Mingfei Gao, Yuqian Hu, Ramprasaath R. Selvaraju, Chetan Ramaiah, ran Xu, Joseph F. JaJa, Larry S. Davis

To address this deficiency, we develop a new method to generate high-quality and diverse QA pairs by explicitly utilizing the existing rich text available in the scene context of each image.

Ranked #3 on Visual Question Answering (VQA) on TextVQA test-standard

Answer Generation Question-Answer-Generation +3

Paper
Code

Use All The Labels: A Hierarchical Multi-Label Contrastive Learning Framework

1 code implementation • CVPR 2022 • Shu Zhang, ran Xu, Caiming Xiong, Chetan Ramaiah

Current contrastive learning frameworks focus on leveraging a single supervisory signal to learn representations, which limits the efficacy on unseen data and downstream tasks.

Contrastive Learning Representation Learning

141

Paper
Code

SmartAdapt: Multi-Branch Object Detection Framework for Videos on Mobiles

no code implementations • CVPR 2022 • ran Xu, Fangzhou Mu, Jayoung Lee, Preeti Mukherjee, Somali Chaterji, Saurabh Bagchi, Yin Li

In this paper, we ask, and answer, the wide-ranging question across all MBODFs: How to expose the right set of execution branches and then how to schedule the optimal one at inference time?

object-detection Video Object Detection

Paper
Add Code

Virtuoso: Video-based Intelligence for real-time tuning on SOCs

no code implementations • 24 Dec 2021 • Jayoung Lee, Pengcheng Wang, ran Xu, Venkat Dasari, Noah Weston, Yin Li, Saurabh Bagchi, Somali Chaterji

First, the system does not consider energy consumption of the models while making a decision on which model to run.

Image Classification object-detection +1

Paper
Add Code

Value Retrieval with Arbitrary Queries for Form-like Documents

1 code implementation • 15 Dec 2021 • Mingfei Gao, Le Xue, Chetan Ramaiah, Chen Xing, ran Xu, Caiming Xiong

Unlike previous methods that only address a fixed set of field items, our method predicts target value for an arbitrary query based on the understanding of the layout and semantics of a form.

document understanding Language Modelling +1

Paper
Code

Burn After Reading: Online Adaptation for Cross-domain Streaming Data

no code implementations • 8 Dec 2021 • Luyu Yang, Mingfei Gao, Zeyuan Chen, ran Xu, Abhinav Shrivastava, Chetan Ramaiah

In the context of online privacy, many methods propose complex privacy and security preserving measures to protect sensitive data.

Unsupervised Domain Adaptation

Paper
Add Code

Open Vocabulary Object Detection with Pseudo Bounding-Box Labels

1 code implementation • 18 Nov 2021 • Mingfei Gao, Chen Xing, Juan Carlos Niebles, Junnan Li, ran Xu, Wenhao Liu, Caiming Xiong

To enlarge the set of base classes, we propose a method to automatically generate pseudo bounding-box annotations of diverse objects from large-scale image-caption pairs.

Object object-detection +1

Paper
Code

Robustness Evaluation of Transformer-based Form Field Extractors via Form Attacks

1 code implementation • 8 Oct 2021 • Le Xue, Mingfei Gao, Zeyuan Chen, Caiming Xiong, ran Xu

We propose a novel framework to evaluate the robustness of transformer-based form field extraction methods via form attacks.

Optical Character Recognition (OCR)

4,291

Paper
Code

Field Extraction from Forms with Unlabeled Data

2 code implementations • SpaNLP (ACL) 2022 • Mingfei Gao, Zeyuan Chen, Nikhil Naik, Kazuma Hashimoto, Caiming Xiong, ran Xu

We propose a novel framework to conduct field extraction from forms with unlabeled data.

Pseudo Label

Paper
Code

MetaHistoSeg: A Python Framework for Meta Learning in Histopathology Image Segmentation

no code implementations • 29 Sep 2021 • Zheng Yuan, Andre Esteva, ran Xu

We also curate a histopathology meta dataset - a benchmark dataset for training and validating models on out-of-distribution performance across a range of cancer types.

Domain Generalization Few-Shot Learning +3

Paper
Add Code

JANUS: Benchmarking Commercial and Open-Source Cloud and Edge Platforms for Object and Anomaly Detection Workloads

no code implementations • 9 Dec 2020 • Karthick Shankar, Pengcheng Wang, ran Xu, Ashraf Mahgoub, Somali Chaterji

In addition, we also look at the pros and cons of some of the proprietary deep-learning object detection packages, such as Amazon Rekognition, Google Vision, and Azure Cognitive Services, to contrast with open-source and tunable solutions, such as Faster R-CNN (FRCNN).

Anomaly Detection Benchmarking +4

Paper
Add Code

ApproxDet: Content and Contention-Aware Approximate Object Detection for Mobiles

1 code implementation • 21 Oct 2020 • ran Xu, Chen-Lin Zhang, Pengcheng Wang, Jayoung Lee, Subrata Mitra, Somali Chaterji, Yin Li, Saurabh Bagchi

In this paper we introduce ApproxDet, an adaptive video object detection framework for mobile devices to meet accuracy-latency requirements in the face of changing content and resource contention scenarios.

Object object-detection +3

Paper
Code

WOAD: Weakly Supervised Online Action Detection in Untrimmed Videos

no code implementations • CVPR 2021 • Mingfei Gao, Yingbo Zhou, ran Xu, Richard Socher, Caiming Xiong

Online action detection in untrimmed videos aims to identify an action as it happens, which makes it very important for real-time applications.

Ranked #5 on Online Action Detection on THUMOS'14

Action Recognition Online Action Detection

Paper
Add Code

Proposal Learning for Semi-Supervised Object Detection

no code implementations • 15 Jan 2020 • Peng Tang, Chetan Ramaiah, Yan Wang, ran Xu, Caiming Xiong

two-stage object detectors) by training on both labeled and unlabeled data.

Object object-detection +2

Paper
Add Code

Context-aware Active Multi-Step Reinforcement Learning

no code implementations • 11 Nov 2019 • Gang Chen, Dingcheng Li, ran Xu

Then given the selected samples, we propose the adaptive multi-step TD, which generalizes TD($\lambda$), but adaptively switch on/off the backups from future returns of different steps.

Active Learning Decision Making +2

Paper
Add Code

ApproxNet: Content and Contention-Aware Video Analytics System for Embedded Clients

no code implementations • 28 Aug 2019 • Ran Xu, Rakesh Kumar, Pengcheng Wang, Peter Bai, Ganga Meghanath, Somali Chaterji, Subrata Mitra, Saurabh Bagchi

None of the current approximation techniques for object classification DNNs can adapt to changing runtime conditions, e. g., changes in resource availability on the device, the content characteristics, or requirements from the user.

Object Detection

Paper
Add Code

TempoCave: Visualizing Dynamic Connectome Datasets to Support Cognitive Behavioral Therapy

1 code implementation • 18 Jun 2019 • Ran Xu, Manu Mathew Thomas, Alex Leow, Olusola Ajilore, Angus G. Forbes

We introduce TempoCave, a novel visualization application for analyzing dynamic brain networks, or connectomes.

Human-Computer Interaction Neurons and Cognition

Paper
Code

Collecting and Annotating the Large Continuous Action Dataset

no code implementations • 18 Nov 2015 • Daniel Paul Barrett, ran Xu, Haonan Yu, Jeffrey Mark Siskind

We make available to the community a new dataset to support action-recognition research.

Action Recognition Temporal Action Localization

Paper
Add Code

Human Action Segmentation With Hierarchical Supervoxel Consistency

no code implementations • CVPR 2015 • Jiasen Lu, ran Xu, Jason J. Corso

Detailed analysis of human action, such as action classification, detection and localization has received increasing attention from the community; datasets like JHMDB have made it plausible to conduct studies analyzing the impact that such deeper information has on the greater action understanding problem.

Action Classification Action Segmentation +3

Paper
Add Code

Sequential Labeling with online Deep Learning

no code implementations • 10 Dec 2014 • Gang Chen, ran Xu, Sargur Srihari

Deep learning has attracted great attention recently and yielded the state of the art performance in dimension reduction and classification problems.

Dimensionality Reduction

Paper
Add Code

Compositional Structure Learning for Action Understanding

no code implementations • 21 Oct 2014 • Ran Xu, Gang Chen, Caiming Xiong, Wei Chen, Jason J. Corso

The focus of the action understanding literature has predominately been classification, how- ever, there are many applications demanding richer action understanding such as mobile robotics and video search, with solutions to classification, localization and detection.

Action Detection Action Understanding +1

Paper
Add Code

Evaluating Term Extraction Methods for Interpreters

no code implementations • WS 2014 • Ran Xu, Serge Sharoff

Term Extraction

Paper
Add Code

Actionness Ranking with Lattice Conditional Ordinal Random Fields

no code implementations • CVPR 2014 • Wei Chen, Caiming Xiong, ran Xu, Jason J. Corso

Action analysis in image and video has been attracting more and more attention in computer vision.

Action Analysis

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.