1 code implementation • COLING 2022 • Mingfei Gao, Le Xue, Chetan Ramaiah, Chen Xing, ran Xu, Caiming Xiong
Unlike previous methods that only address a fixed set of field items, our method predicts target value for an arbitrary query based on the understanding of the layout and semantics of a form.
1 code implementation • 17 Mar 2024 • Guohao Sun, Can Qin, Jiamian Wang, Zeyuan Chen, ran Xu, Zhiqiang Tao
Recent advancements in the vision-language model have shown notable generalization in vision-language tasks after visual instruction tuning.
Ranked #35 on Visual Question Answering on MM-Vet
no code implementations • 13 Mar 2024 • ran Xu, Yan Shen, Xiaoqi Li, Ruihai Wu, Hao Dong
To address these challenges, we introduce a comprehensive benchmark, NrVLM, comprising 15 distinct manipulation tasks, containing over 4500 episodes meticulously annotated with fine-grained language instructions.
1 code implementation • 28 Feb 2024 • Congying Xia, Chen Xing, Jiangshu Du, Xinyi Yang, Yihao Feng, ran Xu, Wenpeng Yin, Caiming Xiong
This paper presents FoFo, a pioneering benchmark for evaluating large language models' (LLMs) ability to follow complex, domain-specific formats, a crucial yet underexamined capability for their application as AI agents.
no code implementations • 25 Feb 2024 • ran Xu, Wenqi Shi, Yue Yu, Yuchen Zhuang, Bowen Jin, May D. Wang, Joyce C. Ho, Carl Yang
We present RAM-EHR, a Retrieval AugMentation pipeline to improve clinical predictions on Electronic Health Records (EHRs).
no code implementations • 19 Feb 2024 • Hejie Cui, Xinyu Fang, ran Xu, Xuan Kan, Joyce C. Ho, Carl Yang
While there has been a lot of research on representation learning of structured EHR data, the fusion of different types of EHR data (multimodal fusion) is not well studied.
no code implementations • 14 Jan 2024 • Jiaqi Chen, Bingqian Lin, ran Xu, Zhenhua Chai, Xiaodan Liang, Kwan-Yee K. Wong
Embodied agents equipped with GPT as their brain have exhibited extraordinary decision-making and generalization abilities across various tasks.
1 code implementation • 13 Jan 2024 • Wenqi Shi, ran Xu, Yuchen Zhuang, Yue Yu, Jieyu Zhang, Hang Wu, Yuanda Zhu, Joyce Ho, Carl Yang, May D. Wang
Large language models (LLMs) have demonstrated exceptional capabilities in planning and tool utilization as autonomous agents, but few have been developed for medical problem-solving.
1 code implementation • 10 Jan 2024 • Lichao Sun, Yue Huang, Haoran Wang, Siyuan Wu, Qihui Zhang, Yuan Li, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bertie Vidgen, Bhavya Kailkhura, Caiming Xiong, Chaowei Xiao, Chunyuan Li, Eric Xing, Furong Huang, Hao liu, Heng Ji, Hongyi Wang, huan zhang, Huaxiu Yao, Manolis Kellis, Marinka Zitnik, Meng Jiang, Mohit Bansal, James Zou, Jian Pei, Jian Liu, Jianfeng Gao, Jiawei Han, Jieyu Zhao, Jiliang Tang, Jindong Wang, Joaquin Vanschoren, John Mitchell, Kai Shu, Kaidi Xu, Kai-Wei Chang, Lifang He, Lifu Huang, Michael Backes, Neil Zhenqiang Gong, Philip S. Yu, Pin-Yu Chen, Quanquan Gu, ran Xu, Rex Ying, Shuiwang Ji, Suman Jana, Tianlong Chen, Tianming Liu, Tianyi Zhou, William Wang, Xiang Li, Xiangliang Zhang, Xiao Wang, Xing Xie, Xun Chen, Xuyu Wang, Yan Liu, Yanfang Ye, Yinzhi Cao, Yong Chen, Yue Zhao
This paper introduces TrustLLM, a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions.
no code implementations • 19 Dec 2023 • Jiaming Liu, ran Xu, Senqiao Yang, Renrui Zhang, Qizhe Zhang, Zehui Chen, Yandong Guo, Shanghang Zhang
To tackle these issues, we propose a continual self-supervised method, Adaptive Distribution Masked Autoencoders (ADMA), which enhances the extraction of target domain knowledge while mitigating the accumulation of distribution shifts.
1 code implementation • 30 Nov 2023 • Artemis Panagopoulou, Le Xue, Ning Yu, Junnan Li, Dongxu Li, Shafiq Joty, ran Xu, Silvio Savarese, Caiming Xiong, Juan Carlos Niebles
Vision-language pre-training and instruction tuning have demonstrated general-purpose capabilities in 2D visual reasoning tasks by aligning visual encoders with state-of-the-art large language models (LLMs).
1 code implementation • 1 Nov 2023 • ran Xu, Hejie Cui, Yue Yu, Xuan Kan, Wenqi Shi, Yuchen Zhuang, Wei Jin, Joyce Ho, Carl Yang
Clinical natural language processing requires methods that can address domain-specific challenges, such as complex medical terminology and clinical contexts.
no code implementations • 24 Sep 2023 • Jiayi Ni, Senqiao Yang, ran Xu, Jiaming Liu, Xiaoqi Li, Wenyu Jiao, Zehui Chen, Yi Liu, Shanghang Zhang
In this paper, we propose a distribution-aware tuning (DAT) method to make the semantic segmentation CTTA efficient and practical in real-world applications.
2 code implementations • 11 Aug 2023 • Zhiwei Liu, Weiran Yao, JianGuo Zhang, Le Xue, Shelby Heinecke, Rithesh Murthy, Yihao Feng, Zeyuan Chen, Juan Carlos Niebles, Devansh Arpit, ran Xu, Phil Mui, Huan Wang, Caiming Xiong, Silvio Savarese
The massive successes of large language models (LLMs) encourage the emerging exploration of LLM-augmented Autonomous Agents (LAAs).
no code implementations • 4 Aug 2023 • Weiran Yao, Shelby Heinecke, Juan Carlos Niebles, Zhiwei Liu, Yihao Feng, Le Xue, Rithesh Murthy, Zeyuan Chen, JianGuo Zhang, Devansh Arpit, ran Xu, Phil Mui, Huan Wang, Caiming Xiong, Silvio Savarese
This demonstrates that using policy gradient optimization to improve language agents, for which we believe our work is one of the first, seems promising and can be applied to optimize other models in the agent architecture to enhance agent performances over time.
no code implementations • 18 Jul 2023 • Rithesh Murthy, Shelby Heinecke, Juan Carlos Niebles, Zhiwei Liu, Le Xue, Weiran Yao, Yihao Feng, Zeyuan Chen, Akash Gokul, Devansh Arpit, ran Xu, Phil Mui, Huan Wang, Caiming Xiong, Silvio Savarese
In this paper, we propose an enhanced approach for Rapid Exploration and eXploitation for AI Agents called REX.
1 code implementation • 12 Jun 2023 • ran Xu, Yue Yu, Joyce C. Ho, Carl Yang
To address this challenge, we propose a weakly-supervised approach for scientific document classification using label names only.
no code implementations • 7 Jun 2023 • Hejie Cui, Jiaying Lu, Shiyu Wang, ran Xu, Wenjing Ma, Shaojun Yu, Yue Yu, Xuan Kan, Chen Ling, Tianfan Fu, Liang Zhao, Joyce Ho, Fei Wang, Carl Yang
This work aims to serve as a valuable resource for understanding the potential and opportunities of HKG in health research.
no code implementations • 5 Jun 2023 • Xuan Kan, Zimu Li, Hejie Cui, Yue Yu, ran Xu, Shaojun Yu, Zilong Zhang, Ying Guo, Carl Yang
Biological networks are commonly used in biomedical and healthcare domains to effectively model the structure of complex biological systems with interactions linking biological entities.
1 code implementation • NeurIPS 2023 • Can Qin, Shu Zhang, Ning Yu, Yihao Feng, Xinyi Yang, Yingbo Zhou, Huan Wang, Juan Carlos Niebles, Caiming Xiong, Silvio Savarese, Stefano Ermon, Yun Fu, ran Xu
Visual generative foundation models such as Stable Diffusion show promise in navigating these goals, especially when prompted with arbitrary languages.
1 code implementation • 14 May 2023 • Le Xue, Ning Yu, Shu Zhang, Junnan Li, Roberto Martín-Martín, Jiajun Wu, Caiming Xiong, ran Xu, Juan Carlos Niebles, Silvio Savarese
Recent advancements in multimodal pre-training methods have shown promising efficacy in 3D representation learning by aligning multimodal features across 3D shapes, their 2D counterparts, and language descriptions.
Ranked #4 on 3D Point Cloud Classification on ScanObjectNN (using extra training data)
no code implementations • CVPR 2023 • Vibashan VS, Ning Yu, Chen Xing, Can Qin, Mingfei Gao, Juan Carlos Niebles, Vishal M. Patel, ran Xu
In summary, an OV method learns task-specific information using strong supervision from base annotations and novel category information using weak supervision from image-captions pairs.
1 code implementation • ICCV 2023 • Can Qin, Ning Yu, Chen Xing, Shu Zhang, Zeyuan Chen, Stefano Ermon, Yun Fu, Caiming Xiong, ran Xu
Empirical results show that GlueNet can be trained efficiently and enables various capabilities beyond previous state-of-the-art models: 1) multilingual language models such as XLM-Roberta can be aligned with existing T2I models, allowing for the generation of high-quality images from captions beyond English; 2) GlueNet can align multi-modal encoders such as AudioCLIP with the Stable Diffusion model, enabling sound-to-image generation; 3) it can also upgrade the current text encoder of the latent diffusion model for challenging case generation.
1 code implementation • 16 Mar 2023 • Shu Zhang, Xinyi Yang, Yihao Feng, Can Qin, Chia-Chih Chen, Ning Yu, Zeyuan Chen, Huan Wang, Silvio Savarese, Stefano Ermon, Caiming Xiong, ran Xu
Incorporating human feedback has been shown to be crucial to align text generated by large language models to human preferences.
no code implementations • ICCV 2023 • Qichen Fu, Xingyu Liu, ran Xu, Juan Carlos Niebles, Kris M. Kitani
Accurately estimating 3D hand pose is crucial for understanding how humans interact with the world.
1 code implementation • 10 Jan 2023 • ran Xu, Yue Yu, Hejie Cui, Xuan Kan, Yanqiao Zhu, Joyce Ho, Chao Zhang, Carl Yang
Our further analysis demonstrates that our proposed data selection strategy reduces the noise of pseudo labels by 36. 8% and saves 57. 3% of the time when compared with the best baseline.
no code implementations • 6 Jan 2023 • Manli Shu, Le Xue, Ning Yu, Roberto Martín-Martín, Juan Carlos Niebles, Caiming Xiong, ran Xu
By plugging our proposed modules into the state-of-the-art transformer-based 3D detector, we improve the previous best results on both benchmarks, with the largest improvement margin on small objects.
1 code implementation • 19 Dec 2022 • Ning Yu, Chia-Chih Chen, Zeyuan Chen, Rui Meng, Gang Wu, Paul Josel, Juan Carlos Niebles, Caiming Xiong, ran Xu
Graphic layout designs play an essential role in visual communication.
1 code implementation • CVPR 2023 • Le Xue, Mingfei Gao, Chen Xing, Roberto Martín-Martín, Jiajun Wu, Caiming Xiong, ran Xu, Juan Carlos Niebles, Silvio Savarese
Then, ULIP learns a 3D representation space aligned with the common image-text space, using a small number of automatically synthesized triplets.
Ranked #3 on Training-free 3D Point Cloud Classification on ModelNet40 (using extra training data)
1 code implementation • 6 Dec 2022 • Yutong Dai, Zeyuan Chen, Junnan Li, Shelby Heinecke, Lichao Sun, ran Xu
We propose FedNH, a novel method that improves the local models' performance for both personalization and generalization by combining the uniformity and semantics of class prototypes.
1 code implementation • 1 Nov 2022 • Yue Yu, Xuan Kan, Hejie Cui, ran Xu, Yujia Zheng, Xiangchen Song, Yanqiao Zhu, Kun Zhang, Razieh Nabi, Ying Guo, Chao Zhang, Carl Yang
To better adapt GNNs for fMRI analysis, we propose TBDS, an end-to-end framework based on \underline{T}ask-aware \underline{B}rain connectivity \underline{D}AG (short for Directed Acyclic Graph) \underline{S}tructure generation for fMRI analysis.
1 code implementation • 15 Sep 2022 • Yue Yu, Rongzhi Zhang, ran Xu, Jieyu Zhang, Jiaming Shen, Chao Zhang
Large Language Models have demonstrated remarkable few-shot performance, but the performance can be sensitive to the selection of few-shot instances.
1 code implementation • 3 Aug 2022 • Jun Wang, Mingfei Gao, Yuqian Hu, Ramprasaath R. Selvaraju, Chetan Ramaiah, ran Xu, Joseph F. JaJa, Larry S. Davis
To address this deficiency, we develop a new method to generate high-quality and diverse QA pairs by explicitly utilizing the existing rich text available in the scene context of each image.
1 code implementation • CVPR 2022 • Shu Zhang, ran Xu, Caiming Xiong, Chetan Ramaiah
Current contrastive learning frameworks focus on leveraging a single supervisory signal to learn representations, which limits the efficacy on unseen data and downstream tasks.
no code implementations • CVPR 2022 • ran Xu, Fangzhou Mu, Jayoung Lee, Preeti Mukherjee, Somali Chaterji, Saurabh Bagchi, Yin Li
In this paper, we ask, and answer, the wide-ranging question across all MBODFs: How to expose the right set of execution branches and then how to schedule the optimal one at inference time?
no code implementations • 24 Dec 2021 • Jayoung Lee, Pengcheng Wang, ran Xu, Venkat Dasari, Noah Weston, Yin Li, Saurabh Bagchi, Somali Chaterji
First, the system does not consider energy consumption of the models while making a decision on which model to run.
1 code implementation • 15 Dec 2021 • Mingfei Gao, Le Xue, Chetan Ramaiah, Chen Xing, ran Xu, Caiming Xiong
Unlike previous methods that only address a fixed set of field items, our method predicts target value for an arbitrary query based on the understanding of the layout and semantics of a form.
no code implementations • 8 Dec 2021 • Luyu Yang, Mingfei Gao, Zeyuan Chen, ran Xu, Abhinav Shrivastava, Chetan Ramaiah
In the context of online privacy, many methods propose complex privacy and security preserving measures to protect sensitive data.
1 code implementation • 18 Nov 2021 • Mingfei Gao, Chen Xing, Juan Carlos Niebles, Junnan Li, ran Xu, Wenhao Liu, Caiming Xiong
To enlarge the set of base classes, we propose a method to automatically generate pseudo bounding-box annotations of diverse objects from large-scale image-caption pairs.
1 code implementation • 8 Oct 2021 • Le Xue, Mingfei Gao, Zeyuan Chen, Caiming Xiong, ran Xu
We propose a novel framework to evaluate the robustness of transformer-based form field extraction methods via form attacks.
2 code implementations • SpaNLP (ACL) 2022 • Mingfei Gao, Zeyuan Chen, Nikhil Naik, Kazuma Hashimoto, Caiming Xiong, ran Xu
We propose a novel framework to conduct field extraction from forms with unlabeled data.
no code implementations • 29 Sep 2021 • Zheng Yuan, Andre Esteva, ran Xu
We also curate a histopathology meta dataset - a benchmark dataset for training and validating models on out-of-distribution performance across a range of cancer types.
no code implementations • 9 Dec 2020 • Karthick Shankar, Pengcheng Wang, ran Xu, Ashraf Mahgoub, Somali Chaterji
In addition, we also look at the pros and cons of some of the proprietary deep-learning object detection packages, such as Amazon Rekognition, Google Vision, and Azure Cognitive Services, to contrast with open-source and tunable solutions, such as Faster R-CNN (FRCNN).
1 code implementation • 21 Oct 2020 • ran Xu, Chen-Lin Zhang, Pengcheng Wang, Jayoung Lee, Subrata Mitra, Somali Chaterji, Yin Li, Saurabh Bagchi
In this paper we introduce ApproxDet, an adaptive video object detection framework for mobile devices to meet accuracy-latency requirements in the face of changing content and resource contention scenarios.
no code implementations • CVPR 2021 • Mingfei Gao, Yingbo Zhou, ran Xu, Richard Socher, Caiming Xiong
Online action detection in untrimmed videos aims to identify an action as it happens, which makes it very important for real-time applications.
Ranked #5 on Online Action Detection on THUMOS'14
no code implementations • 15 Jan 2020 • Peng Tang, Chetan Ramaiah, Yan Wang, ran Xu, Caiming Xiong
two-stage object detectors) by training on both labeled and unlabeled data.
no code implementations • 11 Nov 2019 • Gang Chen, Dingcheng Li, ran Xu
Then given the selected samples, we propose the adaptive multi-step TD, which generalizes TD($\lambda$), but adaptively switch on/off the backups from future returns of different steps.
no code implementations • 28 Aug 2019 • Ran Xu, Rakesh Kumar, Pengcheng Wang, Peter Bai, Ganga Meghanath, Somali Chaterji, Subrata Mitra, Saurabh Bagchi
None of the current approximation techniques for object classification DNNs can adapt to changing runtime conditions, e. g., changes in resource availability on the device, the content characteristics, or requirements from the user.
1 code implementation • 18 Jun 2019 • Ran Xu, Manu Mathew Thomas, Alex Leow, Olusola Ajilore, Angus G. Forbes
We introduce TempoCave, a novel visualization application for analyzing dynamic brain networks, or connectomes.
Human-Computer Interaction Neurons and Cognition
no code implementations • 18 Nov 2015 • Daniel Paul Barrett, ran Xu, Haonan Yu, Jeffrey Mark Siskind
We make available to the community a new dataset to support action-recognition research.
no code implementations • CVPR 2015 • Jiasen Lu, ran Xu, Jason J. Corso
Detailed analysis of human action, such as action classification, detection and localization has received increasing attention from the community; datasets like JHMDB have made it plausible to conduct studies analyzing the impact that such deeper information has on the greater action understanding problem.
no code implementations • 10 Dec 2014 • Gang Chen, ran Xu, Sargur Srihari
Deep learning has attracted great attention recently and yielded the state of the art performance in dimension reduction and classification problems.
no code implementations • 21 Oct 2014 • Ran Xu, Gang Chen, Caiming Xiong, Wei Chen, Jason J. Corso
The focus of the action understanding literature has predominately been classification, how- ever, there are many applications demanding richer action understanding such as mobile robotics and video search, with solutions to classification, localization and detection.
no code implementations • CVPR 2014 • Wei Chen, Caiming Xiong, ran Xu, Jason J. Corso
Action analysis in image and video has been attracting more and more attention in computer vision.