Search Results for author: Zichen Zhang

Found 26 papers, 9 papers with code

Temporal2Seq: A Unified Framework for Temporal Video Understanding Tasks

no code implementations27 Sep 2024 Min Yang, Zichen Zhang, LiMin Wang

With this unified token representation, Temporal2Seq can train a generalist model within a single architecture on different video understanding tasks.

Action Detection Action Segmentation +5

PEAR: Phrase-Based Hand-Object Interaction Anticipation

no code implementations31 Jul 2024 Zichen Zhang, Hongchen Luo, Wei Zhai, Yang Cao, Yu Kang

To address this, we propose a novel model, PEAR (Phrase-Based Hand-Object Interaction Anticipation), which jointly anticipates interaction intention and manipulation.

Object

SoupLM: Model Integration in Large Language and Multi-Modal Models

no code implementations11 Jul 2024 Yue Bai, Zichen Zhang, Jiasen Lu, Yun Fu

Training large language models (LLMs) and multimodal LLMs necessitates significant computing resources, and existing publicly available LLMs are typically pre-trained on diverse, privately curated datasets spanning various tasks.

Chatbot

PoliFormer: Scaling On-Policy RL with Transformers Results in Masterful Navigators

no code implementations28 Jun 2024 Kuo-Hao Zeng, Zichen Zhang, Kiana Ehsani, Rose Hendrix, Jordi Salvador, Alvaro Herrasti, Ross Girshick, Aniruddha Kembhavi, Luca Weihs

We present PoliFormer (Policy Transformer), an RGB-only indoor navigation agent trained end-to-end with reinforcement learning at scale that generalizes to the real-world without adaptation despite being trained purely in simulation.

Decoder Object +1

Bidirectional Progressive Transformer for Interaction Intention Anticipation

no code implementations9 May 2024 Zichen Zhang, Hongchen Luo, Wei Zhai, Yang Cao, Yu Kang

Building upon this relationship, a novel Bidirectional prOgressive Transformer (BOT), which introduces a Bidirectional Progressive mechanism into the anticipation of interaction intention is established.

Prediction Trajectory Forecasting

Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action

1 code implementation28 Dec 2023 Jiasen Lu, Christopher Clark, Sangho Lee, Zichen Zhang, Savya Khosla, Ryan Marten, Derek Hoiem, Aniruddha Kembhavi

We present Unified-IO 2, the first autoregressive multimodal model that is capable of understanding and generating image, text, audio, and action.

Decoder Image Generation +1

Universal Visual Decomposer: Long-Horizon Manipulation Made Easy

no code implementations12 Oct 2023 Zichen Zhang, Yunshuang Li, Osbert Bastani, Abhishek Gupta, Dinesh Jayaraman, Yecheng Jason Ma, Luca Weihs

Learning long-horizon manipulation tasks, however, is a long-standing challenge, and demands decomposing the overarching task into several manageable subtasks to facilitate policy learning and generalization to unseen tasks.

reinforcement-learning

When Learning Is Out of Reach, Reset: Generalization in Autonomous Visuomotor Reinforcement Learning

no code implementations30 Mar 2023 Zichen Zhang, Luca Weihs

Finally, we find that our proposed approach can dramatically reduce the number of resets required for training other embodied tasks, in particular for RoboTHOR ObjectNav we obtain higher success rates than episodic approaches using 99. 97\% fewer resets.

Reinforcement Learning (RL)

Streaming Traffic Flow Prediction Based on Continuous Reinforcement Learning

no code implementations24 Dec 2022 Yanan Xiao, Minyu Liu, Zichen Zhang, Lu Jiang, Minghao Yin, Jianan Wang

We propose to formulate the problem as a continuous reinforcement learning task, where the agent is the next flow value predictor, the action is the next time-series flow value in the sensor, and the environment state is a dynamically fused representation of the sensor and transportation network.

reinforcement-learning Reinforcement Learning +3

A Simple Decentralized Cross-Entropy Method

1 code implementation16 Dec 2022 Zichen Zhang, Jun Jin, Martin Jagersand, Jun Luo, Dale Schuurmans

To tackle this issue, we propose Decentralized CEM (DecentCEM), a simple but effective improvement over classical CEM, by using an ensemble of CEM instances running independently from one another, and each performing a local improvement of its own sampling distribution.

continuous-control Continuous Control +1

VIMA: General Robot Manipulation with Multimodal Prompts

2 code implementations6 Oct 2022 Yunfan Jiang, Agrim Gupta, Zichen Zhang, Guanzhi Wang, Yongqiang Dou, Yanjun Chen, Li Fei-Fei, Anima Anandkumar, Yuke Zhu, Linxi Fan

We show that a wide spectrum of robot manipulation tasks can be expressed with multimodal prompts, interleaving textual and visual tokens.

Imitation Learning Language Modelling +3

Drawing Inductor Layout with a Reinforcement Learning Agent: Method and Application for VCO Inductors

no code implementations23 Feb 2022 Cameron Haigh, Zichen Zhang, Negar Hassanpour, Khurram Javed, Yingying Fu, Shayan Shahramian, Shawn Zhang, Jun Luo

In light of the need to tweak the target specifications throughout the circuit design cycle, we also develop a variant in which the agent can learn to quickly adapt to draw new inductors for moderately different target specifications.

Reinforcement Learning (RL)

Decentralized Cross-Entropy Method for Model-Based Reinforcement Learning

no code implementations29 Sep 2021 Zichen Zhang, Jun Jin, Martin Jagersand, Jun Luo, Dale Schuurmans

Further, we extend the decentralized approach to sequential decision-making problems where we show in 13 continuous control benchmark environments that it matches or outperforms the state-of-the-art CEM algorithms in most cases, under the same budget of the total number of samples for planning.

continuous-control Continuous Control +6

Sample Efficient Learning of Image-Based Diagnostic Classifiers Using Probabilistic Labels

no code implementations11 Feb 2021 Roberto Vega, Pouneh Gorji, Zichen Zhang, Xuebin Qin, Abhilash Rakkunedeth Hareendranathan, Jeevesh Kapur, Jacob L. Jaremko, Russell Greiner

This complicates its use in tasks like image-based medical diagnosis, where the small training datasets are usually insufficient to learn appropriate data representations.

Medical Diagnosis

Boundary-Aware Segmentation Network for Mobile and Web Applications

5 code implementations12 Jan 2021 Xuebin Qin, Deng-Ping Fan, Chenyang Huang, Cyril Diagne, Zichen Zhang, Adrià Cabeza Sant'Anna, Albert Suàrez, Martin Jagersand, Ling Shao

In this paper, we propose a simple yet powerful Boundary-Aware Segmentation Network (BASNet), which comprises a predict-refine architecture and a hybrid loss, for highly accurate image segmentation.

Camouflaged Object Segmentation Decoder +4

Reducing Selection Bias in Counterfactual Reasoning for Individual Treatment Effects Estimation

no code implementations19 Dec 2019 Zichen Zhang, Qingfeng Lan, Lei Ding, Yue Wang, Negar Hassanpour, Russell Greiner

We learn two groups of latent random variables, where one group corresponds to variables that only cause selection bias, and the other group is relevant for outcome prediction.

counterfactual Counterfactual Reasoning +1

2017 Robotic Instrument Segmentation Challenge

3 code implementations18 Feb 2019 Max Allan, Alex Shvets, Thomas Kurmann, Zichen Zhang, Rahul Duggal, Yun-Hsuan Su, Nicola Rieke, Iro Laina, Niveditha Kalavakonda, Sebastian Bodenstedt, Luis Herrera, Wenqi Li, Vladimir Iglovikov, Huoling Luo, Jian Yang, Danail Stoyanov, Lena Maier-Hein, Stefanie Speidel, Mahdi Azizian

In mainstream computer vision and machine learning, public datasets such as ImageNet, COCO and KITTI have helped drive enormous improvements by enabling researchers to understand the strengths and limitations of different algorithms via performance comparison.

Benchmarking Person Re-Identification +2

Robot eye-hand coordination learning by watching human demonstrations: a task function approximation approach

1 code implementation29 Sep 2018 Jun Jin, Laura Petrich, Masood Dehghan, Zichen Zhang, Martin Jagersand

Our proposed method can directly learn from raw videos, which removes the need for hand-engineered task specification.

Robotics

End-to-end detection-segmentation network with ROI convolution

1 code implementation8 Jan 2018 Zichen Zhang, Min Tang, Dana Cobzas, Dornoosh Zonoobi, Martin Jagersand, Jacob L. Jaremko

We propose an end-to-end neural network that improves the segmentation accuracy of fully convolutional networks by incorporating a localization unit.

Object Localization Segmentation

Incremental 3D Line Segment Extraction from Semi-dense SLAM

1 code implementation10 Aug 2017 Shida He, Xuebin Qin, Zichen Zhang, Martin Jagersand

This approach reduces a 3D line segment fitting problem into two 2D line segment fitting problems and takes advantage of both images and depth maps.

Clustering Simultaneous Localization and Mapping +1

Cannot find the paper you are looking for? You can Submit a new open access paper.