Search Results for author: Han Lin

Found 18 papers, 7 papers with code

DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation

no code implementations25 Nov 2024 Zun Wang, Jialu Li, Han Lin, Jaehong Yoon, Mohit Bansal

To address these challenges, we propose DreamRunner, a novel story-to-video generation method: First, we structure the input script using a large language model (LLM) to facilitate both coarse-grained scene planning as well as fine-grained object-level layout and motion planning.

Large Language Model Motion Planning +4

VEDIT: Latent Prediction Architecture For Procedural Video Representation Learning

no code implementations4 Oct 2024 Han Lin, Tushar Nagarajan, Nicolas Ballas, Mido Assran, Mojtaba Komeili, Mohit Bansal, Koustuv Sinha

In this work, we show that a strong off-the-shelf frozen pretrained visual encoder, along with a well designed prediction model, can achieve state-of-the-art (SoTA) performance in forecasting and procedural planning without the need for pretraining the prediction model, nor requiring additional supervision from language or ASR.

Action Anticipation Denoising +2

Fast Tree-Field Integrators: From Low Displacement Rank to Topological Transformers

1 code implementation22 Jun 2024 Krzysztof Choromanski, Arijit Sehanobish, Somnath Basu Roy Chowdhury, Han Lin, Avinava Dubey, Tamas Sarlos, Snigdha Chaturvedi

We present a new class of fast polylog-linear algorithms based on the theory of structured matrices (in particular low displacement rank) for integrating tensor fields defined on weighted trees.

Graph Classification

MCM: Multi-condition Motion Synthesis Framework

1 code implementation19 Apr 2024 Zeyu Ling, Bo Han, Yongkang Wongkan, Han Lin, Mohan Kankanhalli, Weidong Geng

Conditional human motion synthesis (HMS) aims to generate human motion sequences that conform to specific conditions.

Motion Synthesis

Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model

no code implementations15 Apr 2024 Han Lin, Jaemin Cho, Abhay Zala, Mohit Bansal

ControlNets are widely used for adding spatial control to text-to-image diffusion models with different conditions, such as depth maps, scribbles/sketches, and human poses.

Image Generation Style Transfer +3

EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents

no code implementations18 Mar 2024 Abhay Zala, Jaemin Cho, Han Lin, Jaehong Yoon, Mohit Bansal

Then, we enable the LLM to continuously adapt the generated environments to progressively improve the skills that the agent is weak at, by providing feedback to the LLM in the form of the agent's performance.

Reinforcement Learning (RL) World Knowledge

DiagrammerGPT: Generating Open-Domain, Open-Platform Diagrams via LLM Planning

no code implementations18 Oct 2023 Abhay Zala, Han Lin, Jaemin Cho, Mohit Bansal

In the second stage, we use a diagram generator, DiagramGLIGEN, and a text label rendering module to generate diagrams (with clear text labels) following the diagram plans.

VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning

no code implementations26 Sep 2023 Han Lin, Abhay Zala, Jaemin Cho, Mohit Bansal

Our experiments demonstrate that our proposed VideoDirectorGPT framework substantially improves layout and movement control in both single- and multi-scene video generation and can generate multi-scene videos with consistency, while achieving competitive performance with SOTAs in open-domain single-scene T2V generation.

Image Generation Video Generation

Supervised Masked Knowledge Distillation for Few-Shot Transformers

1 code implementation CVPR 2023 Han Lin, Guangxing Han, Jiawei Ma, Shiyuan Huang, Xudong Lin, Shih-Fu Chang

Vision Transformers (ViTs) emerge to achieve impressive performance on many data-abundant computer vision tasks by capturing long-range dependencies among local features.

Few-Shot Learning Inductive Bias +1

Self-supervised Learning for Segmentation and Quantification of Dopamine Neurons in Parkinson's Disease

no code implementations11 Jan 2023 Fatemeh Haghighi, Soumitra Ghosh, Hai Ngu, Sarah Chu, Han Lin, Mohsen Hejrati, Baris Bingol, Somaye Hashemifar

To this end, we propose an end-to-end deep learning framework based on self-supervised learning for the segmentation and quantification of dopaminergic neurons in PD animal models.

Deep Learning Self-Supervised Learning

TANDEM3D: Active Tactile Exploration for 3D Object Recognition

no code implementations19 Sep 2022 Jingxi Xu, Han Lin, Shuran Song, Matei Ciocarlie

In this work, we propose TANDEM3D, a method that applies a co-training framework for exploration and decision making to 3D object recognition with tactile signals.

3D Object Recognition Decision Making +1

Hybrid Random Features

1 code implementation ICLR 2022 Krzysztof Choromanski, Haoxian Chen, Han Lin, Yuanzhe Ma, Arijit Sehanobish, Deepali Jain, Michael S Ryoo, Jake Varley, Andy Zeng, Valerii Likhosherstov, Dmitry Kalashnikov, Vikas Sindhwani, Adrian Weller

We propose a new class of random feature methods for linearizing softmax and Gaussian kernels called hybrid random features (HRFs) that automatically adapt the quality of kernel estimation to provide most accurate approximation in the defined regions of interest.

Benchmarking

From block-Toeplitz matrices to differential equations on graphs: towards a general theory for scalable masked Transformers

1 code implementation16 Jul 2021 Krzysztof Choromanski, Han Lin, Haoxian Chen, Tianyi Zhang, Arijit Sehanobish, Valerii Likhosherstov, Jack Parker-Holder, Tamas Sarlos, Adrian Weller, Thomas Weingarten

In this paper we provide, to the best of our knowledge, the first comprehensive approach for incorporating various masking mechanisms into Transformers architectures in a scalable way.

Graph Attention

The Tsinghua University-Ma Huateng Telescopes for Survey: Overview and Performance of the System

no code implementations21 Dec 2020 Ji-Cheng Zhang, Xiao-Feng Wang, Jun Mo, Gao-Bo Xi, Jie Lin, Xiao-Jun Jiang, Xiao-Ming Zhang, Wen-Xiong Li, Sheng-Yu Yan, Zhi-Hao Chen, Lei Hu, Xue Li, Wei-Li Lin, Han Lin, Cheng Miao, Li-Ming Rui, Han-Na Sai, Dan-Feng Xiang, Xing-Han Zhang

The TMTS system can have a FoV of about 9 deg2 when monitoring the sky with two bands (i. e., SDSS g and r filters) at the same time, and a maximum FoV of ~18 deg2 when four telescopes monitor different sky areas in monochromatic filter mode.

Instrumentation and Methods for Astrophysics

Demystifying Orthogonal Monte Carlo and Beyond

no code implementations NeurIPS 2020 Han Lin, Haoxian Chen, Tianyi Zhang, Clement Laroche, Krzysztof Choromanski

Orthogonal Monte Carlo (OMC) is a very effective sampling algorithm imposing structural geometric conditions (orthogonality) on samples for variance reduction.

Cannot find the paper you are looking for? You can Submit a new open access paper.