Search Results for author: Emad Barsoum

Found 23 papers, 7 papers with code

Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE

1 code implementation10 Feb 2025 Haiduo Huang, Fuwei Yang, Zhenhua Liu, Yixing Xu, Jinze Li, Yang Liu, Xuanwu Yin, Dong Li, Pengju Ren, Emad Barsoum

Speculative decoding (SD) accelerates large language model inference by using a smaller draft model to predict multiple tokens, which are then verified in parallel by the larger target model.

Diversity Language Modeling +2

Edit as You See: Image-guided Video Editing via Masked Motion Modeling

no code implementations8 Jan 2025 Zhi-Lin Huang, Yixuan Liu, Chujun Qin, Zhongdao Wang, Dong Zhou, Dong Li, Emad Barsoum

In this paper, we propose a novel Image-guided Video Editing Diffusion model, termed IVEDiff for the image-guided video editing.

Optical Flow Estimation Self-Supervised Learning +1

Agent Laboratory: Using LLM Agents as Research Assistants

no code implementations8 Jan 2025 Samuel Schmidgall, Yusheng Su, Ze Wang, Ximeng Sun, Jialian Wu, Xiaodong Yu, Jiang Liu, Zicheng Liu, Emad Barsoum

Historically, scientific discovery has been a lengthy and costly process, demanding substantial time and resources from initial conception to final results.

scientific discovery

MSWA: Refining Local Attention with Multi-ScaleWindow Attention

no code implementations2 Jan 2025 Yixing Xu, Shivank Nag, Dong Li, Lu Tian, Emad Barsoum

Sliding window attention (SWA) solves this problem by restricting the attention range to a fixed-size local context window.

Common Sense Reasoning Language Modeling +1

ReNeg: Learning Negative Embedding with Reward Guidance

2 code implementations27 Dec 2024 Xiaomin Li, Yixuan Liu, Takashi Isobe, Xu Jia, Qinpeng Cui, Dong Zhou, Dong Li, You He, Huchuan Lu, Zhongdao Wang, Emad Barsoum

In text-to-image (T2I) generation applications, negative embeddings have proven to be a simple yet effective approach for enhancing generation quality.

FTP: A Fine-grained Token-wise Pruner for Large Language Models via Token Routing

no code implementations16 Dec 2024 Zekai Li, Jintu Zheng, Ji Liu, Han Liu, Haowei Zhu, Zeping Li, Fuwei Yang, Haiduo Huang, Jinzhang Peng, Dong Li, Lu Tian, Emad Barsoum

To address these issues, we propose a fine-grained token-wise pruning approach for the LLMs, which presents a learnable router to adaptively identify the less important tokens and skip them across model blocks to reduce computational cost during inference.

SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer

1 code implementation14 Dec 2024 Hao Chen, Ze Wang, Xiang Li, Ximeng Sun, Fangyi Chen, Jiang Liu, Jindong Wang, Bhiksha Raj, Zicheng Liu, Emad Barsoum

With its fully-differentiable design and semantic-rich latent space, our experiment demonstrates that SoftVQ-VAE achieves efficient tokenization without compromising generation quality, paving the way for more efficient generative models.

Denoising Image Generation

Fast Occupancy Network

no code implementations10 Dec 2024 Mingjie Lu, Yuanxian Huang, Ji Liu, Xingliang Huang, Dong Li, Jinzhang Peng, Lu Tian, Emad Barsoum

To address this problem, we make an analysis of the bottleneck of Occupancy Network inference cost, and present a simple and fast Occupancy Network model, which adopts a deformable 2D convolutional layer to lift BEV feature to 3D voxel feature and presents an efficient voxel feature pyramid network (FPN) module to improve performance with few computational cost.

Autonomous Driving

Taming Diffusion Prior for Image Super-Resolution with Domain Shift SDEs

2 code implementations26 Sep 2024 Qinpeng Cui, Yixuan Liu, Xinyi Zhang, Qiqi Bao, Qingmin Liao, Li Wang, Tian Lu, Zicheng Liu, Zhongdao Wang, Emad Barsoum

In this paper, we present DoSSR, a Domain Shift diffusion-based SR model that capitalizes on the generative powers of pretrained diffusion models while significantly enhancing efficiency by initiating the diffusion process with low-resolution (LR) images.

Image Restoration Image Super-Resolution

Enhancing One-shot Pruned Pre-trained Language Models through Sparse-Dense-Sparse Mechanism

no code implementations20 Aug 2024 Guanchen Li, Xiandong Zhao, Lian Liu, Zeping Li, Dong Li, Lu Tian, Jie He, Ashish Sirasao, Emad Barsoum

Next, we reconstruct a dense model featuring a pruning-friendly weight distribution by reactivating pruned connections with sparse regularization.

Amphista: Bi-directional Multi-head Decoding for Accelerating LLM Inference

no code implementations19 Jun 2024 Zeping Li, Xinlong Yang, Ziheng Gao, Ji Liu, Guanchen Li, Zhuang Liu, Dong Li, Jinzhang Peng, Lu Tian, Emad Barsoum

On MT-Bench, Amphista delivers up to 2. 75$\times$ speedup over vanilla autoregressive decoding and 1. 40$\times$ over Medusa on Vicuna 33B in wall-clock time.

LADDER: An Efficient Framework for Video Frame Interpolation

no code implementations17 Apr 2024 Tong Shen, Dong Li, Ziheng Gao, Lu Tian, Emad Barsoum

Video Frame Interpolation (VFI) is a crucial technique in various applications such as slow-motion generation, frame rate conversion, video frame restoration etc.

Decoder Motion Generation +1

Sparse Laneformer

no code implementations11 Apr 2024 Ji Liu, Zifeng Zhang, Mingjie Lu, Hongyang Wei, Dong Li, Yile Xie, Jinzhang Peng, Lu Tian, Ashish Sirasao, Emad Barsoum

We analyze that dense anchors are not necessary for lane detection, and propose a transformer-based lane detection framework based on a sparse anchor mechanism.

Autonomous Driving Lane Detection

3D Human motion anticipation and classification

no code implementations31 Dec 2020 Emad Barsoum, John Kender, Zicheng Liu

Our model learns to predict multiple future sequences of human poses from the same input sequence.

Action Recognition Classification +6

Scaling Distributed Training with Adaptive Summation

no code implementations4 Jun 2020 Saeed Maleki, Madan Musuvathi, Todd Mytkowicz, Olli Saarikivi, Tianju Xu, Vadim Eksarevskiy, Jaliya Ekanayake, Emad Barsoum

This paper introduces a novel method to combine gradients called Adasum (for adaptive sum) that converges faster than prior work.

16k

NGEMM: Optimizing GEMM for Deep Learning via Compiler-based Techniques

no code implementations1 Oct 2019 Wenlei Bao, Li-Wen Chang, Yang Chen, Ke Deng, Amit Agarwal, Emad Barsoum, Abe Taha

Various approaches have been developed by leveraging techniques such as vectorization and memory layout to improve the performance of integer GEMM.

Deep Learning Quantization

Object Localization with a Weakly Supervised CapsNet

no code implementations20 May 2018 Weitang Liu, Emad Barsoum, John D. Owens

Our model can learn and derive the coordinates of the digits better than its convolution counterpart that lacks a routing-by-agreement algorithm, and can also perform well when testing on the multi-digit moving MNIST and KTH datasets.

Object Object Localization +3

HP-GAN: Probabilistic 3D human motion prediction via GAN

3 code implementations27 Nov 2017 Emad Barsoum, John Kender, Zicheng Liu

Our model, which we call HP-GAN, learns a probability density function of future human poses conditioned on previous poses.

Autonomous Vehicles Human motion prediction +6

Articulated Hand Pose Estimation Review

no code implementations21 Apr 2016 Emad Barsoum

In this paper, we focus on reviewing recent progress of hand pose estimation from depth sensor.

Hand Pose Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.