Search Results for author: Susan Liang

Found 8 papers, 1 papers with code

Approximated Likelihood Ratio: A Forward-Only and Parallel Framework for Boosting Neural Network Training

no code implementations • 18 Mar 2024 • Zeliang Zhang, Jinyang Jiang, Zhuo Liu, Susan Liang, Yijie Peng, Chenliang Xu

In this paper, we introduce an approximation technique for the likelihood ratio (LR) method to alleviate computational and memory demands in gradient estimation.

Paper
Add Code

Video Understanding with Large Language Models: A Survey

1 code implementation • 29 Dec 2023 • Yunlong Tang, Jing Bi, Siting Xu, Luchuan Song, Susan Liang, Teng Wang, Daoan Zhang, Jie An, Jingyang Lin, Rongyi Zhu, Ali Vosoughi, Chao Huang, Zeliang Zhang, Feng Zheng, JianGuo Zhang, Ping Luo, Jiebo Luo, Chenliang Xu

With the burgeoning growth of online video platforms and the escalating volume of video content, the demand for proficient video understanding tools has intensified markedly.

Video Understanding

641

Paper
Code

Scalable CP Decomposition for Tensor Learning using GPU Tensor Cores

no code implementations • 22 Nov 2023 • Zeliang Zhang, Zhuo Liu, Susan Liang, Zhiyuan Wang, Yifan Zhu, Chen Ding, Chenliang Xu

However, the application of tensor decomposition is largely hindered by the exponential increment of the computational complexity and storage consumption with the size of tensors.

Computational Efficiency Tensor Decomposition

Paper
Add Code

Neural Acoustic Context Field: Rendering Realistic Room Impulse Response With Neural Fields

no code implementations • 27 Sep 2023 • Susan Liang, Chao Huang, Yapeng Tian, Anurag Kumar, Chenliang Xu

Room impulse response (RIR), which measures the sound propagation within an environment, is critical for synthesizing high-fidelity audio for a given environment.

Room Impulse Response (RIR)

Paper
Add Code

DAVIS: High-Quality Audio-Visual Separation with Generative Diffusion Models

no code implementations • 31 Jul 2023 • Chao Huang, Susan Liang, Yapeng Tian, Anurag Kumar, Chenliang Xu

We propose DAVIS, a Diffusion model-based Audio-VIusal Separation framework that solves the audio-visual sound source separation task through a generative manner.

Paper
Add Code

UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2022

no code implementations • 22 Jun 2022 • Yuanhang Zhang, Susan Liang, Shuang Yang, Shiguang Shan

This report presents a brief description of our winning solution to the AVA Active Speaker Detection (ASD) task at ActivityNet Challenge 2022.

Ranked #2 on Audio-Visual Active Speaker Detection on AVA-ActiveSpeaker

Audio-Visual Active Speaker Detection

Paper
Add Code

UniCon: Unified Context Network for Robust Active Speaker Detection

no code implementations • 5 Aug 2021 • Yuanhang Zhang, Susan Liang, Shuang Yang, Xiao Liu, Zhongqin Wu, Shiguang Shan, Xilin Chen

Our solution is a novel, unified framework that focuses on jointly modeling multiple types of contextual information: spatial context to indicate the position and scale of each candidate's face, relational context to capture the visual relationships among the candidates and contrast audio-visual affinities with each other, and temporal context to aggregate long-term information and smooth out local uncertainties.

Ranked #10 on Audio-Visual Active Speaker Detection on AVA-ActiveSpeaker

Audio-Visual Active Speaker Detection

Paper
Add Code

ICTCAS-UCAS-TAL Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2021

no code implementations • The ActivityNet Large-Scale Activity Recognition Challenge Workshop, CVPR 2021 • Yuanhang Zhang, Susan Liang, Shuang Yang, Xiao Liu, Zhongqin Wu, Shiguang Shan

This report presents a brief description of our method for the AVA Active Speaker Detection (ASD) task at ActivityNet Challenge 2021.

Ranked #6 on Audio-Visual Active Speaker Detection on AVA-ActiveSpeaker

Audio-Visual Active Speaker Detection

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.