Search Results for author: Zhifeng Kong

Found 19 papers, 7 papers with code

Audio Dialogues: Dialogues dataset for audio and music understanding

no code implementations11 Apr 2024 Arushi Goel, Zhifeng Kong, Rafael Valle, Bryan Catanzaro

Existing datasets for audio understanding primarily focus on single-turn interactions (i. e. audio captioning, audio question answering) for describing audio in natural language, thus limiting understanding audio via interactive dialogue.

Audio captioning Audio Question Answering +3

Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities

no code implementations2 Feb 2024 Zhifeng Kong, Arushi Goel, Rohan Badlani, Wei Ping, Rafael Valle, Bryan Catanzaro

Augmenting large language models (LLMs) to understand audio -- including non-speech sounds and non-verbal speech -- is critically important for diverse real-world applications of LLMs.

Few-Shot Learning In-Context Learning +2

CleanUNet 2: A Hybrid Speech Denoising Model on Waveform and Spectrogram

no code implementations12 Sep 2023 Zhifeng Kong, Wei Ping, Ambrish Dantrey, Bryan Catanzaro

In this work, we present CleanUNet 2, a speech denoising model that combines the advantages of waveform denoiser and spectrogram denoiser and achieves the best of both worlds.

Denoising Speech Denoising +1

Data Redaction from Conditional Generative Models

no code implementations18 May 2023 Zhifeng Kong, Kamalika Chaudhuri

Deep generative models are known to produce undesirable samples such as harmful content.

Can Membership Inferencing be Refuted?

no code implementations7 Mar 2023 Zhifeng Kong, Amrita Roy Chowdhury, Kamalika Chaudhuri

Given a machine learning model, a data point and some auxiliary information, the goal of an MI attack is to determine whether the data point was used to train the model.

Approximate Data Deletion in Generative Models

no code implementations29 Jun 2022 Zhifeng Kong, Scott Alfeld

Using this framework, we introduce a fast method for approximate data deletion and a statistical test for estimating whether or not training points have been deleted.

Data Redaction from Pre-trained GANs

no code implementations29 Jun 2022 Zhifeng Kong, Kamalika Chaudhuri

Large pre-trained generative models are known to occasionally output undesirable samples, which undermines their trustworthiness.

Speech Denoising in the Waveform Domain with Self-Attention

1 code implementation15 Feb 2022 Zhifeng Kong, Wei Ping, Ambrish Dantrey, Bryan Catanzaro

In this work, we present CleanUNet, a causal speech denoising model on the raw waveform.

Denoising Speech Denoising

On Fast Sampling of Diffusion Probabilistic Models

1 code implementation ICML Workshop INNF 2021 Zhifeng Kong, Wei Ping

In this work, we propose FastDPM, a unified framework for fast sampling in diffusion probabilistic models.

Understanding Instance-based Interpretability of Variational Auto-Encoders

1 code implementation NeurIPS 2021 Zhifeng Kong, Kamalika Chaudhuri

Instance-based interpretation methods have been widely studied for supervised learning methods as they help explain how black box neural networks predict.

Universal Approximation of Residual Flows in Maximum Mean Discrepancy

no code implementations ICML Workshop INNF 2021 Zhifeng Kong, Kamalika Chaudhuri

Normalizing flows are a class of flexible deep generative models that offer easy likelihood computation.

A Geometry-Aware Algorithm to Learn Hierarchical Embeddings in Hyperbolic Space

no code implementations ICLR Workshop GTRL 2021 Zhangyu Wang, Lantian Xu, Zhifeng Kong, Weilong Wang, Xuyu Peng, Enyang Zheng

Hyperbolic embeddings are a class of representation learning methods that offer competitive performances when data can be abstracted as a tree-like graph.

Representation Learning

DiffWave: A Versatile Diffusion Model for Audio Synthesis

11 code implementations ICLR 2021 Zhifeng Kong, Wei Ping, Jiaji Huang, Kexin Zhao, Bryan Catanzaro

In this work, we propose DiffWave, a versatile diffusion probabilistic model for conditional and unconditional waveform generation.

Audio Synthesis Speech Synthesis

The Expressive Power of a Class of Normalizing Flow Models

no code implementations31 May 2020 Zhifeng Kong, Kamalika Chaudhuri

Normalizing flows have received a great deal of recent attention as they allow flexible generative modeling as well as easy likelihood computation.

Fastened CROWN: Tightened Neural Network Robustness Certificates

1 code implementation2 Dec 2019 Zhaoyang Lyu, Ching-Yun Ko, Zhifeng Kong, Ngai Wong, Dahua Lin, Luca Daniel

We draw inspiration from such work and further demonstrate the optimality of deterministic CROWN (Zhang et al. 2018) solutions in a given linear programming problem under mild constraints.

Convergence Analysis of the Dynamics of a Special Kind of Two-Layered Neural Networks with $\ell_1$ and $\ell_2$ Regularization

1 code implementation19 Nov 2017 Zhifeng Kong

In this paper, we made an extension to the convergence analysis of the dynamics of two-layered bias-free networks with one $ReLU$ output.

WristAuthen: A Dynamic Time Wrapping Approach for User Authentication by Hand-Interaction through Wrist-Worn Devices

no code implementations22 Oct 2017 Qi Lyu, Zhifeng Kong, Chao Shen, Tianwei Yue

This paper presents a novel user authentication system through wrist-worn devices by analyzing the interaction behavior with users, which is both accurate and efficient for future usage.

Generative Adversarial Networks with Inverse Transformation Unit

no code implementations27 Sep 2017 Zhifeng Kong, Shuo Ding

In this paper we introduce a new structure to Generative Adversarial Networks by adding an inverse transformation unit behind the generator.

Cannot find the paper you are looking for? You can Submit a new open access paper.