Search Results for author: Ziyang Chen

Found 23 papers, 13 papers with code

PSDNet and DPDNet: Efficient channel expansion, Depthwise-Pointwise-Depthwise Inverted Bottleneck Block

no code implementations • 3 Sep 2019 • Guoqing Li, Meng Zhang, Qianru Zhang, Ziyang Chen, Wenzhao Liu, Jiaojie Li, Xuzhao Shen, Jianjun Li, Zhenyu Zhu, Chau Yuen

To design more efficient lightweight concolutional neural netwok, Depthwise-Pointwise-Depthwise inverted bottleneck block (DPD block) is proposed and DPDNet is designed by stacking DPD block.

Paper
Add Code

Reconstructing Images of Two Adjacent Objects through Scattering Medium Using Generative Adversarial Network

no code implementations • 24 Jul 2021 • Xuetian Lai, Qiongyao Li, Ziyang Chen, Xiaopeng Shao, Jixiong Pu

It is shown that based on the trained YGAN, we can reconstruct images of two adjacent objects from one speckle pattern with high fidelity.

Generative Adversarial Network Image Classification +2

Paper
Add Code

Structure from Silence: Learning Scene Structure from Ambient Sound

1 code implementation • 10 Nov 2021 • Ziyang Chen, Xixi Hu, Andrew Owens

From whirling ceiling fans to ticking clocks, the sounds that we hear subtly vary as we move through a scene.

Paper
Code

Sound Localization by Self-Supervised Time Delay Estimation

1 code implementation • 26 Apr 2022 • Ziyang Chen, David F. Fouhey, Andrew Owens

We adapt the contrastive random walk of Jabri et al. to learn a cycle-consistent representation from unlabeled stereo sounds, resulting in a model that performs on par with supervised methods on "in the wild" internet recordings.

Contrastive Learning Visual Tracking

Paper
Code

A Robotic Visual Grasping Design: Rethinking Convolution Neural Network with High-Resolutions

1 code implementation • 15 Sep 2022 • Zhangli Zhou, Shaochen Wang, Ziyang Chen, Mingyu Cai, Zhen Kan

We demonstrate that using parallel branches as opposed to serial stacked convolutional layers will be a more powerful design for robotic visual grasping tasks.

Robotic Grasping

Paper
Code

Large Language Models Meet Harry Potter: A Bilingual Dataset for Aligning Dialogue Agents with Characters

1 code implementation • 13 Nov 2022 • Nuo Chen, Yan Wang, Haiyun Jiang, Deng Cai, Yuhan Li, Ziyang Chen, Longyue Wang, Jia Li

In this paper, we introduce the Harry Potter Dialogue (HPD) dataset, designed to advance the study of dialogue agents and character alignment.

Ranked #1 on Persona Dialogue in Story on Harry Potter Dialogue Dataset

Dialogue Generation In-Context Learning +2

Paper
Code

Mix and Localize: Localizing Sound Sources in Mixtures

no code implementations • CVPR 2022 • Xixi Hu, Ziyang Chen, Andrew Owens

This task requires a model to both group a sound mixture into individual sources, and to associate them with a visual signal.

Paper
Add Code

Self-Supervised Video Forensics by Audio-Visual Anomaly Detection

no code implementations • CVPR 2023 • Chao Feng, Ziyang Chen, Andrew Owens

Manipulated videos often contain subtle inconsistencies between their visual and audio signals.

Ranked #3 on DeepFake Detection on FakeAVCeleb

Anomaly Detection DeepFake Detection +1

Paper
Add Code

Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation

2 code implementations • ICCV 2023 • Ziyang Chen, Shengyi Qian, Andrew Owens

In this paper, we use these cues to solve a problem we call Sound Localization from Motion (SLfM): jointly estimating camera rotation and localizing sound sources.

Paper
Code

UniSeg: A Prompt-driven Universal Segmentation Model as well as A Strong Representation Learner

1 code implementation • 7 Apr 2023 • Yiwen Ye, Yutong Xie, Jianpeng Zhang, Ziyang Chen, Yong Xia

Moreover, UniSeg also beats other pre-trained models on two downstream datasets, providing the community with a high-quality pre-trained model for 3D medical image segmentation.

Image Segmentation Medical Image Segmentation +2

146

Paper
Code

Reconstruction-driven Dynamic Refinement based Unsupervised Domain Adaptation for Joint Optic Disc and Cup Segmentation

no code implementations • 10 Apr 2023 • Ziyang Chen, Yongsheng Pan, Yong Xia

The reconstruction alignment (RA) module uses a variational auto-encoder (VAE) to reconstruct the input image and thus boosts the image representation ability of the network in a self-supervised way.

Edge Detection Segmentation +1

Paper
Add Code

Conditional Generation of Audio from Video via Foley Analogies

1 code implementation • CVPR 2023 • Yuexi Du, Ziyang Chen, Justin Salamon, Bryan Russell, Andrew Owens

Second, we propose a model for generating a soundtrack for a silent input video, given a user-supplied example that specifies what the video should "sound like".

Paper
Code

Treasure in Distribution: A Domain Randomization based Multi-Source Domain Generalization for 2D Medical Image Segmentation

1 code implementation • 31 May 2023 • Ziyang Chen, Yongsheng Pan, Yiwen Ye, Hengfei Cui, Yong Xia

In this paper, we propose a multi-source DG method called Treasure in Distribution (TriD), which constructs an unprecedented search space to obtain the model with strong robustness by randomly sampling from a uniform distribution.

Domain Generalization Image Segmentation +2

Paper
Code

MaxMin-L2-SVC-NCH: A Novel Approach for Support Vector Classifier Training and Parameter Selection

no code implementations • 14 Jul 2023 • Linkai Luo, Qiaoling Yang, Hong Peng, Yiding Wang, Ziyang Chen

We first formulate the training and parameter selection of SVC as a minimax optimization problem named as MaxMin-L2-SVC-NCH, in which the minimization problem is an optimization problem of finding the closest points between two normal convex hulls (L2-SVC-NCH) while the maximization problem is an optimization problem of finding the optimal Gaussian kernel parameters.

Paper
Add Code

SAMN: A Sample Attention Memory Network Combining SVM and NN in One Architecture

no code implementations • 25 Sep 2023 • Qiaoling Yang, Linkai Luo, Haoyu Zhang, Hong Peng, Ziyang Chen

To address this, we propose a sample attention memory network (SAMN) that effectively combines SVM and NN by incorporating sample attention module, class prototypes, and memory block to NN.

Paper
Add Code

A Survey of Large Language Models Attribution

1 code implementation • 7 Nov 2023 • Dongfang Li, Zetian Sun, Xinshuo Hu, Zhenyu Liu, Ziyang Chen, Baotian Hu, Aiguo Wu, Min Zhang

Open-domain generative systems have gained significant attention in the field of conversational AI (e. g., generative search engines).

112

Paper
Code

Temporal Knowledge Question Answering via Abstract Reasoning Induction

no code implementations • 15 Nov 2023 • Ziyang Chen, Dongfang Li, Xiang Zhao, Baotian Hu, Min Zhang

In this paper, we tackle the significant challenge of temporal knowledge reasoning in Large Language Models (LLMs), an area where such models frequently encounter difficulties.

Question Answering

Paper
Add Code

Continual Self-supervised Learning: Towards Universal Multi-modal Medical Data Representation Learning

1 code implementation • 29 Nov 2023 • Yiwen Ye, Yutong Xie, Jianpeng Zhang, Ziyang Chen, Qi Wu, Yong Xia

In this paper, we reconsider versatile self-supervised learning from the perspective of continual learning and propose MedCoSS, a continuous self-supervised learning approach for multi-modal medical data.

Continual Learning Representation Learning +1

Paper
Code

Each Test Image Deserves A Specific Prompt: Continual Test-Time Adaptation for 2D Medical Image Segmentation

1 code implementation • 30 Nov 2023 • Ziyang Chen, Yiwen Ye, Mengkang Lu, Yongsheng Pan, Yong Xia

Distribution shift widely exists in medical images acquired from different medical centres and poses a significant obstacle to deploying the pre-trained semantic segmentation model in real-world applications.

Image Segmentation Medical Image Segmentation +2

Paper
Code

SegRap2023: A Benchmark of Organs-at-Risk and Gross Tumor Volume Segmentation for Radiotherapy Planning of Nasopharyngeal Carcinoma

1 code implementation • 15 Dec 2023 • Xiangde Luo, Jia Fu, Yunxin Zhong, Shuolin Liu, Bing Han, Mehdi Astaraki, Simone Bendazzoli, Iuliana Toma-Dasu, Yiwen Ye, Ziyang Chen, Yong Xia, Yanzhou Su, Jin Ye, Junjun He, Zhaohu Xing, Hongqiu Wang, Lei Zhu, Kaixiang Yang, Xin Fang, Zhiwei Wang, Chan Woong Lee, Sang Joon Park, Jaehee Chun, Constantin Ulrich, Klaus H. Maier-Hein, Nchongmaje Ndipenoch, Alina Miron, Yongmin Li, Yimeng Zhang, Yu Chen, Lu Bai, Jinlong Huang, Chengyang An, Lisheng Wang, Kaiwen Huang, Yunqi Gu, Tao Zhou, Mu Zhou, Shichuan Zhang, Wenjun Liao, Guotai Wang, Shaoting Zhang

The precise delineation of Gross Tumor Volumes (GTVs) and Organs-At-Risk (OARs) is crucial in radiation treatment, directly impacting patient prognosis.

Computed Tomography (CT) Image Segmentation +3

Paper
Code

Binding Touch to Everything: Learning Unified Multimodal Tactile Representations

no code implementations • 31 Jan 2024 • Fengyu Yang, Chao Feng, Ziyang Chen, Hyoungseob Park, Daniel Wang, Yiming Dou, Ziyao Zeng, Xien Chen, Rit Gangopadhyay, Andrew Owens, Alex Wong

We introduce UniTouch, a unified tactile model for vision-based touch sensors connected to multiple modalities, including vision, language, and sound.

Question Answering Visual Question Answering (VQA)

Paper
Add Code

Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark

no code implementations • 27 Mar 2024 • Ziyang Chen, Israel D. Gebru, Christian Richardt, Anurag Kumar, William Laney, Andrew Owens, Alexander Richard

The dataset includes high-quality and densely captured room impulse response data paired with multi-view images, and precise 6DoF pose tracking data for sound emitters and listeners in the rooms.

Few-Shot Learning Pose Tracking +1

Paper
Add Code

MoCha-Stereo: Motif Channel Attention Network for Stereo Matching

2 code implementations • 10 Apr 2024 • Ziyang Chen, Wei Long, He Yao, Yongjun Zhang, Bingshu Wang, Yongbin Qin, Jia Wu

In addition, edge variations in %potential feature channels of the reconstruction error map also affect details matching, we propose the Reconstruction Error Motif Penalty (REMP) module to further refine the full-resolution disparity estimation.

Ranked #1 on Stereo Depth Estimation on KITTI 2015

Disparity Estimation Stereo Depth Estimation +2

131

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.