Search Results for author: Ziyang Chen

Found 23 papers, 13 papers with code

PSDNet and DPDNet: Efficient channel expansion, Depthwise-Pointwise-Depthwise Inverted Bottleneck Block

no code implementations3 Sep 2019 Guoqing Li, Meng Zhang, Qianru Zhang, Ziyang Chen, Wenzhao Liu, Jiaojie Li, Xuzhao Shen, Jianjun Li, Zhenyu Zhu, Chau Yuen

To design more efficient lightweight concolutional neural netwok, Depthwise-Pointwise-Depthwise inverted bottleneck block (DPD block) is proposed and DPDNet is designed by stacking DPD block.

Structure from Silence: Learning Scene Structure from Ambient Sound

1 code implementation10 Nov 2021 Ziyang Chen, Xixi Hu, Andrew Owens

From whirling ceiling fans to ticking clocks, the sounds that we hear subtly vary as we move through a scene.

Sound Localization by Self-Supervised Time Delay Estimation

1 code implementation26 Apr 2022 Ziyang Chen, David F. Fouhey, Andrew Owens

We adapt the contrastive random walk of Jabri et al. to learn a cycle-consistent representation from unlabeled stereo sounds, resulting in a model that performs on par with supervised methods on "in the wild" internet recordings.

Contrastive Learning Visual Tracking

A Robotic Visual Grasping Design: Rethinking Convolution Neural Network with High-Resolutions

1 code implementation15 Sep 2022 Zhangli Zhou, Shaochen Wang, Ziyang Chen, Mingyu Cai, Zhen Kan

We demonstrate that using parallel branches as opposed to serial stacked convolutional layers will be a more powerful design for robotic visual grasping tasks.

Robotic Grasping

Mix and Localize: Localizing Sound Sources in Mixtures

no code implementations CVPR 2022 Xixi Hu, Ziyang Chen, Andrew Owens

This task requires a model to both group a sound mixture into individual sources, and to associate them with a visual signal.

Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation

2 code implementations ICCV 2023 Ziyang Chen, Shengyi Qian, Andrew Owens

In this paper, we use these cues to solve a problem we call Sound Localization from Motion (SLfM): jointly estimating camera rotation and localizing sound sources.

UniSeg: A Prompt-driven Universal Segmentation Model as well as A Strong Representation Learner

1 code implementation7 Apr 2023 Yiwen Ye, Yutong Xie, Jianpeng Zhang, Ziyang Chen, Yong Xia

Moreover, UniSeg also beats other pre-trained models on two downstream datasets, providing the community with a high-quality pre-trained model for 3D medical image segmentation.

Image Segmentation Medical Image Segmentation +2

Reconstruction-driven Dynamic Refinement based Unsupervised Domain Adaptation for Joint Optic Disc and Cup Segmentation

no code implementations10 Apr 2023 Ziyang Chen, Yongsheng Pan, Yong Xia

The reconstruction alignment (RA) module uses a variational auto-encoder (VAE) to reconstruct the input image and thus boosts the image representation ability of the network in a self-supervised way.

Edge Detection Segmentation +1

Conditional Generation of Audio from Video via Foley Analogies

1 code implementation CVPR 2023 Yuexi Du, Ziyang Chen, Justin Salamon, Bryan Russell, Andrew Owens

Second, we propose a model for generating a soundtrack for a silent input video, given a user-supplied example that specifies what the video should "sound like".

Treasure in Distribution: A Domain Randomization based Multi-Source Domain Generalization for 2D Medical Image Segmentation

1 code implementation31 May 2023 Ziyang Chen, Yongsheng Pan, Yiwen Ye, Hengfei Cui, Yong Xia

In this paper, we propose a multi-source DG method called Treasure in Distribution (TriD), which constructs an unprecedented search space to obtain the model with strong robustness by randomly sampling from a uniform distribution.

Domain Generalization Image Segmentation +2

MaxMin-L2-SVC-NCH: A Novel Approach for Support Vector Classifier Training and Parameter Selection

no code implementations14 Jul 2023 Linkai Luo, Qiaoling Yang, Hong Peng, Yiding Wang, Ziyang Chen

We first formulate the training and parameter selection of SVC as a minimax optimization problem named as MaxMin-L2-SVC-NCH, in which the minimization problem is an optimization problem of finding the closest points between two normal convex hulls (L2-SVC-NCH) while the maximization problem is an optimization problem of finding the optimal Gaussian kernel parameters.

SAMN: A Sample Attention Memory Network Combining SVM and NN in One Architecture

no code implementations25 Sep 2023 Qiaoling Yang, Linkai Luo, Haoyu Zhang, Hong Peng, Ziyang Chen

To address this, we propose a sample attention memory network (SAMN) that effectively combines SVM and NN by incorporating sample attention module, class prototypes, and memory block to NN.

A Survey of Large Language Models Attribution

1 code implementation7 Nov 2023 Dongfang Li, Zetian Sun, Xinshuo Hu, Zhenyu Liu, Ziyang Chen, Baotian Hu, Aiguo Wu, Min Zhang

Open-domain generative systems have gained significant attention in the field of conversational AI (e. g., generative search engines).

Temporal Knowledge Question Answering via Abstract Reasoning Induction

no code implementations15 Nov 2023 Ziyang Chen, Dongfang Li, Xiang Zhao, Baotian Hu, Min Zhang

In this paper, we tackle the significant challenge of temporal knowledge reasoning in Large Language Models (LLMs), an area where such models frequently encounter difficulties.

Question Answering

Continual Self-supervised Learning: Towards Universal Multi-modal Medical Data Representation Learning

1 code implementation29 Nov 2023 Yiwen Ye, Yutong Xie, Jianpeng Zhang, Ziyang Chen, Qi Wu, Yong Xia

In this paper, we reconsider versatile self-supervised learning from the perspective of continual learning and propose MedCoSS, a continuous self-supervised learning approach for multi-modal medical data.

Continual Learning Representation Learning +1

Each Test Image Deserves A Specific Prompt: Continual Test-Time Adaptation for 2D Medical Image Segmentation

1 code implementation30 Nov 2023 Ziyang Chen, Yiwen Ye, Mengkang Lu, Yongsheng Pan, Yong Xia

Distribution shift widely exists in medical images acquired from different medical centres and poses a significant obstacle to deploying the pre-trained semantic segmentation model in real-world applications.

Image Segmentation Medical Image Segmentation +2

Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark

no code implementations27 Mar 2024 Ziyang Chen, Israel D. Gebru, Christian Richardt, Anurag Kumar, William Laney, Andrew Owens, Alexander Richard

The dataset includes high-quality and densely captured room impulse response data paired with multi-view images, and precise 6DoF pose tracking data for sound emitters and listeners in the rooms.

Few-Shot Learning Pose Tracking +1

MoCha-Stereo: Motif Channel Attention Network for Stereo Matching

2 code implementations10 Apr 2024 Ziyang Chen, Wei Long, He Yao, Yongjun Zhang, Bingshu Wang, Yongbin Qin, Jia Wu

In addition, edge variations in %potential feature channels of the reconstruction error map also affect details matching, we propose the Reconstruction Error Motif Penalty (REMP) module to further refine the full-resolution disparity estimation.

Disparity Estimation Stereo Depth Estimation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.