no code implementations • CCL 2021 • Dongzhen Wen, Fan Zhang, Xiao Zhang, Liang Yang, Yuan Lin, Bo Xu, Hongfei Lin
“软件源代码的理解则是软件协同开发与维护的核心, 而源代码中占半数以上的标识符的理解则在软件理解中起到重要作用, 传统软件工程主要研究通过命名规范限制标识符的命名过程以构造更易理解和交流的标识符。本文则在梳理分析常见编程语言命名规范的基础上, 提出一种全新的标识符可理解性评价标准。具体而言, 本文首先总结梳理了常见主流编程语言中的命名规范并类比自然语言语素概念本文提出基于软件语素的标识符构成过程, 即标识符的构成可被视为软件语素的生成、排列和连接过程。在此基础上, 本文提出一种结合自然语料库的软件标识符规范性评价方法, 用来衡量软件标识符是否易于理解。最后, 本文通过源代码理解数据集和乇乩乴乨乵乢平台中开源项目对规范性指标进行了验证性实验, 结果表明本文提出的规范性分数能够很好衡量软件项目的可理解性。”
no code implementations • ICML 2020 • Nian Si, Fan Zhang, Zhengyuan Zhou, Jose Blanchet
We first present a policy evaluation procedure in the ambiguous environment and also give a heuristic algorithm to solve the distributionally robust policy learning problems efficiently.
1 code implementation • 20 Jan 2025 • Guankun Wang, Long Bai, Junyi Wang, Kun Yuan, Zhen Li, Tianxu Jiang, Xiting He, Jinlin Wu, Zhen Chen, Zhen Lei, Hongbin Liu, Jiazheng Wang, Fan Zhang, Nicolas Padoy, Nassir Navab, Hongliang Ren
Recently, Multimodal Large Language Models (MLLMs) have demonstrated their immense potential in computer-aided diagnosis and decision-making.
no code implementations • 20 Jan 2025 • Zixuan Chen, Yujin Wang, Xin Cai, Zhiyuan You, Zheming Lu, Fan Zhang, Shi Guo, Tianfan Xue
In this work, we propose UltraFusion, the first exposure fusion technique that can merge input with 9 stops differences.
no code implementations • 12 Jan 2025 • Ruizhe Ou, Yuan Hu, Fan Zhang, Jiaxin Chen, Yu Liu
In addition, to address the absence of large-scale datasets for training pixel-level RS MLLMs, we construct the GeoPixInstruct dataset, comprising 65, 463 images and 140, 412 instances, with each instance annotated with text descriptions, bounding boxes, and masks.
no code implementations • 12 Jan 2025 • Wenqi Zhou, Kai Cao, Hao Zheng, Xinyi Zheng, Miao Liu, Per Ola Kristensson, Walterio Mayol-Cuevas, Fan Zhang, Weizhe Lin, Junxiao Shen
Leveraging the advanced text processing capabilities of large language models (LLMs), X-LeBench develops a life-logging simulation pipeline that produces realistic, coherent daily plans aligned with real-world video data.
no code implementations • 6 Jan 2025 • Zhuo Chen, Yuyang Gong, Miaokun Chen, Haotan Liu, Qikai Cheng, Fan Zhang, Wei Lu, Xiaozhong Liu, Jiawei Liu
By leveraging instruction engineering, we obtain partial retrieval model outputs from black-box RAG system, facilitating the training of surrogate models to enhance the effectiveness of opinion manipulation attack.
no code implementations • 6 Jan 2025 • Nantheera Anantrasirichai, Fan Zhang, David Bull
This paper explores the significant technological shifts since our previous review in 2022, highlighting how these developments have expanded creative opportunities and efficiency.
1 code implementation • 20 Dec 2024 • Ke Yan, Qing Cai, Fan Zhang, Ziyan Cao, Zhi Liu
To address these issues, we propose a novel Semantic-Guided Triplet Co-training (SGTC) framework, which achieves high-end medical image segmentation by only annotating three orthogonal slices of a few volumetric samples, significantly alleviating the burden of radiologists.
1 code implementation • 15 Dec 2024 • Haisheng Lu, Yujie Fu, Fan Zhang, Le Zhang
Medical image segmentation is a critical component of clinical practice, and the state-of-the-art MedSAM model has significantly advanced this field.
no code implementations • 11 Dec 2024 • Xihua Zhu, Yiqian Yang, Fan Zhang
With the rapid development of gravitational wave astronomy, the increasing number of detected events necessitates efficient methods for parameter estimation and model updates.
1 code implementation • 10 Dec 2024 • Fan Zhang, Shulin Tian, Ziqi Huang, Yu Qiao, Ziwei Liu
Moreover, existing evaluation methods rely on rigid pipelines that overlook specific user needs and provide numerical results without clear explanations.
no code implementations • 10 Dec 2024 • Yiqian Yang, Xihua Zhu, Fan Zhang
Adversarial training with Normalizing Flow (NF) models is an emerging research area aimed at improving model robustness through adversarial samples.
no code implementations • 9 Dec 2024 • Shanshan Wang, Shoujun Yu, Jian Cheng, Sen Jia, Changjun Tie, Jiayu Zhu, Haohao Peng, Yijing Dong, Jianzhong He, Fan Zhang, Yaowen Xing, Xiuqin Jia, Qi Yang, Qiyuan Tian, Hua Guo, Guobin Li, Hairong Zheng
Diffusion magnetic resonance imaging (dMRI) provides critical insights into the microstructural and connectional organization of the human brain.
no code implementations • 4 Dec 2024 • YuXuan Jiang, Ho Man Kwan, Tianhao Peng, Ge Gao, Fan Zhang, Xiaoqing Zhu, Joel Sole, David Bull
Recent advances in implicit neural representations (INRs) have shown significant promise in modeling visual signals for various low-vision tasks including image super-resolution (ISR).
no code implementations • 28 Nov 2024 • Prithviraj Banerjee, Sindi Shkodrani, Pierre Moulon, Shreyas Hampali, Shangchen Han, Fan Zhang, Linguang Zhang, Jade Fountain, Edward Miller, Selen Basol, Richard Newcombe, Robert Wang, Jakob Julian Engel, Tomas Hodan
We introduce HOT3D, a publicly available dataset for egocentric hand and object tracking in 3D.
no code implementations • 23 Nov 2024 • Fan Zhang, Siyuan Zhao, Naye Ji, Zhaohan Wang, Jingmei Wu, Fuxing Gao, Zhenqing Ye, Leyao Yan, Lanxin Dai, Weidong Geng, Xin Lyu, Bozuo Zhao, Dingguo Yu, Hui Du, Bin Hu
DiM-Gestor features a dual-component framework: (1) a fuzzy feature extractor and (2) a speech-to-gesture mapping module, both built on the Mamba-2.
no code implementations • 20 Nov 2024 • YuXuan Jiang, Jakub Nawała, Chen Feng, Fan Zhang, Xiaoqing Zhu, Joel Sole, David Bull
To address this issue, this paper proposes a low-complexity SR method, RTSR, designed to enhance the visual quality of compressed video content, focusing on resolution up-scaling from a) 360p to 1080p and from b) 540p to 4K.
1 code implementation • 20 Nov 2024 • Ziqi Huang, Fan Zhang, Xiaojie Xu, Yinan He, Jiashuo Yu, Ziyue Dong, Qianli Ma, Nattapol Chanpaisit, Chenyang Si, Yuming Jiang, Yaohui Wang, Xinyuan Chen, Ying-Cong Chen, LiMin Wang, Dahua Lin, Yu Qiao, Ziwei Liu
Video generation has witnessed significant advancements, yet evaluating these models remains a challenge.
no code implementations • 19 Nov 2024 • Yuanyuan Tian, Wenwen Li, Lei Hu, Xiao Chen, Michael Brook, Michael Brubaker, Fan Zhang, Anna K. Liljedahl
Retrieval and recommendation are two essential tasks in modern search tools.
no code implementations • 19 Nov 2024 • Zhixiang Wang, Xudong Li, Yizhai Zhang, Fan Zhang, Panfeng
Event cameras, when combined with inertial sensors, show significant potential for motion estimation in challenging scenarios, such as high-speed maneuvers and low-light environments.
no code implementations • 17 Nov 2024 • Ge Gao, Adrian Azzarelli, Ho Man Kwan, Nantheera Anantrasirichai, Fan Zhang, Oliver Moolan-Feroze, David Bull
However, the development and validation of efficient 3D data compression methods are constrained by the lack of comprehensive and high-quality volumetric video datasets, which typically require much more effort to acquire and consume increased resources compared to 2D image and video databases.
1 code implementation • 14 Nov 2024 • Nancy R. Newlin, Kurt Schilling, Serge Koudoro, Bramsh Qamar Chandio, Praitayini Kanakaraj, Daniel Moyer, Claire E. Kelly, Sila Genc, Jian Chen, Joseph Yuan-Mou Yang, Ye Wu, Yifei He, Jiawei Zhang, Qingrun Zeng, Fan Zhang, Nagesh Adluru, Vishwesh Nath, Sudhir Pathak, Walter Schneider, Anurag Gade, Yogesh Rathi, Tom Hendriks, Anna Vilanova, Maxime Chamberland, Tomasz Pieciak, Dominika Ciupek, Antonio Tristán Vega, Santiago Aja-Fernández, Maciej Malawski, Gani Ouedraogo, Julia Machnio, Christian Ewert, Paul M. Thompson, Neda Jahanshad, Eleftherios Garyfallidis, Bennett A. Landman
There is a pressing need to harmonize the preprocessing of DW-MRI datasets to ensure the derivation of robust quantitative diffusion metrics across acquisitions.
no code implementations • 4 Nov 2024 • Jin Wang, Bocheng Guo, Yijie Li, Junyi Wang, Yuqian Chen, Jarrett Rushmore, Nikos Makris, Yogesh Rathi, Lauren J O'Donnell, Fan Zhang
Tractography fiber clustering using diffusion MRI (dMRI) is a crucial strategy for white matter (WM) parcellation.
no code implementations • 1 Nov 2024 • Zihong He, Weizhe Lin, Hao Zheng, Fan Zhang, Matt W. Jones, Laurence Aitchison, Xuhai Xu, Miao Liu, Per Ola Kristensson, Junxiao Shen
With the rapid advancement of AI systems, their abilities to store, retrieve, and utilize information over the long term - referred to as long-term memory - have become increasingly significant.
no code implementations • 30 Oct 2024 • Yujin Wang, Tianyi Xu, Fan Zhang, Tianfan Xue, Jinwei Gu
Based on this, AdaptiveISP utilizes deep reinforcement learning to automatically generate an optimal ISP pipeline and the associated ISP parameters to maximize the detection performance.
1 code implementation • 29 Oct 2024 • Yui Lo, Yuqian Chen, Dongnan Liu, Jon Haitz Legarreta, Leo Zekelman, Fan Zhang, Jarrett Rushmore, Yogesh Rathi, Nikos Makris, Alexandra J. Golby, Weidong Cai, Lauren J. O'Donnell
In this work, we investigate the possibility of utilizing a deep learning model to compute shape measures of the brain's white matter connections.
no code implementations • 19 Oct 2024 • Yui Lo, Yuqian Chen, Dongnan Liu, Wan Liu, Leo Zekelman, Jarrett Rushmore, Fan Zhang, Yogesh Rathi, Nikos Makris, Alexandra J. Golby, Weidong Cai, Lauren J. O'Donnell
Our results demonstrate that shape measures are predictive of individual cognitive performance.
no code implementations • 15 Oct 2024 • Youshen Xiao, Sheng Liao, Xuanyang Tian, Fan Zhang, Xinlong Dong, Yunhui Jiang, Xiyu Chen, Ruixi Sun, Yuyao Zhang, Fei Gao
Acoustic-Resolution Photoacoustic Microscopy (AR-PAM) is promising for subcutaneous vascular imaging, but its spatial resolution is constrained by the Point Spread Function (PSF).
no code implementations • 11 Oct 2024 • Chen Xu, Qiming Huang, Yuqi Hou, Jiangxing Wu, Fan Zhang, Hyung Jin Chang, Jianbo Jiao
Medical image segmentation poses challenges due to domain gaps, data modality variations, and dependency on domain knowledge or experts, especially for low- and middle-income countries (LMICs).
no code implementations • 2 Oct 2024 • Haoran Wang, Nantheera Anantrasirichai, Fan Zhang, David Bull
3D Gaussian splatting (3DGS) offers the capability to achieve real-time high quality 3D scene rendering.
1 code implementation • 27 Sep 2024 • Ruikang Li, Yujin Wang, Shiqi Chen, Fan Zhang, Jinwei Gu, Tianfan Xue
The raw domain denoising adapts to sensor-specific noise as well as spatially varying noise levels, while the sRGB domain denoising adapts to ISP variations and removes residual noise amplified by the ISP.
Ranked #1 on Image Denoising on DND
2 code implementations • 27 Sep 2024 • Xinlong Wang, Xiaosong Zhang, Zhengxiong Luo, Quan Sun, Yufeng Cui, Jinsheng Wang, Fan Zhang, Yueze Wang, Zhen Li, Qiying Yu, Yingli Zhao, Yulong Ao, Xuebin Min, Tao Li, Boya Wu, Bo Zhao, BoWen Zhang, Liangdong Wang, Guang Liu, Zheqi He, Xi Yang, Jingjing Liu, Yonghua Lin, Tiejun Huang, Zhongyuan Wang
While next-token prediction is considered a promising path towards artificial general intelligence, it has struggled to excel in multimodal tasks, which are still dominated by diffusion models (e. g., Stable Diffusion) and compositional approaches (e. g., CLIP combined with LLMs).
Ranked #132 on Visual Question Answering on MM-Vet
no code implementations • 21 Sep 2024 • Fei Ma, Yuqiang Feng, Fan Zhang, Yongsheng Zhou
Common Perlin noise based cloud generation is a random, non-optimizable process, which cannot be directly used to attack the target models.
no code implementations • 11 Sep 2024 • Ho Man Kwan, Ge Gao, Fan Zhang, Andrew Gower, David Bull
In this paper, rather than focusing on representation architectures as in many existing works, we propose a novel INR-based video compression framework, Neural Video Representation Compression (NVRC), targeting compression of the representation.
1 code implementation • 11 Sep 2024 • Chenjun Li, Dian Yang, Shun Yao, Shuyue Wang, Ye Wu, Le Zhang, Qiannuo Li, Kang Ik Kevin Cho, Johanna Seitz-Holland, Lipeng Ning, Jon Haitz Legarreta, Yogesh Rathi, Carl-Fredrik Westin, Lauren J. O'Donnell, Nir A. Sochen, Ofer Pasternak, Fan Zhang
To do so, we design an evidence-based ensemble learning framework for uncertainty-aware parcellation to leverage the multiple dMRI parameters derived from diffusion MRI.
no code implementations • 9 Sep 2024 • Fan Zhang, Lingling Li, Licheng Jiao, Xu Liu, Fang Liu, Shuyuan Yang, Biao Hou
In a series of FPN experiments on the scale-preferred tasks, we found that the ``divide-and-conquer'' idea of FPN severely hampers the detector's learning in the right direction due to the large number of large-scale negative samples and interference from background noise.
no code implementations • 5 Sep 2024 • Shunyu Li, Fan Zhang, Tianqi Mao, Rui Na, Zhaocheng Wang, George K. Karagiannidis
This paper proposes a transmit beamforming strategy for the integrated sensing and communication (ISAC) systems enabled by the novel stacked intelligent metasurface (SIM) architecture, where the base station (BS) simultaneously performs downlink communication and radar target detection via different beams.
no code implementations • 2 Sep 2024 • Fan Zhang, Michael Gienger
We present a framework for assistive robot manipulation, which focuses on two fundamental challenges: first, efficiently adapting large-scale models to downstream scene affordance understanding tasks, especially in daily living scenarios where gathering multi-task data involving humans requires strenuous effort; second, effectively learning robot trajectories by grounding the visual affordance model.
no code implementations • 2 Sep 2024 • Ge Gao, Ho Man Kwan, Fan Zhang, David Bull
Neural video compression has recently demonstrated significant potential to compete with conventional video codecs in terms of rate-quality performance.
no code implementations • 23 Aug 2024 • Xi Zhu, Wei zhang, Yijie Li, Lauren J. O'Donnell, Fan Zhang
This achievement underscores a substantial progression in enhancing dMRI quality, highlighting the potential of our novel generative approach to revolutionize dMRI imaging standards.
no code implementations • 13 Aug 2024 • Zihao Qi, Chen Feng, Fan Zhang, Xiaozhong Xu, Shan Liu, David Bull
Based on this collected subjective data, we benchmarked the performance of 10 full-reference and 11 no-reference quality metrics.
no code implementations • 10 Aug 2024 • Fan Zhang, Ziyue Ji, Weiguang Kang, Weiqing Li, Zhiyong Su
Specifically, based on the construction of a synthetic eyeglasses frame dataset, we first define a class-specific eyeglasses frame template with pre-defined keypoints.
no code implementations • 9 Aug 2024 • Siyue Teng, YuXuan Jiang, Ge Gao, Fan Zhang, Thomas Davis, Zoe Liu, David Bull
Recent advances in video compression have seen significant coding performance improvements with the development of new standards and learning-based video codecs.
1 code implementation • 6 Aug 2024 • Jakub Nawała, YuXuan Jiang, Fan Zhang, Xiaoqing Zhu, Joel Sole, David Bull
Deep learning is now playing an important role in enhancing the performance of conventional hybrid video codecs.
no code implementations • 1 Aug 2024 • Fan Zhang, Naye Ji, Fuxing Gao, Bozuo Zhao, Jingmei Wu, Yanbing Jiang, Hui Du, Zhenqing Ye, Jiayang Zhu, WeiFan Zhong, Leyao Yan, Xiaomeng Ma
Speech-driven gesture generation is an emerging domain within virtual human creation, where current methods predominantly utilize Transformer-based architectures that necessitate extensive memory and are characterized by slow inference speeds.
1 code implementation • 29 Jul 2024 • Wenxuan Wang, Quan Sun, Fan Zhang, Yepeng Tang, Jing Liu, Xinlong Wang
We demonstrate that DIVA improves CLIP's performance on the challenging MMVP-VLM benchmark which assesses fine-grained visual abilities to a large extent (e. g., 3-7%), and enhances the performance of MLLMs and vision models on multimodal understanding and segmentation tasks.
no code implementations • 28 Jul 2024 • Yui Lo, Yuqian Chen, Fan Zhang, Dongnan Liu, Leo Zekelman, Suheyla Cetin-Karayumak, Yogesh Rathi, Weidong Cai, Lauren J. O'Donnell
In this work, we propose a novel deep-learning model to impute tissue microstructure: the White Matter Geometry-guided Diffusion (WMG-Diff) model.
no code implementations • 21 Jul 2024 • Ari Tchetchenian, Leo Zekelman, Yuqian Chen, Jarrett Rushmore, Fan Zhang, Edward H. Yeterian, Nikos Makris, Yogesh Rathi, Erik Meijering, Yang song, Lauren J. O'Donnell
We refer to our method as Deep Multimodal Saliency Parcellation (DeepMSP), as it computes the saliency of structural measures for predicting cognitive and motor functional performance, with these saliencies being applied to the task of parcellation.
no code implementations • 18 Jul 2024 • Zhuo Chen, Jiawei Liu, Haotan Liu, Qikai Cheng, Fan Zhang, Wei Lu, Xiaozhong Liu
Retrieval-Augmented Generation (RAG) is applied to solve hallucination problems and real-time constraints of large language models, but it also induces vulnerabilities against retrieval corruption attacks.
1 code implementation • 17 Jul 2024 • QiHao Zhao, Yalun Dai, Shen Lin, Wei Hu, Fan Zhang, Jun Liu
In real-world scenarios, where knowledge distributions exhibit long-tail.
no code implementations • 11 Jul 2024 • Yuqian Chen, Fan Zhang, Meng Wang, Leo R. Zekelman, Suheyla Cetin-Karayumak, Tengfei Xue, Chaoyi Zhang, Yang song, Nikos Makris, Yogesh Rathi, Weidong Cai, Lauren J. O'Donnell
The proposed approach highlights the potential of integrating local anatomical information and global feature dependencies to improve prediction performance in machine learning with diffusion MRI tractography.
1 code implementation • 11 Jul 2024 • Xiaotong Li, Fan Zhang, Haiwen Diao, Yueze Wang, Xinlong Wang, Ling-Yu Duan
To facilitate the cutting-edge research of MLLMs on comprehensive vision perception, we thereby propose Perceptual Fusion, using a low-budget but highly effective caption engine for complete and accurate image descriptions.
Ranked #123 on Visual Question Answering on MM-Vet
no code implementations • 5 Jul 2024 • Qian Zeng, Le Zhang, Yipeng Liu, Ce Zhu, Fan Zhang
Glaucoma is a leading cause of irreversible blindness worldwide.
no code implementations • 1 Jul 2024 • Crispian Morris, Nantheera Anantrasirichai, Fan Zhang, David Bull
In many real-world scenarios, recorded videos suffer from accidental focus blur, and while video deblurring methods exist, most specifically target motion blur.
1 code implementation • 19 Jun 2024 • Fan Zhang, Xin Zhang
Massive number of applications involve data with underlying relationships embedded in non-Euclidean space.
no code implementations • 13 Jun 2024 • Prithviraj Banerjee, Sindi Shkodrani, Pierre Moulon, Shreyas Hampali, Fan Zhang, Jade Fountain, Edward Miller, Selen Basol, Richard Newcombe, Robert Wang, Jakob Julian Engel, Tomas Hodan
The dataset offers over 833 minutes (more than 3. 7M images) of multi-view RGB/monochrome image streams showing 19 subjects interacting with 33 diverse rigid objects, multi-modal signals such as eye gaze or scene point clouds, as well as comprehensive ground truth annotations including 3D poses of objects, hands, and cameras, and 3D models of hands and objects.
no code implementations • 31 May 2024 • Chen Feng, Duolikun Danier, Fan Zhang, Alex Mackin, Andrew Collins, David Bull
In this paper, we propose a Multiple Visual Artifact Detector, MVAD, for video streaming which, for the first time, is able to detect multiple artifacts using a single framework that is not reliant on video quality assessment models.
1 code implementation • 29 May 2024 • Fan Zhang, Carlos Esteve-Yagüe, Sören Dittmer, Carola-Bibiane Schönlieb, Michael Roberts
This study contributes to PFL by establishing a solid theoretical foundation for the proposed method and offering a robust, ready-to-use framework that effectively addresses the challenges posed by non-IID data in FL.
no code implementations • 23 May 2024 • Zhibo Chen, Heming Sun, Li Zhang, Fan Zhang
This paper provides a survey of the latest developments in visual signal coding and processing with generative models.
1 code implementation • 21 May 2024 • Yihong Huang, Yuang Zhang, Liping Wang, Fan Zhang, Xuemin Lin
Most deep UOD models are trained exclusively on clean datasets to learn the distribution of the normal data, which requires huge manual efforts to clean the real-world data if possible.
no code implementations • 19 May 2024 • Fan Zhang, Xian-Sheng Hua, Chong Chen, Xiao Luo
Image-text matching has been a long-standing problem, which seeks to connect vision and language through semantic understanding.
no code implementations • 14 May 2024 • Tianhao Peng, Chen Feng, Duolikun Danier, Fan Zhang, Benoit Vallade, Alex Mackin, David Bull
The proposed method, RMT-BVQA, has been evaluated on the VDPVE (VQA Dataset for Perceptual Video Enhancement) database through a five-fold cross validation.
no code implementations • 29 Apr 2024 • Khaled Saab, Tao Tu, Wei-Hung Weng, Ryutaro Tanno, David Stutz, Ellery Wulczyn, Fan Zhang, Tim Strother, Chunjong Park, Elahe Vedadi, Juanma Zambrano Chaves, Szu-Yeu Hu, Mike Schaekermann, Aishwarya Kamath, Yong Cheng, David G. T. Barrett, Cathy Cheung, Basil Mustafa, Anil Palepu, Daniel McDuff, Le Hou, Tomer Golany, Luyang Liu, Jean-Baptiste Alayrac, Neil Houlsby, Nenad Tomasev, Jan Freyberg, Charles Lau, Jonas Kemp, Jeremy Lai, Shekoofeh Azizi, Kimberly Kanada, SiWai Man, Kavita Kulkarni, Ruoxi Sun, Siamak Shakeri, Luheng He, Ben Caine, Albert Webson, Natasha Latysheva, Melvin Johnson, Philip Mansfield, Jian Lu, Ehud Rivlin, Jesper Anderson, Bradley Green, Renee Wong, Jonathan Krause, Jonathon Shlens, Ewa Dominowska, S. M. Ali Eslami, Katherine Chou, Claire Cui, Oriol Vinyals, Koray Kavukcuoglu, James Manyika, Jeff Dean, Demis Hassabis, Yossi Matias, Dale Webster, Joelle Barral, Greg Corrado, Christopher Semturs, S. Sara Mahdavi, Juraj Gottweis, Alan Karthikesalingam, Vivek Natarajan
We evaluate Med-Gemini on 14 medical benchmarks, establishing new state-of-the-art (SoTA) performance on 10 of them, and surpass the GPT-4 model family on every benchmark where a direct comparison is viable, often by a wide margin.
Ranked #1 on Question Answering on MedQA (using extra training data)
no code implementations • 24 Apr 2024 • Xiangci Li, Sihao Chen, Rajvi Kapadia, Jessica Ouyang, Fan Zhang
Claim verification in real-world settings (e. g. against a large collection of candidate evidences retrieved from the web) typically requires identifying and aggregating a complete set of evidence pieces that collectively provide full support to the claim.
no code implementations • 23 Apr 2024 • Fan Zhang, Zhi-Qi Cheng, Jian Zhao, Xiaojiang Peng, Xuelong Li
LEAF introduces a hierarchical expression-aware aggregation strategy that operates at three levels: semantic, instance, and category.
Facial Expression Recognition Facial Expression Recognition (FER)
no code implementations • 16 Apr 2024 • Fan Zhang, Jinfeng Chen, Yu Hu, Zhiqiang Gao, Ge Lv, Qin Lin
On the other hand, machine learning benefits from an additional assurance layer provided by the ESO, as any imperfections in the machine learning model can be compensated for by the ESO.
no code implementations • 15 Apr 2024 • YuXuan Jiang, Chen Feng, Fan Zhang, David Bull
Knowledge distillation (KD) has emerged as a promising technique in deep learning, typically employed to enhance a compact student network through learning from their high-performance but more complex teacher variant.
no code implementations • 6 Apr 2024 • Yijie Li, Wei zhang, Ye Wu, Li Yin, Ce Zhu, Yuqian Chen, Suheyla Cetin-Karayumak, Kang Ik K Cho, Leo R. Zekelman, Jarrett Rushmore, Yogesh Rathi, Nikos Makris, Lauren J. O'Donnell, Fan Zhang
However, a comprehensive investigation into WM fiber tracts between Eastern and Western populations is challenged due to the lack of a cross-population WM atlas and the large site-specific variability of dMRI data.
no code implementations • 27 Mar 2024 • Yui Lo, Yuqian Chen, Dongnan Liu, Wan Liu, Leo Zekelman, Fan Zhang, Yogesh Rathi, Nikos Makris, Alexandra J. Golby, Weidong Cai, Lauren J. O'Donnell
Overall, our results indicate that the shape of the brain's connections is predictive of human language function.
no code implementations • 25 Mar 2024 • Kaikang Zhao, Xi Chen, Wei Huang, Liuxin Ding, Xianglong Kong, Fan Zhang
In this work, we aim to enhance ensemble diversity by reducing attack transferability.
1 code implementation • 22 Mar 2024 • Shuhao Li, Yue Cui, Jingyi Xu, Libin Li, Lingkai Meng, Weidong Yang, Fan Zhang, Xiaofang Zhou
Traffic prediction has long been a focal and pivotal area in research, witnessing both significant strides from city-level to road-level predictions in recent years.
no code implementations • 16 Mar 2024 • Fan Zhang, Zhaohan Wang, Xin Lyu, Siyuan Zhao, Mengjian Li, Weidong Geng, Naye Ji, Hui Du, Fuxing Gao, Hao Wu, Shunman Li
Finally, we employ the diffusion model to train and infer various gestures.
1 code implementation • 14 Mar 2024 • Fan Zhang, Wei Qin, Weijieying Ren, Lei Wang, Zetong Chen, Richang Hong
Additionally, We find that most of the solutions to long-tailed problems are still biased towards head classes in the end, and we propose a simple and post hoc prediction re-balancing strategy to further mitigate the basis toward head class.
no code implementations • CVPR 2024 • QiHao Zhao, Yalun Dai, Hao Li, Wei Hu, Fan Zhang, Jun Liu
Long-tail recognition is challenging because it requires the model to learn good representations from tail categories and address imbalances across all categories.
2 code implementations • 6 Feb 2024 • Quan Sun, Jinsheng Wang, Qiying Yu, Yufeng Cui, Fan Zhang, Xiaosong Zhang, Xinlong Wang
Scaling up contrastive language-image pretraining (CLIP) is critical for empowering both vision and multimodal models.
Ranked #1 on Zero-Shot Transfer Image Classification on SUN
Image Classification Zero-Shot Transfer Image Classification
1 code implementation • 2 Feb 2024 • Ho Man Kwan, Fan Zhang, Andrew Gower, David Bull
In this paper we, for the first time, extend their application to immersive (multi-view) videos, by proposing MV-HiNeRV, a new INR-based immersive video codec.
no code implementations • 14 Jan 2024 • Fan Zhang, Shuyi Mao, Qing Li, Xiaojiang Peng
Comparative evaluations with popular point-based methods on HPoint103 and the public dataset DHP19 demonstrate the dramatic outperformance of our D-CPT.
no code implementations • 14 Jan 2024 • Fan Zhang, Xiaobao Guo, Xiaojiang Peng, Alex Kot
In addition, when compared with the domain disparity existing between face datasets and FER datasets, the divergence between general datasets and FER datasets is more pronounced.
2 code implementations • 10 Jan 2024 • Yuanchun Li, Hao Wen, Weijun Wang, Xiangyu Li, Yizhen Yuan, Guohong Liu, Jiacheng Liu, Wenxing Xu, Xiang Wang, Yi Sun, Rui Kong, Yile Wang, Hanfei Geng, Jian Luan, Xuefeng Jin, Zilong Ye, Guanjing Xiong, Fan Zhang, Xiang Li, Mengwei Xu, Zhijun Li, Peng Li, Yang Liu, Ya-Qin Zhang, Yunxin Liu
Next, we discuss several key challenges to achieve intelligent, efficient and secure Personal LLM Agents, followed by a comprehensive survey of representative solutions to address these challenges.
no code implementations • 9 Jan 2024 • Yuxiang Wei, Yuqian Chen, Tengfei Xue, Leo Zekelman, Nikos Makris, Yogesh Rathi, Weidong Cai, Fan Zhang, Lauren J. O' Donnell
We present an explainable multi-view network (EMV-Net) that can use different anatomical views to improve prediction performance.
no code implementations • CVPR 2024 • Fan Zhang, Xian-Sheng Hua, Chong Chen, Xiao Luo
In this paper we propose a semi-supervised approach named Fine-grained Prototypcical Voting with Heterogeneous Mixup (FIVE) which maps both 2D and 3D data into a common embedding space for cross-modal retrieval.
no code implementations • 31 Dec 2023 • YuXuan Jiang, Jakub Nawala, Fan Zhang, David Bull
Deep learning techniques have been applied in the context of image super-resolution (SR), achieving remarkable advances in terms of reconstruction performance.
no code implementations • 22 Dec 2023 • Zhangyin Feng, Runyi Hu, Liangxin Liu, Fan Zhang, Duyu Tang, Yong Dai, Xiaocheng Feng, Jiwei Li, Bing Qin, Shuming Shi
Compared with autoregressive baselines that needs to run one thousand times, our model only runs 16 times to generate images of competitive quality with an order of magnitude lower inference latency.
no code implementations • 22 Dec 2023 • Akshit Gupta, Simone Mora, Fan Zhang, Martine Rutten, R. Venkatesha Prasad, Carlo Ratti
Healthy urban greenery is a fundamental asset to mitigate climate change phenomena such as extreme heat and air pollution.
1 code implementation • CVPR 2024 • Quan Sun, Yufeng Cui, Xiaosong Zhang, Fan Zhang, Qiying Yu, Zhengxiong Luo, Yueze Wang, Yongming Rao, Jingjing Liu, Tiejun Huang, Xinlong Wang
The human ability to easily solve multimodal tasks in context (i. e., with only a few demonstrations or simple instructions), is what current multimodal systems have largely struggled to imitate.
Ranked #3 on Personalized Image Generation on DreamBooth
1 code implementation • CVPR 2024 • Fan Zhang, ShaoDi You, Yu Li, Ying Fu
Nonetheless, the performance of these methods is often constrained by the domain gap and looser constraints.
no code implementations • 19 Dec 2023 • Zihao Qi, Chen Feng, Duolikun Danier, Fan Zhang, Xiaozhong Xu, Shan Liu, David Bull
In this work, we observe that existing full-/no-reference quality metrics fail to accurately predict the perceptual quality difference between transcoded UGC content and the corresponding unpristine references.
no code implementations • 15 Dec 2023 • Fan Zhang, Jining Chen, Kunlun Wang, Wen Chen
we formulate a joint device scheduling, and power allocation problem to maximize the number of scheduled devices.
no code implementations • 14 Dec 2023 • Chen Feng, Duolikun Danier, Fan Zhang, Alex Mackin, Andy Collins, David Bull
Professionally generated content (PGC) streamed online can contain visual artefacts that degrade the quality of user experience.
no code implementations • 14 Dec 2023 • Chen Feng, Duolikun Danier, Haoran Wang, Fan Zhang, Benoit Vallade, Alex Mackin, David Bull
Deep learning-based video quality assessment (deep VQA) has demonstrated significant potential in surpassing conventional metrics, with promising improvements in terms of correlation with human perception.
no code implementations • 6 Dec 2023 • Xinwei Yuan, Shu Han, Wei Huang, Hongliang Ye, Xianglong Kong, Fan Zhang
In this paper, we propose a novel IDS architecture that can enhance the robustness of IDS against adversarial attacks by combining conventional machine learning (ML) models and Deep Learning models.
no code implementations • 5 Dec 2023 • Tianhao Peng, Ge Gao, Heming Sun, Fan Zhang, David Bull
In recent years, end-to-end learnt video codecs have demonstrated their potential to compete with conventional coding algorithms in term of compression efficiency.
1 code implementation • 29 Nov 2023 • Haoran Ma, Yan Zhang, Pengyuan Liu, Fan Zhang, Pengyu Zhu
In this work, a spatial-dependent graph neural networks (GNNs) approach is proposed to reveal the relation between spatial structure and restoration quality on an urban scale.
1 code implementation • CVPR 2024 • Ziqi Huang, Yinan He, Jiashuo Yu, Fan Zhang, Chenyang Si, Yuming Jiang, Yuanhan Zhang, Tianxing Wu, Qingyang Jin, Nattapol Chanpaisit, Yaohui Wang, Xinyuan Chen, LiMin Wang, Dahua Lin, Yu Qiao, Ziwei Liu
We will open-source VBench, including all prompts, evaluation methods, generated videos, and human preference annotations, and also include more video generation models in VBench to drive forward the field of video generation.
no code implementations • 25 Nov 2023 • Haolin He, Ce Zhu, Le Zhang, Yipeng Liu, Xiao Xu, Yuqian Chen, Leo Zekelman, Jarrett Rushmore, Yogesh Rathi, Nikos Makris, Lauren J. O'Donnell, Fan Zhang
The amygdala plays a vital role in emotional processing and exhibits structural diversity that necessitates fine-scale parcellation for a comprehensive understanding of its anatomico-functional correlations.
no code implementations • 8 Nov 2023 • Fan Zhang, Tianqi Mao, Ruiqi Liu, Zhu Han, Sheng Chen, Zhaocheng Wang
For the communication-centric design, to maximize the achievable data rate, a fraction of REs are optimally allocated for communications according to prior knowledge of the communication channel.
1 code implementation • CVPR 2024 • Qiying Yu, Quan Sun, Xiaosong Zhang, Yufeng Cui, Fan Zhang, Yue Cao, Xinlong Wang, Jingjing Liu
To provide higher-quality and more scalable multimodal pretraining data, we propose CapsFusion, an advanced framework that leverages large language models to consolidate and refine information from both web-based image-text pairs and synthetic captions.
1 code implementation • 29 Oct 2023 • Zhen Qian, Min Chen, Zhuo Sun, Fan Zhang, Qingsong Xu, Jinzhao Guo, Zhiwei Xie, Zhixin Zhang
Understanding urban dynamics and promoting sustainable development requires comprehensive insights about buildings.
no code implementations • 4 Oct 2023 • Fan Zhang, Daniel Kreuter, Yichen Chen, Sören Dittmer, Samuel Tull, Tolou Shadbahr, BloodCounts! Collaboration, Jacobus Preller, James H. F. Rudd, John A. D. Aston, Carola-Bibiane Schönlieb, Nicholas Gleadall, Michael Roberts
We give detailed recommendations to help improve the quality of the methodology development for federated learning in healthcare.
no code implementations • 4 Sep 2023 • Zixiong Wang, Yunxiao Zhang, Rui Xu, Fan Zhang, PengShuai Wang, Shuangmin Chen, Shiqing Xin, Wenping Wang, Changhe Tu
Our approach enforces the Hessian of the neural implicit function to have a zero determinant for points near the surface.
no code implementations • 26 Aug 2023 • Fan Zhang, Kebing Jin, Hankz Hankui Zhuo
Despite the superior performance of large language models to generate natural language texts, it is hard to generate texts with correct logic according to a given task, due to the difficulties for neural models to capture implied rules from free-form texts.
no code implementations • 24 Aug 2023 • Jakob Engel, Kiran Somasundaram, Michael Goesele, Albert Sun, Alexander Gamino, Andrew Turner, Arjang Talattof, Arnie Yuan, Bilal Souti, Brighid Meredith, Cheng Peng, Chris Sweeney, Cole Wilson, Dan Barnes, Daniel DeTone, David Caruso, Derek Valleroy, Dinesh Ginjupalli, Duncan Frost, Edward Miller, Elias Mueggler, Evgeniy Oleinik, Fan Zhang, Guruprasad Somasundaram, Gustavo Solaira, Harry Lanaras, Henry Howard-Jenkins, Huixuan Tang, Hyo Jin Kim, Jaime Rivera, Ji Luo, Jing Dong, Julian Straub, Kevin Bailey, Kevin Eckenhoff, Lingni Ma, Luis Pesqueira, Mark Schwesinger, Maurizio Monge, Nan Yang, Nick Charron, Nikhil Raina, Omkar Parkhi, Peter Borschowa, Pierre Moulon, Prince Gupta, Raul Mur-Artal, Robbie Pennington, Sachin Kulkarni, Sagar Miglani, Santosh Gondi, Saransh Solanki, Sean Diener, Shangyi Cheng, Simon Green, Steve Saarinen, Suvam Patra, Tassos Mourikis, Thomas Whelan, Tripti Singh, Vasileios Balntas, Vijay Baiyya, Wilson Dreewes, Xiaqing Pan, Yang Lou, Yipu Zhao, Yusuf Mansour, Yuyang Zou, Zhaoyang Lv, Zijian Wang, Mingfei Yan, Carl Ren, Renzo De Nardi, Richard Newcombe
Egocentric, multi-modal data as available on future augmented reality (AR) devices provides unique challenges and opportunities for machine perception.
no code implementations • 21 Aug 2023 • Yubiao Yue, Xinyu Zeng, Xiaoqiang Shi, Meiping Zhang, Fan Zhang, Yunxin Liang, Yan Liu, Zhenzhang Li, Yang Li
Deep learning-based ear disease diagnosis technology has proven effective and affordable.
1 code implementation • ICCV 2023 • QiHao Zhao, Chen Jiang, Wei Hu, Fan Zhang, Jun Liu
In the analysis and ablation study, we demonstrate that our method compared with previous work can effectively increase the diversity of experts, significantly reduce the variance of the model, and improve recognition accuracy.
Ranked #5 on Long-tail Learning on CIFAR-10-LT (ρ=50)
no code implementations • 11 Aug 2023 • Fan Zhang, Naye Ji, Fuxing Gao, Siyuan Zhao, Zhaohan Wang, Shunman Li
Firstly, considering that speech audio not only contains acoustic and semantic features but also conveys personality traits, emotions, and more subtle information related to accompanying gestures, we pioneer the adaptation of WavLM, a large-scale pre-trained model, to extract low-level and high-level audio information.
2 code implementations • 10 Aug 2023 • Jun Ma, Yao Zhang, Song Gu, Cheng Ge, Shihao Ma, Adamo Young, Cheng Zhu, Kangkang Meng, Xin Yang, Ziyan Huang, Fan Zhang, Wentao Liu, YuanKe Pan, Shoujin Huang, Jiacheng Wang, Mingze Sun, Weixin Xu, Dengqiang Jia, Jae Won Choi, Natália Alves, Bram de Wilde, Gregor Koehler, Yajun Wu, Manuel Wiesenfarth, Qiongjie Zhu, Guoqiang Dong, Jian He, the FLARE Challenge Consortium, Bo wang
The best-performing algorithms successfully generalized to holdout external validation sets, achieving a median DSC of 89. 5\%, 90. 9\%, and 88. 3\% on North American, European, and Asian cohorts, respectively.
no code implementations • 4 Aug 2023 • Fan Zhang
The design of deep neural networks remains somewhat of an art rather than precise science.
no code implementations • 19 Jul 2023 • Qingyao Ai, Ting Bai, Zhao Cao, Yi Chang, Jiawei Chen, Zhumin Chen, Zhiyong Cheng, Shoubin Dong, Zhicheng Dou, Fuli Feng, Shen Gao, Jiafeng Guo, Xiangnan He, Yanyan Lan, Chenliang Li, Yiqun Liu, Ziyu Lyu, Weizhi Ma, Jun Ma, Zhaochun Ren, Pengjie Ren, Zhiqiang Wang, Mingwen Wang, Ji-Rong Wen, Le Wu, Xin Xin, Jun Xu, Dawei Yin, Peng Zhang, Fan Zhang, Weinan Zhang, Min Zhang, Xiaofei Zhu
The research field of Information Retrieval (IR) has evolved significantly, expanding beyond traditional search to meet diverse user information needs.
no code implementations • 18 Jul 2023 • Tengfei Xue, Yuqian Chen, Chaoyi Zhang, Alexandra J. Golby, Nikos Makris, Yogesh Rathi, Weidong Cai, Fan Zhang, Lauren J. O'Donnell
TractCloud achieves efficient and consistent whole-brain white matter parcellation across the lifespan (from neonates to elderly subjects, including brain tumor patients) without the need for registration.
no code implementations • 15 Jul 2023 • Ao Jin, Fan Zhang, Panfeng Huang
To avoid complex constraints of the traditional nonlinear method for tethered space robot (TSR) deployment, this paper proposes a data-driven optimal control framework with an improved deep learning based Koopman operator that could be applied to complex environments.
no code implementations • 11 Jul 2023 • Kun Li, Fan Zhang, Wei Guo
In order to defend against malware attacks, researchers have proposed many Windows malware detection models based on deep learning.
2 code implementations • 11 Jul 2023 • Quan Sun, Qiying Yu, Yufeng Cui, Fan Zhang, Xiaosong Zhang, Yueze Wang, Hongcheng Gao, Jingjing Liu, Tiejun Huang, Xinlong Wang
We present Emu, a Transformer-based multimodal foundation model, which can seamlessly generate images and texts in multimodal context.
Ranked #1 on Visual Question Answering on VizWiz
no code implementations • 8 Jul 2023 • Yuqian Chen, Leo R. Zekelman, Chaoyi Zhang, Tengfei Xue, Yang song, Nikos Makris, Yogesh Rathi, Alexandra J. Golby, Weidong Cai, Fan Zhang, Lauren J. O'Donnell
We evaluate the effectiveness of the proposed method by predicting individual performance on two neuropsychological assessments of language using a dataset of 20 association white matter fiber tracts from 806 subjects from the Human Connectome Project.
no code implementations • 3 Jul 2023 • Haixing Dai, Mengxuan Hu, Qing Li, Lu Zhang, Lin Zhao, Dajiang Zhu, Ibai Diez, Jorge Sepulcre, Fan Zhang, Xingyu Gao, Manhua Liu, Quanzheng Li, Sheng Li, Tianming Liu, Xiang Li
Alzheimer's disease (AD) is a neurodegenerative disorder that is beginning with amyloidosis, followed by neuronal loss and deterioration in structure, function, and cognition.