1 code implementation • 20 Dec 2024 • Jiaming Ji, Jiayi Zhou, Hantao Lou, Boyuan Chen, Donghai Hong, Xuyao Wang, Wenqi Chen, Kaile Wang, Rui Pan, Jiahao Li, Mohan Wang, Josef Dai, Tianyi Qiu, Hua Xu, Dong Li, WeiPeng Chen, Jun Song, Bo Zheng, Yaodong Yang
In this work, we make the first attempt to fine-tune all-modality models (i. e. input and output with any modality, also named any-to-any models) using human preference data across all modalities (including text, image, audio, and video), ensuring its behavior aligns with human intentions.
no code implementations • 16 Dec 2024 • Jianxiang Yu, Jiaqi Tan, Zichen Ding, Jiapeng Zhu, Jiahao Li, Yao Cheng, Qier Cui, Yunshi Lan, Xiang Li
Peer review, as a cornerstone of scientific research, ensures the integrity and quality of scholarly work by providing authors with objective feedback for refinement.
no code implementations • 2 Dec 2024 • Wenxin Su, Song Tang, Xiaofeng Liu, Xiaojing Yi, Mao Ye, Chunxiao Zu, Jiahao Li, Xiatian Zhu
Specifically, we first theoretically reformulate conventional perturbation optimization in a generative way--learning a perturbation generation function with a latent input variable.
no code implementations • 19 Oct 2024 • Chao Li, Jiahao Li, Jinwei Zhang, Eddy Solomon, Alexey V. Dimov, Pascal Spincemaille, Thanh D. Nguyen, Martin R. Prince, Yi Wang
Purpose: To develop an MRI technique for free-breathing 3D whole-liver quantification of water T1, water T2, proton density fat fraction (PDFF), R2*.
no code implementations • 8 Aug 2024 • Xiaoyang Ji, Yuchen Zhou, Haofu Yang, Shiyue Xu, Jiahao Li
Graph clustering, a classical task in graph learning, involves partitioning the nodes of a graph into distinct clusters.
no code implementations • 26 May 2024 • Chao Li, Jinwei Zhang, Hang Zhang, Jiahao Li, Pascal Spincemaille, Thanh D. Nguyen, Yi Wang
Purpose: To develop a pipeline for motion artifact correction in mGRE and quantitative susceptibility mapping (QSM).
1 code implementation • 17 May 2024 • Jiahao Li, Quan Wang, Licheng Zhang, Guoqing Jin, Zhendong Mao
In this paper, we propose a feature-adaptive and data-scalable in-context learning framework (FADS-ICL), which can leverage task-adaptive features to promote inference on the downstream task, with the supervision of beyond-context samples.
1 code implementation • 14 May 2024 • Zhimin Li, Jianwei Zhang, Qin Lin, Jiangfeng Xiong, Yanxin Long, Xinchi Deng, Yingfang Zhang, Xingchao Liu, Minbin Huang, Zedong Xiao, Dayou Chen, Jiajun He, Jiahao Li, Wenyue Li, Chen Zhang, Rongwei Quan, Jianxiang Lu, Jiabin Huang, Xiaoyan Yuan, Xiaoxiao Zheng, Yixuan Li, Jihong Zhang, Chao Zhang, Meng Chen, Jie Liu, Zheng Fang, Weiyan Wang, Jinbao Xue, Yangyu Tao, Jianchen Zhu, Kai Liu, Sihuan Lin, Yifu Sun, Yun Li, Dongdong Wang, Mingtao Chen, Zhichao Hu, Xiao Xiao, Yan Chen, Yuhong Liu, Wei Liu, Di Wang, Yong Yang, Jie Jiang, Qinglin Lu
For fine-grained language understanding, we train a Multimodal Large Language Model to refine the captions of the images.
no code implementations • 7 May 2024 • Chen Qian, Jiahao Li, Yufan Dang, Wei Liu, Yifei Wang, Zihao Xie, Weize Chen, Cheng Yang, Yingli Zhang, Zhiyuan Liu, Maosong Sun
We propose two fundamental patterns: the successive pattern, refining based on nearest experiences within a task batch, and the cumulative pattern, acquiring experiences across all previous task batches.
no code implementations • 28 Mar 2024 • Wufei Ma, Jiahao Li, Bin Li, Yan Lu
Deep learning-based video compression is a challenging task, and many previous state-of-the-art learning-based video codecs use optical flows to exploit the temporal correlation between successive frames and then compress the residual error.
1 code implementation • CVPR 2024 • Jiahao Li, Bin Li, Yan Lu
This results in a better learning of the quantization scaler and helps our NVC support about 11. 4 dB PSNR range.
no code implementations • CVPR 2024 • Zhaoyang Jia, Jiahao Li, Bin Li, Houqiang Li, Yan Lu
To address this issue we introduce a Generative Latent Coding (GLC) architecture which performs transform coding in the latent space of a generative vector-quantized variational auto-encoder (VQ-VAE) instead of in the pixel space.
1 code implementation • CVPR 2024 • Yue Gao, Jiahao Li, Lei Chu, Yan Lu
Recent advancements in video modeling extensively rely on optical flow to represent the relationships across frames but this approach often lacks efficiency and fails to model the probability of the intrinsic motion of objects.
no code implementations • CVPR 2024 • Xin Kang, Lei Chu, Jiahao Li, Xuejin Chen, Yan Lu
Recent methods for label-free 3D semantic segmentation aim to assist 3D model training by leveraging the open-world recognition ability of pre-trained vision language models.
1 code implementation • 28 Dec 2023 • Chen Qian, Yufan Dang, Jiahao Li, Wei Liu, Zihao Xie, Yifei Wang, Weize Chen, Cheng Yang, Xin Cong, Xiaoyin Che, Zhiyuan Liu, Maosong Sun
Recent advancements in large language models (LLMs) have brought significant changes to various domains, especially through LLM-driven autonomous agents.
no code implementations • CVPR 2024 • Desai Xie, Jiahao Li, Hao Tan, Xin Sun, Zhixin Shu, Yi Zhou, Sai Bi, Sören Pirk, Arie E. Kaufman
To this end, we introduce Carve3D, an improved RLFT algorithm coupled with a novel Multi-view Reconstruction Consistency (MRC) metric, to enhance the consistency of multi-view diffusion models.
no code implementations • 23 Nov 2023 • Jiahao Li, Quan Wang, Chiwei Zhu, Zhendong Mao, Yongdong Zhang
In this paper, the inherent discrepancies are manifested in two aspects, namely, accuracy of data annotation and diversity of potential annotations.
no code implementations • 15 Nov 2023 • Yinghao Xu, Hao Tan, Fujun Luan, Sai Bi, Peng Wang, Jiahao Li, Zifan Shi, Kalyan Sunkavalli, Gordon Wetzstein, Zexiang Xu, Kai Zhang
We propose \textbf{DMV3D}, a novel 3D generation approach that uses a transformer-based 3D large reconstruction model to denoise multi-view diffusion.
no code implementations • 10 Nov 2023 • Jiahao Li, Hao Tan, Kai Zhang, Zexiang Xu, Fujun Luan, Yinghao Xu, Yicong Hong, Kalyan Sunkavalli, Greg Shakhnarovich, Sai Bi
Text-to-3D with diffusion models has achieved remarkable progress in recent years.
1 code implementation • ICCV 2023 • Jiahao Li, Zongxin Yang, Xiaohan Wang, Jianxin Ma, Chang Zhou, Yi Yang
Our method includes an encoder-decoder transformer architecture to fuse 2D and 3D representations for achieving 2D$\&$3D aligned results in a coarse-to-fine manner and a novel 3D joint contrastive learning approach for adding explicitly global supervision for the 3D feature space.
1 code implementation • 16 Jul 2023 • Chen Qian, Wei Liu, Hongzhang Liu, Nuo Chen, Yufan Dang, Jiahao Li, Cheng Yang, Weize Chen, Yusheng Su, Xin Cong, Juyuan Xu, Dahai Li, Zhiyuan Liu, Maosong Sun
Numerous studies used deep learning to improve specific phases in a waterfall model, such as design, coding, and testing.
no code implementations • 5 Jul 2023 • Yuanyou Xu, Jiahao Li, Zongxin Yang, Yi Yang, Yueting Zhuang
MSDeAOT efficiently propagates object masks from previous frames to the current frame using two feature scales of 16 and 8.
no code implementations • 5 Jul 2023 • Jiahao Li, Yuanyou Xu, Zongxin Yang, Yi Yang, Yueting Zhuang
The Associating Objects with Transformers (AOT) framework has exhibited exceptional performance in a wide range of complex scenarios for video object segmentation.
1 code implementation • 5 Jun 2023 • Hang Zhang, Renjiu Hu, Xiang Chen, Rongguang Wang, Jinwei Zhang, Jiahao Li
Specifically, the network incorporating DAGrid has realized a 70. 8% reduction in network parameter size and a 96. 8% decrease in FLOPs, while concurrently improving the Dice score for skin lesion segmentation by 1. 0% compared to state-of-the-art transformers.
1 code implementation • Information Processing & Management 2023 • Jiahao Li, Yong Zhang, Xingyu Yang, LiangWei Chen
In addition, while the vast majority of SOTA strategies maintain a poor turnover rate of approximately greater than 50% on average, our framework enjoys a relatively low turnover rate on all datasets, efficiency analysis illustrates that our framework no longer has the quadratic dependency limitation.
no code implementations • 7 Apr 2023 • Jinwei Zhang, Thanh D. Nguyen, Eddy Solomon, Chao Li, Qihao Zhang, Jiahao Li, Hang Zhang, Pascal Spincemaille, Yi Wang
Results: The retrospective ablation study showed improved image sharpness of mcLARO compared to the baseline network without multi-contrast sampling pattern optimization or image feature fusion, and negligible bias and narrow 95% limits of agreement on regional T1, T2, T2* and QSM values were obtained by the under-sampled reconstructions compared to the fully sampled reconstruction.
1 code implementation • 15 Mar 2023 • Hang Zhang, Rongguang Wang, Renjiu Hu, Jinwei Zhang, Jiahao Li
Chronic active multiple sclerosis lesions, also termed as rim+ lesions, can be characterized by a hyperintense rim at the edge of the lesion on quantitative susceptibility maps.
2 code implementations • CVPR 2023 • Jiahao Li, Bin Li, Yan Lu
Better yet, our codec has surpassed the under-developing next generation traditional codec/ECM in both RGB and YUV420 colorspaces, in terms of PSNR.
1 code implementation • 10 Feb 2023 • Guo-Hua Wang, Jiahao Li, Bin Li, Yan Lu
Both mask decay and residual representation learning greatly improve the RD performance of our scalable encoder.
no code implementations • 19 Jan 2023 • Hang Zhang, Rongguang Wang, Jinwei Zhang, Dongdong Liu, Chao Li, Jiahao Li
Compared to natural images, medical images usually show stronger visual patterns and therefore this adds flexibility and elasticity to resource-limited clinical applications by injecting proper priors into neural networks.
no code implementations • CVPR 2023 • Linfeng Qi, Jiahao Li, Bin Li, Houqiang Li, Yan Lu
Meanwhile, besides assisting frame coding at the current time step, the feature from context generation will be propagated as motion condition when coding the subsequent motion latent.
no code implementations • CVPR 2023 • Fei Xie, Lei Chu, Jiahao Li, Yan Lu, Chao Ma
Existing Siamese tracking methods, which are built on pair-wise matching between two single frames, heavily rely on additional sophisticated mechanism to exploit temporal information among successive video frames, hindering them from high efficiency and industrial deployments.
1 code implementation • 3 Dec 2022 • Jiahao Li, Zhourun Wu, Wenhao Lin, Jiawei Luo, Jun Zhang, Qingcai Chen, Junjie Chen
Although many feature extraction methods have been proposed to improve the performance of enhancer identification, they cannot learn position-related multiscale contextual information from raw DNA sequences.
1 code implementation • CVPR 2023 • Haochen Wang, Xiaodan Du, Jiahao Li, Raymond A. Yeh, Greg Shakhnarovich
We propose to apply chain rule on the learned gradients, and back-propagate the score of a diffusion model through the Jacobian of a differentiable renderer, which we instantiate to be a voxel radiance field.
Ranked #6 on Text to 3D on T$^3$Bench
1 code implementation • 1 Nov 2022 • Jinwei Zhang, Pascal Spincemaille, Hang Zhang, Thanh D. Nguyen, Chao Li, Jiahao Li, Ilhami Kovanlikaya, Mert R. Sabuncu, Yi Wang
In this paper, we present our new framework, called Learned Acquisition and Reconstruction Optimization (LARO), which aims to accelerate the multi-echo gradient echo (mGRE) pulse sequence for QSM.
1 code implementation • 20 Oct 2022 • Jiahao Li, Quan Wang, Zhendong Mao, Junbo Guo, Yanyan Yang, Yongdong Zhang
In this paper, we consider introducing an auxiliary task of Chinese pronunciation prediction (CPP) to improve CSC, and, for the first time, systematically discuss the adaptivity and granularity of this auxiliary task.
1 code implementation • 13 Jul 2022 • Jiahao Li, Bin Li, Yan Lu
Besides estimating the probability distribution, our entropy model also generates the quantization step at spatial-channel-wise.
1 code implementation • 7 Apr 2022 • Jiahao Li, Greg Shakhnarovich, Raymond A. Yeh
Our method for phrase localization requires no human annotations or additional training.
no code implementations • CVPR 2022 • Cong Huang, Jiahao Li, Bin Li, Dong Liu, Yan Lu
The temporal features usually contain various noisy and uncorrelated information, and they may interfere with the restoration of the current frame.
1 code implementation • 27 Nov 2021 • Xihua Sheng, Jiahao Li, Bin Li, Li Li, Dong Liu, Yan Lu
From the stored propagated features, we propose to learn multi-scale temporal contexts, and re-fill the learned temporal contexts into the modules of our compression scheme, including the contextual encoder-decoder, the frame generator, and the temporal context encoder.
no code implementations • 19 Oct 2021 • Zhaoliang Zheng, Jiahao Li, Parth Agrawal, Zhao Lei, Aaron John-Sabu, Ankur Mehta
Designing a controllable airship for non-expert users or preemptively evaluating the performance of desired airships has always been a very challenging problem.
1 code implementation • NeurIPS 2021 • Jiahao Li, Bin Li, Yan Lu
In this paper, we propose a deep contextual video compression framework to enable a paradigm shift from predictive coding to conditional coding.
no code implementations • 19 Feb 2021 • Erva Ulu, Nurcan Gecer Ulu, Jiahao Li, Walter Hsiao
Combined with the build orientation optimization, Curvy provides a practical solution to the design of support structures with minimal perceptual or functional impact on the target part to be printed.
Graphics Computational Geometry
1 code implementation • 18 Feb 2021 • Wencheng Zhu, Jiahao Li, Jiwen Lu, Jie zhou
Specifically, we first compute a pixel-wise similarity matrix by using representations of reference and target pixels and then select top-rank reference pixels for target pixel classification.
no code implementations • 8 Feb 2021 • Xin Zhao, Jifeng Guo, Lin Wang, Fanqi Li, Jiahao Li, Junteng Zheng, Bo Yang
Solid texture synthesis (STS), an effective way to extend a 2D exemplar to a 3D solid volume, exhibits advantages in computational photography.
1 code implementation • 1 Dec 2020 • Wencheng Zhu, Jiwen Lu, Jiahao Li, and Jie Zhou
In this paper, we propose a Detect-to-Summarize network (DSNet) framework for supervised video summarization.
Ranked #2 on Video Summarization on TvSum (using extra training data)
no code implementations • 14 Sep 2020 • Jiacheng Ruan, Jiahao Li
As a common method in Machine Learning, Ensemble Method is used to train multiple models from a data set and obtain better results through certain combination strategies.
1 code implementation • ECCV 2020 • Jiahao Li, Changhao Zhang, Ziyao Xu, Hangning Zhou, Chi Zhang
In this paper, we propose a novel learning-based pipeline for partially overlapping 3D point cloud registration.
2 code implementations • 22 Sep 2018 • Junfeng Jiang, Jiahao Li
Especially during the Chinese market crash in 2015, the Pearson correlation coefficient of adjusted sentimental factor with SSE is 0. 5844, which suggests that our model can provide a solid guidance, especially in the special period when the market is influenced greatly by public sentiment.