1 code implementation • 2 Apr 2025 • Yi-Long Lu, Chunhui Zhang, Jiajun Song, Lifeng Fan, Wei Wang
Theory of Mind (ToM), the ability to attribute mental states to others, is fundamental for human social intelligence and a critical capability for advanced Artificial Intelligence.
1 code implementation • 31 Dec 2024 • Ling Fu, Biao Yang, Zhebin Kuang, Jiajun Song, Yuzhe Li, Linghao Zhu, Qidi Luo, Xinyu Wang, Hao Lu, Mingxin Huang, Zhang Li, Guozhi Tang, Bin Shan, Chunhui Lin, Qi Liu, Binghong Wu, Hao Feng, Hao liu, Can Huang, Jingqun Tang, Wei Chen, Lianwen Jin, Yuliang Liu, Xiang Bai
Scoring the Optical Character Recognition (OCR) capabilities of Large Multimodal Models (LMMs) has witnessed growing interest recently.
no code implementations • 14 Dec 2024 • Chi Zhang, Jiajun Song, Siyu Li, Yitao Liang, Yuxi Ma, Wei Wang, Yixin Zhu, Song-Chun Zhu
Mathematics olympiads are prestigious competitions, with problem proposing and solving highly honored.
1 code implementation • CVPR 2025 • Pengfei Zhou, Xiaopeng Peng, Jiajun Song, Chuanhao Li, Zhaopan Xu, Yue Yang, Ziyao Guo, Hao Zhang, Yuqi Lin, Yefei He, Lirui Zhao, Shuo Liu, Tianhua Li, Yuxuan Xie, Xiaojun Chang, Yu Qiao, Wenqi Shao, Kaipeng Zhang
While the progress in unified models offers new solutions, existing benchmarks are insufficient for evaluating these methods due to limitations in data size and diversity.
1 code implementation • 18 Aug 2024 • Jiajun Song, Zhuoyan Xu, Yiqiao Zhong
We empirically examined the training dynamics of Transformers on a synthetic example and conducted extensive experiments on a variety of pretrained LLMs, focusing on a type of components known as induction heads.
1 code implementation • 6 Jul 2024 • Jiajun Song, Jiajun Luo, Rongwei Lu, Shuzhao Xie, Bin Chen, Zhi Wang
Asynchronous Federated Learning (AFL) confronts inherent challenges arising from the heterogeneity of devices (e. g., their computation capacities) and low-bandwidth environments, both potentially causing stale model updates (e. g., local gradients) for global aggregation.
1 code implementation • 21 May 2024 • Hiba Maryam, Ling Fu, Jiajun Song, Tajrian ABM Shafayet, Qidi Luo, Xiang Bai, Yuliang Liu
The development of Urdu scene text detection, recognition, and Visual Question Answering (VQA) technologies is crucial for advancing accessibility, information retrieval, and linguistic diversity in digital content, facilitating better understanding and interaction with Urdu-language visual data.
1 code implementation • 14 Feb 2024 • Pengfei Zhou, Weiqing Min, Jiajun Song, Yang Zhang, Shuqiang Jiang
The complexity of food semantic attributes further makes it more difficult for current ZSD methods to distinguish various food categories.
1 code implementation • 7 Oct 2023 • Pengfei Zhou, Weiqing Min, Yang Zhang, Jiajun Song, Ying Jin, Shuqiang Jiang
To tackle this, we propose the Semantic Separable Diffusion Synthesizer (SeeDS) framework for Zero-Shot Food Detection (ZSFD).
Ranked #1 on
Generalized Zero-Shot Object Detection
on MS-COCO
1 code implementation • 7 Oct 2023 • Jiajun Song, Yiqiao Zhong
Given embedding vector $\boldsymbol{h}_{c, t} \in \mathbb{R}^d$ at sequence position $t \le T$ in a sequence (or context) $c \le C$, extracting the mean effects yields the decomposition \[ \boldsymbol{h}_{c, t} = \boldsymbol{\mu} + \mathbf{pos}_t + \mathbf{ctx}_c + \mathbf{resid}_{c, t} \] where $\boldsymbol{\mu}$ is the global mean vector, $\mathbf{pos}_t$ and $\mathbf{ctx}_c$ are the mean vectors across contexts and across positions respectively, and $\mathbf{resid}_{c, t}$ is the residual vector.
1 code implementation • journal 2023 • Jiajun Song, Zhuo Li, Weiqing Min, Shuqiang Jiang
Therefore, it is challenging to study the generalization of the model in food image retrieval.
2 code implementations • 9 Feb 2021 • Zhuo Li, Weiqing Min, Jiajun Song, Yaohui Zhu, Liping Kang, Xiaoming Wei, Xiaolin Wei, Shuqiang Jiang
Limited by the definition of AP, such methods consider both negative and positive instances ranking before each positive instance.
Ranked #3 on
Vehicle Re-Identification
on VehicleID Large