no code implementations • 2 Apr 2025 • Zhixin Cheng, Jiacheng Deng, Xinjun Li, Baoqun Yin, Tianzhu Zhang
In the AMAM, we design an adversarial approach to reduce the domain gap between image and point cloud.
no code implementations • 22 Jan 2025 • Chen Chen, Xinlong Hao, Weiwen Liu, Xu Huang, Xingshan Zeng, Shuai Yu, Dexun Li, Shuai Wang, Weinan Gan, Yuefeng Huang, Wulong Liu, Xinzhi Wang, Defu Lian, Baoqun Yin, Yasheng Wang, Wu Liu
Normal evaluates function calls in basic scenarios; Special evaluates function calls in scenarios with vague or incomplete instructions; Agent introduces multi-agent interactions to simulate function calling evaluation in real-world multi-turn interactions.
no code implementations • 25 Nov 2024 • Yuanyang Yin, Yaqi Zhao, Mingwu Zheng, Ke Lin, Jiarong Ou, Rui Chen, Victor Shea-Jay Huang, Jiahao Wang, Xin Tao, Pengfei Wan, Di Zhang, Baoqun Yin, Wentao Zhang, Kun Gai
Achieving optimal performance of video diffusion transformers within given data and compute budget is crucial due to their high training costs.
no code implementations • 25 Nov 2024 • Yaqi Zhao, Yuanyang Yin, Lin Li, MingAn Lin, Victor Shea-Jay Huang, Siwei Chen, WeiPeng Chen, Baoqun Yin, Zenan Zhou, Wentao Zhang
Specifically, the VE's representation of visual information may not fully align with LLM's cognitive framework, leading to a mismatch where visual features exceed the language model's interpretive range.
no code implementations • 21 Aug 2024 • Yuanyang Yin, Yaqi Zhao, YaJie Zhang, Ke Lin, Jiahao Wang, Xin Tao, Pengfei Wan, Di Zhang, Baoqun Yin, Wentao Zhang
Multimodal Large Language Models (MLLMs) have recently demonstrated remarkable perceptual and reasoning abilities, typically comprising a Vision Encoder, an Adapter, and a Large Language Model (LLM).
Ranked #71 on
Visual Question Answering
on MM-Vet
no code implementations • 13 Dec 2023 • Xin Ding, Xiaoyu Liu, Zhijun Tu, Yun Zhang, Wei Li, Jie Hu, Hanting Chen, Yehui Tang, Zhiwei Xiong, Baoqun Yin, Yunhe Wang
Post-training quantization (PTQ) has played a key role in compressing large language models (LLMs) with ultra-low costs.
no code implementations • 10 Oct 2023 • Munawar Ali, Baoqun Yin, Hazrat Bilal, Aakash Kumar, Ali Muhammad, Avinash Rohra
The whole study proposes a combination of spiked and normal convolution layers as an energy-efficient and reliable object detector model.
no code implementations • 5 Dec 2021 • Yun Li, Chen Zhang, Shihao Han, Li Lyna Zhang, Baoqun Yin, Yunxin Liu, Mengwei Xu
Human brains are known to be capable of speeding up visual recognition of repeatedly presented objects through faster memory encoding and accessing procedures on activated neurons.
no code implementations • 4 Jul 2020 • Yun Li, Zechun Liu, Weiqun Wu, Haotian Yao, Xiangyu Zhang, Chi Zhang, Baoqun Yin
In this paper, a simple yet effective network pruning framework is proposed to simultaneously address the problems of pruning indicator, pruning ratio, and efficiency constraint.