1 code implementation • 17 Dec 2023 • Mingsheng Li, Xin Chen, Chi Zhang, Sijin Chen, Hongyuan Zhu, Fukun Yin, Gang Yu, Tao Chen
Furthermore, we establish a new benchmark for assessing the performance of large models in understanding multi-modal 3D prompts.
1 code implementation • 30 Nov 2023 • Sijin Chen, Xin Chen, Chi Zhang, Mingsheng Li, Gang Yu, Hao Fei, Hongyuan Zhu, Jiayuan Fan, Tao Chen
However, developing LMMs that can comprehend, reason, and plan in complex and diverse 3D environments remains a challenging topic, especially considering the demand for understanding permutation-invariant point cloud 3D representations of the 3D scene.
no code implementations • 29 Oct 2023 • Sijin Chen, Zhize Li, Yuejie Chi
To our knowledge, Power-EF is the first distributed and compressed SGD algorithm that provably escapes saddle points in heterogeneous FL without any data homogeneity assumptions.
1 code implementation • 6 Sep 2023 • Sijin Chen, Hongyuan Zhu, Mingsheng Li, Xin Chen, Peng Guo, Yinjie Lei, Gang Yu, Taihao Li, Tao Chen
Moreover, we argue that object localization and description generation require different levels of scene understanding, which could be challenging for a shared set of queries to capture.
1 code implementation • CVPR 2023 • Sijin Chen, Hongyuan Zhu, Xin Chen, Yinjie Lei, Tao Chen, Gang Yu
Compared with prior arts, our framework has several appealing advantages: 1) Without resorting to numerous hand-crafted components, our method is based on a full transformer encoder-decoder architecture with a learnable vote query driven object decoder, and a caption decoder that produces the dense captions in a set-prediction manner.
no code implementations • 28 Dec 2021 • Sijin Chen, Xiwei Cheng, Anthony Man-Cho So
This paper proposes a Generalized Power Method (GPM) to tackle the problem of community detection and group synchronization simultaneously in a direct non-convex manner.
1 code implementation • 5 Dec 2020 • Wu Zheng, Weiliang Tang, Sijin Chen, Li Jiang, Chi-Wing Fu
Existing single-stage detectors for locating objects in point clouds often treat object localization and category classification as separate tasks, so the localization accuracy and classification confidence may not well align.
Ranked #3 on Birds Eye View Object Detection on KITTI Cars Easy