no code implementations • WMT (EMNLP) 2020 • Tingxun Shi, Shiyu Zhao, Xiaopu Li, Xiaoxue Wang, Qian Zhang, Di Ai, Dawei Dang, Xue Zhengshan, Jie Hao
In this paper we demonstrate our (OPPO’s) machine translation systems for the WMT20 Shared Task on News Translation for all the 22 language pairs.
no code implementations • WMT (EMNLP) 2021 • Shiyu Zhao, Xiaopu Li, Minghui Wu, Jie Hao
This paper describes Mininglamp neural machine translation systems of the WMT2021 news translation tasks.
1 code implementation • 31 Mar 2025 • Wenkang Ji, Huaben Chen, Mingyang Chen, Guobin Zhu, Lufeng Xu, Roderich Groß, Rui Zhou, Ming Cao, Shiyu Zhao
The development of control policies for multi-robot systems traditionally follows a complex and labor-intensive process, often lacking the flexibility to adapt to dynamic tasks.
no code implementations • 18 Mar 2025 • Yang Zhou, Shiyu Zhao, Yuxiao Chen, Zhenting Wang, Dimitris N. Metaxas
Large foundation models trained on large-scale visual-text data can significantly enhance Open Vocabulary Object Detection (OVD) through data generation.
1 code implementation • 10 Mar 2025 • Hanqing Guo, Xiuxiu Lin, Shiyu Zhao
Next, this motion difference map is combined with an RGB image using a bimodal fusion module, allowing for adaptive feature learning of the drone.
1 code implementation • 2 Feb 2025 • Can Jin, Ying Li, Mingyu Zhao, Shiyu Zhao, Zhenting Wang, Xiaoxiao He, Ligong Han, Tong Che, Dimitris N. Metaxas
Visual prompting has gained popularity as a method for adapting pre-trained models to specific tasks, particularly in the realm of parameter-efficient tuning.
no code implementations • 31 Dec 2024 • Zhenting Wang, Shuming Hu, Shiyu Zhao, Xiaowen Lin, Felix Juefei-Xu, Zhuowei Li, Ligong Han, Harihar Subramanyam, Li Chen, Jianfa Chen, Nan Jiang, Lingjuan Lyu, Shiqing Ma, Dimitris N. Metaxas, Ankit Jain
To address these challenges, we propose a MLLM-based method includes objectifying safety rules, assessing the relevance between rules and images, making quick judgments based on debiased token probabilities with logically complete yet simplified precondition chains for safety rules, and conducting more in-depth reasoning with cascaded chain-of-thought processes if necessary.
1 code implementation • 24 Dec 2024 • Tingxu Han, Zhenting Wang, Chunrong Fang, Shiyu Zhao, Shiqing Ma, Zhenyu Chen
Reasoning is critical for large language models (LLMs) to excel in a wide range of tasks.
no code implementations • 30 Nov 2024 • Shiyu Zhao, Zhenting Wang, Felix Juefei-Xu, Xide Xia, Miao Liu, Xiaofang Wang, Mingfu Liang, Ning Zhang, Dimitris N. Metaxas, Licheng Yu
For Scenario II, based on the reduction strategy from G-Search, we design a parametric sigmoid function (P-Sigmoid) to guide the reduction at each layer of the MLLM, whose parameters are optimized by Bayesian Optimization.
1 code implementation • 13 Nov 2024 • Peng Wang, Lingzhe Zhao, Yin Zhang, Shiyu Zhao, Peidong Liu
In our experiments, we demonstrate that MBA-SLAM surpasses previous state-of-the-art methods in both camera localization and map reconstruction, showcasing superior performance across a range of datasets, including synthetic and real datasets featuring sharp images as well as those affected by motion blur, highlighting the versatility and robustness of our approach.
no code implementations • 14 Oct 2024 • Hanqing Guo, Canlun Zheng, Shiyu Zhao
Our proposed method can effectively and efficiently detect extremely small MAVs from dynamic and complex backgrounds because it aggregates pixel-level motion features and eliminates false positives based on the motion and appearance features of MAVs.
3 code implementations • 9 Oct 2024 • Manling Li, Shiyu Zhao, Qineng Wang, Kangrui Wang, Yu Zhou, Sanjana Srivastava, Cem Gokmen, Tony Lee, Li Erran Li, Ruohan Zhang, Weiyu Liu, Percy Liang, Li Fei-Fei, Jiayuan Mao, Jiajun Wu
We aim to evaluate Large Language Models (LLMs) for embodied decision making.
no code implementations • 20 Jun 2024 • Can Jin, Hongwu Peng, Shiyu Zhao, Zhenting Wang, Wujiang Xu, Ligong Han, Jiahui Zhao, Kai Zhong, Sanguthevar Rajasekaran, Dimitris N. Metaxas
Existing automatic prompt engineering algorithms primarily focus on language modeling and classification tasks, leaving the domain of IR, particularly reranking, underexplored.
1 code implementation • 17 Jun 2024 • Shirley Wu, Shiyu Zhao, Qian Huang, Kexin Huang, Michihiro Yasunaga, Kaidi Cao, Vassilis N. Ioannidis, Karthik Subbian, Jure Leskovec, James Zou
Large language model (LLM) agents have demonstrated impressive capabilities in utilizing external tools and knowledge to boost accuracy and reduce hallucinations.
1 code implementation • 19 Apr 2024 • Shirley Wu, Shiyu Zhao, Michihiro Yasunaga, Kexin Huang, Kaidi Cao, Qian Huang, Vassilis N. Ioannidis, Karthik Subbian, James Zou, Jure Leskovec
To address the gap, we develop STARK, a large-scale Semi-structure retrieval benchmark on Textual and Relational Knowledge Bases.
no code implementations • CVPR 2024 • Mingfu Liang, Jong-Chyi Su, Samuel Schulter, Sparsh Garg, Shiyu Zhao, Ying Wu, Manmohan Chandraker
This necessitates an expensive process of continuously curating and annotating data with significant human effort.
1 code implementation • 25 Mar 2024 • Yin Zhang, Jinhong Deng, Peidong Liu, Wen Li, Shiyu Zhao
A new benchmark for cross-domain MAV detection is proposed based on the proposed dataset.
1 code implementation • CVPR 2024 • Shiyu Zhao, Long Zhao, Vijay Kumar B. G, Yumin Suh, Dimitris N. Metaxas, Manmohan Chandraker, Samuel Schulter
The recent progress in language-based open-vocabulary object detection can be largely attributed to finding better ways of leveraging large-scale data with free-form text annotations.
2 code implementations • 18 Dec 2023 • Hanqing Guo, Ye Zheng, Yin Zhang, Zhi Gao, Shiyu Zhao
In this paper, we propose a global-local MAV detector that can fuse both motion and appearance features for MAV detection under challenging conditions.
1 code implementation • 31 Oct 2023 • Huaben Chen, Wenkang Ji, Lufeng Xu, Shiyu Zhao
To that end, this work studies a consensus-seeking task where the state of each agent is a numerical value and they negotiate with each other to reach a consensus value.
no code implementations • 24 Aug 2023 • Jianan Li, Liang Li, Shiyu Zhao
The comprehension of how local interactions arise in global collective behavior is of utmost importance in both biological and physical research.
2 code implementations • CVPR 2024 • Shiyu Zhao, Samuel Schulter, Long Zhao, Zhixing Zhang, Vijay Kumar B. G, Yumin Suh, Manmohan Chandraker, Dimitris N. Metaxas
This work identifies two challenges of using self-training in OVD: noisy PLs from VLMs and frequent distribution changes of PLs.
no code implementations • ICCV 2023 • Samuel Schulter, Vijay Kumar B G, Yumin Suh, Konstantinos M. Dafnis, Zhixing Zhang, Shiyu Zhao, Dimitris Metaxas
With more than 28K unique object descriptions on over 25K images, OmniLabel provides a challenging benchmark with diverse and complex object descriptions in a naturally open-vocabulary setting.
no code implementations • 9 Mar 2023 • Zhiyong Sun, Shiyu Zhao, Daniel Zelazo
These conditions involve the spectrum and null space of the associated bearing Laplacian matrix for a directed bearing formation.
1 code implementation • 16 Aug 2022 • Xiao Liu, Shiyu Zhao, Kai Su, Yukuo Cen, Jiezhong Qiu, Mengdi Zhang, Wei Wu, Yuxiao Dong, Jie Tang
In this work, we present the Knowledge Graph Transformer (kgTransformer) with masked pre-training and fine-tuning strategies.
1 code implementation • 18 Jul 2022 • Shiyu Zhao, Zhixing Zhang, Samuel Schulter, Long Zhao, Vijay Kumar B. G, Anastasis Stathopoulos, Manmohan Chandraker, Dimitris Metaxas
We propose a novel method that leverages the rich semantics available in recent vision and language models to localize and classify objects in unlabeled images, effectively generating pseudo labels for object detection.
Ranked #20 on
Open Vocabulary Object Detection
on MSCOCO
1 code implementation • CVPR 2022 • Shiyu Zhao, Long Zhao, Zhixing Zhang, Enyu Zhou, Dimitris Metaxas
In this paper, inspired by the traditional matching-optimization methods where matching is introduced to handle large displacements before energy-based optimizations, we introduce a simple but effective global matching step before the direct regression and develop a learning-based matching-optimization framework, namely GMFlowNet.
Ranked #5 on
Optical Flow Estimation
on KITTI 2015
1 code implementation • 10 Nov 2021 • Hui Yu, Shiyu Zhao, JianYu Shi
Results: In this paper, by supposing that the interactions between two given drugs are caused by their local chemical structures (sub-structures) and their DDI types are determined by the linkages between different substructure sets, we design a novel Substructure-ware Tensor Neural Network model for DDI prediction (STNN-DDI).
1 code implementation • CVPR 2021 • Li SiYao, Shiyu Zhao, Weijiang Yu, Wenxiu Sun, Dimitris N. Metaxas, Chen Change Loy, Ziwei Liu
In the animation industry, cartoon videos are usually produced at low frame rate since hand drawing of such frames is costly and time-consuming.
1 code implementation • 20 Sep 2016 • Shiyu Zhao
The time derivative of a rotation matrix equals the product of a skew-symmetric matrix and the rotation matrix itself.
Robotics