Search Results for author: Shiyu Zhao

Found 30 papers, 19 papers with code

OPPO’s Machine Translation Systems for WMT20

no code implementations WMT (EMNLP) 2020 Tingxun Shi, Shiyu Zhao, Xiaopu Li, Xiaoxue Wang, Qian Zhang, Di Ai, Dawei Dang, Xue Zhengshan, Jie Hao

In this paper we demonstrate our (OPPO’s) machine translation systems for the WMT20 Shared Task on News Translation for all the 22 language pairs.

Machine Translation Translation

GenSwarm: Scalable Multi-Robot Code-Policy Generation and Deployment via Language Models

1 code implementation31 Mar 2025 Wenkang Ji, Huaben Chen, Mingyang Chen, Guobin Zhu, Lufeng Xu, Roderich Groß, Rui Zhou, Ming Cao, Shiyu Zhao

The development of control policies for multi-robot systems traditionally follows a complex and labor-intensive process, often lacking the flexibility to adapt to dynamic tasks.

Zero-Shot Learning

LED: LLM Enhanced Open-Vocabulary Object Detection without Human Curated Data Generation

no code implementations18 Mar 2025 Yang Zhou, Shiyu Zhao, Yuxiao Chen, Zhenting Wang, Dimitris N. Metaxas

Large foundation models trained on large-scale visual-text data can significantly enhance Open Vocabulary Object Detection (OVD) through data generation.

object-detection Open-vocabulary object detection +3

YOLOMG: Vision-based Drone-to-Drone Detection with Appearance and Pixel-Level Motion Fusion

1 code implementation10 Mar 2025 Hanqing Guo, Xiuxiu Lin, Shiyu Zhao

Next, this motion difference map is combined with an RGB image using a bimodal fusion module, allowing for adaptive feature learning of the drone.

LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation

1 code implementation2 Feb 2025 Can Jin, Ying Li, Mingyu Zhao, Shiyu Zhao, Zhenting Wang, Xiaoxiao He, Ligong Han, Tong Che, Dimitris N. Metaxas

Visual prompting has gained popularity as a method for adapting pre-trained models to specific tasks, particularly in the realm of parameter-efficient tuning.

Inductive Bias Visual Prompting

MLLM-as-a-Judge for Image Safety without Human Labeling

no code implementations31 Dec 2024 Zhenting Wang, Shuming Hu, Shiyu Zhao, Xiaowen Lin, Felix Juefei-Xu, Zhuowei Li, Ligong Han, Harihar Subramanyam, Li Chen, Jianfa Chen, Nan Jiang, Lingjuan Lyu, Shiqing Ma, Dimitris N. Metaxas, Ankit Jain

To address these challenges, we propose a MLLM-based method includes objectifying safety rules, assessing the relevance between rules and images, making quick judgments based on debiased token probabilities with logically complete yet simplified precondition chains for safety rules, and conducting more in-depth reasoning with cascaded chain-of-thought processes if necessary.

Image Generation

Token-Budget-Aware LLM Reasoning

1 code implementation24 Dec 2024 Tingxu Han, Zhenting Wang, Chunrong Fang, Shiyu Zhao, Shiqing Ma, Zhenyu Chen

Reasoning is critical for large language models (LLMs) to excel in a wide range of tasks.

Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction

no code implementations30 Nov 2024 Shiyu Zhao, Zhenting Wang, Felix Juefei-Xu, Xide Xia, Miao Liu, Xiaofang Wang, Mingfu Liang, Ning Zhang, Dimitris N. Metaxas, Licheng Yu

For Scenario II, based on the reduction strategy from G-Search, we design a parametric sigmoid function (P-Sigmoid) to guide the reduction at each layer of the MLLM, whose parameters are optimized by Bayesian Optimization.

Bayesian Optimization Token Reduction

MBA-SLAM: Motion Blur Aware Dense Visual SLAM with Radiance Fields Representation

1 code implementation13 Nov 2024 Peng Wang, Lingzhe Zhao, Yin Zhang, Shiyu Zhao, Peidong Liu

In our experiments, we demonstrate that MBA-SLAM surpasses previous state-of-the-art methods in both camera localization and map reconstruction, showcasing superior performance across a range of datasets, including synthetic and real datasets featuring sharp images as well as those affected by motion blur, highlighting the versatility and robustness of our approach.

3DGS Camera Localization +2

Motion-guided small MAV detection in complex and non-planar scenes

no code implementations14 Oct 2024 Hanqing Guo, Canlun Zheng, Shiyu Zhao

Our proposed method can effectively and efficiently detect extremely small MAVs from dynamic and complex backgrounds because it aggregates pixel-level motion features and eliminates false positives based on the motion and appearance features of MAVs.

Multi-Object Tracking

APEER: Automatic Prompt Engineering Enhances Large Language Model Reranking

no code implementations20 Jun 2024 Can Jin, Hongwu Peng, Shiyu Zhao, Zhenting Wang, Wujiang Xu, Ligong Han, Jiahui Zhao, Kai Zhong, Sanguthevar Rajasekaran, Dimitris N. Metaxas

Existing automatic prompt engineering algorithms primarily focus on language modeling and classification tasks, leaving the domain of IR, particularly reranking, underexplored.

Information Retrieval Language Modeling +4

AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning

1 code implementation17 Jun 2024 Shirley Wu, Shiyu Zhao, Qian Huang, Kexin Huang, Michihiro Yasunaga, Kaidi Cao, Vassilis N. Ioannidis, Karthik Subbian, Jure Leskovec, James Zou

Large language model (LLM) agents have demonstrated impressive capabilities in utilizing external tools and knowledge to boost accuracy and reduce hallucinations.

Language Modeling Language Modelling +3

Domain Adaptive Detection of MAVs: A Benchmark and Noise Suppression Network

1 code implementation25 Mar 2024 Yin Zhang, Jinhong Deng, Peidong Liu, Wen Li, Shiyu Zhao

A new benchmark for cross-domain MAV detection is proposed based on the proposed dataset.

Pseudo Label

Generating Enhanced Negatives for Training Language-Based Object Detectors

1 code implementation CVPR 2024 Shiyu Zhao, Long Zhao, Vijay Kumar B. G, Yumin Suh, Dimitris N. Metaxas, Manmohan Chandraker, Samuel Schulter

The recent progress in language-based open-vocabulary object detection can be largely attributed to finding better ways of leveraging large-scale data with free-form text annotations.

Object object-detection +2

Global-Local MAV Detection under Challenging Conditions based on Appearance and Motion

2 code implementations18 Dec 2023 Hanqing Guo, Ye Zheng, Yin Zhang, Zhi Gao, Shiyu Zhao

In this paper, we propose a global-local MAV detector that can fuse both motion and appearance features for MAV detection under challenging conditions.

Computational Efficiency

Multi-Agent Consensus Seeking via Large Language Models

1 code implementation31 Oct 2023 Huaben Chen, Wenkang Ji, Lufeng Xu, Shiyu Zhao

To that end, this work studies a consensus-seeking task where the state of each agent is a numerical value and they negotiate with each other to reach a consensus value.

Predator-prey survival pressure is sufficient to evolve swarming behaviors

no code implementations24 Aug 2023 Jianan Li, Liang Li, Shiyu Zhao

The comprehension of how local interactions arise in global collective behavior is of utmost importance in both biological and physical research.

Diversity reinforcement-learning +1

OmniLabel: A Challenging Benchmark for Language-Based Object Detection

no code implementations ICCV 2023 Samuel Schulter, Vijay Kumar B G, Yumin Suh, Konstantinos M. Dafnis, Zhixing Zhang, Shiyu Zhao, Dimitris Metaxas

With more than 28K unique object descriptions on over 25K images, OmniLabel provides a challenging benchmark with diverse and complex object descriptions in a naturally open-vocabulary setting.

Object object-detection +1

Characterizing bearing equivalence in directed graphs

no code implementations9 Mar 2023 Zhiyong Sun, Shiyu Zhao, Daniel Zelazo

These conditions involve the spectrum and null space of the associated bearing Laplacian matrix for a directed bearing formation.

Mask and Reason: Pre-Training Knowledge Graph Transformers for Complex Logical Queries

1 code implementation16 Aug 2022 Xiao Liu, Shiyu Zhao, Kai Su, Yukuo Cen, Jiezhong Qiu, Mengdi Zhang, Wei Wu, Yuxiao Dong, Jie Tang

In this work, we present the Knowledge Graph Transformer (kgTransformer) with masked pre-training and fine-tuning strategies.

Mixture-of-Experts

Exploiting Unlabeled Data with Vision and Language Models for Object Detection

1 code implementation18 Jul 2022 Shiyu Zhao, Zhixing Zhang, Samuel Schulter, Long Zhao, Vijay Kumar B. G, Anastasis Stathopoulos, Manmohan Chandraker, Dimitris Metaxas

We propose a novel method that leverages the rich semantics available in recent vision and language models to localize and classify objects in unlabeled images, effectively generating pseudo labels for object detection.

Object object-detection +4

Global Matching with Overlapping Attention for Optical Flow Estimation

1 code implementation CVPR 2022 Shiyu Zhao, Long Zhao, Zhixing Zhang, Enyu Zhou, Dimitris Metaxas

In this paper, inspired by the traditional matching-optimization methods where matching is introduced to handle large displacements before energy-based optimizations, we introduce a simple but effective global matching step before the direct regression and develop a learning-based matching-optimization framework, namely GMFlowNet.

Optical Flow Estimation regression

STNN-DDI: A Substructure-aware Tensor Neural Network to Predict Drug-Drug Interactions

1 code implementation10 Nov 2021 Hui Yu, Shiyu Zhao, JianYu Shi

Results: In this paper, by supposing that the interactions between two given drugs are caused by their local chemical structures (sub-structures) and their DDI types are determined by the linkages between different substructure sets, we design a novel Substructure-ware Tensor Neural Network model for DDI prediction (STNN-DDI).

Vocal Bursts Type Prediction

Deep Animation Video Interpolation in the Wild

1 code implementation CVPR 2021 Li SiYao, Shiyu Zhao, Weijiang Yu, Wenxiu Sun, Dimitris N. Metaxas, Chen Change Loy, Ziwei Liu

In the animation industry, cartoon videos are usually produced at low frame rate since hand drawing of such frames is costly and time-consuming.

Optical Flow Estimation Triplet +1

Time Derivative of Rotation Matrices: A Tutorial

1 code implementation20 Sep 2016 Shiyu Zhao

The time derivative of a rotation matrix equals the product of a skew-symmetric matrix and the rotation matrix itself.

Robotics

Cannot find the paper you are looking for? You can Submit a new open access paper.