Search Results for author: Weixin Luo

Found 24 papers, 19 papers with code

MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning

2 code implementations19 Jan 2024 Chenyu Wang, Weixin Luo, Qianyu Chen, Haonan Mai, Jindi Guo, Sixun Dong, Xiaohua, Xuan, Zhengxin Li, Lin Ma, Shenghua Gao

Recently, the astonishing performance of large language models (LLMs) in natural language comprehension and generation tasks triggered lots of exploration of using them as central controllers to build agent systems.

Language Modelling Large Language Model

SoccerNet 2023 Challenges Results

2 code implementations12 Sep 2023 Anthony Cioppa, Silvio Giancola, Vladimir Somers, Floriane Magera, Xin Zhou, Hassan Mkhallati, Adrien Deliège, Jan Held, Carlos Hinojosa, Amir M. Mansourian, Pierre Miralles, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdullah Kamal, Adrien Maglo, Albert Clapés, Amr Abdelaziz, Artur Xarles, Astrid Orcesi, Atom Scott, Bin Liu, Byoungkwon Lim, Chen Chen, Fabian Deuser, Feng Yan, Fufu Yu, Gal Shitrit, Guanshuo Wang, Gyusik Choi, Hankyul Kim, Hao Guo, Hasby Fahrudin, Hidenari Koguchi, Håkan Ardö, Ibrahim Salah, Ido Yerushalmy, Iftikar Muhammad, Ikuma Uchida, Ishay Be'ery, Jaonary Rabarisoa, Jeongae Lee, Jiajun Fu, Jianqin Yin, Jinghang Xu, Jongho Nang, Julien Denize, Junjie Li, Junpei Zhang, Juntae Kim, Kamil Synowiec, Kenji Kobayashi, Kexin Zhang, Konrad Habel, Kota Nakajima, Licheng Jiao, Lin Ma, Lizhi Wang, Luping Wang, Menglong Li, Mengying Zhou, Mohamed Nasr, Mohamed Abdelwahed, Mykola Liashuha, Nikolay Falaleev, Norbert Oswald, Qiong Jia, Quoc-Cuong Pham, Ran Song, Romain Hérault, Rui Peng, Ruilong Chen, Ruixuan Liu, Ruslan Baikulov, Ryuto Fukushima, Sergio Escalera, Seungcheon Lee, Shimin Chen, Shouhong Ding, Taiga Someya, Thomas B. Moeslund, Tianjiao Li, Wei Shen, Wei zhang, Wei Li, Wei Dai, Weixin Luo, Wending Zhao, Wenjie Zhang, Xinquan Yang, Yanbiao Ma, Yeeun Joo, Yingsen Zeng, Yiyang Gan, Yongqiang Zhu, Yujie Zhong, Zheng Ruan, Zhiheng Li, Zhijian Huang, Ziyu Meng

More information on the tasks, challenges, and leaderboards are available on https://www. soccer-net. org.

Action Spotting Camera Calibration +3

E2E-LOAD: End-to-End Long-form Online Action Detection

1 code implementation ICCV 2023 Shuqiang Cao, Weixin Luo, Bairui Wang, Wei zhang, Lin Ma

Furthermore, we propose a novel and efficient inference mechanism that accelerates heavy spatial-temporal exploration.

Online Action Detection

Bridging the Gap Between End-to-end and Non-End-to-end Multi-Object Tracking

2 code implementations22 May 2023 Feng Yan, Weixin Luo, Yujie Zhong, Yiyang Gan, Lin Ma

Existing end-to-end Multi-Object Tracking (e2e-MOT) methods have not surpassed non-end-to-end tracking-by-detection methods.

Multi-Object Tracking Video Object Tracking

Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos

1 code implementation CVPR 2023 Sixun Dong, Huazhang Hu, Dongze Lian, Weixin Luo, Yicheng Qian, Shenghua Gao

Sequential video understanding, as an emerging video understanding task, has driven lots of researchers' attention because of its goal-oriented nature.

Representation Learning Sentence +1

Multiple Object Tracking Challenge Technical Report for Team MT_IoT

1 code implementation7 Dec 2022 Feng Yan, Zhiheng Li, Weixin Luo, Zequn Jie, Fan Liang, Xiaolin Wei, Lin Ma

This is a brief technical report of our proposed method for Multiple-Object Tracking (MOT) Challenge in Complex Environments.

Ranked #8 on Multi-Object Tracking on DanceTrack (using extra training data)

Human Detection Multi-Object Tracking +2

Learning Point-Language Hierarchical Alignment for 3D Visual Grounding

1 code implementation22 Oct 2022 Jiaming Chen, Weixin Luo, Ran Song, Xiaolin Wei, Lin Ma, Wei zhang

This paper presents a novel hierarchical alignment model (HAM) that learns multi-granularity visual and linguistic representations in an end-to-end manner.

Sentence Visual Grounding +1

SVIP: Sequence VerIfication for Procedures in Videos

1 code implementation CVPR 2022 Yicheng Qian, Weixin Luo, Dongze Lian, Xu Tang, Peilin Zhao, Shenghua Gao

In this paper, we propose a novel sequence verification task that aims to distinguish positive video pairs performing the same action sequence from negative ones with step-level transformations but still conducting the same task.

Action Detection Action Recognition

Two-stage Visual Cues Enhancement Network for Referring Image Segmentation

1 code implementation9 Oct 2021 Yang Jiao, Zequn Jie, Weixin Luo, Jingjing Chen, Yu-Gang Jiang, Xiaolin Wei, Lin Ma

Referring Image Segmentation (RIS) aims at segmenting the target object from an image referred by one given natural language expression.

Image Segmentation Retrieval +2

Proxy-bridged Image Reconstruction Network for Anomaly Detection in Medical Images

no code implementations5 Oct 2021 Kang Zhou, Jing Li, Weixin Luo, Zhengxin Li, Jianlong Yang, Huazhu Fu, Jun Cheng, Jiang Liu, Shenghua Gao

To mitigate this problem, in this paper, we propose a novel Proxy-bridged Image Reconstruction Network (ProxyAno) for anomaly detection in medical images.

Anomaly Detection Image Reconstruction

Look Before You Leap: Learning Landmark Features for One-Stage Visual Grounding

1 code implementation CVPR 2021 Binbin Huang, Dongze Lian, Weixin Luo, Shenghua Gao

Then we combine the contextual information from the landmark feature convolution module with the target's visual features for grounding.

Descriptive Object +1

Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild

1 code implementation CVPR 2021 Zhaoyuan Yin, Jia Zheng, Weixin Luo, Shenhan Qian, Hanling Zhang, Shenghua Gao

This paper proposes a framework for the interactive video object segmentation (VOS) in the wild where users can choose some frames for annotations iteratively.

Interactive Video Object Segmentation

Amodal Segmentation Based on Visible Region Segmentation and Shape Prior

1 code implementation10 Dec 2020 Yuting Xiao, Yanyu Xu, Ziming Zhong, Weixin Luo, Jiawei Li, Shenghua Gao

In this way, features corresponding to background and occlusion can be suppressed for amodal mask estimation.

Segmentation

SIRI: Spatial Relation Induced Network For Spatial Description Resolution

no code implementations NeurIPS 2020 Peiyao Wang, Weixin Luo, Yanyu Xu, Haojie Li, Shugong Xu, Jianyu Yang, Shenghua Gao

Spatial Description Resolution, as a language-guided localization task, is proposed for target location in a panoramic street view, given corresponding language descriptions.

Relation

Encoding Structure-Texture Relation with P-Net for Anomaly Detection in Retinal Images

1 code implementation ECCV 2020 Kang Zhou, Yuting Xiao, Jianlong Yang, Jun Cheng, Wen Liu, Weixin Luo, Zaiwang Gu, Jiang Liu, Shenghua Gao

In the end, we further utilize the reconstructed image to extract the structure and measure the difference between structure extracted from original and the reconstructed image.

Anatomy Anomaly Detection +2

Password-conditioned Anonymization and Deanonymization with Face Identity Transformers

1 code implementation26 Nov 2019 Xiuye Gu, Weixin Luo, Michael S. Ryoo, Yong Jae Lee

Cameras are prevalent in our daily lives, and enable many useful systems built upon computer vision technologies such as smart cameras and home robots for service applications.

Multi-Cell Multi-Task Convolutional Neural Networks for Diabetic Retinopathy Grading

no code implementations31 Aug 2018 Kang Zhou, Zaiwang Gu, Wen Liu, Weixin Luo, Jun Cheng, Shenghua Gao, Jiang Liu

To considering the relationships of images with different stages, we propose a \textbf{Multi-Task} learning strategy which predicts the label with both classification and regression.

Diabetic Retinopathy Grading General Classification +1

Face Aging With Identity-Preserved Conditional Generative Adversarial Networks

2 code implementations CVPR 2018 Zongwei Wang, Xu Tang, Weixin Luo, Shenghua Gao

By grouping faces with target age together, the objective of face aging is equivalent to transferring aging patterns of faces within the target age group to the face whose aged face is to be synthesized.

Future Frame Prediction for Anomaly Detection – A New Baseline

1 code implementation CVPR 2018 Wen Liu, Weixin Luo, Dongze Lian, Shenghua Gao

To predict a future frame with higher quality for normal events, other than the commonly used appearance (spatial) constraints on intensity and gradient, we also introduce a motion (temporal) constraint in video prediction by enforcing the optical flow between predicted frames and ground truth frames to be consistent, and this is the first work that introduces a temporal constraint into the video prediction task.

Anomaly Detection Optical Flow Estimation +1

Future Frame Prediction for Anomaly Detection -- A New Baseline

1 code implementation28 Dec 2017 Wen Liu, Weixin Luo, Dongze Lian, Shenghua Gao

To predict a future frame with higher quality for normal events, other than the commonly used appearance (spatial) constraints on intensity and gradient, we also introduce a motion (temporal) constraint in video prediction by enforcing the optical flow between predicted frames and ground truth frames to be consistent, and this is the first work that introduces a temporal constraint into the video prediction task.

Anomaly Detection Optical Flow Estimation +2

A Revisit of Sparse Coding Based Anomaly Detection in Stacked RNN Framework

1 code implementation ICCV 2017 Weixin Luo, Wen Liu, Shenghua Gao

Motivated by the capability of sparse coding based anomaly detection, we propose a Temporally-coherent Sparse Coding (TSC) where we enforce similar neighbouring frames be encoded with similar reconstruction coefficients.

Anomaly Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.