Search Results for author: Hongjie Zhang

Found 12 papers, 4 papers with code

EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World

1 code implementation24 Mar 2024 Yifei HUANG, Guo Chen, Jilan Xu, Mingfang Zhang, Lijin Yang, Baoqi Pei, Hongjie Zhang, Lu Dong, Yali Wang, LiMin Wang, Yu Qiao

Along with the videos we record high-quality gaze data and provide detailed multimodal annotations, formulating a playground for modeling the human ability to bridge asynchronous procedural actions from different viewpoints.

InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding

2 code implementations22 Mar 2024 Yi Wang, Kunchang Li, Xinhao Li, Jiashuo Yu, Yinan He, Guo Chen, Baoqi Pei, Rongkun Zheng, Jilan Xu, Zun Wang, Yansong Shi, Tianxiang Jiang, Songze Li, Hongjie Zhang, Yifei HUANG, Yu Qiao, Yali Wang, LiMin Wang

We introduce InternVideo2, a new video foundation model (ViFM) that achieves the state-of-the-art performance in action recognition, video-text tasks, and video-centric dialogue.

 Ranked #1 on Audio Classification on ESC-50 (using extra training data)

Action Classification Action Recognition +12

MoVQA: A Benchmark of Versatile Question-Answering for Long-Form Movie Understanding

no code implementations8 Dec 2023 Hongjie Zhang, Yi Liu, Lu Dong, Yifei HUANG, Zhen-Hua Ling, Yali Wang, LiMin Wang, Yu Qiao

While several long-form VideoQA datasets have been introduced, the length of both videos used to curate questions and sub-clips of clues leveraged to answer those questions have not yet reached the criteria for genuine long-form video understanding.

Question Answering Video Question Answering +1

Multi-view Feature Extraction based on Triple Contrastive Heads

no code implementations22 Mar 2023 Hongjie Zhang

Multi-view feature extraction is an efficient approach for alleviating the issue of dimensionality in highdimensional multi-view data.

Contrastive Learning Self-Supervised Learning

Multi-view Feature Extraction based on Dual Contrastive Head

no code implementations8 Feb 2023 Hongjie Zhang

Multi-view feature extraction is an efficient approach for alleviating the issue of dimensionality in highdimensional multi-view data.

Contrastive Learning Self-Supervised Learning

InternVideo: General Video Foundation Models via Generative and Discriminative Learning

1 code implementation6 Dec 2022 Yi Wang, Kunchang Li, Yizhuo Li, Yinan He, Bingkun Huang, Zhiyu Zhao, Hongjie Zhang, Jilan Xu, Yi Liu, Zun Wang, Sen Xing, Guo Chen, Junting Pan, Jiashuo Yu, Yali Wang, LiMin Wang, Yu Qiao

Specifically, InternVideo efficiently explores masked video modeling and video-language contrastive learning as the pretraining objectives, and selectively coordinates video representations of these two complementary frameworks in a learnable manner to boost various video applications.

 Ranked #1 on Action Recognition on Something-Something V1 (using extra training data)

Action Classification Contrastive Learning +8

AcceRL: Policy Acceleration Framework for Deep Reinforcement Learning

no code implementations28 Nov 2022 Hongjie Zhang

Inspired by the redundancy of neural networks, we propose a lightweight parallel training framework based on neural network compression, AcceRL, to accelerate the policy learning while ensuring policy quality.

Decision Making General Reinforcement Learning +3

Feature Extraction Framework based on Contrastive Learning with Adaptive Positive and Negative Samples

no code implementations11 Jan 2022 Hongjie Zhang

In this study, we propose a feature extraction framework based on contrastive learning with adaptive positive and negative samples (CL-FEFA) that is suitable for unsupervised, supervised, and semi-supervised single-view feature extraction.

Contrastive Learning

Unified Framework for Feature Extraction based on Contrastive Learning

no code implementations25 Jan 2021 Hongjie Zhang

In this study, we proposed a unified framework based on a new perspective of contrastive learning (CL) that is suitable for both unsupervised and supervised feature extraction.

Contrastive Learning Graph Embedding +1

Hybrid Models for Open Set Recognition

no code implementations ECCV 2020 Hongjie Zhang, Ang Li, Jie Guo, Yanwen Guo

We propose the OpenHybrid framework, which is composed of an encoder to encode the input data into a joint embedding space, a classifier to classify samples to inlier classes, and a flow-based density estimator to detect whether a sample belongs to the unknown category.

Open Set Learning Out-of-Distribution Detection

Viewpoint Selection for Photographing Architectures

no code implementations6 Mar 2017 Jingwu He, Linbo Wang, Wenzhe Zhou, Hongjie Zhang, Xiufen Cui, Yanwen Guo

Unlike previous efforts devoted to photo quality assessment which mainly rely on 2D image features, we show in this paper combining 2D image features extracted from images with 3D geometric features computed on the 3D models can result in more reliable evaluation of viewpoint quality.

2k Clustering

Cannot find the paper you are looking for? You can Submit a new open access paper.