Position-Aware Contrastive Alignment for Referring Image Segmentation

no code implementations27 Dec 2022 Bo Chen, Zhiwei Hu, Zhilong Ji, Jinfeng Bai, WangMeng Zuo

The main challenge of this task is to understand the visual and linguistic content simultaneously and to find the referred object accurately among all instances in the image.

Image Segmentation Semantic Segmentation

1st Place Solution for YouTubeVOS Challenge 2022: Referring Video Object Segmentation

1 code implementation27 Dec 2022 Zhiwei Hu, Bo Chen, Yuan Gao, Zhilong Ji, Jinfeng Bai

The task of referring video object segmentation aims to segment the object in the frames of a given video to which the referring expressions refer.

Referring Video Object Segmentation Semantic Segmentation +1

Transformer-based Entity Typing in Knowledge Graphs

1 code implementation20 Oct 2022 Zhiwei Hu, Víctor Gutiérrez-Basulto, Zhiliang Xiang, Ru Li, Jeff Z. Pan

We investigate the knowledge graph entity typing task which aims at inferring plausible entity types.

Entity Typing Knowledge Graphs

Visual Subtitle Feature Enhanced Video Outline Generation

no code implementations24 Aug 2022 Qi Lv, Ziqiang Cao, Wenrui Xie, Derui Wang, Jingwen Wang, Zhiwei Hu, Tangkun Zhang, Ba Yuan, Yuanhang Li, Min Cao, Wenjie Li, Sujian Li, Guohong Fu

Furthermore, based on the similarity between video outlines and textual outlines, we use a large number of articles with chapter headings to pretrain our model.

Headline generation Navigate +4

Type-aware Embeddings for Multi-Hop Reasoning over Knowledge Graphs

1 code implementation2 May 2022 Zhiwei Hu, Víctor Gutiérrez-Basulto, Zhiliang Xiang, XiaoLi Li, Ru Li, Jeff Z. Pan

Multi-hop reasoning over real-life knowledge graphs (KGs) is a highly challenging problem as traditional subgraph matching methods are not capable to deal with noise and missing information.

Knowledge Graphs

Deeply Interleaved Two-Stream Encoder for Referring Video Segmentation

no code implementations30 Mar 2022 Guang Feng, Lihe Zhang, Zhiwei Hu, Huchuan Lu

To address this task, we first design a two-stream encoder to extract CNN-based visual features and transformer-based linguistic features hierarchically, and a vision-language mutual guidance (VLMG) module is inserted into the encoder multiple times to promote the hierarchical and progressive fusion of multi-modal features.

Referring Expression Segmentation Video Segmentation +1

Youling: an AI-Assisted Lyrics Creation System

no code implementations EMNLP 2020 Rongsheng Zhang, Xiaoxi Mao, Le Li, Lin Jiang, Lin Chen, Zhiwei Hu, Yadong Xi, Changjie Fan, Minlie Huang

In the lyrics generation process, \textit{Youling} supports traditional one pass full-text generation mode as well as an interactive generation mode, which allows users to select the satisfactory sentences from generated candidates conditioned on preceding context.

Text Generation

Encoder Fusion Network with Co-Attention Embedding for Referring Image Segmentation

no code implementations CVPR 2021 Guang Feng, Zhiwei Hu, Lihe Zhang, Huchuan Lu

In this work, we propose an encoder fusion network (EFN), which transforms the visual encoder into a multi-modal feature learning network, and uses language to refine the multi-modal features progressively.

Image Segmentation Semantic Segmentation

Possible multi-orbital ground state in CeCu$_2$Si$_2$

no code implementations5 Oct 2020 Andrea Amorese, Andrea Marino, Martin Sundermann, Kai Chen, Zhiwei Hu, Thomas Willers, Fadi Choukani, Philippe Ohresser, Javier Herrero-Martin, Stefano Agrestini, Chien-Te Chen, Hong-Ji Lin, Maurits W. Haverkort, Silvia Seiro, Christoph Geibel, Frank Steglich, Liu Hao Tjeng, Gertrud Zwicknagl, Andrea Severing

The crystal-field ground state wave function of CeCu$_2$Si$_2$ has been investigated with linear polarized $M$-edge x-ray absorption spectroscopy from 250mK to 250K, thus covering the superconducting ($T_{\text{c}}$=0. 6K), the Kondo ($T_{\text{K}}$$\approx$20K) as well as the Curie-Weiss regime.

Strongly Correlated Electrons

Bi-Directional Relationship Inferring Network for Referring Image Segmentation

no code implementations CVPR 2020 Zhiwei Hu, Guang Feng, Jiayu Sun, Lihe Zhang, Huchuan Lu

Combining with the language-guided visual attention, a bi-directional cross-modal attention module (BCAM) is built to learn the relationship between multi-modal features.

Image Segmentation Referring Expression +2

