Search Results for author: Jiebo Luo

Found 252 papers, 90 papers with code

Harnessing GPT-4V(ision) for Insurance: A Preliminary Exploration

no code implementations15 Apr 2024 Chenwei Lin, Hanjia Lyu, Jiebo Luo, Xian Xu

The emergence of Large Multimodal Models (LMMs) marks a significant milestone in the development of artificial intelligence.

MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators

2 code implementations7 Apr 2024 Shenghai Yuan, Jinfa Huang, Yujun Shi, Yongqi Xu, Ruijie Zhu, Bin Lin, Xinhua Cheng, Li Yuan, Jiebo Luo

Recent advances in Text-to-Video generation (T2V) have achieved remarkable success in synthesizing high-quality general videos from textual descriptions.

Text-to-Video Generation Video Generation

Learning Spatial Adaptation and Temporal Coherence in Diffusion Models for Video Super-Resolution

no code implementations25 Mar 2024 Zhikai Chen, Fuchen Long, Zhaofan Qiu, Ting Yao, Wengang Zhou, Jiebo Luo, Tao Mei

Technically, SATeCo freezes all the parameters of the pre-trained UNet and VAE, and only optimizes two deliberately-designed spatial feature adaptation (SFA) and temporal feature alignment (TFA) modules, in the decoder of UNet and VAE.

Denoising Image Super-Resolution +3

DanceCamera3D: 3D Camera Movement Synthesis with Music and Dance

1 code implementation20 Mar 2024 Zixuan Wang, Jia Jia, Shikun Sun, Haozhe Wu, Rong Han, Zhenyu Li, Di Tang, Jiaqing Zhou, Jiebo Luo

However, camera movement synthesis with music and dance remains an unsolved challenging problem due to the scarcity of paired data.

SoMeLVLM: A Large Vision Language Model for Social Media Processing

no code implementations20 Feb 2024 Xinnong Zhang, Haoyu Kuang, Xinyi Mou, Hanjia Lyu, Kun Wu, Siming Chen, Jiebo Luo, Xuanjing Huang, Zhongyu Wei

The powerful Large Vision Language Models make it possible to handle a variety of tasks simultaneously, but even with carefully designed prompting methods, the general domain models often fall short in aligning with the unique speaking style and context of social media tasks.

Language Modelling

Emo-Avatar: Efficient Monocular Video Style Avatar through Texture Rendering

no code implementations1 Feb 2024 Pinxin Liu, Luchuan Song, Daoan Zhang, Hang Hua, Yunlong Tang, Huaijin Tu, Jiebo Luo, Chenliang Xu

To address the above problems, we propose the Efficient Monotonic Video Style Avatar (Emo-Avatar) through deferred neural rendering that enhances StyleGAN's capacity for producing dynamic, drivable portrait videos.

Contrastive Learning Neural Rendering

Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach

1 code implementation28 Jan 2024 Shaofeng Zhang, Jinfa Huang, Qiang Zhou, Zhibin Wang, Fan Wang, Jiebo Luo, Junchi Yan

At inference, we generate images with arbitrary expansion multiples by inputting an anchor image and its corresponding positional embeddings.

Image Outpainting

Human vs. LMMs: Exploring the Discrepancy in Emoji Interpretation and Usage in Digital Communication

1 code implementation16 Jan 2024 Hanjia Lyu, Weihong Qi, Zhongyu Wei, Jiebo Luo

Leveraging Large Multimodal Models (LMMs) to simulate human behaviors when processing multimodal information, especially in the context of social media, has garnered immense interest due to its broad potential and far-reaching implications.

CoCoT: Contrastive Chain-of-Thought Prompting for Large Multimodal Models with Multiple Image Inputs

no code implementations5 Jan 2024 Daoan Zhang, Junming Yang, Hanjia Lyu, Zijian Jin, Yuan YAO, Mingkai Chen, Jiebo Luo

When exploring the development of Artificial General Intelligence (AGI), a critical task for these models involves interpreting and processing information from multiple image inputs.

Image Comprehension Text Matching +1

Bring Metric Functions into Diffusion Models

no code implementations4 Jan 2024 Jie An, Zhengyuan Yang, JianFeng Wang, Linjie Li, Zicheng Liu, Lijuan Wang, Jiebo Luo

The first module, similar to a standard DDPM, learns to predict the added noise and is unaffected by the metric function.

Denoising

Video Understanding with Large Language Models: A Survey

1 code implementation29 Dec 2023 Yunlong Tang, Jing Bi, Siting Xu, Luchuan Song, Susan Liang, Teng Wang, Daoan Zhang, Jie An, Jingyang Lin, Rongyi Zhu, Ali Vosoughi, Chao Huang, Zeliang Zhang, Feng Zheng, JianGuo Zhang, Ping Luo, Jiebo Luo, Chenliang Xu

With the burgeoning growth of online video platforms and the escalating volume of video content, the demand for proficient video understanding tools has intensified markedly.

Video Understanding

SurgicalPart-SAM: Part-to-Whole Collaborative Prompting for Surgical Instrument Segmentation

2 code implementations22 Dec 2023 Wenxi Yue, Jing Zhang, Kun Hu, Qiuxia Wu, ZongYuan Ge, Yong Xia, Jiebo Luo, Zhiyong Wang

Specifically, we achieve this by proposing (1) Collaborative Prompts that describe instrument structures via collaborating category-level and part-level texts; (2) Cross-Modal Prompt Encoder that encodes text prompts jointly with visual embeddings into discriminative part-level representations; and (3) Part-to-Whole Adaptive Fusion and Hierarchical Decoding that adaptively fuse the part-level representations into a whole for accurate instrument segmentation in surgical scenarios.

Segmentation Semantic Segmentation

GPT-4V(ision) as A Social Media Analysis Engine

1 code implementation13 Nov 2023 Hanjia Lyu, Jinfa Huang, Daoan Zhang, Yongsheng Yu, Xinyi Mou, Jinsheng Pan, Zhengyuan Yang, Zhongyu Wei, Jiebo Luo

Our investigation begins with a preliminary quantitative analysis for each task using existing benchmark datasets, followed by a careful review of the results and a selection of qualitative samples that illustrate GPT-4V's potential in understanding multimodal social media content.

Hallucination Hate Speech Detection +1

A Survey of Large Language Models in Medicine: Progress, Application, and Challenge

1 code implementation9 Nov 2023 Hongjian Zhou, Fenglin Liu, Boyang Gu, Xinyu Zou, Jinfa Huang, Jinge Wu, Yiru Li, Sam S. Chen, Peilin Zhou, Junling Liu, Yining Hua, Chengfeng Mao, Chenyu You, Xian Wu, Yefeng Zheng, Lei Clifton, Zheng Li, Jiebo Luo, David A. Clifton

Therefore, this review aims to provide a detailed overview of the development and deployment of LLMs in medicine, including the challenges and opportunities they face.

Mixture of Weak & Strong Experts on Graphs

no code implementations9 Nov 2023 Hanqing Zeng, Hanjia Lyu, Diyi Hu, Yinglong Xia, Jiebo Luo

We propose to decouple the two modalities by mixture of weak and strong experts (Mowst), where the weak expert is a light-weight Multi-layer Perceptron (MLP), and the strong expert is an off-the-shelf Graph Neural Network (GNN).

Node Classification

Deceptive Fairness Attacks on Graphs via Meta Learning

1 code implementation24 Oct 2023 Jian Kang, Yinglong Xia, Ross Maciejewski, Jiebo Luo, Hanghang Tong

We study deceptive fairness attacks on graphs to answer the following question: How can we achieve poisoning attacks on a graph learning model to exacerbate the bias deceptively?

Adversarial Robustness Fairness +3

OpenLEAF: Open-Domain Interleaved Image-Text Generation and Evaluation

no code implementations11 Oct 2023 Jie An, Zhengyuan Yang, Linjie Li, JianFeng Wang, Kevin Lin, Zicheng Liu, Lijuan Wang, Jiebo Luo

We hope our proposed framework, benchmark, and LMM evaluation could help establish the intriguing interleaved image-text generation task.

Question Answering Text Generation

Understanding Divergent Framing of the Supreme Court Controversies: Social Media vs. News Outlets

no code implementations18 Sep 2023 Jinsheng Pan, Zichen Wang, Weihong Qi, Hanjia Lyu, Jiebo Luo

Understanding the framing of political issues is of paramount importance as it significantly shapes how individuals perceive, interpret, and engage with these matters.

Decision Making

SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation

1 code implementation17 Aug 2023 Wenxi Yue, Jing Zhang, Kun Hu, Yong Xia, Jiebo Luo, Zhiyong Wang

However, we observe two problems with this naive pipeline: (1) the domain gap between natural objects and surgical instruments leads to inferior generalisation of SAM; and (2) SAM relies on precise point or box locations for accurate segmentation, requiring either extensive manual guidance or a well-performing specialist detector for prompt preparation, which leads to a complex multi-stage pipeline.

Image Segmentation Segmentation +1

Jurassic World Remake: Bringing Ancient Fossils Back to Life via Zero-Shot Long Image-to-Image Translation

1 code implementation14 Aug 2023 Alexander Martin, Haitian Zheng, Jie An, Jiebo Luo

In this work, we use text-guided latent diffusion models for zero-shot image-to-image translation (I2I) across large domain gaps (longI2I), where large amounts of new visual features and new geometry need to be generated to enter the target domain.

Image-to-Image Translation

User-Controllable Recommendation via Counterfactual Retrospective and Prospective Explanations

1 code implementation2 Aug 2023 Juntao Tan, Yingqiang Ge, Yan Zhu, Yinglong Xia, Jiebo Luo, Jianchao Ji, Yongfeng Zhang

Acknowledging the recent advancements in explainable recommender systems that enhance users' understanding of recommendation mechanisms, we propose leveraging these advancements to improve user controllability.

counterfactual Counterfactual Reasoning +1

MobileVidFactory: Automatic Diffusion-Based Social Media Video Generation for Mobile Devices from Text

no code implementations31 Jul 2023 Junchen Zhu, Huan Yang, Wenjing Wang, Huiguo He, Zixi Tuo, Yongsheng Yu, Wen-Huang Cheng, Lianli Gao, Jingkuan Song, Jianlong Fu, Jiebo Luo

In the basic generation, we take advantage of the pretrained image diffusion model, and adapt it to a high-quality open-domain vertical video generator for mobile devices.

Video Generation

LLM-Rec: Personalized Recommendation via Prompting Large Language Models

no code implementations24 Jul 2023 Hanjia Lyu, Song Jiang, Hanqing Zeng, Yinglong Xia, Qifan Wang, Si Zhang, Ren Chen, Christopher Leung, Jiajie Tang, Jiebo Luo

Notably, the success of LLM-Rec lies in its prompting strategies, which effectively tap into the language model's comprehension of both general and specific item characteristics.

Domain-Scalable Unpaired Image Translation via Latent Space Anchoring

1 code implementation26 Jun 2023 Siyu Huang, Jie An, Donglai Wei, Zudi Lin, Jiebo Luo, Hanspeter Pfister

However, given a UNIT model trained on certain domains, it is difficult for current methods to incorporate new domains because they often need to train the full model on both existing and new domains.

Image-to-Image Translation Translation

Improving Video Colorization by Test-Time Tuning

1 code implementation25 Jun 2023 Yaping Zhao, Haitian Zheng, Jiebo Luo, Edmund Y. Lam

With the advancements in deep learning, video colorization by propagating color information from a colorized reference frame to a monochrome video sequence has been well explored.

Colorization

Unveiling Cross Modality Bias in Visual Question Answering: A Causal View with Possible Worlds VQA

no code implementations31 May 2023 Ali Vosoughi, Shijian Deng, Songyang Zhang, Yapeng Tian, Chenliang Xu, Jiebo Luo

In this paper, we first model a confounding effect that causes language and vision bias simultaneously, then propose a counterfactual inference to remove the influence of this effect.

counterfactual Counterfactual Inference +2

Learning to Evaluate the Artness of AI-generated Images

no code implementations8 May 2023 Junyu Chen, Jie An, Hanjia Lyu, Jiebo Luo

Assessing the artness of AI-generated images continues to be a challenge within the realm of image generation.

Image Generation

Meta-causal Learning for Single Domain Generalization

no code implementations CVPR 2023 Jin Chen, Zhi Gao, Xinxiao wu, Jiebo Luo

Under this paradigm, we propose a meta-causal learning method to learn meta-knowledge, that is, how to infer the causes of domain shift between the auxiliary and source domains during training.

counterfactual Counterfactual Inference +2

Predicting Adverse Neonatal Outcomes for Preterm Neonates with Multi-Task Learning

no code implementations28 Mar 2023 Jingyang Lin, Junyu Chen, Hanjia Lyu, Igor Khodak, Divya Chhabra, Colby L Day Richardson, Irina Prelipcean, Andrew M Dylag, Jiebo Luo

In this work, we first analyze the correlations between three adverse neonatal outcomes and then formulate the diagnosis of multiple neonatal outcomes as a multi-task learning (MTL) problem.

Feature Importance Multi-Task Learning

Bias or Diversity? Unraveling Fine-Grained Thematic Discrepancy in U.S. News Headlines

no code implementations28 Mar 2023 Jinsheng Pan, Weihong Qi, Zichen Wang, Hanjia Lyu, Jiebo Luo

There is a broad consensus that news media outlets incorporate ideological biases in their news articles.

Grounding 3D Object Affordance from 2D Interactions in Images

1 code implementation ICCV 2023 Yuhang Yang, Wei Zhai, Hongchen Luo, Yang Cao, Jiebo Luo, Zheng-Jun Zha

Comprehensive experiments on PIAD demonstrate the reliability of the proposed task and the superiority of our method.

Object

Spatial-Aware Token for Weakly Supervised Object Localization

1 code implementation ICCV 2023 Pingyu Wu, Wei Zhai, Yang Cao, Jiebo Luo, Zheng-Jun Zha

Specifically, a spatial token is first introduced in the input space to aggregate representations for localization task.

Object Weakly-Supervised Object Localization

SegPrompt: Using Segmentation Map as a Better Prompt to Finetune Deep Models for Kidney Stone Classification

no code implementations15 Mar 2023 Wei Zhu, Runtao Zhou, Yao Yuan, Campbell Timothy, Rajat Jain, Jiebo Luo

However, the shortage of annotated training data poses a severe problem in improving the performance and generalization ability of the trained model.

Classification Segmentation

Adaptive Siamese Tracking with a Compact Latent Network

no code implementations2 Feb 2023 Xingping Dong, Jianbing Shen, Fatih Porikli, Jiebo Luo, Ling Shao

Under this viewing, we perform an in-depth analysis for them through visual simulations and real tracking examples, and find that the failure cases in some challenging situations can be regarded as the issue of missing decisive samples in offline training.

Computational Assessment of Hyperpartisanship in News Titles

no code implementations16 Jan 2023 Hanjia Lyu, Jinsheng Pan, Zichen Wang, Jiebo Luo

Through an analysis of the topic distribution, we find that societal issues gradually receive more attention from all media groups.

Active Learning Language Modelling

PromptCap: Prompt-Guided Image Captioning for VQA with GPT-3

no code implementations ICCV 2023 Yushi Hu, Hang Hua, Zhengyuan Yang, Weijia Shi, Noah A. Smith, Jiebo Luo

PromptCap outperforms generic captions by a large margin and achieves state-of-the-art accuracy on knowledge-based VQA tasks (60. 4% on OK-VQA and 59. 6% on A-OKVQA).

Image Captioning Question Answering +3

QuantArt: Quantizing Image Style Transfer Towards High Visual Fidelity

1 code implementation CVPR 2023 Siyu Huang, Jie An, Donglai Wei, Jiebo Luo, Hanspeter Pfister

The mechanism of existing style transfer algorithms is by minimizing a hybrid loss function to push the generated image toward high similarities in both content and style.

Quantization Style Transfer +1

A Unified Framework for Contrastive Learning from a Perspective of Affinity Matrix

no code implementations26 Nov 2022 Wenbin Li, Meihao Kong, Xuesong Yang, Lei Wang, Jing Huo, Yang Gao, Jiebo Luo

In this study, we present a new unified contrastive learning representation framework (named UniCLR) suitable for all the above four kinds of methods from a novel perspective of basic affinity matrix.

Contrastive Learning Representation Learning

Improving Visual-textual Sentiment Analysis by Fusing Expert Features

no code implementations23 Nov 2022 Junyu Chen, Jie An, Hanjia Lyu, Jiebo Luo

Visual-textual sentiment analysis aims to predict sentiment with the input of a pair of image and text.

Sentiment Analysis

Stare at What You See: Masked Image Modeling without Reconstruction

no code implementations CVPR 2023 Hongwei Xue, Peng Gao, Hongyang Li, Yu Qiao, Hao Sun, Houqiang Li, Jiebo Luo

However, unlike the low-level features such as pixel values, we argue the features extracted by powerful teacher models already encode rich semantic correlation across regions in an intact image. This raises one question: is reconstruction necessary in Masked Image Modeling (MIM) with a teacher model?

PromptCap: Prompt-Guided Task-Aware Image Captioning

1 code implementation15 Nov 2022 Yushi Hu, Hang Hua, Zhengyuan Yang, Weijia Shi, Noah A Smith, Jiebo Luo

PromptCap outperforms generic captions by a large margin and achieves state-of-the-art accuracy on knowledge-based VQA tasks (60. 4% on OK-VQA and 59. 6% on A-OKVQA).

Image Captioning Language Modelling +5

FeDXL: Provable Federated Learning for Deep X-Risk Optimization

1 code implementation26 Oct 2022 Zhishuai Guo, Rong Jin, Jiebo Luo, Tianbao Yang

To this end, we propose an active-passive decomposition framework that decouples the gradient's components with two types, namely active parts and passive parts, where the active parts depend on local data that are computed with the local model and the passive parts depend on other machines that are communicated/computed based on historical models and samples.

Federated Learning

Learning a Grammar Inducer from Massive Uncurated Instructional Videos

1 code implementation22 Oct 2022 Songyang Zhang, Linfeng Song, Lifeng Jin, Haitao Mi, Kun Xu, Dong Yu, Jiebo Luo

While previous work focuses on building systems for inducing grammars on text that are well-aligned with video content, we investigate the scenario, in which text and video are only in loose correspondence.

Language Acquisition Video Alignment

TLDW: Extreme Multimodal Summarisation of News Videos

no code implementations16 Oct 2022 Peggy Tang, Kun Hu, Lei Zhang, Jiebo Luo, Zhiyong Wang

Multimodal summarisation with multimodal output is drawing increasing attention due to the rapid growth of multimedia data.

Sentence

Contextual Modeling for 3D Dense Captioning on Point Clouds

no code implementations8 Oct 2022 Yufeng Zhong, Long Xu, Jiebo Luo, Lin Ma

With such global and local contextual modeling strategies, our proposed model can effectively characterize the object representations and contextual information and thereby generate comprehensive and detailed descriptions of the located objects.

3D dense captioning Dense Captioning +2

Causal Inference via Nonlinear Variable Decorrelation for Healthcare Applications

no code implementations29 Sep 2022 Junda Wang, Weijian Li, Han Wang, Hanjia Lyu, Caroline Thirukumaran, Addisu Mesfin, Jiebo Luo

Causal inference and model interpretability research are gaining increasing attention, especially in the domains of healthcare and bioinformatics.

Causal Inference

Bi-Calibration Networks for Weakly-Supervised Video Representation Learning

1 code implementation21 Jun 2022 Fuchen Long, Ting Yao, Zhaofan Qiu, Xinmei Tian, Jiebo Luo, Tao Mei

The video-to-text/video-to-query projections over text prototypes/query vocabulary then start the text-to-query or query-to-text calibration to estimate the amendment to query or text.

Representation Learning

Stand-Alone Inter-Frame Attention in Video Models

1 code implementation CVPR 2022 Fuchen Long, Zhaofan Qiu, Yingwei Pan, Ting Yao, Jiebo Luo, Tao Mei

In this paper, we present a new recipe of inter-frame attention block, namely Stand-alone Inter-Frame Attention (SIFA), that novelly delves into the deformation across frames to estimate local self-attention on each spatial location.

Action Classification Action Recognition +1

Improving Pre-trained Language Model Fine-tuning with Noise Stability Regularization

no code implementations12 Jun 2022 Hang Hua, Xingjian Li, Dejing Dou, Cheng-Zhong Xu, Jiebo Luo

The advent of large-scale pre-trained language models has contributed greatly to the recent progress in natural language processing.

Domain Generalization Language Modelling +3

Automatic Relation-aware Graph Network Proliferation

1 code implementation CVPR 2022 Shaofei Cai, Liang Li, Xinzhe Han, Jiebo Luo, Zheng-Jun Zha, Qingming Huang

However, the currently used graph search space overemphasizes learning node features and neglects mining hierarchical relational information.

Graph Classification Graph Learning +5

Localized Adversarial Domain Generalization

1 code implementation CVPR 2022 Wei Zhu, Le Lu, Jing Xiao, Mei Han, Jiebo Luo, Adam P. Harrison

Adversarial domain generalization is a popular approach to DG, but conventional approaches (1) struggle to sufficiently align features so that local neighborhoods are mixed across domains; and (2) can suffer from feature space over collapse which can threaten generalization performance.

Domain Generalization

Deep Federated Anomaly Detection for Multivariate Time Series Data

no code implementations9 May 2022 Wei Zhu, Dongjin Song, Yuncong Chen, Wei Cheng, Bo Zong, Takehiko Mizoguchi, Cristian Lumezanu, Haifeng Chen, Jiebo Luo

Specifically, we first design an Exemplar-based Deep Neural network (ExDNN) to learn local time series representations based on their compatibility with an exemplar module which consists of hidden parameters learned to capture varieties of normal patterns on each edge device.

Constrained Clustering Federated Learning +3

Explainable Fairness in Recommendation

no code implementations24 Apr 2022 Yingqiang Ge, Juntao Tan, Yan Zhu, Yinglong Xia, Jiebo Luo, Shuchang Liu, Zuohui Fu, Shijie Geng, Zelong Li, Yongfeng Zhang

In this paper, we study the problem of explainable fairness, which helps to gain insights about why a system is fair or unfair, and guides the design of fair recommender systems with a more informed and unified methodology.

counterfactual Fairness +1

CM-GAN: Image Inpainting with Cascaded Modulation GAN and Object-Aware Training

1 code implementation22 Mar 2022 Haitian Zheng, Zhe Lin, Jingwan Lu, Scott Cohen, Eli Shechtman, Connelly Barnes, Jianming Zhang, Ning Xu, Sohrab Amirghodsi, Jiebo Luo

We propose cascaded modulation GAN (CM-GAN), a new network design consisting of an encoder with Fourier convolution blocks that extract multi-scale feature representations from the input image with holes and a dual-stream decoder with a novel cascaded global-spatial modulation block at each scale level.

Image Inpainting

Breast Cancer Induced Bone Osteolysis Prediction Using Temporal Variational Auto-Encoders

no code implementations20 Mar 2022 Wei Xiong, Neil Yeung, Shubo Wang, Haofu Liao, Liyun Wang, Jiebo Luo

Its ability of predicting the development of bone lesions in cancer-invading bones can assist in assessing the risk of impending fractures and choosing proper treatments in breast cancer bone metastasis.

Computed Tomography (CT)

RawlsGCN: Towards Rawlsian Difference Principle on Graph Convolutional Network

no code implementations28 Feb 2022 Jian Kang, Yan Zhu, Yinglong Xia, Jiebo Luo, Hanghang Tong

Graph Convolutional Network (GCN) plays pivotal roles in many real-world applications.

Point Cloud Denoising via Momentum Ascent in Gradient Fields

1 code implementation21 Feb 2022 Yaping Zhao, Haitian Zheng, Zhongrui Wang, Jiebo Luo, Edmund Y. Lam

To achieve point cloud denoising, traditional methods heavily rely on geometric priors, and most learning-based approaches suffer from outliers and loss of details.

Denoising Position

MANet: Improving Video Denoising with a Multi-Alignment Network

1 code implementation20 Feb 2022 Yaping Zhao, Haitian Zheng, Zhongrui Wang, Jiebo Luo, Edmund Y. Lam

In video denoising, the adjacent frames often provide very useful information, but accurate alignment is needed before such information can be harnassed.

Denoising Video Denoising

Cross-modal Contrastive Distillation for Instructional Activity Anticipation

no code implementations18 Jan 2022 Zhengyuan Yang, Jingen Liu, Jing Huang, Xiaodong He, Tao Mei, Chenliang Xu, Jiebo Luo

In this study, we aim to predict the plausible future action steps given an observation of the past and study the task of instructional activity anticipation.

Knowledge Distillation

SpaceEdit: Learning a Unified Editing Space for Open-Domain Image Color Editing

no code implementations CVPR 2022 Jing Shi, Ning Xu, Haitian Zheng, Alex Smith, Jiebo Luo, Chenliang Xu

Recently, large pretrained models (e. g., BERT, StyleGAN, CLIP) show great knowledge transfer and generalization capability on various downstream tasks within their domains.

Image-to-Image Translation Retrieval +1

Multi-modal Dependency Tree for Video Captioning

no code implementations NeurIPS 2021 Wentian Zhao, Xinxiao wu, Jiebo Luo

To this end, we propose a novel video captioning method that generates a sentence by first constructing a multi-modal dependency tree and then traversing the constructed tree, where the syntactic structure and semantic relationship in the sentence are represented by the tree topology.

Caption Generation Dependency Parsing +3

SpaceEdit: Learning a Unified Editing Space for Open-Domain Image Editing

no code implementations30 Nov 2021 Jing Shi, Ning Xu, Haitian Zheng, Alex Smith, Jiebo Luo, Chenliang Xu

Recently, large pretrained models (e. g., BERT, StyleGAN, CLIP) have shown great knowledge transfer and generalization capability on various downstream tasks within their domains.

Image-to-Image Translation Retrieval +1

Music Sentiment Transfer

1 code implementation12 Oct 2021 Miles Sigel, Michael Zhou, Jiebo Luo

Results and literature suggest that the task of music sentiment transfer is more difficult than image sentiment transfer because of the temporal characteristics of music and lack of existing datasets.

Style Transfer

Procedure Planning in Instructional Videos via Contextual Modeling and Model-based Policy Learning

no code implementations ICCV 2021 Jing Bi, Jiebo Luo, Chenliang Xu

In this work, we leverage instructional videos to study humans' decision-making processes, focusing on learning a model to plan goal-directed actions in real-life videos.

Action Recognition Bayesian Inference +1

CoSeg: Cognitively Inspired Unsupervised Generic Event Segmentation

1 code implementation30 Sep 2021 Xiao Wang, Jingen Liu, Tao Mei, Jiebo Luo

Unlike the mainstream clustering-based methods, our framework exploits a transformer-based feature reconstruction scheme to detect event boundary by reconstruction errors.

Boundary Detection Event Segmentation +1

Learning to Aggregate and Refine Noisy Labels for Visual Sentiment Analysis

no code implementations15 Sep 2021 Wei Zhu, Zihe Zheng, Haitian Zheng, Hanjia Lyu, Jiebo Luo

The learned prototypes and their labels can be regarded as denoising features and labels for the local regions and can guide the training process to prevent the model from overfitting the noisy cases.

Denoising Learning with noisy labels +1

Federated Learning of Molecular Properties with Graph Neural Networks in a Heterogeneous Setting

no code implementations15 Sep 2021 Wei Zhu, Jiebo Luo, Andrew White

FLIT(+) can align the local training across heterogeneous clients by improving the performance for uncertain samples.

Federated Learning

LibFewShot: A Comprehensive Library for Few-shot Learning

1 code implementation10 Sep 2021 Wenbin Li, Ziyi, Wang, Xuesong Yang, Chuanqi Dong, Pinzhuo Tian, Tiexin Qin, Jing Huo, Yinghuan Shi, Lei Wang, Yang Gao, Jiebo Luo

Furthermore, based on LibFewShot, we provide comprehensive evaluations on multiple benchmarks with various backbone architectures to evaluate common pitfalls and effects of different training tricks.

Data Augmentation Few-Shot Image Classification +2

Learning Fine-Grained Motion Embedding for Landscape Animation

no code implementations6 Sep 2021 Hongwei Xue, Bei Liu, Huan Yang, Jianlong Fu, Houqiang Li, Jiebo Luo

To tackle this problem, we propose a model named FGLA to generate high-quality and realistic videos by learning Fine-Grained motion embedding for Landscape Animation.

Multi-Modulation Network for Audio-Visual Event Localization

no code implementations26 Aug 2021 Hao Wang, Zheng-Jun Zha, Liang Li, Xuejin Chen, Jiebo Luo

We propose a novel MultiModulation Network (M2N) to learn the above correlation and leverage it as semantic guidance to modulate the related auditory, visual, and fused features.

audio-visual event localization

UniFaceGAN: A Unified Framework for Temporally Consistent Facial Video Editing

no code implementations12 Aug 2021 Meng Cao, HaoZhi Huang, Hao Wang, Xuan Wang, Li Shen, Sheng Wang, Linchao Bao, Zhifeng Li, Jiebo Luo

Compared with the state-of-the-art facial image editing methods, our framework generates video portraits that are more photo-realistic and temporally smooth.

3D Reconstruction Face Reenactment +3

Boosting Entity-aware Image Captioning with Multi-modal Knowledge Graph

no code implementations26 Jul 2021 Wentian Zhao, Yao Hu, HeDa Wang, Xinxiao wu, Jiebo Luo

Entity-aware image captioning aims to describe named entities and events related to the image by utilizing the background knowledge in the associated article.

Graph Attention Image Captioning +1

Adaptive Recursive Circle Framework for Fine-grained Action Recognition

no code implementations25 Jul 2021 Hanxi Lin, Xinxiao wu, Jiebo Luo

It inherits the operators and parameters of the original layer but is slightly different in the use of those operators and parameters.

Fine-grained Action Recognition

Trip-ROMA: Self-Supervised Learning with Triplets and Random Mappings

1 code implementation22 Jul 2021 Wenbin Li, Xuesong Yang, Meihao Kong, Lei Wang, Jing Huo, Yang Gao, Jiebo Luo

However, in small data regimes, we can not obtain a sufficient number of negative pairs or effectively avoid the over-fitting problem when negatives are not used at all.

Representation Learning Self-Supervised Learning +1

Improving OCR-Based Image Captioning by Incorporating Geometrical Relationship

no code implementations CVPR 2021 Jing Wang, Jinhui Tang, Mingkun Yang, Xiang Bai, Jiebo Luo

Under the guidance of the geometrical relationship between OCR tokens, our LSTM-R capitalizes on a newly-devised relation-aware pointer network to select OCR tokens from the scene text for OCR-based image captioning.

Image Captioning Optical Character Recognition (OCR) +1

Structured Multi-Level Interaction Network for Video Moment Localization via Language Query

no code implementations CVPR 2021 Hao Wang, Zheng-Jun Zha, Liang Li, Dong Liu, Jiebo Luo

In particular, for cross-modal interaction, we interact the sentence-level query with the whole moment while interact the word-level query with content and boundary, as in a coarse-to-fine manner.

Sentence

How COVID-19 Has Changed Crowdfunding: Evidence From GoFundMe

no code implementations18 Jun 2021 Junda Wang, Xupin Zhang, Jiebo Luo

More importantly, sentiment analysis and the paired sample t-test are performed to examine the differences in crowdfunding campaigns before and after the COVID-19 outbreak that started in March 2020.

counterfactual Sentiment Analysis

SAT: 2D Semantics Assisted Training for 3D Visual Grounding

1 code implementation ICCV 2021 Zhengyuan Yang, Songyang Zhang, LiWei Wang, Jiebo Luo

3D visual grounding aims at grounding a natural language description about a 3D scene, usually represented in the form of 3D point clouds, to the targeted object region.

Object Representation Learning +1

Few-shot Partial Multi-view Learning

no code implementations5 May 2021 Yuan Zhou, Yanrong Guo, Shijie Hao, Richang Hong, Jiebo Luo

The challenges of this task are twofold: (i) it is difficult to overcome the impact of data scarcity under the interference of missing views; (ii) the limited number of data exacerbates information scarcity, thus making it harder to address the view-missing issue in turn.

Few-Shot Learning MULTI-VIEW LEARNING

Video-aided Unsupervised Grammar Induction

1 code implementation NAACL 2021 Songyang Zhang, Linfeng Song, Lifeng Jin, Kun Xu, Dong Yu, Jiebo Luo

We investigate video-aided grammar induction, which learns a constituency parser from both unlabeled text and its corresponding video.

Optical Character Recognition (OCR)

Facial Attribute Transformers for Precise and Robust Makeup Transfer

no code implementations7 Apr 2021 Zhaoyi Wan, Haoran Chen, Jielei Zhang, Wentao Jiang, Cong Yao, Jiebo Luo

In this paper, we address the problem of makeup transfer, which aims at transplanting the makeup from the reference face to the source face while preserving the identity of the source.

Attribute Face Generation

ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows

1 code implementation CVPR 2021 Jie An, Siyu Huang, Yibing Song, Dejing Dou, Wei Liu, Jiebo Luo

The forward inference projects input images into deep features, while the backward inference remaps deep features back to input images in a lossless and unbiased way.

Style Transfer

Memory Enhanced Embedding Learning for Cross-Modal Video-Text Retrieval

no code implementations29 Mar 2021 Rui Zhao, Kecheng Zheng, Zheng-Jun Zha, Hongtao Xie, Jiebo Luo

The cross-modal memory module is employed to record the instance embeddings of all the datasets for global negative mining.

Retrieval Text Retrieval +1

When Few-Shot Learning Meets Video Object Detection

no code implementations26 Mar 2021 Zhongjie Yu, Gaoang Wang, Lin Chen, Sebastian Raschka, Jiebo Luo

We employ a transfer-learning framework to effectively train the video object detector on a large number of base-class objects and a few video clips of novel-class objects.

Few-Shot Video Object Detection Object +3

Group-aware Label Transfer for Domain Adaptive Person Re-identification

1 code implementation CVPR 2021 Kecheng Zheng, Wu Liu, Lingxiao He, Tao Mei, Jiebo Luo, Zheng-Jun Zha

In this paper, we propose a Group-aware Label Transfer (GLT) algorithm, which enables the online interaction and mutual promotion of pseudo-label prediction and representation learning.

Attribute Clustering +5

Enhanced Aspect-Based Sentiment Analysis Models with Progressive Self-supervised Attention Learning

1 code implementation5 Mar 2021 Jinsong Su, Jialong Tang, Hui Jiang, Ziyao Lu, Yubin Ge, Linfeng Song, Deyi Xiong, Le Sun, Jiebo Luo

In aspect-based sentiment analysis (ABSA), many neural models are equipped with an attention mechanism to quantify the contribution of each context word to sentiment prediction.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA)

DAIL: Dataset-Aware and Invariant Learning for Face Recognition

no code implementations14 Jan 2021 Gaoang Wang, Lin Chen, Tianqiang Liu, Mingwei He, Jiebo Luo

To solve the first issue of identity overlapping, we propose a dataset-aware loss for multi-dataset training by reducing the penalty when the same person appears in multiple datasets.

Domain Adaptation Face Recognition

Semantic Layout Manipulation with High-Resolution Sparse Attention

1 code implementation14 Dec 2020 Haitian Zheng, Zhe Lin, Jingwan Lu, Scott Cohen, Jianming Zhang, Ning Xu, Jiebo Luo

A core problem of this task is how to transfer visual details from the input images to the new semantic layout while making the resulting image visually realistic.

Vocal Bursts Intensity Prediction

TAP: Text-Aware Pre-training for Text-VQA and Text-Caption

1 code implementation CVPR 2021 Zhengyuan Yang, Yijuan Lu, JianFeng Wang, Xi Yin, Dinei Florencio, Lijuan Wang, Cha Zhang, Lei Zhang, Jiebo Luo

Due to this aligned representation learning, even pre-trained on the same downstream task dataset, TAP already boosts the absolute accuracy on the TextVQA dataset by +5. 4%, compared with a non-TAP baseline.

Caption Generation Language Modelling +5

Multi-Scale 2D Temporal Adjacent Networks for Moment Localization with Natural Language

1 code implementation4 Dec 2020 Songyang Zhang, Houwen Peng, Jianlong Fu, Yijuan Lu, Jiebo Luo

It is a challenging problem because a target moment may take place in the context of other temporal moments in the untrimmed video.

XraySyn: Realistic View Synthesis From a Single Radiograph Through CT Priors

1 code implementation4 Dec 2020 Cheng Peng, Haofu Liao, Gina Wong, Jiebo Luo, Shaohua Kevin Zhou, Rama Chellappa

A radiograph visualizes the internal anatomy of a patient through the use of X-ray, which projects 3D information onto a 2D plane.

3D-Aware Image Synthesis Anatomy +3

Social Media Study of Public Opinions on Potential COVID-19 Vaccines: Informing Dissent, Disparities, and Dissemination

no code implementations3 Dec 2020 Hanjia Lyu, Wei Wu, Junda Wang, Viet Duong, Xiyang Zhang, Jiebo Luo

People who have the worst personal pandemic experience are more likely to hold the anti-vaccine opinion.

Social and Information Networks

Learning Semantic-aware Normalization for Generative Adversarial Networks

1 code implementation NeurIPS 2020 Heliang Zheng, Jianlong Fu, Yanhong Zeng, Jiebo Luo, Zheng-Jun Zha

Such a model disentangles latent factors according to the semantic of feature channels by channel-/group- wise fusion of latent codes and feature channels.

Image Inpainting Unconditional Image Generation

Slender Object Detection: Diagnoses and Improvements

1 code implementation17 Nov 2020 Zhaoyi Wan, Yimin Chen, Sutao Deng, Kunpeng Chen, Cong Yao, Jiebo Luo

In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely \textbf{slender objects}.

Object object-detection +1

Content-based Analysis of the Cultural Differences between TikTok and Douyin

no code implementations3 Nov 2020 Li Sun, Haoqi Zhang, Songyang Zhang, Jiebo Luo

Short-form video social media shifts away from the traditional media paradigm by telling the audience a dynamic story to attract their attention.

Object object-detection +1

Pose-based Body Language Recognition for Emotion and Psychiatric Symptom Interpretation

no code implementations30 Oct 2020 Zhengyuan Yang, Amanda Kay, Yuncheng Li, Wendi Cross, Jiebo Luo

We then evaluate the framework on a proposed URMC dataset, which consists of conversations between a standardized patient and a behavioral health professional, along with expert annotations of body language, emotions, and potential psychiatric symptoms.

Action Recognition Emotion Recognition

Region Comparison Network for Interpretable Few-shot Image Classification

1 code implementation8 Sep 2020 Zhiyu Xue, Lixin Duan, Wen Li, Lin Chen, Jiebo Luo

For that, in this work, we propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works as in a neural network as well as to find out specific regions that are related to each other in images coming from the query and support sets.

Classification Few-Shot Image Classification +3

Dynamic Context-guided Capsule Network for Multimodal Machine Translation

1 code implementation4 Sep 2020 Huan Lin, Fandong Meng, Jinsong Su, Yongjing Yin, Zhengyuan Yang, Yubin Ge, Jie zhou, Jiebo Luo

Particularly, we represent the input image with global and regional visual features, we introduce two parallel DCCNs to model multimodal context vectors with visual features at different granularities.

Multimodal Machine Translation Representation Learning +1

Learning to Localize Actions from Moments

1 code implementation ECCV 2020 Fuchen Long, Ting Yao, Zhaofan Qiu, Xinmei Tian, Jiebo Luo, Tao Mei

In this paper, we introduce a new design of transfer learning type to learn action localization for a large set of action categories, but only on action moments from the categories of interest and temporal annotations of untrimmed videos from a small set of action classes.

Action Localization Transfer Learning

A Smartphone-based System for Real-time Early Childhood Caries Diagnosis

no code implementations17 Aug 2020 Yi-Peng Zhang, Haofu Liao, Jin Xiao, Nisreen Al Jallad, Oriana Ly-Mapes, Jiebo Luo

The identification of ECC in an early stage usually requires expertise in the field, and hence is often ignored by parents.

Dynamic Dual-Attentive Aggregation Learning for Visible-Infrared Person Re-Identification

5 code implementations ECCV 2020 Mang Ye, Jianbing Shen, David J. Crandall, Ling Shao, Jiebo Luo

In this paper, we propose a novel dynamic dual-attentive aggregation (DDAG) learning method by mining both intra-modality part-level and cross-modality graph-level contextual cues for VI-ReID.

Person Re-Identification Retrieval

Universal Model for Multi-Domain Medical Image Retrieval

no code implementations14 Jul 2020 Yang Feng, Yubao Liu, Jiebo Luo

Usually, one image retrieval model is only trained to handle images from one modality or one source.

Medical Image Retrieval Retrieval

Task-agnostic Temporally Consistent Facial Video Editing

no code implementations3 Jul 2020 Meng Cao, Hao-Zhi Huang, Hao Wang, Xuan Wang, Li Shen, Sheng Wang, Linchao Bao, Zhifeng Li, Jiebo Luo

Compared with the state-of-the-art facial image editing methods, our framework generates video portraits that are more photo-realistic and temporally smooth.

3D Reconstruction Video Editing

Monitoring Depression Trend on Twitter during the COVID-19 Pandemic

no code implementations1 Jul 2020 Yi-Peng Zhang, Hanjia Lyu, Yubao Liu, Xiyang Zhang, Yu Wang, Jiebo Luo

The COVID-19 pandemic has severely affected people's daily lives and caused tremendous economic loss worldwide.

Global Image Sentiment Transfer

no code implementations22 Jun 2020 Jie An, Tianlang Chen, Songyang Zhang, Jiebo Luo

This work proposes a novel framework consisting of a reference image retrieval step and a global sentiment transfer step to transfer sentiments of images according to a given sentiment tag.

Image Retrieval Retrieval +3

Image Sentiment Transfer

no code implementations19 Jun 2020 Tianlang Chen, Wei Xiong, Haitian Zheng, Jiebo Luo

In this paper, we propose an effective and flexible framework that performs image sentiment transfer at the object level.

Disentanglement Image-to-Image Translation +2

Real-time Universal Style Transfer on High-resolution Images via Zero-channel Pruning

no code implementations16 Jun 2020 Jie An, Tao Li, Hao-Zhi Huang, Li Shen, Xuan Wang, Yongyi Tang, Jinwen Ma, Wei Liu, Jiebo Luo

Extracting effective deep features to represent content and style information is the key to universal style transfer.

Style Transfer

Personalized Fashion Recommendation from Personal Social Media Data: An Item-to-Set Metric Learning Approach

no code implementations25 May 2020 Haitian Zheng, Kefei Wu, Jong-Hwi Park, Wei Zhu, Jiebo Luo

In this work, we study the problem of personalized fashion recommendation from social media data, i. e. recommending new outfits to social media users that fit their fashion preferences.

Metric Learning

On Vocabulary Reliance in Scene Text Recognition

no code implementations CVPR 2020 Zhaoyi Wan, Jielei Zhang, Liang Zhang, Jiebo Luo, Cong Yao

This remedy alleviates the problem of vocabulary reliance and improves the overall scene text recognition performance.

Scene Text Recognition

Unsupervised Low-light Image Enhancement with Decoupled Networks

no code implementations6 May 2020 Wei Xiong, Ding Liu, Xiaohui Shen, Chen Fang, Jiebo Luo

In this paper, we tackle the problem of enhancing real-world low-light images with significant noise in an unsupervised fashion.

Image-to-Image Translation Low-Light Image Enhancement

Alleviating the Incompatibility between Cross Entropy Loss and Episode Training for Few-shot Skin Disease Classification

no code implementations21 Apr 2020 Wei Zhu, Haofu Liao, Wenbin Li, Weijian Li, Jiebo Luo

Inspired by the recent success of Few-Shot Learning (FSL) in natural image classification, we propose to apply FSL to skin disease identification to address the extreme scarcity of training sample problem.

Few-Shot Learning General Classification +2

The Ivory Tower Lost: How College Students Respond Differently than the General Public to the COVID-19 Pandemic

no code implementations21 Apr 2020 Viet Duong, Phu Pham, Tongyu Yang, Yu Wang, Jiebo Luo

Recently, the pandemic of the novel Coronavirus Disease-2019 (COVID-19) has presented governments with ultimate challenges.

In the Eyes of the Beholder: Analyzing Social Media Use of Neutral and Controversial Terms for COVID-19

no code implementations21 Apr 2020 Long Chen, Hanjia Lyu, Tongyu Yang, Yu Wang, Jiebo Luo

To model the substantive difference of tweets with controversial terms and those with non-controversial terms, we apply topic modeling and LIWC-based sentiment analysis.

Sentiment Analysis

Unsupervised Learning of Landmarks based on Inter-Intra Subject Consistencies

1 code implementation16 Apr 2020 Weijian Li, Haofu Liao, Shun Miao, Le Lu, Jiebo Luo

To recover from the transformed images back to the original subject, the landmark detector is forced to learn spatial locations that contain the consistent semantic meanings both for the paired intra-subject images and between the paired inter-subject images.

TuiGAN: Learning Versatile Image-to-Image Translation with Two Unpaired Images

1 code implementation ECCV 2020 Jianxin Lin, Yingxue Pang, Yingce Xia, Zhibo Chen, Jiebo Luo

With TuiGAN, an image is translated in a coarse-to-fine manner where the generated image is gradually refined from global structures to local details.

Translation Unsupervised Image-To-Image Translation +1

Learning a Weakly-Supervised Video Actor-Action Segmentation Model with a Wise Selection

no code implementations CVPR 2020 Jie Chen, Zhiheng Li, Jiebo Luo, Chenliang Xu

Instead of blindly trusting quality-inconsistent PAs, WS^2 employs a learning-based selection to select effective PAs and a novel region integrity criterion as a stopping condition for weakly-supervised training.

Action Segmentation Segmentation +3

Adaptive Offline Quintuplet Loss for Image-Text Matching

1 code implementation ECCV 2020 Tianlang Chen, Jiajun Deng, Jiebo Luo

For each image or text anchor in a training mini-batch, the model is trained to distinguish between a positive and the most confusing negative of the anchor mined from the mini-batch (i. e. online hard negative).

Image-text matching Text Matching

Expressing Objects just like Words: Recurrent Visual Embedding for Image-Text Matching

no code implementations20 Feb 2020 Tianlang Chen, Jiebo Luo

Existing image-text matching approaches typically infer the similarity of an image-text pair by capturing and aggregating the affinities between the text and each independent object of the image.

Image-text matching Object +4

Asymmetric Distribution Measure for Few-shot Learning

no code implementations1 Feb 2020 Wenbin Li, Lei Wang, Jing Huo, Yinghuan Shi, Yang Gao, Jiebo Luo

Given the natural asymmetric relation between a query image and a support class, we argue that an asymmetric measure is more suitable for metric-based few-shot learning.

Few-Shot Image Classification Few-Shot Learning

#MeToo on Campus: Studying College Sexual Assault at Scale Using Data Reported on Social Media

no code implementations16 Jan 2020 Viet Duong, Phu Pham, Ritwik Bose, Jiebo Luo

Recently, the emergence of the #MeToo trend on social media has empowered thousands of people to share their own sexual harassment experiences.

Fine-grained Image-to-Image Transformation towards Visual Recognition

no code implementations CVPR 2020 Wei Xiong, Yutong He, Yixuan Zhang, Wenhan Luo, Lin Ma, Jiebo Luo

In this paper, we aim at transforming an image with a fine-grained category to synthesize new images that preserve the identity of the input image, which can thereby benefit the subsequent fine-grained image recognition and few-shot learning tasks.

Few-Shot Learning Fine-Grained Image Recognition

TransMatch: A Transfer-Learning Scheme for Semi-Supervised Few-Shot Learning

no code implementations CVPR 2020 Zhongjie Yu, Lin Chen, Zhongwei Cheng, Jiebo Luo

Under the proposed framework, we develop a novel method for semi-supervised few-shot learning called TransMatch by instantiating the three components with Imprinting and MixMatch.

Few-Shot Learning Transfer Learning

Neural Simile Recognition with Cyclic Multitask Learning and Local Attention

1 code implementation19 Dec 2019 Jiali Zeng, Linfeng Song, Jinsong Su, Jun Xie, Wei Song, Jiebo Luo

Simile recognition is to detect simile sentences and to extract simile components, i. e., tenors and vehicles.

Sentence Sentence Classification

Graph-based Neural Sentence Ordering

1 code implementation16 Dec 2019 Yongjing Yin, Linfeng Song, Jinsong Su, Jiali Zeng, Chulun Zhou, Jiebo Luo

Sentence ordering is to restore the original paragraph from a set of sentences.

Sentence Sentence Ordering

Iterative Dual Domain Adaptation for Neural Machine Translation

no code implementations IJCNLP 2019 Jiali Zeng, Yang Liu, Jinsong Su, Yubin Ge, Yaojie Lu, Yongjing Yin, Jiebo Luo

Previous studies on the domain adaptation for neural machine translation (NMT) mainly focus on the one-pass transferring out-of-domain translation knowledge to in-domain NMT model.

Domain Adaptation Knowledge Distillation +4

Grounding-Tracking-Integration

no code implementations13 Dec 2019 Zhengyuan Yang, Tushar Kumar, Tianlang Chen, Jinsong Su, Jiebo Luo

In this paper, we study Tracking by Language that localizes the target box sequence in a video based on a language query.

Learning Sparse 2D Temporal Adjacent Networks for Temporal Action Localization

2 code implementations8 Dec 2019 Songyang Zhang, Houwen Peng, Le Yang, Jianlong Fu, Jiebo Luo

In this report, we introduce the Winner method for HACS Temporal Action Localization Challenge 2019.

Temporal Action Localization

Learning 2D Temporal Adjacent Networks for Moment Localization with Natural Language

3 code implementations8 Dec 2019 Songyang Zhang, Houwen Peng, Jianlong Fu, Jiebo Luo

We address the problem of retrieving a specific moment from an untrimmed video by a query sentence.

Sentence

Ultrafast Photorealistic Style Transfer via Neural Architecture Search

no code implementations5 Dec 2019 Jie An, Haoyi Xiong, Jun Huan, Jiebo Luo

Our method consists of a construction step (C-step) to build a photorealistic stylization network and a pruning step (P-step) for acceleration.

Network Pruning Neural Architecture Search +1

Defensive Few-shot Learning

1 code implementation16 Nov 2019 Wenbin Li, Lei Wang, Xingxing Zhang, Lei Qi, Jing Huo, Yang Gao, Jiebo Luo

(2) how to narrow the distribution gap between clean and adversarial examples under the few-shot setting?

Adversarial Defense Few-Shot Learning

Open-Ended Visual Question Answering by Multi-Modal Domain Adaptation

no code implementations Findings of the Association for Computational Linguistics 2020 Yiming Xu, Lin Chen, Zhongwei Cheng, Lixin Duan, Jiebo Luo

A straightforward solution is to fine-tune a pre-trained source model by using those limited labeled target data, but it usually cannot work well due to the considerable difference between the data distributions of the source and target domains.

Domain Adaptation Question Answering +1

Learning Deep Bilinear Transformation for Fine-grained Image Representation

1 code implementation NeurIPS 2019 Heliang Zheng, Jianlong Fu, Zheng-Jun Zha, Jiebo Luo

However, the computational cost to learn pairwise interactions between deep feature channels is prohibitively expensive, which restricts this powerful transformation to be used in deep neural networks.

Fine-Grained Image Recognition

SMP Challenge: An Overview of Social Media Prediction Challenge 2019

no code implementations4 Oct 2019 Bo Wu, Wen-Huang Cheng, Peiye Liu, Bei Liu, Zhaoyang Zeng, Jiebo Luo

In the SMP Challenge at ACM Multimedia 2019, we introduce a novel prediction task Temporal Popularity Prediction, which focuses on predicting future interaction or attractiveness (in terms of clicks, views or likes etc.)

Multimedia recommendation

Unsupervised Pose Flow Learning for Pose Guided Synthesis

no code implementations30 Sep 2019 Haitian Zheng, Lele Chen, Chenliang Xu, Jiebo Luo

Pose guided synthesis aims to generate a new image in an arbitrary target pose while preserving the appearance details from the source image.

Large-scale Tag-based Font Retrieval with Generative Feature Learning

no code implementations ICCV 2019 Tianlang Chen, Zhaowen Wang, Ning Xu, Hailin Jin, Jiebo Luo

In this paper, we address the problem of large-scale tag-based font retrieval which aims to bring semantics to the font selection process and enable people without expert knowledge to use fonts effectively.

Retrieval TAG

Exploiting Temporal Relationships in Video Moment Localization with Natural Language

1 code implementation11 Aug 2019 Songyang Zhang, Jinsong Su, Jiebo Luo

We address the problem of video moment localization with natural language, i. e. localizing a video segment described by a natural language sentence.

Sentence

Semi-Supervised Adversarial Monocular Depth Estimation

no code implementations6 Aug 2019 Rongrong Ji, Ke Li, Yan Wang, Xiaoshuai Sun, Feng Guo, Xiaowei Guo, Yongjian Wu, Feiyue Huang, Jiebo Luo

In this paper, we address the problem of monocular depth estimation when only a limited number of training image-depth pairs are available.

Monocular Depth Estimation

ADN: Artifact Disentanglement Network for Unsupervised Metal Artifact Reduction

2 code implementations3 Aug 2019 Haofu Liao, Wei-An Lin, S. Kevin Zhou, Jiebo Luo

Current deep neural network based approaches to computed tomography (CT) metal artifact reduction (MAR) are supervised methods that rely on synthesized metal artifacts for training.

Computed Tomography (CT) Disentanglement +4

Weakly Supervised Body Part Segmentation with Pose based Part Priors

no code implementations30 Jul 2019 Zhengyuan Yang, Yuncheng Li, Linjie Yang, Ning Zhang, Jiebo Luo

The core idea is first converting the sparse weak labels such as keypoints to the initial estimate of body part masks, and then iteratively refine the part mask predictions.

Face Parsing Segmentation +1

Automatic Radiology Report Generation based on Multi-view Image Fusion and Medical Concept Enrichment

no code implementations22 Jul 2019 Jianbo Yuan, Haofu Liao, Rui Luo, Jiebo Luo

In addition, in order to enrich the decoder with descriptive semantics and enforce the correctness of the deterministic medical-related contents such as mentions of organs or diagnoses, we extract medical concepts based on the radiology reports in the training data and fine-tune the encoder to extract the most frequent medical concepts from the x-ray images.

Descriptive Image Captioning +2

Fast Universal Style Transfer for Artistic and Photorealistic Rendering

no code implementations6 Jul 2019 Jie An, Haoyi Xiong, Jiebo Luo, Jun Huan, Jinwen Ma

Given a pair of images as the source of content and the reference of style, existing solutions usually first train an auto-encoder (AE) to reconstruct the image using deep features and then embeds pre-defined style transfer modules into the AE reconstruction procedure to transfer the style of the reconstructed image through modifying the deep features.

Style Transfer

Uncovering Download Fraud Activities in Mobile App Markets

no code implementations5 Jul 2019 Yingtong Dou, Weijian Li, Zhirong Liu, Zhenhua Dong, Jiebo Luo, Philip S. Yu

To the best of our knowledge, this is the first work that investigates the download fraud problem in mobile App markets.

DuDoNet: Dual Domain Network for CT Metal Artifact Reduction

no code implementations CVPR 2019 Wei-An Lin, Haofu Liao, Cheng Peng, Xiaohang Sun, Jingdan Zhang, Jiebo Luo, Rama Chellappa, Shaohua Kevin Zhou

The linkage between the sigogram and image domains is a novel Radon inversion layer that allows the gradients to back-propagate from the image domain to the sinogram domain during training.

Computed Tomography (CT) Medical Diagnosis +1

Generative Mask Pyramid Network for CT/CBCT Metal Artifact Reduction with Joint Projection-Sinogram Correction

no code implementations29 Jun 2019 Haofu Liao, Wei-An Lin, Zhimin Huo, Levon Vogelsang, William J. Sehnert, S. Kevin Zhou, Jiebo Luo

A conventional approach to computed tomography (CT) or cone beam CT (CBCT) metal artifact reduction is to replace the X-ray projection data within the metal trace with synthesized data.

Computed Tomography (CT) Metal Artifact Reduction

Patch Transformer for Multi-tagging Whole Slide Histopathology Images

no code implementations10 Jun 2019 Weijian Li, Viet-Duy Nguyen, Haofu Liao, Matt Wilder, Ke Cheng, Jiebo Luo

Automated whole slide image (WSI) tagging has become a growing demand due to the increasing volume and diversity of WSIs collected nowadays in histopathology.

TAG

StyleNAS: An Empirical Study of Neural Architecture Search to Uncover Surprisingly Fast End-to-End Universal Style Transfer Networks

no code implementations6 Jun 2019 Jie An, Haoyi Xiong, Jinwen Ma, Jiebo Luo, Jun Huan

Finally compared to existing universal style transfer networks for photorealistic rendering such as PhotoWCT that stacks multiple well-trained auto-encoders and WCT transforms in a non-end-to-end manner, the architectures designed by StyleNAS produce better style-transferred images with details preserving, using a tiny number of operators/parameters, and enjoying around 500x inference time speed-up.

Image Classification Neural Architecture Search +4

Artifact Disentanglement Network for Unsupervised Metal Artifact Reduction

1 code implementation5 Jun 2019 Haofu Liao, Wei-An Lin, Jianbo Yuan, S. Kevin Zhou, Jiebo Luo

Extensive experiments show that our method significantly outperforms the existing unsupervised models for image-to-image translation problems, and achieves comparable performance to existing supervised models on a synthesized dataset.

Computed Tomography (CT) Disentanglement +3

Progressive Self-Supervised Attention Learning for Aspect-Level Sentiment Analysis

1 code implementation ACL 2019 Jialong Tang, Ziyao Lu, Jinsong Su, Yubin Ge, Linfeng Song, Le Sun, Jiebo Luo

In aspect-level sentiment classification (ASC), it is prevalent to equip dominant neural models with attention mechanisms, for the sake of acquiring the importance of each context word on the given aspect.

Aspect-Based Sentiment Analysis (ABSA) Sentiment Classification

Relational Reasoning using Prior Knowledge for Visual Captioning

no code implementations4 Jun 2019 Jingyi Hou, Xinxiao Wu, Yayun Qi, Wentian Zhao, Jiebo Luo, Yunde Jia

Extensive experiments on the MS-COCO image captioning benchmark and the MSVD video captioning benchmark validate the superiority of our method on leveraging prior commonsense knowledge to enhance relational reasoning for visual captioning.

Image Captioning object-detection +4

Spatio-temporal Video Re-localization by Warp LSTM

no code implementations CVPR 2019 Yang Feng, Lin Ma, Wei Liu, Jiebo Luo

The need for efficiently finding the video content a user wants is increasing because of the erupting of user-generated videos on the Web.

Retrieval Video Retrieval

Human-Centered Emotion Recognition in Animated GIFs

1 code implementation27 Apr 2019 Zhengyuan Yang, Yixuan Zhang, Jiebo Luo

The framework consists of a facial attention module and a hierarchical segment temporal module.

Emotion Recognition

Revisiting Local Descriptor based Image-to-Class Measure for Few-shot Learning

1 code implementation CVPR 2019 Wenbin Li, Lei Wang, Jinglin Xu, Jing Huo, Yang Gao, Jiebo Luo

Its key difference from the literature is the replacement of the image-level feature based measure in the final layer by a local descriptor based image-to-class measure.

Few-Shot Image Classification Few-Shot Learning +1

Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods

no code implementations27 Mar 2019 Guo-Jun Qi, Jiebo Luo

Representation learning with small labeled data have emerged in many problems, since the success of deep neural networks often relies on the availability of a huge amount of labeled data that is expensive to collect.

Domain Adaptation Representation Learning +1

Multiview 2D/3D Rigid Registration via a Point-Of-Interest Network for Tracking and Triangulation ($\text{POINT}^2$)

no code implementations10 Mar 2019 Haofu Liao, Wei-An Lin, Jiarui Zhang, Jingdan Zhang, Jiebo Luo, S. Kevin Zhou

As the POI tracker is shift-invariant, $\text{POINT}^2$ is more robust to the initial pose of the 3D pre-intervention image.

Foreground-aware Image Inpainting

no code implementations CVPR 2019 Wei Xiong, Jiahui Yu, Zhe Lin, Jimei Yang, Xin Lu, Connelly Barnes, Jiebo Luo

We show that by such disentanglement, the contour completion model predicts reasonable contours of objects, and further substantially improves the performance of image inpainting.

Disentanglement Image Inpainting

AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transformations rather than Data

1 code implementation CVPR 2019 Liheng Zhang, Guo-Jun Qi, Liqiang Wang, Jiebo Luo

The success of deep neural networks often relies on a large amount of labeled examples, which can be difficult to obtain in many real scenarios.

Representation Learning

Joint Vertebrae Identification and Localization in Spinal CT Images by Combining Short- and Long-Range Contextual Information

no code implementations9 Dec 2018 Haofu Liao, Addisu Mesfin, Jiebo Luo

For the long-range contextual information, we propose a multi-task bidirectional recurrent neural network (Bi-RNN) to encode the spatial and contextual information among the vertebrae of the visible spine column.

Joint Vertebrae Identification And Localization In Spinal Ct Images

More Knowledge is Better: Cross-Modality Volume Completion and 3D+2D Segmentation for Intracardiac Echocardiography Contouring

no code implementations9 Dec 2018 Haofu Liao, Yucheng Tang, Gareth Funka-Lea, Jiebo Luo, Shaohua Kevin Zhou

Using catheter ablation to treat atrial fibrillation increasingly relies on intracardiac echocardiography (ICE) for an anatomical delineation of the left atrium and the pulmonary veins that enter the atrium.

Anatomy

Adversarial Sparse-View CBCT Artifact Reduction

no code implementations9 Dec 2018 Haofu Liao, Zhimin Huo, William J. Sehnert, Shaohua Kevin Zhou, Jiebo Luo

We present an effective post-processing method to reduce the artifacts from sparsely reconstructed cone-beam CT (CBCT) images.

Cbct Artifact Reduction

Real-Time Referring Expression Comprehension by Single-Stage Grounding Network

no code implementations9 Dec 2018 Xinpeng Chen, Lin Ma, Jingyuan Chen, Zequn Jie, Wei Liu, Jiebo Luo

Experiments on RefCOCO, RefCOCO+, and RefCOCOg datasets demonstrate that our proposed SSG without relying on any region proposals can achieve comparable performance with other advanced models.

Attribute Referring Expression +1

Cannot find the paper you are looking for? You can Submit a new open access paper.