Search Results for author: Jianan Wang

Found 36 papers, 13 papers with code

Implicit Sentiment Analysis with Event-centered Text Representation

no code implementations EMNLP 2021 Deyu Zhou, Jianan Wang, Linhai Zhang, Yulan He

Implicit sentiment analysis, aiming at detecting the sentiment of a sentence without sentiment words, has become an attractive research topic in recent years.

Representation Learning Sentence +2

Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model

no code implementations24 Feb 2025 Yaxuan Huang, Xili Dai, Jianan Wang, Xianbiao Qi, Yixing Yuan, Xiangyu Yue

Room layout estimation from multiple-perspective images is poorly investigated due to the complexities that emerge from multi-view geometry, which requires muti-step solutions such as camera intrinsic and extrinsic estimation, image matching, and triangulation.

3D Reconstruction Room Layout Estimation

OccScene: Semantic Occupancy-based Cross-task Mutual Learning for 3D Scene Generation

no code implementations15 Dec 2024 Bohan Li, Xin Jin, Jianan Wang, Yukai Shi, Yasheng Sun, XiaoFeng Wang, Zhuang Ma, Baao Xie, Chao Ma, Xiaokang Yang, Wenjun Zeng

Within OccScene, the perception module can be effectively improved with customized and diverse generated scenes, while the perception priors in return enhance the generation performance for mutual benefits.

Mamba Scene Generation

DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D Diffusion

1 code implementation25 Sep 2024 Yukun Huang, Jianan Wang, Ailing Zeng, Zheng-Jun Zha, Lei Zhang, Xihui Liu

The core of this framework lies in Skeleton-guided Score Distillation and Hybrid 3D Gaussian Avatar representation.

Text to 3D

BeSimulator: A Large Language Model Powered Text-based Behavior Simulator

no code implementations24 Sep 2024 Jianan Wang, Bin Li, Xueying Wang, Fu Li, Yunlong Wu, Juan Chen, Xiaodong Yi

Traditional robot simulators focus on physical process modeling and realistic rendering, often suffering from high computational costs, inefficiencies, and limited adaptability.

Language Modeling Language Modelling +1

HeGTa: Leveraging Heterogeneous Graph-enhanced Large Language Models for Few-shot Complex Table Understanding

no code implementations28 Mar 2024 Rihui Jin, Yu Li, Guilin Qi, Nan Hu, Yuan-Fang Li, Jiaoyan Chen, Jianan Wang, Yongrui Chen, Dehai Min, Sheng Bi

Table understanding (TU) has achieved promising advancements, but it faces the challenges of the scarcity of manually labeled tables and the presence of complex table structures. To address these challenges, we propose HGT, a framework with a heterogeneous graph (HG)-enhanced large language model (LLM) to tackle few-shot TU tasks. It leverages the LLM by aligning the table semantics with the LLM's parametric knowledge through soft prompts and instruction turning and deals with complex tables by a multi-task pre-training scheme involving three novel multi-granularity self-supervised HG pre-training objectives. We empirically demonstrate the effectiveness of HGT, showing that it outperforms the SOTA for few-shot complex TU on several benchmarks.

Language Modeling Language Modelling +1

CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, Controllability and Compatibility

1 code implementation18 Mar 2024 Bojia Zi, Shihao Zhao, Xianbiao Qi, Jianan Wang, Yukai Shi, Qianyu Chen, Bin Liang, Kam-Fai Wong, Lei Zhang

To this end, this paper proposes a novel text-guided video inpainting model that achieves better consistency, controllability and compatibility.

Image Inpainting Video Alignment +2

Ctrl123: Consistent Novel View Synthesis via Closed-Loop Transcription

no code implementations16 Mar 2024 Hongxiang Zhao, Xili Dai, Jianan Wang, Shengbang Tong, Jingyuan Zhang, Weida Wang, Lei Zhang, Yi Ma

This consequently limits the performance of downstream tasks, such as image-to-multiview generation and 3D reconstruction.

3D Reconstruction Novel View Synthesis

Stable Score Distillation for High-Quality 3D Generation

no code implementations14 Dec 2023 Boshi Tang, Jianan Wang, Zhiyong Wu, Lei Zhang

Although Score Distillation Sampling (SDS) has exhibited remarkable performance in conditional 3D content generation, a comprehensive understanding of its formulation is still lacking, hindering the development of 3D generation.

3D Generation

Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts

no code implementations18 Oct 2023 Xinhua Cheng, Tianyu Yang, Jianan Wang, Yu Li, Lei Zhang, Jian Zhang, Li Yuan

Recent text-to-3D generation methods achieve impressive 3D content creation capacity thanks to the advances in image diffusion models and optimizing strategies.

3D Generation Text to 3D

Delta-LoRA: Fine-Tuning High-Rank Parameters with the Delta of Low-Rank Matrices

no code implementations5 Sep 2023 Bojia Zi, Xianbiao Qi, Lingzhi Wang, Jianan Wang, Kam-Fai Wong, Lei Zhang

In this paper, we present Delta-LoRA, which is a novel parameter-efficient approach to fine-tune large language models (LLMs).

DreamTime: An Improved Optimization Strategy for Diffusion-Guided 3D Generation

no code implementations21 Jun 2023 Yukun Huang, Jianan Wang, Yukai Shi, Boshi Tang, Xianbiao Qi, Lei Zhang

Text-to-image diffusion models pre-trained on billions of image-text pairs have recently enabled 3D content creation by optimizing a randomly initialized differentiable 3D representation with score distillation.

3D Generation Diversity +1

Understanding Optimization of Deep Learning via Jacobian Matrix and Lipschitz Constant

no code implementations15 Jun 2023 Xianbiao Qi, Jianan Wang, Lei Zhang

This article provides a comprehensive understanding of optimization in deep learning, with a primary focus on the challenges of gradient vanishing and gradient exploding, which normally lead to diminished model representational ability and training instability, respectively.

Deep Learning

detrex: Benchmarking Detection Transformers

1 code implementation12 Jun 2023 Tianhe Ren, Shilong Liu, Feng Li, Hao Zhang, Ailing Zeng, Jie Yang, Xingyu Liao, Ding Jia, Hongyang Li, He Cao, Jianan Wang, Zhaoyang Zeng, Xianbiao Qi, Yuhui Yuan, Jianwei Yang, Lei Zhang

To address this issue, we develop a unified, highly modular, and lightweight codebase called detrex, which supports a majority of the mainstream DETR-based instance recognition algorithms, covering various fundamental tasks, including object detection, segmentation, and pose estimation.

Benchmarking object-detection +2

LipsFormer: Introducing Lipschitz Continuity to Vision Transformers

1 code implementation19 Apr 2023 Xianbiao Qi, Jianan Wang, Yihao Chen, Yukai Shi, Lei Zhang

In contrast to previous practical tricks that address training instability by learning rate warmup, layer normalization, attention formulation, and weight initialization, we show that Lipschitz continuity is a more essential property to ensure training stability.

DisCo-CLIP: A Distributed Contrastive Loss for Memory Efficient CLIP Training

1 code implementation CVPR 2023 Yihao Chen, Xianbiao Qi, Jianan Wang, Lei Zhang

In this way, we can reduce the GPU memory consumption of contrastive loss computation from $\bigO(B^2)$ to $\bigO(\frac{B^2}{N})$, where $B$ and $N$ are the batch size and the number of GPUs used for training.

Contrastive Learning

HumanSD: A Native Skeleton-Guided Diffusion Model for Human Image Generation

3 code implementations ICCV 2023 Xuan Ju, Ailing Zeng, Chenchen Zhao, Jianan Wang, Lei Zhang, Qiang Xu

While such a plug-and-play approach is appealing, the inevitable and uncertain conflicts between the original images produced from the frozen SD branch and the given condition incur significant challenges for the learnable branch, which essentially conducts image feature editing for condition enforcement.

Denoising Image Generation

Entity-Level Text-Guided Image Manipulation

1 code implementation22 Feb 2023 Yikai Wang, Jianan Wang, Guansong Lu, Hang Xu, Zhenguo Li, Wei zhang, Yanwei Fu

In the image manipulation phase, SeMani adopts a generative model to synthesize new images conditioned on the entity-irrelevant regions and target text descriptions.

Denoising Image Manipulation

A Multi-Source Information Learning Framework for Airbnb Price Prediction

no code implementations1 Jan 2023 Lu Jiang, Yuanhan Li, Na Luo, Jianan Wang, Qiao Ning

Thirdly, we uses the points of interest(POI) around the rental house information generates a variety of spatial network graphs, and learns the embedding of the network to obtain the spatial feature embedding.

Multi-View MOOC Quality Evaluation via Information-Aware Graph Representation Learning

no code implementations1 Jan 2023 Lu Jiang, Yibin Wang, Jianan Wang, Pengyang Wang, Minghao Yin

To tackle the challenges, we formulate the problem as a course representation learning task-based and develop an Information-aware Graph Representation Learning(IaGRL) for multi-view MOOC quality evaluation.

Graph Representation Learning

Exploring Vision Transformers as Diffusion Learners

no code implementations28 Dec 2022 He Cao, Jianan Wang, Tianhe Ren, Xianbiao Qi, Yihao Chen, Yuan YAO, Lei Zhang

We further provide a hypothesis on the implication of disentangling the generative backbone as an encoder-decoder structure and show proof-of-concept experiments verifying the effectiveness of a stronger encoder for generative tasks with ASymmetriC ENcoder Decoder (ASCEND).

Decoder

Streaming Traffic Flow Prediction Based on Continuous Reinforcement Learning

no code implementations24 Dec 2022 Yanan Xiao, Minyu Liu, Zichen Zhang, Lu Jiang, Minghao Yin, Jianan Wang

We propose to formulate the problem as a continuous reinforcement learning task, where the agent is the next flow value predictor, the action is the next time-series flow value in the sensor, and the environment state is a dynamically fused representation of the sensor and transportation network.

reinforcement-learning Reinforcement Learning +3

ManiTrans: Entity-Level Text-Guided Image Manipulation via Token-wise Semantic Alignment and Generation

1 code implementation CVPR 2022 Jianan Wang, Guansong Lu, Hang Xu, Zhenguo Li, Chunjing Xu, Yanwei Fu

Existing text-guided image manipulation methods aim to modify the appearance of the image or to edit a few objects in a virtual or simple scenario, which is far from practical application.

Image Manipulation

Data-efficient Alignment of Multimodal Sequences by Aligning Gradient Updates and Internal Feature Distributions

1 code implementation15 Nov 2020 Jianan Wang, Boyang Li, Xiangyu Fan, Jing Lin, Yanwei Fu

The task of video and text sequence alignment is a prerequisite step toward joint understanding of movie videos and screenplays.

A Combinatorial Perspective on Transfer Learning

1 code implementation NeurIPS 2020 Jianan Wang, Eren Sezener, David Budden, Marcus Hutter, Joel Veness

Our main postulate is that the combination of task segmentation, modular learning and memory-based ensembling can give rise to generalization on an exponentially growing number of unseen tasks.

Continual Learning Transfer Learning

Online Learning in Contextual Bandits using Gated Linear Networks

no code implementations NeurIPS 2020 Eren Sezener, Marcus Hutter, David Budden, Jianan Wang, Joel Veness

We introduce a new and completely online contextual bandit algorithm called Gated Linear Contextual Bandits (GLCB).

Multi-Armed Bandits

Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds

1 code implementation NeurIPS 2019 Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni

The framework directly regresses 3D bounding boxes for all instances in a point cloud, while simultaneously predicting a point-level mask for each instance.

Ranked #14 on 3D Instance Segmentation on S3DIS (mPrec metric)

3D Instance Segmentation Clustering +2

Group Linguistic Bias Aware Neural Response Generation

no code implementations WS 2017 Jianan Wang, Xin Wang, Fang Li, Zhen Xu, Zhuoran Wang, Baoxun Wang

For practical chatbots, one of the essential factor for improving user experience is the capability of customizing the talking style of the agents, that is, to make chatbots provide responses meeting users{'} preference on language styles, topics, etc.

Decoder Response Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.