Search Results for author: Wei zhang

Found 534 papers, 161 papers with code

HARD-Net: Hardness-AwaRe Discrimination Network for 3D Early Activity Prediction

no code implementations ECCV 2020 Tianjiao Li, Jun Liu, Wei zhang, Ling-Yu Duan

In this paper, we propose a novel Hardness-AwaRe Discrimination Network (HARD-Net) to specifically investigate the relationships between the similar activity pairs that are hard to be discriminated.

Activity Prediction Skeleton Based Action Recognition

Unsupervised Multi-View CNN for Salient View Selection of 3D Objects and Scenes

1 code implementation ECCV 2020 Ran Song, Wei zhang, Yitian Zhao, Yonghuai Liu

We present an unsupervised 3D deep learning framework based on a ubiquitously true proposition named by us view-object consistency as it states that a 3D object and its projected 2D views always belong to the same object class.

Object

Towards Generalizeable Semantic Product Search by Text Similarity Pre-training on Search Click Logs

no code implementations ECNLP (ACL) 2022 Zheng Liu, Wei zhang, Yan Chen, Weiyi Sun, Tianchuan Du, Benjamin Schroeder

Recently, semantic search has been successfully applied to E-commerce product search and the learned semantic space for query and product encoding are expected to generalize well to unseen queries or products.

text similarity

Functional Programming Paradigm of Python for Scientific Computation Pipeline Integration

no code implementations27 May 2024 Chen Zhang, Lecheng Jia, Wei zhang, Ning Wen

The advent of modern data processing has led to an increasing tendency towards interdisciplinarity, which frequently involves the importation of different technical approaches.

Mamba4KT:An Efficient and Effective Mamba-based Knowledge Tracing Model

no code implementations26 May 2024 Yang Cao, Wei zhang

Recognizing the significance of prioritizing model efficiency and resource usage in knowledge tracing, we introduce Mamba4KT.

ECLIPSE: Semantic Entropy-LCS for Cross-Lingual Industrial Log Parsing

no code implementations22 May 2024 Wei zhang, Xianfu Cheng, Yi Zhang, Jian Yang, Hongcheng Guo, Zhoujun Li, Xiaolin Yin, Xiangyuan Guan, Xu Shi, Liangfan Zheng, Bo Zhang

These challenges are two-fold: 1) massive log templates: The performance and efficiency of most existing parsers will be significantly reduced when logs of growing quantities and different lengths; 2) Complex and changeable semantics: Traditional template-matching algorithms cannot accurately match the log templates of complicated industrial logs because they cannot utilize cross-language logs with similar semantics.

Language Modelling Large Language Model +2

Weakly-supervised causal discovery based on fuzzy knowledge and complex data complementarity

no code implementations14 May 2024 Wenrui Li, Wei zhang, Qinghao Zhang, Xuegong Zhang, Xiaowo Wang

Extensive experiments with different datasets demonstrate the superiority of KEEL over several state-of-the-art methods in accuracy, robustness and computational efficiency.

Causal Discovery Computational Efficiency

LingML: Linguistic-Informed Machine Learning for Enhanced Fake News Detection

no code implementations7 May 2024 Jasraj Singh, Fang Liu, Hong Xu, Bee Chin Ng, Wei zhang

In this paper, we enhance ML-based solutions with linguistics input and we propose LingML, linguistic-informed ML, for fake news detection.

Fake News Detection Misinformation

SLAM for Indoor Mapping of Wide Area Construction Environments

no code implementations26 Apr 2024 Vincent Ress, Wei zhang, David Skuddis, Norbert Haala, Uwe Soergel

We apply our state-of-the-art LiDAR and visual SLAM approaches and discuss the respective pros and cons of the different sensor types for trajectory estimation and dense map generation in such an environment.

Pose Estimation Simultaneous Localization and Mapping

NTIRE 2024 Quality Assessment of AI-Generated Content Challenge

no code implementations25 Apr 2024 Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Chunyi Li, Tengchuan Kou, Wei Sun, HaoNing Wu, Yixuan Gao, Yuqin Cao, ZiCheng Zhang, Xiele Wu, Radu Timofte, Fei Peng, Huiyuan Fu, Anlong Ming, Chuanming Wang, Huadong Ma, Shuai He, Zifei Dou, Shu Chen, Huacong Zhang, Haiyi Xie, Chengwei Wang, Baoying Chen, Jishen Zeng, Jianquan Yang, Weigang Wang, Xi Fang, Xiaoxin Lv, Jun Yan, Tianwu Zhi, Yabin Zhang, Yaohui Li, Yang Li, Jingwen Xu, Jianzhao Liu, Yiting Liao, Junlin Li, Zihao Yu, Yiting Lu, Xin Li, Hossein Motamednia, S. Farhad Hosseini-Benvidi, Fengbin Guan, Ahmad Mahmoudi-Aznaveh, Azadeh Mansouri, Ganzorig Gankhuyag, Kihwan Yoon, Yifang Xu, Haotian Fan, Fangyuan Kong, Shiling Zhao, Weifeng Dong, Haibing Yin, Li Zhu, Zhiling Wang, Bingchen Huang, Avinab Saha, Sandeep Mishra, Shashank Gupta, Rajesh Sureddi, Oindrila Saha, Luigi Celona, Simone Bianco, Paolo Napoletano, Raimondo Schettini, Junfeng Yang, Jing Fu, Wei zhang, Wenzhi Cao, Limei Liu, Han Peng, Weijun Yuan, Zhan Li, Yihang Cheng, Yifan Deng, Haohui Li, Bowen Qu, Yao Li, Shuqing Luo, Shunzhou Wang, Wei Gao, Zihao Lu, Marcos V. Conde, Xinrui Wang, Zhibo Chen, Ruling Liao, Yan Ye, Qiulin Wang, Bing Li, Zhaokun Zhou, Miao Geng, Rui Chen, Xin Tao, Xiaoyu Liang, Shangkun Sun, Xingyuan Ma, Jiaze Li, Mengduo Yang, Haoran Xu, Jie zhou, Shiding Zhu, Bohan Yu, Pengfei Chen, Xinrui Xu, Jiabin Shen, Zhichao Duan, Erfan Asadi, Jiahe Liu, Qi Yan, Youran Qu, Xiaohui Zeng, Lele Wang, Renjie Liao

A total of 196 participants have registered in the video track.

Image Quality Assessment Image Restoration +2

Data-free Knowledge Distillation for Fine-grained Visual Categorization

1 code implementation ICCV 2023 Renrong Shao, Wei zhang, Jianhua Yin, Jun Wang

Our approach utilizes an adversarial distillation framework with attention generator, mixed high-order attention distillation, and semantic feature contrast learning.

Data-free Knowledge Distillation Fine-Grained Visual Categorization +1

DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection

no code implementations14 Apr 2024 Lewei Yao, Renjie Pi, Jianhua Han, Xiaodan Liang, Hang Xu, Wei zhang, Zhenguo Li, Dan Xu

This is followed by a fine-tuning stage that leverages a small number of high-resolution samples to further enhance detection performance.

Dense Captioning Language Modelling +4

Fast Gradient Computation for Gromov-Wasserstein Distance

no code implementations13 Apr 2024 Wei zhang, ZiHao Wang, Jie Fan, Hao Wu, Yong Zhang

In this way, the original computational bottleneck is broken and the new entropic solution can be obtained with total quadratic time, which is almost optimal complexity.

JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models

no code implementations12 Apr 2024 Yingchaojie Feng, Zhizhang Chen, Zhining Kang, Sijia Wang, Minfeng Zhu, Wei zhang, Wei Chen

Addressing these concerns necessitates a comprehensive analysis of jailbreak prompts to evaluate LLMs' defensive capabilities and identify potential weaknesses.

Map Optical Properties to Subwavelength Structures Directly via a Diffusion Model

no code implementations9 Apr 2024 Shijie Rao, Kaiyu Cui, Yidong Huang, Jiawei Yang, YaLi Li, Shengjin Wang, Xue Feng, Fang Liu, Wei zhang

The inverse design methods proposed for these subwavelength structures are vital to the development of new photonic devices.

Accel-NASBench: Sustainable Benchmarking for Accelerator-Aware NAS

1 code implementation9 Apr 2024 Afzal Ahmad, Linfeng Du, Zhiyao Xie, Wei zhang

We present a technique that allows searching for training proxies that reduce the cost of benchmark construction by significant margins, making it possible to construct realistic NAS benchmarks for large-scale datasets.

Benchmarking Neural Architecture Search

A diffusion MRI tractography atlas for concurrent white matter mapping across Eastern and Western populations

no code implementations6 Apr 2024 Yijie Li, Wei zhang, Ye Wu, Li Yin, Ce Zhu, Yuqian Chen, Suheyla Cetin-Karayumak, Kang Ik K Cho, Leo R. Zekelman, Jarrett Rushmore, Yogesh Rathi, Nikos Makris, Lauren J. O'Donnell, Fan Zhang

However, a comprehensive investigation into WM fiber tracts between Eastern and Western populations is challenged due to the lack of a cross-population WM atlas and the large site-specific variability of dMRI data.

Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object Detection

no code implementations26 Mar 2024 Jiacheng Zhang, Jiaming Li, Xiangru Lin, Wei zhang, Xiao Tan, Junyu Han, Errui Ding, Jingdong Wang, Guanbin Li

Additionally, we present a DepthGradient Projection (DGP) module to mitigate optimization conflicts caused by noisy depth supervision of pseudo-labels, effectively decoupling the depth gradient and removing conflicting gradients.

Monocular 3D Object Detection object-detection +1

Gradient-based Sampling for Class Imbalanced Semi-supervised Object Detection

1 code implementation ICCV 2023 Jiaming Li, Xiangru Lin, Wei zhang, Xiao Tan, YingYing Li, Junyu Han, Errui Ding, Jingdong Wang, Guanbin Li

To tackle the confirmation bias from incorrect pseudo labels of minority classes, the class-rebalancing sampling module resamples unlabeled data following the guidance of the gradient-based reweighting module.

object-detection Object Detection +1

SFOD: Spiking Fusion Object Detector

1 code implementation22 Mar 2024 Yimeng Fan, Wei zhang, Changsong Liu, Mingyang Li, Wenrui Lu

Thereby, we establish state-of-the-art classification results based on SNNs, achieving 93. 7\% accuracy on the NCAR dataset.

Object object-detection +1

OpenOcc: Open Vocabulary 3D Scene Reconstruction via Occupancy Representation

no code implementations18 Mar 2024 Haochen Jiang, Yueming Xu, Yihan Zeng, Hang Xu, Wei zhang, Jianfeng Feng, Li Zhang

We model the geometric structure of the scene with occupancy representation and distill the pre-trained open vocabulary model into a 3D language field via volume rendering for zero-shot inference.

3D Reconstruction 3D Scene Reconstruction +3

LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model

no code implementations18 Mar 2024 Runhui Huang, Kaixin Cai, Jianhua Han, Xiaodan Liang, Renjing Pei, Guansong Lu, Songcen Xu, Wei zhang, Hang Xu

Specifically, an inter-layer attention module is designed to encourage information exchange and learning between layers, while a text-guided intra-layer attention module incorporates layer-specific prompts to direct the specific-content generation for each layer.

Image Generation Style Transfer

Affective Behaviour Analysis via Integrating Multi-Modal Knowledge

no code implementations16 Mar 2024 Wei zhang, Feng Qiu, Chen Liu, Lincheng Li, Heming Du, Tiancheng Guo, Xin Yu

Affective Behavior Analysis aims to facilitate technology emotionally smart, creating a world where devices can understand and react to our emotions as humans do.

OneVOS: Unifying Video Object Segmentation with All-in-One Transformer Framework

no code implementations13 Mar 2024 Wanyun Li, Pinxue Guo, Xinyu Zhou, Lingyi Hong, Yangji He, Xiangyu Zheng, Wei zhang, Wenqiang Zhang

Contemporary Video Object Segmentation (VOS) approaches typically consist stages of feature extraction, matching, memory management, and multiple objects aggregation.

Management Semantic Segmentation +2

Query-guided Prototype Evolution Network for Few-Shot Segmentation

no code implementations11 Mar 2024 Runmin Cong, Hang Xiong, Jinpeng Chen, Wei zhang, Qingming Huang, Yao Zhao

To address this, we present the Query-guided Prototype Evolution Network (QPENet), a new method that integrates query features into the generation process of foreground and background prototypes, thereby yielding customized prototypes attuned to specific queries.

Segmentation

ClickVOS: Click Video Object Segmentation

no code implementations10 Mar 2024 Pinxue Guo, Lingyi Hong, Xinyu Zhou, Shuyong Gao, Wanyun Li, Jinglun Li, Zhaoyu Chen, Xiaoqiang Li, Wei zhang, Wenqiang Zhang

To address these limitations, we propose the setting named Click Video Object Segmentation (ClickVOS) which segments objects of interest across the whole video according to a single click per object in the first frame.

Object Segmentation +3

Aligning Large Language Models for Controllable Recommendations

no code implementations8 Mar 2024 Wensheng Lu, Jianxun Lian, Wei zhang, Guanghua Li, Mingyang Zhou, Hao Liao, Xing Xie

Inspired by the exceptional general intelligence of Large Language Models (LLMs), researchers have begun to explore their application in pioneering the next generation of recommender systems - systems that are conversational, explainable, and controllable.

Recommendation Systems

SMAUG: A Sliding Multidimensional Task Window-Based MARL Framework for Adaptive Real-Time Subtask Recognition

no code implementations4 Mar 2024 Wenjing Zhang, Wei zhang

Instead of making behavioral decisions directly from the exponentially expanding joint observational-action space, subtask-based multi-agent reinforcement learning (MARL) methods enable agents to learn how to tackle different subtasks.

Hierarchical Reinforcement Learning Multi-agent Reinforcement Learning +4

Lemur: Log Parsing with Entropy Sampling and Chain-of-Thought Merging

1 code implementation28 Feb 2024 Wei zhang, Hongcheng Guo, Anjie Le, Jian Yang, Jiaheng Liu, Zhoujun Li, Tieqiao Zheng, Shi Xu, Runqiang Zang, Liangfan Zheng, Bo Zhang

Log parsing, which entails transforming raw log messages into structured templates, constitutes a critical phase in the automation of log analytics.

Log Parsing

StaPep: an open-source tool for the structure prediction and feature extraction of hydrocarbon-stapled peptides

1 code implementation28 Feb 2024 Zhe Wang, Jianping Wu, Mengjun Zheng, Chenchen Geng, Borui Zhen, Wei zhang, Hui Wu, Zhengyang Xu, Gang Xu, Si Chen, Xiang Li

Many tools exist for extracting structural and physiochemical descriptors from linear peptides to predict their properties, but similar tools for hydrocarbon-stapled peptides are lacking. Here, we present StaPep, a Python-based toolkit designed for generating 2D/3D structures and calculating 21 distinct features for hydrocarbon-stapled peptides. The current version supports hydrocarbon-stapled peptides containing 2 non-standard amino acids (norleucine and 2-aminoisobutyric acid) and 6 nonnatural anchoring residues (S3, S5, S8, R3, R5 and R8). Then we established a hand-curated dataset of 201 hydrocarbon-stapled peptides and 384 linear peptides with sequence information and experimental membrane permeability, to showcase StaPep's application in artificial intelligence projects. A machine learning-based predictor utilizing above calculated features was developed with AUC of 0. 85, for identifying cell-penetrating hydrocarbon-stapled peptides. StaPep's pipeline spans data retrieval, cleaning, structure generation, molecular feature calculation, and machine learning model construction for hydrocarbon-stapled peptides. The source codes and dataset are freely available on Github: https://github. com/dahuilangda/stapep_package.

Retrieval

Don't Forget Your Reward Values: Language Model Alignment via Value-based Calibration

1 code implementation25 Feb 2024 Xin Mao, Feng-Lin Li, Huimin Xu, Wei zhang, Anh Tuan Luu

While Reinforcement Learning from Human Feedback (RLHF) significantly enhances the generation quality of Large Language Models (LLMs), recent studies have raised concerns regarding the complexity and instability associated with the Proximal Policy Optimization (PPO) algorithm, proposing a series of order-based calibration methods as viable alternatives.

Language Modelling

Do Large Language Models Understand Logic or Just Mimick Context?

no code implementations19 Feb 2024 Junbing Yan, Chengyu Wang, Jun Huang, Wei zhang

Over the past few years, the abilities of large language models (LLMs) have received extensive attention, which have performed exceptionally well in complicated scenarios such as logical reasoning and symbolic inference.

counterfactual In-Context Learning +1

Gaussian Process Neural Additive Models

1 code implementation19 Feb 2024 Wei zhang, Brian Barr, John Paisley

Deep neural networks have revolutionized many fields, but their black-box nature also occasionally prevents their wider adoption in fields such as healthcare and finance, where interpretable and explainable models are required.

Additive models Explainable Models

Pattern-wise Transparent Sequential Recommendation

no code implementations18 Feb 2024 Kun Ma, Cong Xu, Zeyuan Chen, Wei zhang

However, achieving both model transparency and recommendation performance simultaneously is challenging, especially for models that take the entire sequence of items as input without screening.

Decision Making Sequential Recommendation

Translating Images to Road Network:A Non-Autoregressive Sequence-to-Sequence Approach

2 code implementations13 Feb 2024 Jiachen Lu, Renyuan Peng, Xinyue Cai, Hang Xu, Hongyang Li, Feng Wen, Wei zhang, Li Zhang

Instead, our work establishes a unified representation of both types of data domain by projecting both Euclidean and non-Euclidean data into an integer series called RoadNet Sequence.

Understanding the Role of Cross-Entropy Loss in Fairly Evaluating Large Language Model-based Recommendation

no code implementations9 Feb 2024 Cong Xu, Zhangchi Zhu, Jun Wang, Jianyong Wang, Wei zhang

Large language models (LLMs) have gained much attention in the recommendation community; some studies have observed that LLMs, fine-tuned by the cross-entropy loss with a full softmax, could achieve state-of-the-art performance already.

Language Modelling Large Language Model

EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing Domain

1 code implementation30 Jan 2024 Wei zhang, Miaoxin Cai, Tong Zhang, Yin Zhuang, Xuerui Mao

Multi-modal large language models (MLLMs) have demonstrated remarkable success in vision and visual-language tasks within the natural image domain.

Image Comprehension Instruction Following +2

Fortifying Ethical Boundaries in AI: Advanced Strategies for Enhancing Security in Large Language Models

no code implementations27 Jan 2024 Yunhong He, Jianling Qiu, Wei zhang, Zhengqing Yuan

Recent advancements in large language models (LLMs) have significantly enhanced capabilities in natural language processing and artificial intelligence.

Question Answering Text Generation

Contrastive Learning with Negative Sampling Correction

no code implementations13 Jan 2024 Lu Wang, Chao Du, Pu Zhao, Chuan Luo, Zhangchi Zhu, Bo Qiao, Wei zhang, QIngwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang

To correct the negative sampling bias, we propose a novel contrastive learning method named Positive-Unlabeled Contrastive Learning (PUCL).

Contrastive Learning Data Augmentation +2

Knowledge-enhanced Multi-perspective Video Representation Learning for Scene Recognition

no code implementations9 Jan 2024 Xuzheng Yu, Chen Jiang, Wei zhang, Tian Gan, Linlin Chao, Jianan Zhao, Yuan Cheng, Qingpei Guo, Wei Chu

With the explosive growth of video data in real-world applications, a comprehensive representation of videos becomes increasingly important.

Representation Learning Scene Recognition

Has Your Pretrained Model Improved? A Multi-head Posterior Based Approach

no code implementations2 Jan 2024 Prince Aboagye, Yan Zheng, Junpeng Wang, Uday Singh Saini, Xin Dai, Michael Yeh, Yujie Fan, Zhongfang Zhuang, Shubham Jain, Liang Wang, Wei zhang

The emergence of pre-trained models has significantly impacted Natural Language Processing (NLP) and Computer Vision to relational datasets.

Holistic Autonomous Driving Understanding by Bird's-Eye-View Injected Multi-Modal Large Models

1 code implementation2 Jan 2024 Xinpeng Ding, Jinahua Han, Hang Xu, Xiaodan Liang, Wei zhang, Xiaomeng Li

BEV-InMLLM integrates multi-view, spatial awareness, and temporal semantics to enhance MLLMs' capabilities on NuInstruct tasks.

Autonomous Driving

Enhancing dysarthria speech feature representation with empirical mode decomposition and Walsh-Hadamard transform

no code implementations30 Dec 2023 Ting Zhu, Shufei Duan, Camille Dingam, HuiZhi Liang, Wei zhang

This algorithm effectively addresses the challenges of the imbalanced dataset and non-linearity in dysarthric speech and simultaneously provides a robust representation of the local pathological features of the vocal folds and tracts.

imbalanced classification

Symbolic Cognitive Diagnosis via Hybrid Optimization for Intelligent Education Systems

1 code implementation30 Dec 2023 Junhao Shen, Hong Qian, Wei zhang, Aimin Zhou

The SCD framework incorporates the symbolic tree to explicably represent the complicated student-exercise interaction function, and utilizes gradient-based optimization methods to effectively learn the student and exercise parameters.

Attribute cognitive diagnosis

PanGu-Draw: Advancing Resource-Efficient Text-to-Image Synthesis with Time-Decoupled Training and Reusable Coop-Diffusion

no code implementations27 Dec 2023 Guansong Lu, Yuanfan Guo, Jianhua Han, Minzhe Niu, Yihan Zeng, Songcen Xu, Zeyi Huang, Zhao Zhong, Wei zhang, Hang Xu

Current large-scale diffusion models represent a giant leap forward in conditional image synthesis, capable of interpreting diverse cues like text, human poses, and edges.

Computational Efficiency Denoising +1

SERF: Fine-Grained Interactive 3D Segmentation and Editing with Radiance Fields

no code implementations26 Dec 2023 Kaichen Zhou, Lanqing Hong, Enze Xie, Yongxin Yang, Zhenguo Li, Wei zhang

Although significant progress has been made in the field of 2D-based interactive editing, fine-grained 3D-based interactive editing remains relatively unexplored.

Interactive Segmentation Segmentation

DMT: Comprehensive Distillation with Multiple Self-supervised Teachers

no code implementations19 Dec 2023 Yuang Liu, Jing Wang, Qiang Zhou, Fan Wang, Jun Wang, Wei zhang

Numerous self-supervised learning paradigms, such as contrastive learning and masked image modeling, have been proposed to acquire powerful and general representations from unlabeled data.

Contrastive Learning Model Compression +1

Design, construction and evaluation of emotional multimodal pathological speech database

no code implementations14 Dec 2023 Ting Zhu, Shufei Duan, HuiZhi Liang, Wei zhang

The automatic recognition tested on speech and glottal data, with average accuracy of 78% for controls and 60% for patients in audio, while 51% for controls and 38% for patients in glottal data, indicating an influence of the disease on emotional expression.

Native Language Identification with Large Language Models

no code implementations13 Dec 2023 Wei zhang, Alexandre Salle

We present the first experiments on Native Language Identification (NLI) using LLMs such as GPT-4.

Language Acquisition Native Language Identification

BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models

1 code implementation5 Dec 2023 Fengyuan Shi, Jiaxi Gu, Hang Xu, Songcen Xu, Wei zhang, LiMin Wang

Now text-to-image foundation models are widely applied to various downstream image synthesis tasks, such as controllable image generation and image editing, while downstream video synthesis tasks are less explored for several reasons.

Image Generation Model Selection +3

Interpretable Knowledge Tracing via Response Influence-based Counterfactual Reasoning

1 code implementation1 Dec 2023 Jiajun Cui, Minghe Yu, Bo Jiang, Aimin Zhou, Jianyong Wang, Wei zhang

Knowledge tracing (KT) plays a crucial role in computer-aided education and intelligent tutoring systems, aiming to assess students' knowledge proficiency by predicting their future performance on new questions based on their past response records.

counterfactual Counterfactual Reasoning +1

Towards Better Parameter-Efficient Fine-Tuning for Large Language Models: A Position Paper

no code implementations22 Nov 2023 Chengyu Wang, Junbing Yan, Wei zhang, Jun Huang

This paper delves into the pressing need in Parameter-Efficient Fine-Tuning (PEFT) for Large Language Models (LLMs).

Model Compression Position

Soft Random Sampling: A Theoretical and Empirical Analysis

no code implementations21 Nov 2023 Xiaodong Cui, Ashish Mittal, Songtao Lu, Wei zhang, George Saon, Brian Kingsbury

Soft random sampling (SRS) is a simple yet effective approach for efficient training of large-scale deep neural networks when dealing with massive data.

Automatic Speech Recognition speech-recognition +1

Multi-Resolution Planar Region Extraction for Uneven Terrains

no code implementations21 Nov 2023 Yinghan Sun, Linfang Zheng, Hua Chen, Wei zhang

This paper studies the problem of extracting planar regions in uneven terrains from unordered point cloud measurements.

Computational Efficiency

Scaling User Modeling: Large-scale Online User Representations for Ads Personalization in Meta

no code implementations16 Nov 2023 Wei zhang, Dai Li, Chen Liang, Fang Zhou, Zhongke Zhang, Xuewei Wang, Ru Li, Yi Zhou, Yaning Huang, Dong Liang, Kai Wang, Zhangyuan Wang, Zhengxing Chen, Fenggang Wu, Minghai Chen, Huayu Li, Yunnan Wu, Zhan Shu, Mindi Yuan, Sri Reddy

To address these challenges, we present Scaling User Modeling (SUM), a framework widely deployed in Meta's ads ranking system, designed to facilitate efficient and scalable sharing of online user representation across hundreds of ads models.

Representation Learning

MS-Former: Memory-Supported Transformer for Weakly Supervised Change Detection with Patch-Level Annotations

1 code implementation16 Nov 2023 Zhenglai Li, Chang Tang, Xinwang Liu, Changdong Li, Xianju Li, Wei zhang

How to capture the semantic variations associated with the changed and unchanged regions from the patch-level annotations to obtain promising change results is the critical challenge for the weakly supervised change detection task.

Change Detection

From Complex to Simple: Unraveling the Cognitive Tree for Reasoning with Small Language Models

no code implementations12 Nov 2023 Junbing Yan, Chengyu Wang, Taolin Zhang, Xiaofeng He, Jun Huang, Wei zhang

Reasoning is a distinctive human capacity, enabling us to address complex problems by breaking them down into a series of manageable cognitive steps.

Language Modelling Logical Reasoning

VoxNeRF: Bridging Voxel Representation and Neural Radiance Fields for Enhanced Indoor View Synthesis

no code implementations9 Nov 2023 Sen Wang, Wei zhang, Stefano Gasperini, Shun-Cheng Wu, Nassir Navab

Creating high-quality view synthesis is essential for immersive applications but continues to be problematic, particularly in indoor environments and for real-time deployment.

From Input to Output: A Multi-layer Knowledge Distillation Framework for Compressing Recommendation Models

no code implementations8 Nov 2023 Zhangchi Zhu, Wei zhang

In this paper, we decompose recommendation models into three layers, i. e., the input layer, the intermediate layer, and the output layer, and address deficiencies layer by layer.

Knowledge Distillation

Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation

1 code implementation7 Nov 2023 Ruomeng Ding, Chaoyun Zhang, Lu Wang, Yong Xu, Minghua Ma, Wei zhang, Si Qin, Saravan Rajmohan, QIngwei Lin, Dongmei Zhang

To address these limitations, we introduce a novel thought prompting approach called "Everything of Thoughts" (XoT) to defy the law of "Penrose triangle of existing thought paradigms.

Decision Making

Temporal Treasure Hunt: Content-based Time Series Retrieval System for Discovering Insights

no code implementations5 Nov 2023 Chin-Chia Michael Yeh, Huiyuan Chen, Xin Dai, Yan Zheng, Yujie Fan, Vivian Lai, Junpeng Wang, Audrey Der, Zhongfang Zhuang, Liang Wang, Wei zhang

To facilitate this investigation, we introduce a CTSR benchmark dataset that comprises time series data from a variety of domains, such as motion, power demand, and traffic.

Retrieval Time Series +1

Ego-Network Transformer for Subsequence Classification in Time Series Data

no code implementations5 Nov 2023 Chin-Chia Michael Yeh, Huiyuan Chen, Yujie Fan, Xin Dai, Yan Zheng, Vivian Lai, Junpeng Wang, Zhongfang Zhuang, Liang Wang, Wei zhang, Eamonn Keogh

The ego-networks of all subsequences collectively form a time series subsequence graph, and we introduce an algorithm to efficiently construct this graph.

Time Series Time Series Classification

Time Series Synthesis Using the Matrix Profile for Anonymization

no code implementations5 Nov 2023 Audrey Der, Chin-Chia Michael Yeh, Yan Zheng, Junpeng Wang, Huiyuan Chen, Zhongfang Zhuang, Liang Wang, Wei zhang, Eamonn Keogh

As a result, unmodified data mining tools can obtain near-identical performance on the synthesized time series as on the original time series.

Time Series

Visual Analytics for Efficient Image Exploration and User-Guided Image Captioning

no code implementations2 Nov 2023 Yiran Li, Junpeng Wang, Prince Aboagye, Michael Yeh, Yan Zheng, Liang Wang, Wei zhang, Kwan-Liu Ma

On the one hand, by visually examining the captions automatically generated from language-image models for an image dataset, we gain deeper insights into the semantic underpinnings of the visual contents, unearthing data biases that may be entrenched within the dataset.

Caption Generation Efficient Exploration +1

Lightweight super resolution network for point cloud geometry compression

1 code implementation2 Nov 2023 Wei zhang, Dingquan Li, Ge Li, Wen Gao

This paper presents an approach for compressing point cloud geometry by leveraging a lightweight super-resolution network.

Decoder Point cloud reconstruction +2

BM2CP: Efficient Collaborative Perception with LiDAR-Camera Modalities

1 code implementation23 Oct 2023 Binyu Zhao, Wei zhang, Zhaonian Zou

Collaborative perception enables agents to share complementary perceptual information with nearby agents.

Autonomous Driving

Learning Interpretable Rules for Scalable Data Representation and Classification

1 code implementation22 Oct 2023 Zhuo Wang, Wei zhang, Ning Liu, Jianyong Wang

Rule-based models, e. g., decision trees, are widely used in scenarios demanding high model interpretability for their transparent inner structures and good model expressivity.

Classification

Parallel compressive super-resolution imaging with wide field-of-view based on physics enhanced network

no code implementations20 Oct 2023 Xiao-Peng Jin, An-Dong Xiong, Wei zhang, Xiao-Qing Wang, Fan Liu, Chang-Heng Li, Xu-Ri Yao, Xue-Feng Liu, Qing Zhao

By training the network with the prior OTF of an arbitrary 128x128-pixel region and fine-tuning the network with other OTFs within rest regions of FOV, we realize both mask optimization and super-resolution imaging with up to 1020x1500 wide FOV.

Super-Resolution

Explore the Effect of Data Selection on Poison Efficiency in Backdoor Attacks

no code implementations15 Oct 2023 Ziqiang Li, Pengfei Xia, Hong Sun, Yueqi Zeng, Wei zhang, Bin Li

In this study, we focus on improving the poisoning efficiency of backdoor attacks from the sample selection perspective.

Audio Classification Image Classification +2

HI-SLAM: Monocular Real-time Dense Mapping with Hybrid Implicit Fields

no code implementations7 Oct 2023 Wei zhang, Tiecheng Sun, Sen Wang, Qing Cheng, Norbert Haala

For global consistency, we propose an efficient Sim(3)-based pose graph bundle adjustment (PGBA) approach to run online loop closing and mitigate the pose and scale drift.

Simultaneous Localization and Mapping

An Efficient Content-based Time Series Retrieval System

no code implementations5 Oct 2023 Chin-Chia Michael Yeh, Huiyuan Chen, Xin Dai, Yan Zheng, Junpeng Wang, Vivian Lai, Yujie Fan, Audrey Der, Zhongfang Zhuang, Liang Wang, Wei zhang, Jeff M. Phillips

A Content-based Time Series Retrieval (CTSR) system is an information retrieval system for users to interact with time series emerged from multiple domains, such as finance, healthcare, and manufacturing.

Information Retrieval Retrieval +1

Toward a Foundation Model for Time Series Data

no code implementations5 Oct 2023 Chin-Chia Michael Yeh, Xin Dai, Huiyuan Chen, Yan Zheng, Yujie Fan, Audrey Der, Vivian Lai, Zhongfang Zhuang, Junpeng Wang, Liang Wang, Wei zhang

A foundation model is a machine learning model trained on a large and diverse set of data, typically using self-supervised learning-based pre-training techniques, that can be adapted to various downstream tasks.

Self-Supervised Learning Time Series

Feature Interaction Aware Automated Data Representation Transformation

1 code implementation29 Sep 2023 Ehtesamul Azim, Dongjie Wang, Kunpeng Liu, Wei zhang, Yanjie Fu

Creating an effective representation space is crucial for mitigating the curse of dimensionality, enhancing model generalization, addressing data sparsity, and leveraging classical models more effectively.

Automated Feature Engineering Decision Making +4

Revealing the Power of Spatial-Temporal Masked Autoencoders in Multivariate Time Series Forecasting

no code implementations26 Sep 2023 Jiarui Sun, Yujie Fan, Chin-Chia Michael Yeh, Wei zhang, Girish Chowdhary

To address these issues, we propose Spatial-Temporal Masked Autoencoders (STMAE), an MTS forecasting framework that leverages masked autoencoders to enhance the performance of spatial-temporal baseline models.

Decoder Multivariate Time Series Forecasting +1

Graph-enhanced Optimizers for Structure-aware Recommendation Embedding Evolution

no code implementations24 Sep 2023 Cong Xu, Jun Wang, Jianyong Wang, Wei zhang

Embedding plays a critical role in modern recommender systems because they are virtual representations of real-world entities and the foundation for subsequent decision models.

Recommendation Systems

PanoVOS: Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation

1 code implementation21 Sep 2023 Shilin Yan, Xiaohao Xu, Renrui Zhang, Lingyi Hong, Wenchao Chen, Wenqiang Zhang, Wei zhang

Our dataset poses new challenges in panoramic VOS and we hope that our PanoVOS can advance the development of panoramic segmentation/tracking.

Autonomous Driving Segmentation +4

Learning Segment Similarity and Alignment in Large-Scale Content Based Video Retrieval

no code implementations20 Sep 2023 Chen Jiang, Kaiming Huang, Sifeng He, Xudong Yang, Wei zhang, Xiaobo Zhang, Yuan Cheng, Lei Yang, Qing Wang, Furong Xu, Tan Pan, Wei Chu

SSAN is based on two newly proposed modules in video retrieval: (1) An efficient Self-supervised Keyframe Extraction (SKE) module to reduce redundant frame features, (2) A robust Similarity Pattern Detection (SPD) module for temporal alignment.

Retrieval Video Retrieval

Multi-view Fuzzy Representation Learning with Rules based Model

2 code implementations20 Sep 2023 Wei zhang, Zhaohong Deng, Te Zhang, Kup-Sze Choi, Shitong Wang

Second, a new regularization method based on L_(2, 1)-norm regression is proposed to mine the consistency information between views, while the geometric structure of the data is preserved through the Laplacian graph.

Representation Learning

An Empirical Study of Attention Networks for Semantic Segmentation

no code implementations19 Sep 2023 Hao Guo, Hongbiao Si, Guilin Jiang, Wei zhang, Zhiyan Liu, Xuanyi Zhu, xulong Zhang, Yang Liu

What's more, various methods utilize attention in semantic segmentation, but the conclusion of these methods is lacking.

Segmentation Semantic Segmentation

SoccerNet 2023 Challenges Results

2 code implementations12 Sep 2023 Anthony Cioppa, Silvio Giancola, Vladimir Somers, Floriane Magera, Xin Zhou, Hassan Mkhallati, Adrien Deliège, Jan Held, Carlos Hinojosa, Amir M. Mansourian, Pierre Miralles, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdullah Kamal, Adrien Maglo, Albert Clapés, Amr Abdelaziz, Artur Xarles, Astrid Orcesi, Atom Scott, Bin Liu, Byoungkwon Lim, Chen Chen, Fabian Deuser, Feng Yan, Fufu Yu, Gal Shitrit, Guanshuo Wang, Gyusik Choi, Hankyul Kim, Hao Guo, Hasby Fahrudin, Hidenari Koguchi, Håkan Ardö, Ibrahim Salah, Ido Yerushalmy, Iftikar Muhammad, Ikuma Uchida, Ishay Be'ery, Jaonary Rabarisoa, Jeongae Lee, Jiajun Fu, Jianqin Yin, Jinghang Xu, Jongho Nang, Julien Denize, Junjie Li, Junpei Zhang, Juntae Kim, Kamil Synowiec, Kenji Kobayashi, Kexin Zhang, Konrad Habel, Kota Nakajima, Licheng Jiao, Lin Ma, Lizhi Wang, Luping Wang, Menglong Li, Mengying Zhou, Mohamed Nasr, Mohamed Abdelwahed, Mykola Liashuha, Nikolay Falaleev, Norbert Oswald, Qiong Jia, Quoc-Cuong Pham, Ran Song, Romain Hérault, Rui Peng, Ruilong Chen, Ruixuan Liu, Ruslan Baikulov, Ryuto Fukushima, Sergio Escalera, Seungcheon Lee, Shimin Chen, Shouhong Ding, Taiga Someya, Thomas B. Moeslund, Tianjiao Li, Wei Shen, Wei zhang, Wei Li, Wei Dai, Weixin Luo, Wending Zhao, Wenjie Zhang, Xinquan Yang, Yanbiao Ma, Yeeun Joo, Yingsen Zeng, Yiyang Gan, Yongqiang Zhu, Yujie Zhong, Zheng Ruan, Zhiheng Li, Zhijian Huang, Ziyu Meng

More information on the tasks, challenges, and leaderboards are available on https://www. soccer-net. org.

Action Spotting Camera Calibration +3

HiLM-D: Towards High-Resolution Understanding in Multimodal Large Language Models for Autonomous Driving

no code implementations11 Sep 2023 Xinpeng Ding, Jianhua Han, Hang Xu, Wei zhang, Xiaomeng Li

For the first time, we leverage singular multimodal large language models (MLLMs) to consolidate multiple autonomous driving tasks from videos, i. e., the Risk Object Localization and Intention and Suggestion Prediction (ROLISP) task.

Autonomous Driving Object Localization

Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation

no code implementations7 Sep 2023 Jiaxi Gu, Shicong Wang, Haoyu Zhao, Tianyi Lu, Xing Zhang, Zuxuan Wu, Songcen Xu, Wei zhang, Yu-Gang Jiang, Hang Xu

Conditioned on an initial video clip with a small number of frames, additional frames are iteratively generated by reusing the original latent features and following the previous diffusion process.

Action Recognition Decoder +4

Examining User-Friendly and Open-Sourced Large GPT Models: A Survey on Language, Multimodal, and Scientific GPT Models

1 code implementation27 Aug 2023 Kaiyuan Gao, Sunan He, Zhenyu He, Jiacheng Lin, Qizhi Pei, Jie Shao, Wei zhang

Generative pre-trained transformer (GPT) models have revolutionized the field of natural language processing (NLP) with remarkable performance in various tasks and also extend their power to multimodal domains.

Accel-GCN: High-Performance GPU Accelerator Design for Graph Convolution Networks

1 code implementation22 Aug 2023 Xi Xie, Hongwu Peng, Amit Hasan, Shaoyi Huang, Jiahui Zhao, Haowen Fang, Wei zhang, Tong Geng, Omer Khan, Caiwen Ding

Utilizing these principles, we formulated a kernel for sparse matrix multiplication (SpMM) in GCNs that employs block-level partitioning and combined warp strategy.

Computational Efficiency

DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability

no code implementations ICCV 2023 Runhui Huang, Jianhua Han, Guansong Lu, Xiaodan Liang, Yihan Zeng, Wei zhang, Hang Xu

DiffDis first formulates the image-text discriminative problem as a generative diffusion process of the text embedding from the text encoder conditioned on the image.

Image Generation Zero-Shot Learning

Frequency Perception Network for Camouflaged Object Detection

2 code implementations17 Aug 2023 Runmin Cong, Mengyao Sun, Sanyi Zhang, Xiaofei Zhou, Wei zhang, Yao Zhao

Camouflaged object detection (COD) aims to accurately detect objects hidden in the surrounding environment.

Object object-detection +1

Point-aware Interaction and CNN-induced Refinement Network for RGB-D Salient Object Detection

2 code implementations17 Aug 2023 Runmin Cong, Hongyu Liu, Chen Zhang, Wei zhang, Feng Zheng, Ran Song, Sam Kwong

By integrating complementary information from RGB image and depth map, the ability of salient object detection (SOD) for complex and challenging scenes can be improved.

object-detection RGB-D Salient Object Detection +1

SDDNet: Style-guided Dual-layer Disentanglement Network for Shadow Detection

1 code implementation17 Aug 2023 Runmin Cong, Yuchen Guan, Jinpeng Chen, Wei zhang, Yao Zhao, Sam Kwong

Despite significant progress in shadow detection, current methods still struggle with the adverse impact of background color, which may lead to errors when shadows are present on complex backgrounds.

Disentanglement Shadow Detection

Beyond Semantics: Learning a Behavior Augmented Relevance Model with Self-supervised Learning

1 code implementation10 Aug 2023 Zeyuan Chen, Wei Chen, Jia Xu, Zhongyi Liu, Wei zhang

Drawing inspiration from this, we devise a novel Behavior Augmented Relevance Learning model for Alipay Search (BARL-ASe) that leverages neighbor queries of target item and neighbor items of target query to complement target query-item semantic matching.

Self-Supervised Learning Semantic Similarity +1

Gaussian-based Probabilistic Deep Supervision Network for Noise-Resistant QoS Prediction

no code implementations3 Aug 2023 Ziliang Wang, Xiaohong Zhang, Sheng Huang, Wei zhang, Dan Yang, Meng Yan

Quality of Service (QoS) prediction is an essential task in recommendation systems, where accurately predicting unknown QoS values can improve user satisfaction.

Recommendation Systems

Dynamic Token-Pass Transformers for Semantic Segmentation

no code implementations3 Aug 2023 Yuang Liu, Qiang Zhou, Jing Wang, Fan Wang, Jun Wang, Wei zhang

Vision transformers (ViT) usually extract features via forwarding all the tokens in the self-attention layers from top to toe.

Segmentation Semantic Segmentation

Knowledge-aware Collaborative Filtering with Pre-trained Language Model for Personalized Review-based Rating Prediction

1 code implementation2 Aug 2023 Quanxiu Wang, Xinlei Cao, Jianyong Wang, Wei zhang

For the first issue, to utilize rich knowledge, KCF-PLM develops a transformer network to model the interactions of the extracted aspects w. r. t.

Collaborative Filtering Language Modelling

EmbeddingTree: Hierarchical Exploration of Entity Features in Embedding

no code implementations2 Aug 2023 Yan Zheng, Junpeng Wang, Chin-Chia Michael Yeh, Yujie Fan, Huiyuan Chen, Liang Wang, Wei zhang

The tool helps users discover nuance features of data entities, perform feature denoising/injecting in embedding training, and generate embeddings for unseen entities.

Denoising

Robust Positive-Unlabeled Learning via Noise Negative Sample Self-correction

1 code implementation1 Aug 2023 Zhangchi Zhu, Lu Wang, Pu Zhao, Chao Du, Wei zhang, Hang Dong, Bo Qiao, QIngwei Lin, Saravan Rajmohan, Dongmei Zhang

To mitigate the impact of label uncertainty and improve the robustness of learning with positive and unlabeled data, we propose a new robust PU learning method with a training strategy motivated by the nature of human learning: easy cases should be learned first.

Learning and Evaluating Human Preferences for Conversational Head Generation

no code implementations20 Jul 2023 Mohan Zhou, Yalong Bai, Wei zhang, Ting Yao, Tiejun Zhao, Tao Mei

In this paper, we propose a novel learning-based evaluation metric named Preference Score (PS) for fitting human preference according to the quantitative evaluations across different dimensions.

Semi-DETR: Semi-Supervised Object Detection with Detection Transformers

3 code implementations CVPR 2023 Jiacheng Zhang, Xiangru Lin, Wei zhang, Kuo Wang, Xiao Tan, Junyu Han, Errui Ding, Jingdong Wang, Guanbin Li

Specifically, we propose a Stage-wise Hybrid Matching strategy that combines the one-to-many assignment and one-to-one assignment strategies to improve the training efficiency of the first stage and thus provide high-quality pseudo labels for the training of the second stage.

Object object-detection +3

Visual Analytics For Machine Learning: A Data Perspective Survey

no code implementations15 Jul 2023 Junpeng Wang, Shixia Liu, Wei zhang

The past decade has witnessed a plethora of works that leverage the power of visualization (VIS) to interpret machine learning (ML) models.

Contrastive Graph Pooling for Explainable Classification of Brain Networks

1 code implementation7 Jul 2023 Jiaxing Xu, Qingtian Bian, Xinhang Li, Aihu Zhang, Yiping Ke, Miao Qiao, Wei zhang, Wei Khang Jeremy Sim, Balázs Gulyás

Our contributions underscore the potential of ContrastPool for advancing the understanding of brain networks and neurodegenerative conditions.

Classification

CityTrack: Improving City-Scale Multi-Camera Multi-Target Tracking by Location-Aware Tracking and Box-Grained Matching

no code implementations6 Jul 2023 Jincheng Lu, Xipeng Yang, Jin Ye, Yifu Zhang, Zhikang Zou, Wei zhang, Xiao Tan

Targets in urban traffic scenes often undergo occlusion, illumination changes, and perspective changes, making it difficult to associate targets across different cameras accurately.

Interactive Conversational Head Generation

no code implementations5 Jul 2023 Mohan Zhou, Yalong Bai, Wei zhang, Ting Yao, Tiejun Zhao

Based on ViCo and ViCo-X, we define three novel tasks targeting the interaction modeling during the face-to-face conversation: 1) responsive listening head generation making listeners respond actively to the speaker with non-verbal signals, 2) expressive talking head generation guiding speakers to be aware of listeners' behaviors, and 3) conversational head generation to integrate the talking/listening ability in one interlocutor.

Sentence Talking Head Generation

Understanding recent deep-learning techniques for identifying collective variables of molecular dynamics

no code implementations1 Jul 2023 Wei zhang, Christof Schütte

High-dimensional metastable molecular system can often be characterised by a few features of the system, i. e. collective variables (CVs).

Deep Equilibrium Multimodal Fusion

no code implementations29 Jun 2023 Jinhong Ni, Yalong Bai, Wei zhang, Ting Yao, Tao Mei

Multimodal fusion integrates the complementary information present in multiple modalities and has gained much attention recently.

Visual Question Answering (VQA)

A Theory of Complex Adaptive Learning Behavior in Complex Adaptive Systems and a Non-Localized Wave Equation in Quantum Mechanics

no code implementations27 Jun 2023 Leilei Shi, Xinshuai Guo, Jiuchang Wei, Wei zhang, Guocheng Wang, Bing-Hong Wang

Keywords: complex adaptive systems, complex adaptive learning, universal law, non-localized wave equation, interactively coherent entanglement, interactively coherent adaptation PACS: 89. 75.-k (Complex Systems); 89. 65. Gh (Economics, Econophysics, Financial Markets, Business and Management); 03. 65. Ud (Entanglement and Quantum Nonlocality)

A Collaborative Transfer Learning Framework for Cross-domain Recommendation

no code implementations26 Jun 2023 Wei zhang, Pengye Zhang, Bo Zhang, Xingxing Wang, Dong Wang

The disadvantage of the former is that the data from other domains is not utilized by a single domain model, while the latter leverage all the data from different domains, but the fine-tuned model of transfer learning may trap the model in a local optimum of the source domain, making it difficult to fit the target domain.

Click-Through Rate Prediction Recommendation Systems +1

FlowFace++: Explicit Semantic Flow-supervised End-to-End Face Swapping

no code implementations22 Jun 2023 Yu Zhang, Hao Zeng, Bowen Ma, Wei zhang, Zhimeng Zhang, Yu Ding, Tangjie Lv, Changjie Fan

The discriminator is shape-aware and relies on a semantic flow-guided operation to explicitly calculate the shape discrepancies between the target and source faces, thus optimizing the face swapping network to generate highly realistic results.

Decoder Face Swapping

Visual-Aware Text-to-Speech

no code implementations21 Jun 2023 Mohan Zhou, Yalong Bai, Wei zhang, Ting Yao, Tiejun Zhao, Tao Mei

Dynamically synthesizing talking speech that actively responds to a listening head is critical during the face-to-face interaction.

Speech Synthesis

CMLM-CSE: Based on Conditional MLM Contrastive Learning for Sentence Embeddings

no code implementations16 Jun 2023 Wei zhang, Xu Chen

Traditional comparative learning sentence embedding directly uses the encoder to extract sentence features, and then passes in the comparative loss function for learning.

Contrastive Learning Language Modelling +3

ScrollTimes: Tracing the Provenance of Paintings as a Window into History

no code implementations15 Jun 2023 Wei zhang, Wong Kam-Kwai, Yitian Chen, Ailing Jia, Luwei Wang, Jian-Wei Zhang, Lechao Cheng, Huamin Qu, Wei Chen

The study of cultural artifact provenance, tracing ownership and preservation, holds significant importance in archaeology and art history.

PUGAN: Physical Model-Guided Underwater Image Enhancement Using GAN with Dual-Discriminators

1 code implementation15 Jun 2023 Runmin Cong, Wenyu Yang, Wei zhang, Chongyi Li, Chun-Le Guo, Qingming Huang, Sam Kwong

Among existing UIE methods, Generative Adversarial Networks (GANs) based methods perform well in visual aesthetics, while the physical model-based methods have better scene adaptability.

Quantization UIE

Object Detection in Hyperspectral Image via Unified Spectral-Spatial Feature Aggregation

1 code implementation14 Jun 2023 Xiao He, Chang Tang, Xinwang Liu, Wei zhang, Kun Sun, Jiangfeng Xu

S2ADet comprises a hyperspectral information decoupling (HID) module, a two-stream feature extraction network, and a one-stage detection head.

Object object-detection +1

A Proxy Attack-Free Strategy for Practically Improving the Poisoning Efficiency in Backdoor Attacks

no code implementations14 Jun 2023 Ziqiang Li, Hong Sun, Pengfei Xia, Beihao Xia, Xue Rui, Wei zhang, Qinglang Guo, Bin Li

This paper presents a Proxy attack-Free Strategy (PFS) designed to identify efficient poisoning samples based on individual similarity and ensemble diversity, effectively addressing the mentioned concern.

Active Learning Backdoor Attack

Approximate Maximum-Likelihood RIS-Aided Positioning

no code implementations13 Jun 2023 Wei zhang, Zhenni Wang, Wee Peng Tay

In this paper, we develop a RIS-aided positioning framework to locate a UE in environments where the LOS path may or may not be available.

E2E-LOAD: End-to-End Long-form Online Action Detection

1 code implementation ICCV 2023 Shuqiang Cao, Weixin Luo, Bairui Wang, Wei zhang, Lin Ma

Furthermore, we propose a novel and efficient inference mechanism that accelerates heavy spatial-temporal exploration.

Online Action Detection

NFTVis: Visual Analysis of NFT Performance

no code implementations5 Jun 2023 Fan Yan, Xumeng Wang, Ketian Mao, Wei zhang, Wei Chen

A non-fungible token (NFT) is a data unit stored on the blockchain.

Time Series

PDT: Pretrained Dual Transformers for Time-aware Bipartite Graphs

no code implementations2 Jun 2023 Xin Dai, Yujie Fan, Zhongfang Zhuang, Shubham Jain, Chin-Chia Michael Yeh, Junpeng Wang, Liang Wang, Yan Zheng, Prince Osei Aboagye, Wei zhang

Pre-training on large models is prevalent and emerging with the ever-growing user-generated content in many machine learning application categories.

Contrastive Learning

Can LLMs like GPT-4 outperform traditional AI tools in dementia diagnosis? Maybe, but not today

no code implementations2 Jun 2023 Zhuo Wang, Rongzhen Li, Bowen Dong, Jie Wang, Xiuxing Li, Ning Liu, Chenhui Mao, Wei zhang, Liling Dong, Jing Gao, Jianyong Wang

In this paper, we explore the potential of LLMs such as GPT-4 to outperform traditional AI tools in dementia diagnosis.

Hybrid Driven Learning for Channel Estimation in Intelligent Reflecting Surface Aided Millimeter Wave Communications

no code implementations30 May 2023 Shuntian Zheng, Sheng Wu, Chunxiao Jiang, Wei zhang, Xiaojun Jing

Intelligent reflecting surfaces (IRS) have been proposed in millimeter wave (mmWave) and terahertz (THz) systems to achieve both coverage and capacity enhancement, where the design of hybrid precoders, combiners, and the IRS typically relies on channel state information.

Denoising