no code implementations • EMNLP 2021 • Sheng Zhang, Xin Zhang, Weiming Zhang, Anders Søgaard
Using data from English cloze tests, in which subjects also self-reported their gender, age, education, and race, we examine performance differences of pretrained language models across demographic groups, defined by these (protected) attributes.
no code implementations • 26 Jun 2025 • Yalun Dai, Yangyu Huang, Xin Zhang, Wenshan Wu, Chong Li, Wenhui Lu, Shijie Cao, Li Dong, Scarlett Li
This work introduces a general paradigm, DELT, for considering data efficacy in LM training, which highlights the significance of training data organization.
no code implementations • 25 Jun 2025 • Shen Tan, Xin Zhang, Liangxiu Han, Huaguo Huang, Han Wang
Accurate, cost-effective monitoring of plantation aboveground biomass (AGB) is crucial for supporting local livelihoods and carbon sequestration initiatives like the China Certified Emission Reduction (CCER) program.
no code implementations • 13 Jun 2025 • JinZhe Jiang, YaQian Zhao, Xin Zhang, Chen Li, Yunlong Yu, Hailing Liu
This paper explores the application of the parameter-shift rule (PSR) for computing gradients in unitary optical neural networks (UONNs).
no code implementations • 11 Jun 2025 • Chen-Chia Chang, Wan-Hsuan Lin, Yikang Shen, Yiran Chen, Xin Zhang
Automation of analog topology design is crucial due to customized requirements of modern applications with heavily manual engineering efforts.
1 code implementation • 11 Jun 2025 • Chao-Hong Tan, Qian Chen, Wen Wang, Chong Deng, Qinglin Zhang, Luyao Cheng, Hai Yu, Xin Zhang, Xiang Lv, Tianyu Zhao, Chong Zhang, Yukun Ma, Yafeng Chen, Hui Wang, Jiaqing Liu, Jieping Ye
This paper presents OmniDRCA, a parallel speech-text foundation model based on joint autoregressive modeling, featuring dual-resolution speech representations and contrastive cross-modal alignment.
1 code implementation • 5 Jun 2025 • Yanzhao Zhang, Mingxin Li, Dingkun Long, Xin Zhang, Huan Lin, Baosong Yang, Pengjun Xie, An Yang, Dayiheng Liu, Junyang Lin, Fei Huang, Jingren Zhou
In this work, we introduce the Qwen3 Embedding series, a significant advancement over its predecessor, the GTE-Qwen series, in text embedding and reranking capabilities, built upon the Qwen3 foundation models.
1 code implementation • 4 Jun 2025 • Xiaomi LLM-Core Team, :, Zihao Yue, Zhenru Lin, YiFan Song, Weikun Wang, Shuhuai Ren, Shuhao Gu, Shicheng Li, Peidian Li, Liang Zhao, Lei LI, Kainan Bao, Hao Tian, Hailin Zhang, Gang Wang, Dawei Zhu, Cici, Chenhong He, Bowen Ye, Bowen Shen, Zihan Zhang, Zihan Jiang, Zhixian Zheng, Zhichao Song, Zhenbo Luo, Yue Yu, Yudong Wang, Yuanyuan Tian, Yu Tu, Yihan Yan, Yi Huang, Xu Wang, Xinzhe Xu, Xingchen Song, Xing Zhang, Xing Yong, Xin Zhang, Xiangwei Deng, Wenyu Yang, Wenhan Ma, Weiwei Lv, Weiji Zhuang, Wei Liu, Sirui Deng, Shuo Liu, Shimao Chen, Shihua Yu, Shaohui Liu, Shande Wang, Rui Ma, Qiantong Wang, Peng Wang, Nuo Chen, Menghang Zhu, Kangyang Zhou, Kang Zhou, Kai Fang, Jun Shi, Jinhao Dong, Jiebao Xiao, Jiaming Xu, Huaqiu Liu, Hongshen Xu, Heng Qu, Haochen Zhao, Hanglong Lv, Guoan Wang, Duo Zhang, Dong Zhang, Di Zhang, Chong Ma, Chang Liu, Can Cai, Bingquan Xia
We open-source MiMo-VL-7B-SFT and MiMo-VL-7B-RL, two powerful vision-language models delivering state-of-the-art performance in both general visual understanding and multimodal reasoning.
1 code implementation • 30 May 2025 • Xiaorui Wu, Xiaofeng Mao, Fei Li, Xin Zhang, Xuanhong Li, Chong Teng, Donghong Ji, Zhuang Li
Large Language Models (LLMs) excel in various natural language processing tasks but remain vulnerable to generating harmful content or being exploited for malicious purposes.
no code implementations • 29 May 2025 • Xiaorui Wu, Xiaofeng Mao, Fei Li, Xin Zhang, Xiaolu Zhang, Jun Zhou, Yuxiang Peng, Li Zheng, Chong Teng, Donghong Ji, Zhuang Li
Large language models (LLMs) frequently refuse to respond to pseudo-malicious instructions: semantically harmless input queries triggering unnecessary LLM refusals due to conservative safety alignment, significantly impairing user experience.
1 code implementation • 25 May 2025 • Shengdong Han, Shangdong Yang, Xin Zhang, YuXuan Li, Xiang Li, Jian Yang, Ming-Ming Cheng, Yimian Dai
Resolving closely-spaced small targets in dense clusters presents a significant challenge in infrared imaging, as the overlapping signals hinder precise determination of their quantity, sub-pixel positions, and radiation intensities.
no code implementations • 23 May 2025 • Licheng Pan, Yongqi Tong, Xin Zhang, Xiaolu Zhang, Jun Zhou, Zhixuan Chu
This approach not only provides a more precise and interpretable view of model safety decisions but also seamlessly extends to multilingual scenarios. We have explored the safety decision boundaries of various LLMs and construct the MORBench evaluation set to facilitate robust assessment of model safety and helpfulness across multiple languages.
1 code implementation • 22 May 2025 • Runyang You, Yongqi Li, Xinyu Lin, Xin Zhang, Wenjie Wang, Wenjie Li, Liqiang Nie
To address these issues, we propose \name, a unified large recommender model with intrinsic reasoning capabilities.
1 code implementation • 21 May 2025 • Can Rong, Xin Zhang, Yanxin Xi, Hongjie Sui, Jingtao Ding, Yong Li
Surprisingly, we find that satellite imagery, publicly available across the globe, contains rich urban semantic signals to support high-quality OD flow generation, with over 98\% expressiveness of traditional multisource hard-to-collect urban sociodemographic, economics, land use, and point of interest data.
no code implementations • 19 May 2025 • Wenhao Zhu, Yuhang Xie, Guojie Song, Xin Zhang
The rapid evolution of large language models (LLMs) has revolutionized various fields, including the identification and discovery of human values within text data.
no code implementations • 18 May 2025 • Junhao Liu, Haonan Yu, Xin Zhang
With Large language models (LLMs) becoming increasingly prevalent in various applications, the need for interpreting their predictions has become a critical challenge.
no code implementations • 16 May 2025 • Xin Zhang, Ziruo Zhang, Jiawei Du, Zuozhu Liu, Joey Tianyi Zhou
Multimodal Dataset Distillation (MDD) seeks to condense large-scale image-text datasets into compact surrogates while retaining their effectiveness for cross-modal learning.
no code implementations • 16 May 2025 • Yuhang Liu, Yingxue Zhang, Xin Zhang, Ling Tian, Yanhua Li, Jun Luo
Understanding and predicting urban dynamics is crucial for managing transportation systems, optimizing urban planning, and enhancing public services.
1 code implementation • 13 May 2025 • Haoran Ye, Jing Jin, Yuhang Xie, Xin Zhang, Guojie Song
The rapid advancement of large language models (LLMs) has outpaced traditional evaluation methodologies.
1 code implementation • 12 May 2025 • Xiaomi LLM-Core Team, :, Bingquan Xia, Bowen Shen, Cici, Dawei Zhu, Di Zhang, Gang Wang, Hailin Zhang, Huaqiu Liu, Jiebao Xiao, Jinhao Dong, Liang Zhao, Peidian Li, Peng Wang, Shihua Yu, Shimao Chen, Weikun Wang, Wenhan Ma, Xiangwei Deng, Yi Huang, YiFan Song, Zihan Jiang, Bowen Ye, Can Cai, Chenhong He, Dong Zhang, Duo Zhang, Guoan Wang, Hao Tian, Haochen Zhao, Heng Qu, Hongshen Xu, Jun Shi, Kainan Bao, Qingkai Fang, Kang Zhou, Kangyang Zhou, Lei LI, Menghang Zhu, Nuo Chen, Qiantong Wang, Shaohui Liu, Shicheng Li, Shuhao Gu, Shuhuai Ren, Shuo Liu, Sirui Deng, Weiji Zhuang, Weiwei Lv, Wenyu Yang, Xin Zhang, Xing Yong, Xing Zhang, Xingchen Song, Xinzhe Xu, Xu Wang, Yihan Yan, Yu Tu, Yuanyuan Tian, Yudong Wang, Yue Yu, Zhenru Lin, Zhichao Song, Zihao Yue
We present MiMo-7B, a large language model born for reasoning tasks, with optimization across both pre-training and post-training stages.
1 code implementation • 30 Apr 2025 • Jiajia Li, Xinda Qi, Seyed Hamidreza Nabaei, Meiqi Liu, Dong Chen, Xin Zhang, Xunyuan Yin, Zhaojian Li
Through this review, we aim to provide insights into how these diverse 3D reconstruction techniques can be effectively leveraged for automated and high-throughput plant phenotyping, contributing to the next generation of agricultural technology.
no code implementations • 23 Apr 2025 • Shiyao Lv, Xin Zhang, Xingqun Zhan
Robust GNSS positioning in urban environments is still plagued by multipath effects, particularly due to the complex signal propagation induced by ubiquitous surfaces with varied radio frequency reflectivities.
no code implementations • 21 Apr 2025 • Zijin Yang, Xin Zhang, Kejiang Chen, Kai Zeng, Qiyi Yao, Han Fang, Weiming Zhang, Nenghai Yu
We propose a double-channel design that leverages pseudorandom error-correcting codes to encode the random seed required for watermark pseudorandomization, achieving performance-lossless watermarking under a fixed watermark key and overcoming key management challenges.
no code implementations • 18 Apr 2025 • Feiyang Li, Peng Fang, Zhan Shi, Arijit Khan, Fang Wang, Dan Feng, WeiHao Wang, Xin Zhang, Yongjian Cui
Chain-of-thought (CoT) reasoning boosts large language models' (LLMs) performance on complex tasks but faces two key limitations: a lack of reliability when solely relying on LLM-generated reasoning chains and interference from natural language reasoning steps with the models' inference process, also known as the inference logic of LLMs.
no code implementations • 17 Apr 2025 • Fanyi Yang, Jianfeng Liu, Xin Zhang, Haoyu Liu, Xixin Cao, Yuefeng Zhan, Hao Sun, Weiwei Deng, Feng Sun, Qi Zhang
Instruction tuning has enabled large language models (LLMs) to achieve remarkable performance, but its success heavily depends on the availability of large-scale, high-quality instruction-response pairs.
no code implementations • 15 Apr 2025 • Zhihao Xu, Yongqi Tong, Xin Zhang, Jun Zhou, Xiting Wang
Multi-objective preference alignment in language models often encounters a challenging trade-off: optimizing for one human preference (e. g., helpfulness) frequently compromises others (e. g., harmlessness) due to the inherent conflicts between competing objectives.
1 code implementation • 14 Apr 2025 • Chenghao Xiao, Isaac Chung, Imene Kerboua, Jamie Stirling, Xin Zhang, Márton Kardos, Roman Solomatin, Noura Al Moubayed, Kenneth Enevoldsen, Niklas Muennighoff
We introduce the Massive Image Embedding Benchmark (MIEB) to evaluate the performance of image and image-text embedding models across the broadest spectrum to date.
3 code implementations • 14 Apr 2025 • Bin Ren, Hang Guo, Lei Sun, Zongwei Wu, Radu Timofte, Yawei Li, Yao Zhang, Xinning Chai, Zhengxue Cheng, Yingsheng Qin, Yucai Yang, Li Song, Hongyuan Yu, Pufan Xu, Cheng Wan, Zhijuan Huang, Peng Guo, Shuyuan Cui, Chenjun Li, Xuehai Hu, Pan Pan, Xin Zhang, Heng Zhang, Qing Luo, Linyan Jiang, Haibo Lei, Qifang Gao, Yaqing Li, Weihua Luo, Tsing Li, Qing Wang, Yi Liu, Yang Wang, Hongyu An, Liou Zhang, Shijie Zhao, Lianhong Song, Long Sun, Jinshan Pan, Jiangxin Dong, Jinhui Tang, Jing Wei, Mengyang Wang, Ruilong Guo, Qian Wang, Qingliang Liu, Yang Cheng, Davinci, Enxuan Gu, Pinxin Liu, Yongsheng Yu, Hang Hua, Yunlong Tang, Shihao Wang, ZhiYu Zhang, Yukun Yang, Jiyu Wu, Jiancheng Huang, Yifan Liu, Yi Huang, Shifeng Chen, Rui Chen, Yi Feng, Mingxi Li, Cailu Wan, XiangJi Wu, Zibin Liu, Jinyang Zhong, Kihwan Yoon, Ganzorig Gankhuyag, Shengyun Zhong, Mingyang Wu, Renjie Li, Yushen Zuo, Zhengzhong Tu, Zongang Gao, Guannan Chen, Yuan Tian, Wenhui Chen, Weijun Yuan, Zhan Li, Yihang Chen, Yifan Deng, Ruting Deng, Yilin Zhang, Huan Zheng, Yanyan Wei, Wenxuan Zhao, Suiyi Zhao, Fei Wang, Kun Li, Yinggan Tang, Mengjie Su, Jae-Hyeon Lee, Dong-Hyeop Son, Ui-Jin Choi, Tiancheng Shao, Yuqing Zhang, Mengcheng Ma, Donggeun Ko, Youngsang Kwak, Jiun Lee, Jaehwa Kwak, YuXuan Jiang, Qiang Zhu, Siyue Teng, Fan Zhang, Shuyuan Zhu, Bing Zeng, David Bull, Jing Hu, Hui Deng, Xuan Zhang, Lin Zhu, Qinrui Fan, Weijian Deng, Junnan Wu, Wenqin Deng, Yuquan Liu, Zhaohong Xu, Jameer Babu Pinjari, Kuldeep Purohit, Zeyu Xiao, Zhuoyuan Li, Surya Vashisth, Akshay Dudhane, Praful Hambarde, Sachin Chaudhary, Satya Naryan Tazi, Prashant Patil, Santosh Kumar Vipparthi, Subrahmanyam Murala, Wei-Chen Shen, I-Hsiang Chen, Yunzhe Xu, Chen Zhao, Zhizhou Chen, Akram Khatami-Rizi, Ahmad Mahmoudi-Aznaveh, Alejandro Merino, Bruno Longarela, Javier Abad, Marcos V. Conde, Simone Bianco, Luca Cogo, Gianmarco Corti
This paper presents a comprehensive review of the NTIRE 2025 Challenge on Single-Image Efficient Super-Resolution (ESR).
no code implementations • 13 Apr 2025 • Xin Zhang, Wensheng Lin, Lixin Li, Zhu Han, Tad Matsumoto
Orthogonal time frequency space (OTFS) modulation is widely acknowledged as a prospective waveform for future wireless communication networks. To provide insights for the practical system design, this paper analyzes the outage probability of OTFS modulation with finite blocklength. To begin with, we present the system model and formulate the analysis of outage probability for OTFS with finite blocklength as an equivalent problem of calculating the outage probability with finite blocklength over parallel additive white Gaussian noise (AWGN) channels. Subsequently, we apply the equivalent noise approach to derive a lower bound on the outage probability of OTFS with finite blocklength under both average power allocation and water-filling power allocation strategies, respectively. Finally, the lower bounds of the outage probability are determined using the Monte-Carlo method for the two power allocation strategies. The impact of the number of resolvable paths and coding rates on the outage probability is analyzed, and the simulation results are compared with the theoretical lower bounds.
1 code implementation • 10 Apr 2025 • Shihong Gao, Xin Zhang, Yanyan Shen, Lei Chen
In this paper, we introduce Apt-Serve, a scalable framework designed to enhance effective throughput in LLM inference serving.
no code implementations • 10 Apr 2025 • Kexin Zhang, Xin Zhang, Lixin Li, Wensheng Lin, Wenchi Cheng, Qinghe Du
However, the complex channel conditions in high-speed mobile scenarios significantly impact the reliability and efficiency of traditional communication systems.
1 code implementation • CVPR 2025 • Xin Zhang, Robby T. Tan
However, existing DGSS methods often rely exclusively on either VFMs or VLMs, overlooking their complementary strengths.
no code implementations • 1 Apr 2025 • Xin Zhang, Keren Fu, Qijun Zhao
The Segment Anything Model 2 (SAM2), a prompt-guided video foundation model, has remarkably performed in video object segmentation, drawing significant attention in the community.
Ranked #2 on
Camouflaged Object Segmentation
on MoCA-Mask
no code implementations • 31 Mar 2025 • Xin Zhang, Siting Huang, Xiangyang Luo, Yifan Xie, Weijiang Yu, Heng Chang, Fei Ma, Fei Yu
The Text-to-Mask diffusion model provides \textit{diversity} and \textit{flexibility} to the framework, while the semantic-aware face editing model ensures \textit{controllability} of the framework.
no code implementations • 30 Mar 2025 • Xiangyang Luo, Junhao Cheng, Yifan Xie, Xin Zhang, Tao Feng, Zhou Liu, Fei Ma, Fei Yu
Open-ended story visualization is a challenging task that involves generating coherent image sequences from a given storyline.
no code implementations • 18 Mar 2025 • Chunyu Yang, Shengben Bi, Yihui Xu, Xin Zhang
With the increasing demand for efficient and flexible robotic exploration solutions, Reinforcement Learning (RL) is becoming a promising approach in the field of autonomous robotic exploration.
no code implementations • 10 Mar 2025 • Mengzhe Hei, Zhouran Zhang, Qingbao Liu, Yan Pan, Xiang Zhao, Yongqian Peng, Yicong Ye, Xin Zhang, Shuxin Bai
Extracting high-quality structured information from scientific literature is crucial for advancing material design through data-driven methods.
no code implementations • 9 Mar 2025 • Wei Li, Xin Zhang, Zhongxin Guo, Shaoguang Mao, Wen Luo, Guangyue Peng, Yangyu Huang, Houfeng Wang, Scarlett Li
Implementing new features in repository-level codebases is a crucial application of code generation models.
1 code implementation • 9 Mar 2025 • Mingxiang Cao, Weiying Xie, Xin Zhang, Jiaqing Zhang, Kai Jiang, Jie Lei, Yunsong Li
Multi-modal fusion holds great promise for integrating information from different modalities.
Computational Efficiency
Hyperspectral Image Classification
+2
no code implementations • 6 Mar 2025 • Xin Zhang, Qiyu Wei, Yingjie Zhu, Linhai Zhang, Deyu Zhou, Sophia Ananiadou
In this paper, we introduce SynGraph, a novel framework designed to address data sparsity in sentiment analysis on streaming reviews.
1 code implementation • 5 Mar 2025 • Shimao Zhang, Xiao Liu, Xin Zhang, Junxiao Liu, Zheheng Luo, ShuJian Huang, Yeyun Gong
Human-annotated preference data is used for training to further improve LLMs' performance, which is constrained by the upper limit of human performance.
no code implementations • 4 Mar 2025 • Jie Wu, Haoling Li, Xin Zhang, Jianwen Luo, Yangyu Huang, Ruihang Chu, Yujiu Yang, Scarlett Li
Preference learning enhances Code LLMs beyond supervised fine-tuning by leveraging relative quality comparisons.
no code implementations • 25 Feb 2025 • Xin Zhang, Liang Bai, Xian Yang, Jiye Liang
Low-Rank Adaptation (LoRA) is an efficient fine-tuning method that has been extensively applied in areas such as natural language processing and computer vision.
no code implementations • 24 Feb 2025 • Xin Zhang, Liangxiu Han, Stephen White, Saad Hassan, Philip A Kalra, James Ritchie, Carl Diver, Jennie Shorley
We introduce TabulaTime, a multimodal deep learning framework that enhances ACS risk prediction by combining clinical risk factors with air pollution data.
no code implementations • 20 Feb 2025 • Mingfu Liang, Xi Liu, Rong Jin, Boyang Liu, Qiuling Suo, Qinghai Zhou, Song Zhou, Laming Chen, Hua Zheng, Zhiyuan Li, Shali Jiang, Jiyan Yang, Xiaozhen Xia, Fan Yang, Yasmine Badr, Ellie Wen, Shuyu Xu, Hansey Chen, Zhengyu Zhang, Jade Nie, Chunzhi Yang, Zhichen Zeng, Weilin Zhang, Xingliang Huang, Qianru Li, Shiquan Wang, Evelyn Lyu, Wenjing Lu, Rui Zhang, Wenjun Wang, Jason Rudy, Mengyue Hang, Kai Wang, Yinbin Ma, Shuaiwen Wang, Sihan Zeng, Tongyi Tang, Xiaohan Wei, Longhao Jin, Jamey Zhang, Marcus Chen, Jiayi Xu, Angie Huang, Xihuan Zeng, Chi Zhang, Zhengli Zhao, Jared Yang, Qiang Jin, Xian Chen, Amit Anand Amlesahwaram, Lexi Song, Liang Luo, Yuchen Hao, Nan Xiao, Yavuz Yetim, Luoshang Pan, Gaoxiang Liu, Yuxi Hu, Yuzhen Huang, Jackie Xu, Rich Zhu, Xin Zhang, Yiqun Liu, Hang Yin, Yuxin Chen, Buyun Zhang, Xiaoyi Liu, Xingyuan Wang, Wenguang Mao, Zhijing Li, Zhehui Zhou, Feifan Gu, Qin Huang, Chonglin Sun, Nancy Yu, Shuo Gu, Shupin Mao, Benjamin Au, Jingzheng Qin, Peggy Yao, Jae-Woo Choi, Bin Gao, Ernest Wang, Lei Zhang, Wen-Yen Chen, Ted Lee, Jay Zha, Yi Meng, Alex Gong, Edison Gao, Alireza Vahdatpour, Yiping Han, Yantao Yao, Toshinari Kureha, Shuo Chang, Musharaf Sultan, John Bocharov, Sagar Chordia, Xiaorui Gan, Peng Sun, Rocky Liu, Bo Long, Wenlin Chen, Santanu Kolay, Huayu Li
Second, large-volume data arrive in a streaming mode with data distributions dynamically shifting, as new users/ads join and existing users/ads leave the system.
1 code implementation • 18 Feb 2025 • Xin Zhang, Ziqi Dai, Yongqi Li, Yanzhao Zhang, Dingkun Long, Pengjun Xie, Meishan Zhang, Jun Yu, Wenjie Li, Min Zhang
In this work, we introduce the text-image interleaved retrieval (TIIR) task, where the query and document are interleaved text-image sequences, and the model is required to understand the semantics from the interleaved context for effective retrieval.
no code implementations • 17 Feb 2025 • Guanghao Zhou, Panjia Qiu, Mingyuan Fan, Cen Chen, Mingyuan Chu, Xin Zhang, Jun Zhou
We define jailbreak attacks as an optimization problem within the embedding space of masked language models.
no code implementations • 16 Feb 2025 • Haonan Yu, Junhao Liu, Xin Zhang
Anchors is a popular local model-agnostic explanation technique whose applicability is limited by its computational inefficiency.
1 code implementation • 5 Feb 2025 • Bo Wen, Xin Zhang
This paper presents SOLOMON, a novel Neuro-inspired Large Language Model (LLM) Reasoning Network architecture that enhances the adaptability of foundation models for domain-specific applications.
no code implementations • 4 Feb 2025 • Haoran Ye, Tianze Zhang, Yuhang Xie, Liyuan Zhang, Yuanyi Ren, Xin Zhang, Guojie Song
Despite growing efforts in evaluating, understanding, and aligning LLM values, a psychologically grounded LLM value system remains underexplored.
no code implementations • 23 Jan 2025 • Xin Zhang, Weiliang Li, Rui Li, Zihang Fu, Tongyi Tang, Zhengyu Zhang, Wen-Yen Chen, Nima Noorshams, Nirav Jasapara, Xiaowen Ding, Ellie Wen, Xue Feng
In the realm of online advertising, optimizing conversions is crucial for delivering relevant products to users and enhancing business outcomes.
1 code implementation • 17 Jan 2025 • Tao Shan, Xin Zhang, Di wu
In this paper, we present a graph neural networks (GNNs)-based fast solver (GraphSolver) for solving combined field integral equations (CFIEs) of 3D conducting bodies.
3 code implementations • CVPR 2025 • Xin Zhang, Xue Yang, YuXuan Li, Jian Yang, Ming-Ming Cheng, Xiang Li
Our approach can effectively improve the performance of existing state-of-the-art weakly supervised methods and even surpasses fully supervised models on existing optical benchmarks (i. e., DOTA-v1. 0 dataset).
no code implementations • 8 Jan 2025 • Yaoxiang Wang, Haoling Li, Xin Zhang, Jie Wu, Xiao Liu, Wenxiang Hu, Zhongxin Guo, Yangyu Huang, Ying Xin, Yujiu Yang, Jinsong Su, Qi Chen, Scarlett Li
Effective instruction tuning is indispensable for optimizing code LLMs, aligning model behavior with user expectations and enhancing model performance in real-world applications.
no code implementations • CVPR 2025 • Haochen Li, Rui Zhang, Hantao Yao, Xin Zhang, Yifan Hao, Xinkai Song, Shaohui Peng, Yongwei Zhao, Chen Zhao, Yanjun Wu, Ling Li
Domain adaptive object detection (DAOD) aims to generalize detectors trained on an annotated source domain to an unlabelled target domain.
no code implementations • CVPR 2025 • Xin Zhang, Yanzhao Zhang, Wen Xie, Mingxin Li, Ziqi Dai, Dingkun Long, Pengjun Xie, Meishan Zhang, Wenjie Li, Min Zhang
Last, we provide in-depth analyses of model scaling and training strategies, and perform ablation studies on both the model and synthetic data.
no code implementations • CVPR 2025 • Yifan Wang, Jian Zhao, Zhaoxin Fan, Xin Zhang, Xuecheng Wu, Yudian Zhang, Lei Jin, Xinyue Li, Gang Wang, Mengxi Jia, Ping Hu, Zheng Zhu, Xuelong Li
To benchmark this task, we introduce the TDUAV dataset, the largest dataset for joint UAV tracking and intent understanding, featuring 1, 328 challenging video sequences, over 163K annotated thermal frames, and 3K VQA pairs.
no code implementations • 22 Dec 2024 • Xin Zhang, Yanzhao Zhang, Wen Xie, Mingxin Li, Ziqi Dai, Dingkun Long, Pengjun Xie, Meishan Zhang, Wenjie Li, Min Zhang
Last, we provide in-depth analyses of model scaling and training strategies, and perform ablation studies on both the model and synthetic data.
1 code implementation • 19 Dec 2024 • QiHao Zhao, Yangyu Huang, Tengchao Lv, Lei Cui, Qinzheng Sun, Shaoguang Mao, Xin Zhang, Ying Xin, Qiufeng Yin, Scarlett Li, Furu Wei
This benchmark reassesses LLMs' understanding of world knowledge by averting both unintentional and malicious data leakage.
no code implementations • 11 Dec 2024 • Yifan Xie, Tao Feng, Xin Zhang, Xiangyang Luo, Zixuan Guo, Weijiang Yu, Heng Chang, Fei Ma, Fei Richard Yu
Furthermore, we integrate the audio-point enhancement module, which not only ensures the synchronization of the audio signal with the corresponding lip point cloud within the feature space, but also facilitates a deeper understanding of the interrelations among cross-modal conditional features.
no code implementations • 10 Dec 2024 • Wan Jiang, He Wang, Xin Zhang, Dan Guo, Zhaoxin Fan, Yunfeng Diao, Richang Hong
To fill this gap, we first examine the current 'gold standard' in Machine Unlearning (MU), i. e., re-training the model after removing the undesirable training data, and find it does not work in SGMs.
no code implementations • 2 Dec 2024 • Xi Guo, Chenjing Ding, Haoxuan Dou, Xin Zhang, Weixuan Tang, Wei Wu
Comprehensive experiments in multiple datasets validate InfinityDrive's ability to generate complex and varied scenarios, highlighting its potential as a next-generation driving world model built for the evolving demands of autonomous driving.
no code implementations • 26 Nov 2024 • Jing-Yang Wei, Hao Huang, Xin Zhang, De-Mao Ye, Yi Li, Le Wang, Yao-Guang Ma, Yang-Hui Li
To provide a lightweight and cost-effective solution for the long-wave infrared imaging using a singlet, we develop a camera by integrating a High-Frequency-Enhancing Cycle-GAN neural network into a metalens imaging system.
no code implementations • 21 Nov 2024 • Ming Zhao, Xin Zhang, André Kaup
Detecting ships in synthetic aperture radar (SAR) images is challenging due to strong speckle noise, complex surroundings, and varying scales.
no code implementations • 21 Nov 2024 • Zheheng Luo, Xin Zhang, Xiao Liu, Haoling Li, Yeyun Gong, Chen Qi, Peng Cheng
To evaluate the effectiveness of Velocitune, we conduct experiments in a reasoning-focused dataset with CodeLlama, as well as in a corpus specialised for system command generation with Llama3 and Mistral.
1 code implementation • 19 Nov 2024 • Dimple Vijay Kochar, Hanrui Wang, Anantha Chandrakasan, Xin Zhang
Existing automation efforts using methods like Bayesian Optimization (BO) and Reinforcement Learning (RL) are sub-optimal and costly to generalize across different topologies and technology nodes.
no code implementations • 16 Nov 2024 • Sizhe Wang, Yongqi Tong, Hengyuan Zhang, Dawei Li, Xin Zhang, Tianlong Chen
Building on this, we further propose Balanced Preference Optimization (BPO), designed to dynamically augment the knowledge depth of each sample.
no code implementations • 15 Nov 2024 • Jiaqi Wang, Huan Zhao, Zhenyuan Yang, Peng Shu, JunHao Chen, Haobo Sun, Ruixi Liang, Shixin Li, Pengcheng Shi, Longjun Ma, Zongjia Liu, Zhengliang Liu, Tianyang Zhong, Yutong Zhang, Chong Ma, Xin Zhang, Tuo Zhang, Tianli Ding, Yudan Ren, Tianming Liu, Xi Jiang, Shu Zhang
In this paper, we review legal testing methods based on Large Language Models (LLMs), using the OPENAI o1 model as a case study to evaluate the performance of large models in applying legal provisions.
no code implementations • 11 Nov 2024 • Xin Zhang, Victor S. Sheng
This paper provides an in-depth analysis of Wave Network, a novel token representation method derived from the Wave Network, designed to capture both global and local semantics of input text through wave-inspired complex vectors.
no code implementations • 7 Nov 2024 • Xin Zhang, Victor S. Sheng
Explainability is an essential reason limiting the application of neural networks in many vital fields.
no code implementations • 7 Nov 2024 • Xin Zhang, Victor S. Sheng
Neuro-symbolic AI is an effective method for improving the overall performance of AI models by combining the advantages of neural networks and symbolic learning.
no code implementations • 4 Nov 2024 • Xin Zhang, Victor S. Sheng
We propose an innovative token representation and update method in a new ultra-small language model: the Wave network.
1 code implementation • 31 Oct 2024 • Minghui Chen, Meirui Jiang, Xin Zhang, Qi Dou, Zehua Wang, Xiaoxiao Li
To address these communication cost issues and increase the performance of pre-trained model adaptation in FL, we propose an innovative model interpolation-based local training technique called ``Local Superior Soups.''
no code implementations • 29 Oct 2024 • Xin Zhang, Zhen Xu, Yue Liu, Mengfang Sun, Tong Zhou, Wenying Sun
In the current context of accelerated globalization and digitalization, the complexity and uncertainty of financial markets are increasing, and the identification and prevention of economic risks have become a key link in maintaining the stability of the financial system.
no code implementations • 23 Oct 2024 • Jianjun Wei, Yue Liu, Xin Huang, Xin Zhang, Wenyi Liu, Xu Yan
This paper explores the applications and challenges of graph neural networks (GNNs) in processing complex graph data brought about by the rapid development of the Internet.
no code implementations • 16 Oct 2024 • Junhao Liu, Haonan Yu, Xin Zhang
With the rapid advancements of various machine learning models, there is a significant demand for model-agnostic explanation techniques, which can explain these models across different architectures.
1 code implementation • 11 Oct 2024 • Haochen Li, Rui Zhang, Hantao Yao, Xin Zhang, Yifan Hao, Xinkai Song, Xiaqing Li, Yongwei Zhao, Ling Li, Yunji Chen
Domain adaptive object detection (DAOD) aims to generalize detectors trained on an annotated source domain to an unlabelled target domain.
no code implementations • 11 Oct 2024 • Jialei Chen, Xin Zhang, Mobarakol Islam, Francisco Vasconcelos, Danail Stoyanov, Daniel S. Elson, Baoru Huang
Accurate 3D reconstruction of dynamic surgical scenes from endoscopic video is essential for robotic-assisted surgery.
no code implementations • 9 Oct 2024 • Xin Zhang, Xiang Lyu, Zhihao Du, Qian Chen, Dong Zhang, Hangrui Hu, Chaohong Tan, Tianyu Zhao, Yuxuan Wang, Bin Zhang, Heng Lu, Yaqian Zhou, Xipeng Qiu
Current methods of building LLMs with voice interaction capabilities rely heavily on explicit text autoregressive generation before or during speech response generation to maintain content quality, which unfortunately brings computational overhead and increases latency in multi-turn interactions.
no code implementations • 2 Oct 2024 • Yuna Yan, Lixin Li, Xin Zhang, Wensheng Lin, Wenchi Cheng, Zhu Han
Most current Deep Learning-based Semantic Communication (DeepSC) systems are designed and trained exclusively for particular single-channel conditions, which restricts their adaptability and overall bandwidth utilization.
no code implementations • 27 Sep 2024 • Tianyang Zhong, Zhengliang Liu, Yi Pan, Yutong Zhang, Yifan Zhou, Shizhe Liang, Zihao Wu, Yanjun Lyu, Peng Shu, Xiaowei Yu, Chao Cao, Hanqi Jiang, Hanxu Chen, Yiwei Li, JunHao Chen, Huawen Hu, Yihen Liu, Huaqin Zhao, Shaochen Xu, Haixing Dai, Lin Zhao, Ruidong Zhang, Wei Zhao, Zhenyuan Yang, Jingyuan Chen, Peilong Wang, Wei Ruan, Hui Wang, Huan Zhao, Jing Zhang, Yiming Ren, Shihuan Qin, Tong Chen, Jiaxi Li, Arif Hassan Zidan, Afrar Jahin, Minheng Chen, Sichen Xia, Jason Holmes, Yan Zhuang, Jiaqi Wang, Bochen Xu, Weiran Xia, Jichao Yu, Kaibo Tang, Yaxuan Yang, Bolun Sun, Tao Yang, Guoyu Lu, Xianqiao Wang, Lilong Chai, He Li, Jin Lu, Lichao Sun, Xin Zhang, Bao Ge, Xintao Hu, Lian Zhang, Hua Zhou, Lu Zhang, Shu Zhang, Ninghao Liu, Bei Jiang, Linglong Kong, Zhen Xiang, Yudan Ren, Jun Liu, Xi Jiang, Yu Bao, Wei zhang, Xiang Li, Gang Li, Wei Liu, Dinggang Shen, Andrea Sikora, Xiaoming Zhai, Dajiang Zhu, Tianming Liu
-Impressive performance in chip design tasks, outperforming specialized models in areas such as EDA script generation and bug analysis.
1 code implementation • 26 Sep 2024 • Jiawei Du, Xin Zhang, Juncheng Hu, Wenxin Huang, Joey Tianyi Zhou
Specifically, we introduce a novel method that employs dynamic and directed weight adjustment techniques to modulate the synthesis process, thereby maximizing the representativeness and diversity of each synthetic instance.
2 code implementations • 26 Sep 2024 • Ge Wu, Xin Zhang, Zheng Li, Zhaowei Chen, Jiajun Liang, Jian Yang, Xiang Li
Prompt learning has surfaced as an effective approach to enhance the performance of Vision-Language Models (VLMs) like CLIP when applied to downstream tasks.
1 code implementation • 24 Sep 2024 • Zhiwei Liu, Xin Zhang, Kailai Yang, Qianqian Xie, Jimin Huang, Sophia Ananiadou
The emergence of social media has made the spread of misinformation easier.
1 code implementation • 18 Sep 2024 • Haoran Ye, Yuhang Xie, Yuanyi Ren, Hanjun Fang, Xin Zhang, Guojie Song
Human values and their measurement are long-standing interdisciplinary inquiry.
no code implementations • IEEE Transactions on Intelligent Transportation Systems 2024 • Xin Zhang, Yunan Ling, Kaige Li, Weimin Shi, Zhong Zho
Unsupervised Domain Adaptation Vehicle Re-Identification (UDA vehicle re-ID) aims to enable the model trained in the source domain dataset to adapt to the target domain data and obtain accurate re-identification results, which has received widespread attention due to its practicality in the field of intelligent transportation systems.
1 code implementation • 10 Sep 2024 • Xin Zhang, Deval Mehta, Yanan Hu, Chao Zhu, David Darby, Zhen Yu, Daniel Merlo, Melissa Gresle, Anneke Van Der Walt, Helmut Butzkueven, ZongYuan Ge
Survival analysis holds a crucial role across diverse disciplines, such as economics, engineering and healthcare.
no code implementations • 2 Sep 2024 • Yang Zhang, Rui Zhang, Xuecheng Nie, Haochen Li, Jikun Chen, Yifan Hao, Xin Zhang, Luoqi Liu, Ling Li
We found that attribute confusion occurs when a certain region of the latent features attend to multiple or incorrect prompt tokens.
1 code implementation • 27 Aug 2024 • Changjian Zhou, Xin Zhang, Jiafeng Li, Jia Song, Wensheng Xiang
In addition, a dual-attention based feature fusion block is constructed to learn local joint interaction representations.
no code implementations • 26 Aug 2024 • Na Ren, Xin Zhang
The frequent occurrence of cyber risks and their serious economic consequences have created a growth market for cyber insurance.
no code implementations • 25 Aug 2024 • Xin Zhang, Teodor Boyadzhiev, Jinglei Shi, Jufeng Yang
In this paper, we leverage image complexity as a prior for refining segmentation features to achieve accurate real-time semantic segmentation.
no code implementations • 13 Aug 2024 • Xin Zhang, Jiawei Du, Ping Liu, Joey Tianyi Zhou
This leads to inefficient utilization of the distillation budget and oversight of inter-class feature distributions, which ultimately limits the effectiveness and efficiency, as demonstrated in our analysis.
no code implementations • 31 Jul 2024 • Yuna Yan, Xin Zhang, Lixin Li, Wensheng Lin, Rui Li, Wenchi Cheng, Zhu Han
In this paper, we address the problem of image semantic communication in a multi-user deployment scenario and propose a federated learning (FL) strategy for a Swin Transformer-based semantic communication system (FSSC).
no code implementations • 29 Jul 2024 • Xin Zhang, Yanzhao Zhang, Dingkun Long, Wen Xie, Ziqi Dai, Jialong Tang, Huan Lin, Baosong Yang, Pengjun Xie, Fei Huang, Meishan Zhang, Wenjie Li, Min Zhang
We first introduce a text encoder (base size) enhanced with RoPE and unpadding, pre-trained in a native 8192-token context (longer than 512 of previous multilingual encoders).
no code implementations • 24 Jul 2024 • Xin Zhang, Yuqi Song, Wyatt McCurdy, XiaoFeng Wang, Fei Zuo
These remarkable achievements are greatly attributed to the support of extensive datasets with precise labels.
no code implementations • 19 Jul 2024 • Chen-Chia Chang, Yikang Shen, Shaoze Fan, Jing Li, Shun Zhang, Ningyuan Cao, Yiran Chen, Xin Zhang
To this end, we introduce LaMAGIC, a pioneering language model-based topology generation model that leverages supervised finetuning for automated analog circuit design.
no code implementations • 18 Jul 2024 • Haiyong Chen, Yaxiu Zhang, Yan Zhang, Xin Zhang, Xingwei Yan
Therefore, we propose the Gather and Distribute Domain shift Suppression Network (GDDS).
no code implementations • 13 Jul 2024 • Siyan Liu, Chi-Kuang Yeh, Xin Zhang, Qinglong Tian, Pengfei Li
This study introduces a new approach to addressing positive and unlabeled (PU) data through the double exponential tilting model (DETM).
no code implementations • 9 Jul 2024 • Mengxiang Liu, Xin Zhang, Rui Zhang, Zhuoran Zhou, Zhenyong Zhang, Ruilong Deng
The proposed mitigation method can work even in the worst case where all communication links are under SFDIAs and only require extra current sensors.
no code implementations • 8 Jul 2024 • Yutong Zhang, Yi Pan, Tianyang Zhong, Peixin Dong, Kangni Xie, Yuxiao Liu, Hanqi Jiang, Zhengliang Liu, Shijie Zhao, Tuo Zhang, Xi Jiang, Dinggang Shen, Tianming Liu, Xin Zhang
Our experimental results demonstrated that Gemini-series models excelled in report generation and lesion detection but faces challenges in disease classification and anatomical localization.
no code implementations • 21 Jun 2024 • Haoling Li, Xin Zhang, Xiao Liu, Yeyun Gong, Yifan Wang, Yujiu Yang, Qi Chen, Peng Cheng
Large language models (LLMs) have revolutionized lots of fields of research.
no code implementations • 20 Jun 2024 • Xinbo Zhao, Yingxue Zhang, Xin Zhang, Yu Yang, Yiqun Xie, Yanhua Li, Jun Luo
MODA addresses the challenges of data scarcity and heterogeneity in a multi-task urban setting through Contrastive Data Sharing among tasks.
1 code implementation • 20 Jun 2024 • Jie Feng, Jun Zhang, Tianhui Liu, Xin Zhang, Tianjian Ouyang, Junbo Yan, Yuwei Du, Siqi Guo, Yong Li
The challenge in constructing a systematic evaluation benchmark for urban research lies in the diversity of urban data, the complexity of application scenarios and the highly dynamic nature of the urban environment.
1 code implementation • 19 Jun 2024 • Fan Zhang, Xin Zhang
Massive number of applications involve data with underlying relationships embedded in non-Euclidean space.
no code implementations • 17 Jun 2024 • Feng Huang, Xin Zhang, YiXuan Xu, Xuesong Wang, Xianyu Wu
Video Frame Interpolation (VFI) has been extensively explored and demonstrated, yet its application to polarization remains largely unexplored.
1 code implementation • 10 Jun 2024 • Yidong Wang, Qi Guo, Wenjin Yao, Hongbo Zhang, Xin Zhang, Zhen Wu, Meishan Zhang, Xinyu Dai, Min Zhang, Qingsong Wen, Wei Ye, Shikun Zhang, Yue Zhang
This paper introduces AutoSurvey, a speedy and well-organized methodology for automating the creation of comprehensive literature surveys in rapidly evolving fields like artificial intelligence.
no code implementations • 9 Jun 2024 • Yuxin Hong, Xiao Zhang, Xin Zhang, Joey Tianyi Zhou
In the medical field, managing high-dimensional massive medical imaging data and performing reliable medical analysis from it is a critical challenge, especially in resource-limited environments such as remote medical facilities and mobile devices.
1 code implementation • 6 Jun 2024 • Yuanyi Ren, Haoran Ye, Hanjun Fang, Xin Zhang, Guojie Song
This work introduces ValueBench, the first comprehensive psychometric benchmark for evaluating value orientations and value understanding in LLMs.
no code implementations • 6 Jun 2024 • Prashanth Vijayaraghavan, Luyao Shi, Ehsan Degan, Xin Zhang
Circuit topology generation plays a crucial role in the design of electronic circuits, influencing the fundamental functionality of the circuit.
no code implementations • 23 May 2024 • Guoyao Shen, Mengyu Li, Stephan Anderson, Chad W. Farris, Xin Zhang
Recent advancements in deep learning have enabled the development of generalizable models that achieve state-of-the-art performance across various imaging tasks.
1 code implementation • 19 May 2024 • Chun-Yin Huang, Kartik Srinivas, Xin Zhang, Xiaoxiao Li
Conventional Federated Learning (FL) involves collaborative training of a global model while maintaining user data privacy.
1 code implementation • 6 May 2024 • Xin Zhang, Daochen Zha, Qiaoyu Tan
Next, instead of directly combing their outputs for label inference, we train a simple multi-layer perceptron--MLP model to mimic their predictions on both labeled and unlabeled nodes.
no code implementations • 26 Apr 2024 • Xin Zhang, Liangxiu Han, Tam Sobeih, Lianghao Han, Darren Dancey
This necessitates the development of innovative, spike-aware algorithms tailored for event cameras, a task compounded by the irregularity, continuity, noise, and spatial and temporal characteristics inherent in spiking data. Harnessing the strong generalization capabilities of transformer neural networks for spatiotemporal data, we propose a purely spike-driven spike transformer network for depth estimation from spiking camera data.
1 code implementation • 8 Apr 2024 • Dong Zhang, Zhaowei Li, ShiMin Li, Xin Zhang, Pengyu Wang, Yaqian Zhou, Xipeng Qiu
However, the integration of human feedback to align speech outputs to human preferences is often neglected.
no code implementations • 2 Apr 2024 • Xin Zhang, Ling Chen, Xing Tang, Hongyu Shi
To this end, we propose a Dual-view Supergrid-aware Graph Neural Network (DSGNN) for regional air quality estimation, which can model the spatial dependencies of distant grid regions from dual views (i. e., satellite-derived aerosol optical depth (AOD) and meteorology).
no code implementations • 2 Apr 2024 • Xu Li, Ruiqi Sun, Jiameng Lv, Peng Jia, Nan Li, Chengliang Wei, Zou Hu, Xinzhong Er, Yun Chen, Zhang Ban, Yuedong Fang, Qi Guo, Dezi Liu, Guoliang Li, Lin Lin, Ming Li, Ran Li, Xiaobo Li, Yu Luo, Xianmin Meng, Jundan Nie, Zhaoxiang Qi, Yisheng Qiu, Li Shao, Hao Tian, Lei Wang, Wei Wang, Jingtian Xian, Youhua Xu, Tianmeng Zhang, Xin Zhang, Zhimin Zhou
To overcome these challenges, we have developed a framework based on a hierarchical visual Transformer with a sliding window technique to search for strong lensing systems within entire images.
no code implementations • 23 Mar 2024 • Xin Zhang, Tianjie Ju, Huijia Liang, Ying Fu, Qin Zhang
To tackle this challenge, we introduce a Sequential Fusion method to integrate knowledge from complex contexts into LLMs.
1 code implementation • 22 Mar 2024 • Lei Jiang, Weixin Yang, Xin Zhang, Hao Ni
Skeleton-based action recognition (SAR) in videos is an important but challenging task in computer vision.
no code implementations • 21 Mar 2024 • Fanfan Lin, Junhua Liu, Xinze Li, Shuai Zhao, Bohui Zhao, Hao Ma, Xin Zhang
This paper proposes PE-GPT, a custom-tailored large language model uniquely adapted for power converter modulation design.
no code implementations • 15 Mar 2024 • Yanfei Li, Juejing Liu, Xiaodong Zhao, Wenjun Liu, Tong Geng, Ang Li, Xin Zhang
Traditional analysis of highly distorted micro-X-ray diffraction ({\mu}-XRD) patterns from hydrothermal fluid environments is a time-consuming process, often requiring substantial data preprocessing and labeled experimental data.
no code implementations • 12 Mar 2024 • Ao Chen, Xin Zhang
Acoustic wave modulation plays a pivotal role in various applications, including sound-field reconstruction, wireless communication, and particle manipulation, among others.
no code implementations • 10 Mar 2024 • Shengxin Hong, Liang Xiao, Xin Zhang, Jianxia Chen
We construct a formal model of ArgMed-Agents and present conjectures for theoretical guarantees.
no code implementations • 10 Mar 2024 • Xin Zhang, Linhai Zhang, Deyu Zhou, Guoqiang Xu
Due to the sparsity of user data, sentiment analysis on user reviews in e-commerce platforms often suffers from poor performance, especially when faced with extremely sparse user data or long-tail labels.
1 code implementation • CVPR 2024 • Zheng Li, Xiang Li, Xinyi Fu, Xin Zhang, Weiqiang Wang, Shuo Chen, Jian Yang
To our best knowledge, we are the first to (1) perform unsupervised domain-specific prompt-driven knowledge distillation for CLIP, and (2) establish a practical pre-storing mechanism of text features as shared class vectors between teacher and student.
Ranked #1 on
Prompt Engineering
on Oxford-IIIT Pet Dataset
no code implementations • 4 Mar 2024 • Xin Zhang, Tao Xiao, GePeng Ji, Xuan Wu, Keren Fu, Qijun Zhao
The prompt fed to the motion stream is learned by supervising optical flow in a self-supervised manner.
no code implementations • 20 Feb 2024 • Penghai Zhao, Xin Zhang, Jiayue Cao, Ming-Ming Cheng, Jian Yang, Xiang Li
This paper presents a thorough analysis of these literature reviews within the PAMI field, and tries to address three core research questions: (1) What are the prevalent structural and statistical characteristics of PAMI literature reviews?
1 code implementation • 19 Feb 2024 • Jun Zhan, Junqi Dai, Jiasheng Ye, Yunhua Zhou, Dong Zhang, Zhigeng Liu, Xin Zhang, Ruibin Yuan, Ge Zhang, Linyang Li, Hang Yan, Jie Fu, Tao Gui, Tianxiang Sun, Yugang Jiang, Xipeng Qiu
We introduce AnyGPT, an any-to-any multimodal language model that utilizes discrete representations for the unified processing of various modalities, including speech, text, images, and music.
no code implementations • 17 Feb 2024 • Yang Cao, Xinyi Chen, Xin Zhang, Siying Li
In this paper, we present a novel method for automatically generating sports news, which employs a unique algorithm that extracts pivotal moments from live text broadcasts and uses them to create an initial draft of the news.
no code implementations • 16 Feb 2024 • Xin Zhang, Keren Fu, Qijun Zhao
To facilitate the seamless integration of global classification features with the finely detailed local features selected by DPSM, we introduce a novel feature blending module (FBM).
Ranked #10 on
Person Re-Identification
on Occluded-DukeMTMC
1 code implementation • 14 Feb 2024 • Feifan Song, Yuxuan Fan, Xin Zhang, Peiyi Wang, Houfeng Wang
Large Language Models (LLMs) rely on Human Preference Alignment (HPA) to ensure the generation of safe content.
no code implementations • 6 Feb 2024 • Hao Wang, JinZhe Jiang, Xin Zhang, Chen Li
However, it has been shown that multimodal NLP are vulnerable to adversarial attacks, where the outputs of a model can be dramatically changed by a perturbation to the input.
no code implementations • 29 Jan 2024 • Hwanwoo Kim, Xin Zhang, Jiwei Zhao, Qinglong Tian
This work focuses on the target shift problem in a regression setting (Zhang et al., 2013; Nguyen et al., 2016).
1 code implementation • 24 Jan 2024 • Dong Zhang, Xin Zhang, Jun Zhan, ShiMin Li, Yaqian Zhou, Xipeng Qiu
It comprises an autoregressive model based on LLM for semantic information modeling and a non-autoregressive model employing flow matching for perceptual information modeling.
no code implementations • 18 Jan 2024 • Xin Zhang, YeMing Cai, Tianzhi Jia
Text-to-image synthesis, a subfield of multimodal generation, has gained significant attention in recent years.
1 code implementation • 16 Jan 2024 • Xin Zhang, Yu Liu, Yuming Lin, Qingmin Liao, Yong Li
Urban villages, defined as informal residential areas in or around urban centers, are characterized by inadequate infrastructures and poor living conditions, closely related to the Sustainable Development Goals (SDGs) on poverty, adequate housing, and sustainable cities.
1 code implementation • 11 Jan 2024 • Xin Zhang, Xingqun Zhan, Jihong Huang, Jiahui Liu, Yingchao Xiao
Tightness remains the center quest in all modern estimation bounds.
1 code implementation • 8 Jan 2024 • Dong Zhang, Zhaowei Li, Pengyu Wang, Xin Zhang, Yaqian Zhou, Xipeng Qiu
In this paper, we propose SpeechAgents, a multi-modal LLM based multi-agent system designed for simulating human communication.
no code implementations • 6 Jan 2024 • Luyuan Xie, Cong Li, Xin Zhang, Shengfang Zhai, Yuejian Fang, Qingni Shen, Zhonghai Wu
Representation learning frameworks in unlabeled time series have been proposed for medical signal processing.
no code implementations • 4 Jan 2024 • Yiheng Liu, Hao He, Tianle Han, Xu Zhang, Mengyuan Liu, Jiaming Tian, Yutong Zhang, Jiaqi Wang, Xiaohui Gao, Tianyang Zhong, Yi Pan, Shaochen Xu, Zihao Wu, Zhengliang Liu, Xin Zhang, Shu Zhang, Xintao Hu, Tuo Zhang, Ning Qiang, Tianming Liu, Bao Ge
Low-cost training and deployment of LLMs represent the future development trend.
no code implementations • 2 Jan 2024 • Weidong Liu, Xiaojun Mao, Xiaofei Zhang, Xin Zhang
To fast solve the non-smooth loss under a given privacy budget, we develop a Fast Robust And Privacy-Preserving Estimation (FRAPPE) algorithm for least absolute deviation regression.
no code implementations • 29 Dec 2023 • Xin Zhang, Jinheng Xie, Yuan Yuan, Michael Bi Mi, Robby T. Tan
Further, to ensure the distinguishability among various regions, we introduce a region-level contrastive clustering loss to pull closer similar regions across images.
1 code implementation • 20 Dec 2023 • Zhaojian Yu, Xin Zhang, Ning Shang, Yangyu Huang, Can Xu, Yishujie Zhao, Wenxiang Hu, Qiufeng Yin
Recent work demonstrates that, after instruction tuning, Code Large Language Models (Code LLMs) can obtain impressive capabilities to address a wide range of code-related tasks.
1 code implementation • 8 Dec 2023 • Tongxin Hu, Zhuang Li, Xin Jin, Lizhen Qu, Xin Zhang
Annually, e-commerce platforms incur substantial financial losses due to trademark infringements, making it crucial to identify and mitigate potential legal risks tied to merchant information registered to the platforms.
1 code implementation • 1 Dec 2023 • Weiying Xie, Xiaoyi Fan, Xin Zhang, Yunsong Li, Jie Lei, Leyuan Fang
Pruning-quantization joint learning always facilitates the deployment of deep neural networks (DNNs) on resource-constrained edge devices.
no code implementations • 30 Nov 2023 • Jing Wang, Xiaofeng Liu, Fangyun Wang, Lin Zheng, Fengqiao Gao, Hanwen Zhang, Xin Zhang, Wanqing Xie, Binbin Wang
Our video-based model can diagnose with an accuracy of 93. 9\% (binary classification), and 92. 1\% (3-class classification) in a collected 2D video testing set, which does not need key-frame selection and view annotation in testing.
1 code implementation • 29 Nov 2023 • Xu Liu, Shu Zhou, Yurong Song, Wenzhe Luo, Xin Zhang
To tackle this issue, we propose a face liveness detection method based on image-text pairs and contrastive learning, dividing liveness attack problems in the financial field into eight categories and using text information to describe the images of these eight types of attacks.
no code implementations • 22 Nov 2023 • Hua Zheng, Kuang-Hung Liu, Igor Fedorov, Xin Zhang, Wen-Yen Chen, Wei Wen
Neural Architecture Search (NAS) has become a widely used tool for automating neural network design.
1 code implementation • CVPR 2024 • Xin Zhang, Jiawei Du, Yunsong Li, Weiying Xie, Joey Tianyi Zhou
Dataset pruning aims to construct a coreset capable of achieving performance comparable to the original, full dataset.
2 code implementations • 20 Nov 2023 • Xin Zhang, Yingze Song, Tingting Song, Degang Yang, Yichen Ye, Jie zhou, Liming Zhang
In response to the above questions, the Linear Deformable Convolution (LDConv) is explored in this work, which gives the convolution kernel an arbitrary number of parameters and arbitrary sampled shapes to provide richer options for the trade-off between network overhead and performance.
1 code implementation • 16 Nov 2023 • Guoyao Shen, Mengyu Li, Chad W. Farris, Stephan Anderson, Xin Zhang
In this paper, we propose a k-space cold diffusion model that performs image degradation and restoration in k-space without the need for Gaussian noise.
no code implementations • 14 Nov 2023 • Wei Wen, Kuang-Hung Liu, Igor Fedorov, Xin Zhang, Hang Yin, Weiwei Chu, Kaveh Hassani, Mengying Sun, Jiang Liu, Xu Wang, Lin Jiang, Yuxin Chen, Buyun Zhang, Xi Liu, Dehua Cheng, Zhengxing Chen, Guang Zhao, Fangqiu Han, Jiyan Yang, Yuchen Hao, Liang Xiong, Wen-Yen Chen
In industry system, such as ranking system in Meta, it is unclear whether NAS algorithms from the literature can outperform production baselines because of: (1) scale - Meta ranking systems serve billions of users, (2) strong baselines - the baselines are production models optimized by hundreds to thousands of world-class engineers for years since the rise of deep learning, (3) dynamic baselines - engineers may have established new and stronger baselines during NAS search, and (4) efficiency - the search pipeline must yield results quickly in alignment with the productionization life cycle.
no code implementations • 10 Nov 2023 • Zhengliang Liu, Hanqi Jiang, Tianyang Zhong, Zihao Wu, Chong Ma, Yiwei Li, Xiaowei Yu, Yutong Zhang, Yi Pan, Peng Shu, Yanjun Lyu, Lu Zhang, Junjie Yao, Peixin Dong, Chao Cao, Zhenxiang Xiao, Jiaqi Wang, Huan Zhao, Shaochen Xu, Yaonai Wei, Jingyuan Chen, Haixing Dai, Peilong Wang, Hao He, Zewei Wang, Xinyu Wang, Xu Zhang, Lin Zhao, Yiheng Liu, Kai Zhang, Liheng Yan, Lichao Sun, Jun Liu, Ning Qiang, Bao Ge, Xiaoyan Cai, Shijie Zhao, Xintao Hu, Yixuan Yuan, Gang Li, Shu Zhang, Xin Zhang, Xi Jiang, Tuo Zhang, Dinggang Shen, Quanzheng Li, Wei Liu, Xiang Li, Dajiang Zhu, Tianming Liu
GPT-4V represents a breakthrough in artificial general intelligence (AGI) for computer vision, with applications in the biomedical domain.
no code implementations • 3 Nov 2023 • Tianqi Xiang, Zhiwei Jiang, Weijun Hong, Xin Zhang, Yuehong Gao
In this paper, Reconfigurable Intelligent Surface & Edge (RISE) is proposed to extend RIS' abilities of reflection and refraction over surfaces to diffraction around obstacles' edge for better adaptation to specific coverage scenarios.
no code implementations • 3 Nov 2023 • Weiying Lin, Che Liu, Xin Zhang, Zhen Wei, Sizhe Li, Xun Ma
The process begins with histogram equalization to enhance the original image, followed by the use of Mask RCNN to identify the preliminary positions and outlines of oil tanks, the ground, and areas of potential oil contamination.
no code implementations • 2 Nov 2023 • YiWen Chen, Tianqi Xiang, Xi Chen, Xin Zhang
For signal processing related to localization technologies, non line of sight (NLOS) multipaths have a significant impact on the localization error level.
no code implementations • 27 Oct 2023 • Ziquan Zhu, Jing Tao, Shuihua Wang, Xin Zhang, Yudong Zhang
Five indexes are selected in this paper, which are accuracy, sensitivity, precision, F1-score, and specificity.
no code implementations • 16 Oct 2023 • Junpeng Tan, Xin Zhang, Yao Lv, Xiangmin Xu, Gang Li
Finally, the experimental results on real-world fetal brain MRI stacks demonstrate the state-of-the-art performance of our method.
1 code implementation • 12 Oct 2023 • Xin Zhang, Zehan Li, Yanzhao Zhang, Dingkun Long, Pengjun Xie, Meishan Zhang, Min Zhang
As such cases span from English to other natural or programming languages, from retrieval to classification and beyond, it is desirable to build a unified embedding model rather than dedicated ones for each scenario.
no code implementations • 8 Oct 2023 • Tianyang Zhong, Wei Zhao, Yutong Zhang, Yi Pan, Peixin Dong, Zuowei Jiang, Xiaoyan Kui, Youlan Shang, Li Yang, Yaonai Wei, Longtao Yang, Hao Chen, Huan Zhao, Yuxiao Liu, Ning Zhu, Yiwei Li, Yisong Wang, Jiaqi Yao, Jiaqi Wang, Ying Zeng, Lei He, Chao Zheng, Zhixue Zhang, Ming Li, Zhengliang Liu, Haixing Dai, Zihao Wu, Lu Zhang, Shu Zhang, Xiaoyan Cai, Xintao Hu, Shijie Zhao, Xi Jiang, Xin Zhang, Xiang Li, Dajiang Zhu, Lei Guo, Dinggang Shen, Junwei Han, Tianming Liu, Jun Liu, Tuo Zhang
Radiology report generation, as a key step in medical image analysis, is critical to the quantitative analysis of clinically informed decision-making levels.
3 code implementations • 31 Aug 2023 • Xin Zhang, Dong Zhang, ShiMin Li, Yaqian Zhou, Xipeng Qiu
Therefore, we propose SpeechTokenizer, a unified speech tokenizer for speech large language models.
1 code implementation • 21 Aug 2023 • Guoyao Shen, Yancheng Zhu, Mengyu Li, Ryan McNaughton, Hernan Jara, Sean B. Andersson, Chad W. Farris, Stephan Anderson, Xin Zhang
Recent advances in MRI reconstruction have demonstrated remarkable success through deep learning-based models.
no code implementations • 8 Aug 2023 • Zixuan He, Salik Ram Khanal, Xin Zhang, Manoj Karkee, Qin Zhang
This study proposed a YOLOv5-based custom object detection model to detect strawberries in an outdoor environment.
no code implementations • 7 Aug 2023 • Zehan Li, Xin Zhang, Yanzhao Zhang, Dingkun Long, Pengjun Xie, Meishan Zhang
We present GTE, a general-purpose text embedding model trained with multi-stage contrastive learning.
no code implementations • 2 Aug 2023 • Xinze Li, Kezhi Mao, Fanfan Lin, Xin Zhang
Several adaptive VL strategies have been introduced with which the performance of PSO can be improved.
no code implementations • 1 Aug 2023 • Xinze Li, Xin Zhang, Fanfan Lin, Changjiang Sun, Kezhi Mao
ZVS range and efficiency are two significant performance indicators for DAB converter.
no code implementations • 1 Aug 2023 • Xinze Li, Xin Zhang, Fanfan Lin, Changjiang Sun, Kezhi Mao
However, to minimize the current stress when the DAB converter is under TPS modulation, two difficulties exist in analysis process and realization process, respectively.
no code implementations • 31 Jul 2023 • Xin Zhang, Yuqi Song, Fei Zuo, XiaoFeng Wang
In this work, we address the issue of label imbalance and investigate how to train classifiers using partial labels in large labeling spaces.
no code implementations • 30 Jul 2023 • Xinze Li, Josep Pou, Jiaxin Dong, Fanfan Lin, Changyun Wen, Suvajit Mukherjee, Xin Zhang
The D2EA approach is instantiated for the efficiency optimization of a hybrid modulation for neutral-point-clamped dual-active-bridge (NPC-DAB) converter.
1 code implementation • 25 Jul 2023 • Hexuan Deng, Xin Zhang, Meishan Zhang, Xuebo Liu, Min Zhang
In this paper, we conduct a holistic exploration of the Universal Decompositional Semantic (UDS) Parsing.
1 code implementation • 25 Jul 2023 • Zhengliang Liu, Tianyang Zhong, Yiwei Li, Yutong Zhang, Yi Pan, Zihao Zhao, Peixin Dong, Chao Cao, Yuxiao Liu, Peng Shu, Yaonai Wei, Zihao Wu, Chong Ma, Jiaqi Wang, Sheng Wang, Mengyue Zhou, Zuowei Jiang, Chunlin Li, Jason Holmes, Shaochen Xu, Lu Zhang, Haixing Dai, Kai Zhang, Lin Zhao, Yuanhao Chen, Xu Liu, Peilong Wang, Pingkun Yan, Jun Liu, Bao Ge, Lichao Sun, Dajiang Zhu, Xiang Li, Wei Liu, Xiaoyan Cai, Xintao Hu, Xi Jiang, Shu Zhang, Xin Zhang, Tuo Zhang, Shijie Zhao, Quanzheng Li, Hongtu Zhu, Dinggang Shen, Tianming Liu
The rise of large language models (LLMs) has marked a pivotal shift in the field of natural language processing (NLP).
1 code implementation • 22 Jul 2023 • Qiaoyu Tan, Xin Zhang, Xiao Huang, Hao Chen, Jundong Li, Xia Hu
Graph neural networks (GNNs) have shown prominent performance on attributed network embedding.
no code implementations • 20 Jul 2023 • Kaiwen Wei, Jie Yao, Jingyuan Zhang, Yangyang Kang, Fubang Zhao, Yating Zhang, Changlong Sun, Xin Jin, Xin Zhang
Firstly, the layout of existing datasets is relatively fixed and limited in the number of semantic entity categories, creating a significant gap between these datasets and the complex real-world scenarios.
no code implementations • 19 Jul 2023 • Qianqian Liu, Haixia Zhang, Xin Zhang, Dongfeng Yuan
Meeting the strict Quality of Service (QoS) requirements of terminals has imposed a signiffcant challenge on Multiaccess Edge Computing (MEC) systems, due to the limited multidimensional resources.
no code implementations • 12 Jul 2023 • Zhuowen Yin, Xinyao Ding, Xin Zhang, Zhengwang Wu, Li Wang, Xiangmin Xu, Gang Li
Specifically, we propose a Siamese verification framework to extend the scarce data, and an unsupervised compressor to alleviate data imbalance by extracting key features.
no code implementations • 27 Jun 2023 • Xin Zhang, Liangxiu Han
The success of SSL is heavily dependent on a pre-designed pretext task, which introduces an inductive bias into the model from a large amount of unlabelled data.
1 code implementation • 25 Jun 2023 • Luyuan Xie, Cong Li, ZiRui Wang, Xin Zhang, Boyan Chen, Qingni Shen, Zhonghai Wu
CF module extracts and fuses the multi-scale features of SR images for classification.
Histopathological Image Classification
image-classification
+2
no code implementations • 21 Jun 2023 • Guoyao Shen, Boran Hao, Mengyu Li, Chad W. Farris, Ioannis Ch. Paschalidis, Stephan W. Anderson, Xin Zhang
However, the drawback of these structures is that they are not fully utilizing the information from both domains (k-space and image).
no code implementations • 13 Jun 2023 • Lan Wang, Ruiling He, Lili Zhao, Jia Wang, Zhengzi Geng, Tao Ren, Guo Zhang, Peng Zhang, Kaiqiang Tang, Chaofei Gao, Fei Chen, Liting Zhang, Yonghe Zhou, Xin Li, Fanbin He, Hui Huan, Wenjuan Wang, Yunxiao Liang, Juan Tang, Fang Ai, Tingyu Wang, Liyun Zheng, Zhongwei Zhao, Jiansong Ji, Wei Liu, Jiaojiao Xu, Bo Liu, Xuemei Wang, Yao Zhang, Qiong Yan, Muhan Lv, Xiaomei Chen, Shuhua Zhang, Yihua Wang, Yang Liu, Li Yin, Yanni Liu, Yanqing Huang, Yunfang Liu, Kun Wang, Meiqin Su, Li Bian, Ping An, Xin Zhang, Linxue Qian, Shao Li, Xiaolong Qi
Validation analysis revealed that the AUCs of DLRP were 0. 91 for GEV (95% CI 0. 90 to 0. 93, p < 0. 05) and 0. 88 for HRV (95% CI 0. 86 to 0. 89, p < 0. 01), which were significantly and robustly better than canonical risk indicators, including the value of LSM and SSM.
no code implementations • 13 Jun 2023 • Xin Zhang, Jiaxian Guo, Paul Yoo, Yutaka Matsuo, Yusuke Iwasawa
To guarantee the visual coherence of the generated or edited image, we introduce an inpainting and harmonizing module to guide the pre-trained diffusion model to seamlessly blend the inserted subject into the scene naturally.
no code implementations • 30 May 2023 • Qinglong Tian, Xin Zhang, Jiwei Zhao
We study the domain adaptation problem with label shift in this work.
no code implementations • 26 May 2023 • Xuming Hu, Aiwei Liu, Zeqi Tan, Xin Zhang, Chenwei Zhang, Irwin King, Philip S. Yu
These techniques neither preserve the semantic consistency of the original sentences when rule-based augmentations are adopted, nor preserve the syntax structure of sentences when expressing relations using seq2seq models, resulting in less diverse augmentations.
1 code implementation • 18 May 2023 • Dong Zhang, ShiMin Li, Xin Zhang, Jun Zhan, Pengyu Wang, Yaqian Zhou, Xipeng Qiu
Multi-modal large language models are regarded as a crucial step towards Artificial General Intelligence (AGI) and have garnered significant interest with the emergence of ChatGPT.
1 code implementation • 6 Apr 2023 • Xin Zhang, Chen Liu, Degang Yang, Tingting Song, Yichen Ye, Ke Li, Yingze Song
In this paper, we propose a new perspective on the effectiveness of spatial attention, which is that the spatial attention mechanism essentially solves the problem of convolutional kernel parameter sharing.
no code implementations • 3 Apr 2023 • Xin Zhang, Yuqi Song, XiaoFeng Wang, Fei Zuo
However, concerns have been raised with respect to the trustworthiness of these models: The standard testing method evaluates the performance of a model on a test set, while low-quality and insufficient test sets can lead to unreliable evaluation results, which can have unforeseeable consequences.
3 code implementations • 29 Mar 2023 • Zhengqing Miao, Xin Zhang, Meirong Zhao, Dong Ming
By incorporating two novel attention modules designed specifically for EEG signals, the channel attention module and the depth attention module, LMDA-Net can effectively integrate features from multiple dimensions, resulting in improved classification performance across various BCI tasks.
no code implementations • 26 Mar 2023 • Zhuoying Zhao, Ziling Tan, Pinghui Mo, Xiaonan Wang, Dan Zhao, Xin Zhang, Ming Tao, Jie Liu
This paper proposes a special-purpose system to achieve high-accuracy and high-efficiency machine learning (ML) molecular dynamics (MD) calculations.
no code implementations • 22 Mar 2023 • Hao Wang, Chen Li, JinZhe Jiang, Xin Zhang, YaQian Zhao, Weifeng Gong
Recently, the robustness of deep learning models has received widespread attention, and various methods for improving model robustness have been proposed, including adversarial training, model architecture modification, design of loss functions, certified defenses, and so on.
no code implementations • 20 Mar 2023 • Xiaodong Zhao, YiXuan Luo, Juejing Liu, Wenjun Liu, Kevin M. Rosso, Xiaofeng Guo, Tong Geng, Ang Li, Xin Zhang
This study highlighted the importance of labeled experimental patterns on the training of DNN models to solve u-XRD mapping data from in-situ experiments involving liquid phase.
no code implementations • 15 Mar 2023 • Congqi Cao, Yizhe WANG, Yue Lu, Xin Zhang, Yanning Zhang
Existing works in this field mainly suffer from two weaknesses: (1) They often neglect the multi-label case and only focus on temporal modeling.
no code implementations • 5 Mar 2023 • Zhuqing Liu, Xin Zhang, Songtao Lu, Jia Liu
Decentralized min-max optimization problems with domain constraints underpins many important ML applications, including multi-agent ML fairness assurance, and policy evaluations in multi-agent reinforcement learning.
1 code implementation • 25 Feb 2023 • Yu Liu, Xin Zhang, Jingtao Ding, Yanxin Xi, Yong Li
To address such issues, in this paper, we propose a Knowledge-infused Contrastive Learning (KnowCL) model for urban imagery-based socioeconomic prediction.
1 code implementation • 20 Feb 2023 • Xiang Wei, Xingyu Cui, Ning Cheng, Xiaobin Wang, Xin Zhang, Shen Huang, Pengjun Xie, Jinan Xu, Yufeng Chen, Meishan Zhang, Yong Jiang, Wenjuan Han
Zero-shot information extraction (IE) aims to build IE systems from the unannotated text.
no code implementations • 17 Feb 2023 • Xin Zhang, Liangxiu Han, Lianghao Han, Haoming Chen, Darren Dancey, Daoqiang Zhang
Specifically, it consists of two primary components: 1) A fast and efficient explainable patch selection mechanism for determining the most discriminative patches based on computing the SHapley Additive exPlanations (SHAP) contribution to a transfer learning model for AD diagnosis on massive medical data; and 2) A novel patch-based network for extracting deep features and AD classfication from the selected patches with position embeddings to retain position information, capable of capturing the global and local information of inter- and intra-patches.
3 code implementations • 18 Jan 2023 • Biyang Guo, Xin Zhang, Ziyuan Wang, Minqi Jiang, Jinran Nie, Yuxuan Ding, Jianwei Yue, Yupeng Wu
We call the collected dataset the Human ChatGPT Comparison Corpus (HC3).
no code implementations • ICCV 2023 • Changsong Wen, Xin Zhang, Xingxu Yao, Jufeng Yang
Therefore, we propose a new paradigm, termed ordinal label distribution learning (OLDL).
1 code implementation • 23 Dec 2022 • Qiaoyu Tan, Xin Zhang, Ninghao Liu, Daochen Zha, Li Li, Rui Chen, Soo-Hyun Choi, Xia Hu
To bridge the gap, we introduce a Personalized Subgraph Selector (PS2) as a plug-and-play framework to automatically, personally, and inductively identify optimal subgraphs for different edges when performing GNNLP.
no code implementations • 16 Dec 2022 • Congqi Cao, Xin Zhang, Shizhou Zhang, Peng Wang, Yanning Zhang
To enhance the discriminative power of features, we propose a batch clustering based loss to encourage a clustering branch to generate distinct normal and abnormal clusters based on a batch of data.
no code implementations • 4 Nov 2022 • Xin Zhang, Iván Vallés-Pérez, Andreas Stolcke, Chengzhu Yu, Jasha Droppo, Olabanji Shonibare, Roberto Barra-Chicote, Venkatesh Ravichandran
By fine-tuning an ASR model on synthetic stuttered speech we are able to reduce word error by 5. 7% relative on stuttered utterances, with only minor (<0. 2% relative) degradation for fluent utterances.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 24 Oct 2022 • Xin Zhang, Rabab Abdelfattah, Yuqi Song, Samuel A. Dauchert, XiaoFeng Wang
Depth information is the foundation of perception, essential for autonomous driving, robotics, and other source-constrained applications.
no code implementations • 24 Oct 2022 • Xin Zhang, Rabab Abdelfattah, Yuqi Song, XiaoFeng Wang
Through comprehensive experiments on three large-scale multi-label image datasets, i. e. MS-COCO, NUS-WIDE, and Pascal VOC12, we show that our method can handle the imbalance between positive labels and negative labels, while still outperforming existing missing-label learning approaches in most cases, and in some cases even approaches with fully labeled datasets.