no code implementations • ECCV 2020 • Linlin Chao, Jingdong Chen, Wei Chu
However, CTC tends to output spiky distributions since it prefers to output blank symbol most of the time.
1 code implementation • 30 May 2025 • Shuyao Xu, Cheng Peng, Jiangxuan Long, Weidi Xu, Wei Chu, Yuan Qi
This paper addresses the critical question: How can both positive and negative distilled reasoning traces be effectively leveraged to maximize LLM reasoning performance in an offline setting?
1 code implementation • 10 Apr 2025 • Haozhe Wang, Chao Qu, Zuming Huang, Wei Chu, Fangzhen Lin, Wenhu Chen
By combining these two techniques, our model, VL-Rethinker, advances state-of-the-art scores on MathVista, MathVerse to achieve 80. 4%, 63. 5% respectively.
no code implementations • 11 Mar 2025 • Rui Xu, Mingyu Wang, Xintao Wang, Dakuan Lu, Xiaoyu Tan, Wei Chu, Yinghui Xu
To address this challenge, we propose MIRROR, a chain-of-thought approach that generates character thoughts by retrieving memories, predicting character reactions, and synthesizing motivations.
no code implementations • 17 Feb 2025 • Xiaoyu Tan, Tianchu Yao, Chao Qu, Bin Li, Minghao Yang, Dakuan Lu, Haozhe Wang, Xihe Qiu, Wei Chu, Yinghui Xu, Yuan Qi
In this paper, we present AURORA, a novel automated framework for training universal process reward models (PRMs) using ensemble prompting and reverse verification.
no code implementations • 2 Feb 2025 • Haozhe Wang, Long Li, Chao Qu, Fengming Zhu, Weidi Xu, Wei Chu, Fangzhen Lin
Recent research on tool integration for math Large Language Models (LLMs) aims to combine complementary strengths of chain-of-thought (CoT) reasoning and code execution.
1 code implementation • 26 Jan 2025 • Dakuan Lu, Xiaoyu Tan, Rui Xu, Tianchu Yao, Chao Qu, Wei Chu, Yinghui Xu, Yuan Qi
Recent breakthroughs in large language models (LLMs) exemplified by the impressive mathematical and scientific reasoning capabilities of the o1 model have spotlighted the critical importance of high-quality training data in advancing LLM performance across STEM disciplines.
1 code implementation • 25 Dec 2024 • Yingchen Wei, Xihe Qiu, Xiaoyu Tan, Jingjing Huang, Wei Chu, Yinghui Xu, Yuan Qi
Cross-attention combines image and text data for better feature extraction, and ordered regression loss ensures stable learning.
no code implementations • 7 Nov 2024 • Siming Huang, Tianhao Cheng, J. K. Liu, Jiaran Hao, Liuyihan Song, Yang Xu, J. Yang, Jiaheng Liu, Chenchen Zhang, Linzheng Chai, Ruifeng Yuan, Zhaoxiang Zhang, Jie Fu, Qian Liu, Ge Zhang, Zili Wang, Yuan Qi, Yinghui Xu, Wei Chu
To address the gap, we introduce OpenCoder, a top-tier code LLM that not only achieves performance comparable to leading models but also serves as an "open cookbook" for the research community.
no code implementations • 5 Sep 2024 • Yongxin Deng, Xihe Qiu, Xiaoyu Tan, Chao Qu, Jing Pan, Yuan Cheng, Yinghui Xu, Wei Chu
Cognitive psychology investigates perception, attention, memory, language, problem-solving, decision-making, and reasoning.
no code implementations • 20 Aug 2024 • Yongxin Deng, Xihe Qiu, Xiaoyu Tan, Jing Pan, Chen Jue, Zhijun Fang, Yinghui Xu, Wei Chu, Yuan Qi
Large language models (LLMs) are trained on extensive text corpora, which inevitably include biased information.
1 code implementation • 24 Jul 2024 • Xiaoyu Tan, Bin Li, Xihe Qiu, Jingjing Huang, Yinghui Xu, Wei Chu
To the best of our knowledge, this is the first study to successfully address both event and time label noise in deep Hawkes process models, offering a promising solution for medical applications, specifically in diagnosing OSAHS.
no code implementations • 18 Jul 2024 • Xiaoyu Tan, Yongxin Deng, Xihe Qiu, Weidi Xu, Chao Qu, Wei Chu, Yinghui Xu, Yuan Qi
To address these challenges, we introduce a novel learning framework, THOUGHT-LIKE-PRO In this framework, we utilize imitation learning to imitate the Chain-of-Thought (CoT) process which is verified and translated from reasoning trajectories generated by a symbolic Prolog logic engine.
no code implementations • 17 Jul 2024 • Xiaoyu Tan, Haoyu Wang, Xihe Qiu, Yuan Cheng, Yinghui Xu, Wei Chu, Yuan Qi
Structured data, rich in logical and relational information, has the potential to enhance the reasoning abilities of large language models (LLMs).
no code implementations • 17 Jul 2024 • Xihe Qiu, Haoyu Wang, Xiaoyu Tan, Chao Qu, Yujie Xiong, Yuan Cheng, Yinghui Xu, Wei Chu, Yuan Qi
During execution, multiple agents interact in a downstream environment and communicate intentions to enable coordinated behaviors.
no code implementations • 7 Jul 2024 • Rui Xu, Dakuan Lu, Xiaoyu Tan, Xintao Wang, Siyu Yuan, Jiangjie Chen, Wei Chu, Yinghui Xu
Large language models~(LLMs) have demonstrated impressive performance in various applications, among which role-playing language agents (RPLAs) have engaged a broad user base.
no code implementations • 22 Jun 2024 • Meng Cao, Philip I. Pavlik Jr., Wei Chu, Liang Zhang
Although a recent study underscores the joint influence of memory and attentional factors on sequencing effects, there remains a scarcity of effective computational models integrating both attentional and memory considerations to comprehensively understand the effect of training sequences on students' performance.
1 code implementation • 31 Jan 2024 • Xingning Dong, Qingpei Guo, Tian Gan, Qing Wang, Jianlong Wu, Xiangyuan Ren, Yuan Cheng, Wei Chu
By employing one shared BERT-type network to refine textual and cross-modal features simultaneously, SNP is lightweight and could support various downstream applications.
no code implementations • 9 Jan 2024 • Xuzheng Yu, Chen Jiang, Wei zhang, Tian Gan, Linlin Chao, Jianan Zhao, Yuan Cheng, Qingpei Guo, Wei Chu
With the explosive growth of video data in real-world applications, a comprehensive representation of videos becomes increasingly important.
1 code implementation • 12 Nov 2023 • Qiang Zhou, Zhibin Wang, Wei Chu, Yinghui Xu, Hao Li, Yuan Qi
Our experiments demonstrate that preserving the positional information of visual embeddings through the pool-adapter is particularly beneficial for tasks like visual grounding.
Ranked #174 on
Visual Question Answering
on MM-Vet
1 code implementation • 27 Sep 2023 • Weidi Xu, Jingwei Wang, Lele Xie, Jianshan He, Hongting Zhou, Taifeng Wang, Xiaopei Wan, Jingdong Chen, Chao Qu, Wei Chu
Integrating first-order logic constraints (FOLCs) with neural networks is a crucial but challenging problem since it involves modeling intricate correlations to satisfy the constraints.
1 code implementation • 20 Sep 2023 • Chen Jiang, Hong Liu, Xuzheng Yu, Qing Wang, Yuan Cheng, Jia Xu, Zhongyi Liu, Qingpei Guo, Wei Chu, Ming Yang, Yuan Qi
We thereby present a new Triplet Partial Margin Contrastive Learning (TPM-CL) module to construct partial order triplet samples by automatically generating fine-grained hard negatives for matched text-video pairs.
Ranked #4 on
Video Retrieval
on MSR-VTT-1kA
no code implementations • 20 Sep 2023 • Chen Jiang, Kaiming Huang, Sifeng He, Xudong Yang, Wei zhang, Xiaobo Zhang, Yuan Cheng, Lei Yang, Qing Wang, Furong Xu, Tan Pan, Wei Chu
SSAN is based on two newly proposed modules in video retrieval: (1) An efficient Self-supervised Keyframe Extraction (SKE) module to reduce redundant frame features, (2) A robust Similarity Pattern Detection (SPD) module for temporal alignment.
no code implementations • 25 Jun 2023 • Qingpei Guo, Kaisheng Yao, Wei Chu
They can achieve exceptional performances on specific tasks, but face a particularly challenging problem of modality mismatch because of diversity of input modalities and their fixed structures.
1 code implementation • CVPR 2023 • Tan Pan, Furong Xu, Xudong Yang, Sifeng He, Chen Jiang, Qingpei Guo, Feng Qian Xiaobo Zhang, Yuan Cheng, Lei Yang, Wei Chu
For traditional model upgrades, the old model will not be replaced by the new one until the embeddings of all the images in the database are re-computed by the new model, which takes days or weeks for a large amount of data.
no code implementations • 15 Apr 2023 • Ruchao Fan, Wei Chu, Peng Chang, Abeer Alwan
During inference, an error-based alignment sampling method is investigated in depth to reduce the alignment mismatch in the training and testing processes.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+5
1 code implementation • 28 Feb 2023 • Wen Li, Cheng Zou, Meng Wang, Furong Xu, Jianan Zhao, Ruobing Zheng, Yuan Cheng, Wei Chu
In this paper, we propose a Diverse and Compact Transformer (DC-Former) that can achieve a similar effect by splitting embedding space into multiple diverse and compact subspaces.
1 code implementation • 10 Feb 2023 • Lei Zhang, Xiaodong Yan, Jianshan He, Ruopeng Li, Wei Chu
Our experimental results show that our model effectively relieves the problem of over-smoothing in deep GCNs and outperforms the state-of-the-art (SOTA) methods on various benchmark datasets.
no code implementations • 2 Feb 2023 • Zhixuan Chu, Jianmin Huang, Ruopeng Li, Wei Chu, Sheng Li
Causal inference has numerous real-world applications in many domains, such as health care, marketing, political science, and online advertising.
no code implementations • CVPR 2023 • Jiangwei Lao, Weixiang Hong, Xin Guo, Yingying Zhang, Jian Wang, Jingdong Chen, Wei Chu
In this work, we propose a novel feature enhancement network to simultaneously model short- and long-term temporal correlation.
no code implementations • ACL 2022 • Mingzhe Li, Xiexiong Lin, Xiuying Chen, Jinxiong Chang, Qishen Zhang, Feng Wang, Taifeng Wang, Zhongyi Liu, Wei Chu, Dongyan Zhao, Rui Yan
Contrastive learning has achieved impressive success in generation tasks to militate the "exposure bias" problem and discriminatively exploit the different quality of references.
2 code implementations • 29 Apr 2022 • Linlin Chao, Xiexiong Lin, Taifeng Wang, Wei Chu
Meanwhile, the inference time grows log-linearly with the number of entities for all entities are traversed and compared.
1 code implementation • 3 Apr 2022 • Hao Wang, Tai-Wei Chang, Tianqiao Liu, Jianmin Huang, Zhichao Chen, Chao Yu, Ruopeng Li, Wei Chu
In this paper, we theoretically demonstrate that ESMM suffers from the following two problems: (1) Inherent Estimation Bias (IEB), where the estimated CVR of ESMM is inherently higher than the ground truth; (2) Potential Independence Priority (PIP) for CTCVR estimation, where there is a risk that the ESMM overlooks the causality from click to conversion.
2 code implementations • 13 Mar 2022 • Xiaojie Chu, Yongtao Wang, Chunhua Shen, Jingdong Chen, Wei Chu
The development of scene text recognition (STR) in the era of deep learning has been mainly focused on novel architectures of STR models.
no code implementations • CVPR 2022 • Weixiang Hong, Jiangwei Lao, Wang Ren, Jian Wang, Jingdong Chen, Wei Chu
Instead of proposing a specific vision transformer based detector, in this work, our goal is to reveal the insights of training vision transformer based detectors from scratch.
4 code implementations • 1 Jul 2021 • TingTing Liang, Xiaojie Chu, Yudong Liu, Yongtao Wang, Zhi Tang, Wei Chu, Jingdong Chen, Haibin Ling
With multi-scale testing, we push the current best single model result to a new record of 60. 1% box AP and 52. 3% mask AP without using extra training data.
Ranked #2 on
Instance Segmentation
on COCO test-dev
(using extra training data)
no code implementations • CVPR 2021 • Weixiang Hong, Qingpei Guo, Wei zhang, Jingdong Chen, Wei Chu
Panoptic segmentation is a challenging task aiming to simultaneously segment objects (things) at instance level and background contents (stuff) at semantic level.
no code implementations • CVPR 2021 • Furong Xu, Meng Wang, Wei zhang, Yuan Cheng, Wei Chu
Therefore, there is a need for a training mechanism that enforces the discriminativeness of all the elements in the feature to capture more the subtle visual cues.
no code implementations • 18 Jun 2021 • Jinhan Wang, Yunzheng Zhu, Ruchao Fan, Wei Chu, Abeer Alwan
~ 5 hours of transcribed data and ~ 60 hours of untranscribed data are provided to develop a German ASR system for children.
no code implementations • 18 Jun 2021 • Ruchao Fan, Wei Chu, Peng Chang, Jing Xiao, Abeer Alwan
For the analyses, we plot attention weight distributions in the decoders to visualize the relationships between token-level acoustic embeddings.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
1 code implementation • 23 May 2021 • Hao Huang, Yongtao Wang, Zhaoyu Chen, Yuze Zhang, Yuheng Li, Zhi Tang, Wei Chu, Jingdong Chen, Weisi Lin, Kai-Kuang Ma
Then, we design a two-level perturbation fusion strategy to alleviate the conflict between the adversarial watermarks generated by different facial images and models.
no code implementations • COLING 2020 • Weipeng Huang, Xingyi Cheng, Kunlong Chen, Taifeng Wang, Wei Chu
The ambiguous annotation criteria lead to divergence of Chinese Word Segmentation (CWS) datasets in various granularities.
2 code implementations • ACL 2021 • Linlin Chao, Jianshan He, Taifeng Wang, Wei Chu
Distance based knowledge graph embedding methods show promising results on link prediction task, on which two topics have been widely studied: one is the ability to handle complex relations, such as N-to-1, 1-to-N and N-to-N, the other is to encode various relation patterns, such as symmetry/antisymmetry.
Ranked #12 on
Link Property Prediction
on ogbl-biokg
no code implementations • 28 Oct 2020 • Ruchao Fan, Wei Chu, Peng Chang, Jing Xiao
The information are used to extract acoustic representation for each token in parallel, referred to as token-level acoustic embedding which substitutes the word embedding in autoregressive transformer (AT) to achieve parallel generation in decoder.
no code implementations • EMNLP 2020 • Kunlong Chen, Weidi Xu, Xingyi Cheng, Zou Xiaochuan, Yuyu Zhang, Le Song, Taifeng Wang, Yuan Qi, Wei Chu
Numerical reasoning over texts, such as addition, subtraction, sorting and counting, is a challenging machine reading comprehension task, since it requires both natural language understanding and arithmetic computation.
Ranked #1 on
Question Answering
on DROP Test
no code implementations • ACL 2020 • Xiexiong Lin, Weiyu Jian, Jianshan He, Taifeng Wang, Wei Chu
Experiments demonstrate that our model with fewer parameters yields significant improvements over competitive baselines on two datasets Wizard-of-Wikipedia(average Bleu +87{\%}; abs.
no code implementations • 19 May 2020 • Shijun Wang, Baocheng Zhu, Chen Li, Mingzhe Wu, James Zhang, Wei Chu, Yuan Qi
In this paper, We propose a general Riemannian proximal optimization algorithm with guaranteed convergence to solve Markov decision process (MDP) problems.
1 code implementation • ACL 2020 • Xingyi Cheng, Weidi Xu, Kunlong Chen, Shaohua Jiang, Feng Wang, Taifeng Wang, Wei Chu, Yuan Qi
This paper proposes to incorporate phonological and visual similarity knowledge into language models for CSC via a specialized graph convolutional network (SpellGCN).
no code implementations • 19 Apr 2020 • Chao Qu, Hui Li, Chang Liu, Junwu Xiong, James Zhang, Wei Chu, Weiqiang Wang, Yuan Qi, Le Song
We propose a \emph{collaborative} multi-agent reinforcement learning algorithm named variational policy propagation (VPP) to learn a \emph{joint} policy through the interactions over agents.
Multi-agent Reinforcement Learning
reinforcement-learning
+3
1 code implementation • 8 Sep 2019 • Weidi Xu, Xingyi Cheng, Kunlong Chen, Wei Wang, Bin Bi, Ming Yan, Chen Wu, Luo Si, Wei Chu, Taifeng Wang
To remedy this, we propose to augment the NSP task to a 3-class categorization task, which includes a category for previous sentence prediction (PSP).
1 code implementation • 16 Aug 2019 • Weipeng Huang, Xingyi Cheng, Taifeng Wang, Wei Chu
Combining these three contributions, we enhance the information extracting ability of the multi-head selection model and achieve F1-score 0. 876 on testset-1 with a single model.
no code implementations • 11 Mar 2019 • Xin Chen, Wei Chu, Jinxi Guo, Ning Xu
F0 and aperiodic are obtained through the original singing voice, and used with acoustic features to reconstruct the target singing voice through a vocoder.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • 11 Mar 2019 • Weipeng Huang, Xingyi Cheng, Kunlong Chen, Taifeng Wang, Wei Chu
The ambiguous annotation criteria lead to divergence of Chinese Word Segmentation (CWS) datasets in various granularities.
no code implementations • 18 Dec 2018 • Yichao Zhou, Wei Chu, Sam Young, Xin Chen
In the learning stage, a sequence of stylistically uniform, multiple-channel music samples was modeled by a RNN.
1 code implementation • 26 Nov 2018 • Chenchen Li, Xiang Yan, Xiaotie Deng, Yuan Qi, Wei Chu, Le Song, Junlong Qiao, Jianshan He, Junwu Xiong
Uplift modeling aims to directly model the incremental impact of a treatment on an individual response.
no code implementations • 21 Nov 2018 • Wanchen Sui, Qing Zhang, Jun Yang, Wei Chu
In this paper, we propose a novel integrated framework for learning both text detection and recognition.
no code implementations • CONLL 2019 • Xingyi Cheng, Weidi Xu, Taifeng Wang, Wei Chu
By disentangling the latent representation into the aspect-specific sentiment and the lexical context, our method induces the underlying sentiment prediction for the unlabeled data, which then benefits the ATSA classifier.
Aspect-Based Sentiment Analysis (ABSA)
Natural Language Understanding
+1
no code implementations • 23 Aug 2018 • Chenchen Li, Xiang Yan, Xiaotie Deng, Yuan Qi, Wei Chu, Le Song, Junlong Qiao, Jianshan He, Junwu Xiong
Then we develop a variant of Latent Dirichlet Allocation (LDA) to infer latent variables under the current market environment, which represents the preferences of customers and strategies of competitors.
no code implementations • 13 Jul 2018 • Zhangyu Xiao, Zhijian Ou, Wei Chu, Hui Lin
In this paper, we present an end-to-end automatic speech recognition system, which successfully employs subword units in a hybrid CTC-Attention based system.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 3 May 2018 • Qiangpeng Yang, Mengli Cheng, Wenmeng Zhou, Yan Chen, Minghui Qiu, Wei. Lin, Wei Chu
To solve this problem, we propose a novel end-to-end scene text detector IncepText from an instance-aware segmentation perspective.
no code implementations • 12 Jan 2018 • Feng-Lin Li, Minghui Qiu, Haiqing Chen, Xiongwei Wang, Xing Gao, Jun Huang, Juwei Ren, Zhongzhou Zhao, Weipeng Zhao, Lei Wang, Guwei Jin, Wei Chu
We present AliMe Assist, an intelligent assistant designed for creating an innovative online shopping experience in E-commerce.
no code implementations • 23 Nov 2017 • Jianfei Yu, Minghui Qiu, Jing Jiang, Jun Huang, Shuangyong Song, Wei Chu, Haiqing Chen
In this paper, we study transfer learning for the PI and NLI problems, aiming to propose a general framework, which can effectively and efficiently adapt the shared knowledge learned from a resource-rich source domain to a resource- poor target domain.
no code implementations • ACL 2017 • Minghui Qiu, Feng-Lin Li, Siyu Wang, Xing Gao, Yan Chen, Weipeng Zhao, Haiqing Chen, Jun Huang, Wei Chu
We propose AliMe Chat, an open-domain chatbot engine that integrates the joint results of Information Retrieval (IR) and Sequence to Sequence (Seq2Seq) based generation models.
no code implementations • 20 Apr 2016 • Wei Chu, Ruxin Chen
The previously trained DNN of the matched speaker cluster is used for decoding utterances of the test speaker.
5 code implementations • 31 Mar 2010 • Lihong Li, Wei Chu, John Langford, Xuanhui Wang
\emph{Offline} evaluation of the effectiveness of new algorithms in these applications is critical for protecting online user experiences but very challenging due to their "partial-label" nature.
12 code implementations • 28 Feb 2010 • Lihong Li, Wei Chu, John Langford, Robert E. Schapire
In this work, we model personalized recommendation of news articles as a contextual bandit problem, a principled approach in which a learning algorithm sequentially selects articles to serve users based on contextual information about the users and articles, while simultaneously adapting its article-selection strategy based on user-click feedback to maximize total user clicks.
no code implementations • NeurIPS 2007 • Kai Yu, Wei Chu
In this paper we develop a Gaussian process (GP) framework to model a collection of reciprocal random variables defined on the \emph{edges} of a network.