Search Results for author: Tao Lei

Found 57 papers, 23 papers with code

面向垂直领域的阅读理解数据增强方法(Method for reading comprehension data enhancement in vertical field)

no code implementations CCL 2020 Zhengwei Lv, Lei Yang, Zhizhong Shi, Xiao Liang, Tao Lei, Duoxing Liu

阅读理解问答系统是利用语义理解等自然语言处理技术, 根据输入问题, 对非结构化文档数据进行分析, 生成一个答案, 具有很高的研究和应用价值。在垂直领域应用过程中, 阅读理解问答数据标注成本高且用户问题表达复杂多样, 使得阅读理解问答系统准确率低、鲁棒性差。针对这一问题, 本文提出一种面向垂直领域的阅读理解问答数据的增强方法, 该方法基于真实用户问题, 构造阅读理解训练数据, 一方面降低标注成本, 另一方面增加训练数据多样性, 提升模型的准确率和鲁棒性。本文用汽车领域数据对该方法进行实验验证, 其结果表明该方法对垂直领域阅读理解模型的准确率和鲁棒性均能有效提升。

Reading Comprehension

Representation Discrepancy Bridging Method for Remote Sensing Image-Text Retrieval

no code implementations22 May 2025 Hailong Ning, Siying Wang, Tao Lei, Xiaopeng Cao, Huanmin Dou, Bin Zhao, Asoke K. Nandi, Petia Radeva

On the one hand, a Cross-Modal Asymmetric Adapter (CMAA) is designed to enable modality-specific optimization and improve feature alignment.

cross-modal alignment Image-text Retrieval +2

IDEA Prune: An Integrated Enlarge-and-Prune Pipeline in Generative Language Model Pretraining

no code implementations7 Mar 2025 Yixiao Li, Xianzhi Du, Ajay Jaiswal, Tao Lei, Tuo Zhao, Chong Wang, Jianyu Wang

We study the enlarge-and-prune pipeline as an integrated system to address two critical questions: whether it is worth pretraining an enlarged model even when the model is never deployed, and how to optimize the entire pipeline for better pruned models.

Language Modeling Language Modelling

Instruction-Following Pruning for Large Language Models

no code implementations3 Jan 2025 Bairu Hou, Qibin Chen, Jianyu Wang, Guoli Yin, Chong Wang, Nan Du, Ruoming Pang, Shiyu Chang, Tao Lei

In our method, the pruning mask is input-dependent and adapts dynamically based on the information described in a user instruction.

Instruction Following Math

Distribution alignment based transfer fusion frameworks on quantum devices for seeking quantum advantages

no code implementations4 Nov 2024 Xi He, Feiyu Du, Xiaohan Yu, Yang Zhao, Tao Lei

Two transfer fusion frameworks are proposed in this paper to predict the labels of a target domain data by aligning its distribution to a different but related labelled source domain on quantum devices.

Quantum Machine Learning

PMR-Net: Parallel Multi-Resolution Encoder-Decoder Network Framework for Medical Image Segmentation

no code implementations19 Sep 2024 Xiaogang Du, Dongxin Gu, Tao Lei, Yipeng Jiao, Yibin Zou

The multi-resolution context encoder fuses the global context semantic features of different receptive fields from different encoder branches to maintain effectively the integrity of global information.

Decoder Image Segmentation +3

Apple Intelligence Foundation Language Models

no code implementations29 Jul 2024 Tom Gunter, ZiRui Wang, Chong Wang, Ruoming Pang, Aonan Zhang, BoWen Zhang, Chen Chen, Chung-Cheng Chiu, David Qiu, Deepak Gopinath, Dian Ang Yap, Dong Yin, Feng Nan, Floris Weers, Guoli Yin, Haoshuo Huang, Jianyu Wang, Jiarui Lu, John Peebles, Ke Ye, Mark Lee, Nan Du, Qibin Chen, Quentin Keunebroek, Sam Wiseman, Syd Evans, Tao Lei, Vivek Rathod, Xiang Kong, Xianzhi Du, Yanghao Li, Yongqiang Wang, Yuan Gao, Zaid Ahmed, Zhaoyang Xu, Zhiyun Lu, Al Rashid, Albin Madappally Jose, Alec Doane, Alfredo Bencomo, Allison Vanderby, Andrew Hansen, Ankur Jain, Anupama Mann Anupama, Areeba Kamal, Bugu Wu, Carolina Brum, Charlie Maalouf, Chinguun Erdenebileg, Chris Dulhanty, Dominik Moritz, Doug Kang, Eduardo Jimenez, Evan Ladd, Fangping Shi, Felix Bai, Frank Chu, Fred Hohman, Hadas Kotek, Hannah Gillis Coleman, Jane Li, Jeffrey Bigham, Jeffery Cao, Jeff Lai, Jessica Cheung, Jiulong Shan, Joe Zhou, John Li, Jun Qin, Karanjeet Singh, Karla Vega, Kelvin Zou, Laura Heckman, Lauren Gardiner, Margit Bowler, Maria Cordell, Meng Cao, Nicole Hay, Nilesh Shahdadpuri, Otto Godwin, Pranay Dighe, Pushyami Rachapudi, Ramsey Tantawi, Roman Frigg, Sam Davarnia, Sanskruti Shah, Saptarshi Guha, Sasha Sirovica, Shen Ma, Shuang Ma, Simon Wang, Sulgi Kim, Suma Jayaram, Vaishaal Shankar, Varsha Paidi, Vivek Kumar, Xin Wang, Xin Zheng, Walker Cheng, Yael Shrager, Yang Ye, Yasu Tanaka, Yihao Guo, Yunsong Meng, Zhao Tang Luo, Zhi Ouyang, Alp Aygar, Alvin Wan, Andrew Walkingshaw, Andy Narayanan, Antonie Lin, Arsalan Farooq, Brent Ramerth, Colorado Reed, Chris Bartels, Chris Chaney, David Riazati, Eric Liang Yang, Erin Feldman, Gabriel Hochstrasser, Guillaume Seguin, Irina Belousova, Joris Pelemans, Karen Yang, Keivan Alizadeh Vahid, Liangliang Cao, Mahyar Najibi, Marco Zuliani, Max Horton, Minsik Cho, Nikhil Bhendawade, Patrick Dong, Piotr Maj, Pulkit Agrawal, Qi Shan, Qichen Fu, Regan Poston, Sam Xu, Shuangning Liu, Sushma Rao, Tashweena Heeramun, Thomas Merth, Uday Rayala, Victor Cui, Vivek Rangarajan Sridhar, Wencong Zhang, Wenqi Zhang, Wentao Wu, Xingyu Zhou, Xinwen Liu, Yang Zhao, Yin Xia, Zhile Ren, Zhongzheng Ren

We present foundation language models developed to power Apple Intelligence features, including a ~3 billion parameter model designed to run efficiently on devices and a large server-based language model designed for Private Cloud Compute.

Language Modeling Language Modelling

Learning to Skip for Language Modeling

no code implementations26 Nov 2023 Dewen Zeng, Nan Du, Tao Wang, Yuanzhong Xu, Tao Lei, Zhifeng Chen, Claire Cui

Overparameterized large-scale language models have impressive generalization performance of in-context few-shot learning.

Few-Shot Learning Language Modeling +1

TEC-Net: Vision Transformer Embrace Convolutional Neural Networks for Medical Image Segmentation

1 code implementation7 Jun 2023 Rui Sun, Tao Lei, Weichuan Zhang, Yong Wan, Yong Xia, Asoke K. Nandi

The hybrid architecture of convolution neural networks (CNN) and Transformer has been the most popular method for medical image segmentation.

Image Segmentation Medical Image Segmentation +2

Lightweight Structure-aware Transformer Network for VHR Remote Sensing Image Change Detection

no code implementations3 Jun 2023 Tao Lei, Yetong Xu, Hailong Ning, Zhiyong Lv, Chongdan Min, Yaochu Jin, Asoke K. Nandi

Popular Transformer networks have been successfully applied to remote sensing (RS) image change detection (CD) identifications and achieve better results than most convolutional neural networks (CNNs), but they still suffer from two main problems.

Change Detection

Rethinking the Role of Token Retrieval in Multi-Vector Retrieval

2 code implementations NeurIPS 2023 Jinhyuk Lee, Zhuyun Dai, Sai Meher Karthik Duddu, Tao Lei, Iftekhar Naim, Ming-Wei Chang, Vincent Y. Zhao

Multi-vector retrieval models such as ColBERT [Khattab and Zaharia, 2020] allow token-level interactions between queries and documents, and hence achieve state of the art on many information retrieval benchmarks.

Information Retrieval Retrieval

CoLT5: Faster Long-Range Transformers with Conditional Computation

no code implementations17 Mar 2023 Joshua Ainslie, Tao Lei, Michiel de Jong, Santiago Ontañón, Siddhartha Brahma, Yury Zemlyanskiy, David Uthus, Mandy Guo, James Lee-Thorp, Yi Tay, Yun-Hsuan Sung, Sumit Sanghai

Many natural language processing tasks benefit from long inputs, but processing long documents with Transformers is expensive -- not only due to quadratic attention complexity but also from applying feedforward and projection layers to every token.

Long-range modeling

Lightweight Facial Attractiveness Prediction Using Dual Label Distribution

1 code implementation4 Dec 2022 Shu Liu, Enquan Huang, Ziyu Zhou, Yan Xu, Xiaoyan Kui, Tao Lei, Hongying Meng

The data processing is simplified to a minimum for a lightweight design, and MobileNetV2 is selected as our backbone.

Prediction

Multi-Vector Retrieval as Sparse Alignment

no code implementations2 Nov 2022 Yujie Qian, Jinhyuk Lee, Sai Meher Karthik Duddu, Zhuyun Dai, Siddhartha Brahma, Iftekhar Naim, Tao Lei, Vincent Y. Zhao

With sparsified unary saliences, we are able to prune a large number of query and document token vectors and improve the efficiency of multi-vector retrieval.

Argument Retrieval Information Retrieval +1

Training Language Models with Memory Augmentation

1 code implementation25 May 2022 Zexuan Zhong, Tao Lei, Danqi Chen

Recent work has improved language models (LMs) remarkably by equipping them with a non-parametric memory component.

Language Modeling Language Modelling +1

Simple Recurrence Improves Masked Language Models

no code implementations23 May 2022 Tao Lei, Ran Tian, Jasmijn Bastings, Ankur P. Parikh

In this work, we explore whether modeling recurrence into the Transformer architecture can both be beneficial and efficient, by building an extremely simple recurrent module into the Transformer.

Mixture-of-Experts with Expert Choice Routing

no code implementations18 Feb 2022 Yanqi Zhou, Tao Lei, Hanxiao Liu, Nan Du, Yanping Huang, Vincent Zhao, Andrew Dai, Zhifeng Chen, Quoc Le, James Laudon

Prior work allocates a fixed number of experts to each token using a top-k function regardless of the relative importance of different tokens.

Mixture-of-Experts

SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition

no code implementations11 Oct 2021 Jing Pan, Tao Lei, Kwangyoun Kim, Kyu Han, Shinji Watanabe

The Transformer architecture has been well adopted as a dominant architecture in most sequence transduction tasks including automatic speech recognition (ASR), since its attention mechanism excels in capturing long-range dependencies.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Channel-Temporal Attention for First-Person Video Domain Adaptation

no code implementations17 Aug 2021 Xianyuan Liu, Shuo Zhou, Tao Lei, Haiping Lu

Finally, we propose a Channel-Temporal Attention Network (CTAN) to integrate these blocks into existing architectures.

Action Recognition Unsupervised Domain Adaptation

Nutri-bullets Hybrid: Consensual Multi-document Summarization

no code implementations NAACL 2021 Darsh Shah, Lili Yu, Tao Lei, Regina Barzilay

We present a method for generating comparative summaries that highlight similarities and contradictions in input documents.

Document Summarization Language Modeling +4

Nutribullets Hybrid: Multi-document Health Summarization

2 code implementations8 Apr 2021 Darsh J Shah, Lili Yu, Tao Lei, Regina Barzilay

We present a method for generating comparative summaries that highlights similarities and contradictions in input documents.

Language Modeling Language Modelling +2

When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute

1 code implementation EMNLP 2021 Tao Lei

In this work, we present SRU++, a highly-efficient architecture that combines fast recurrence and attention for sequence modeling.

Language Modeling Language Modelling +1

Medical Image Segmentation Using Deep Learning: A Survey

2 code implementations28 Sep 2020 Risheng Wang, Tao Lei, Ruixia Cui, Bingtao Zhang, Hongy-ing Meng, Asoke K. Nandi

Firstly, compared to traditional surveys that directly divide literatures of deep learning on medical image segmentation into many groups and introduce literatures in detail for each group, we classify currently popular literatures according to a multi-level structure from coarse to fine.

Data Augmentation Deep Learning +8

Autoregressive Knowledge Distillation through Imitation Learning

2 code implementations EMNLP 2020 Alexander Lin, Jeremy Wohlwend, Howard Chen, Tao Lei

The performance of autoregressive models on natural language generation tasks has dramatically improved due to the adoption of deep, self-attentive architectures.

Imitation Learning Knowledge Distillation +3

Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport

1 code implementation ACL 2020 Kyle Swanson, Lili Yu, Tao Lei

Selecting input features of top relevance has become a popular method for building self-explaining models.

Text Matching

ASAPP-ASR: Multistream CNN and Self-Attentive SRU for SOTA Speech Recognition

no code implementations21 May 2020 Jing Pan, Joshua Shapiro, Jeremy Wohlwend, Kyu J. Han, Tao Lei, Tao Ma

In this paper we present state-of-the-art (SOTA) performance on the LibriSpeech corpus with two novel neural network architectures, a multistream CNN for acoustic modeling and a self-attentive simple recurrent unit (SRU) for language modeling.

Ranked #10 on Speech Recognition on LibriSpeech test-clean (using extra training data)

Data Augmentation Diversity +4

Structured Pruning of Large Language Models

2 code implementations EMNLP 2020 Ziheng Wang, Jeremy Wohlwend, Tao Lei

Large language models have recently achieved state of the art performance across a wide variety of natural language tasks.

Language Modeling Language Modelling +2

Adaptive Morphological Reconstruction for Seeded Image Segmentation

1 code implementation8 Apr 2019 Tao Lei, Xiaohong Jia, Tongliang Liu, Shigang Liu, Hongy-ing Meng, Asoke K. Nandi

However, MR might mistakenly filter meaningful seeds that are required for generating accurate segmentation and it is also sensitive to the scale because a single-scale structuring element is employed.

Image Segmentation Segmentation +1

Adversarial Domain Adaptation for Duplicate Question Detection

1 code implementation EMNLP 2018 Darsh J Shah, Tao Lei, Alessandro Moschitti, Salvatore Romeo, Preslav Nakov

We address the problem of detecting duplicate questions in forums, which is an important step towards automating the process of answering new questions.

Domain Adaptation Question Similarity

Significantly Fast and Robust Fuzzy C-MeansClustering Algorithm Based on MorphologicalReconstruction and Membership Filtering

no code implementations IEEE 2018 Tao Lei, Xiaohong Jia, Yanning Zhang, Lifeng He, Hongy-ing Meng, Senior Member, and Asoke K. Nandi, Fellow, IEEE

However, the introduction oflocal spatial information often leads to a high computationalcomplexity, arising out of an iterative calculation of the distancebetween pixels within local spatial neighbors and clusteringcenters.

Clustering Image Segmentation +1

Training RNNs as Fast as CNNs

2 code implementations ICLR 2018 Tao Lei, Yu Zhang, Yoav Artzi

Common recurrent neural network architectures scale poorly due to the intrinsic difficulty in parallelizing their state computations.

General Classification Language Modeling +5

Style Transfer from Non-Parallel Text by Cross-Alignment

12 code implementations NeurIPS 2017 Tianxiao Shen, Tao Lei, Regina Barzilay, Tommi Jaakkola

We demonstrate the effectiveness of this cross-alignment method on three tasks: sentiment modification, decipherment of word substitution ciphers, and recovery of word order.

Decipherment Machine Translation +3

Deriving Neural Architectures from Sequence and Graph Kernels

no code implementations ICML 2017 Tao Lei, Wengong Jin, Regina Barzilay, Tommi Jaakkola

The design of neural architectures for structured objects is typically guided by experimental insights rather than a formal process.

Graph Regression Language Modeling +2

Rationalizing Neural Predictions

3 code implementations EMNLP 2016 Tao Lei, Regina Barzilay, Tommi Jaakkola

Our approach combines two modular components, generator and encoder, which are trained to operate well together.

Prediction Retrieval +1

Semi-supervised Question Retrieval with Gated Convolutions

1 code implementation NAACL 2016 Tao Lei, Hrishikesh Joshi, Regina Barzilay, Tommi Jaakkola, Katerina Tymoshenko, Alessandro Moschitti, Lluis Marquez

Question answering forums are rapidly growing in size with no effective automated ability to refer to and reuse answers already available for previous posted questions.

Decoder Question Answering +1

Cannot find the paper you are looking for? You can Submit a new open access paper.