We therefore suggest that task-specific data annotation should be part of an economical strategy when adapting an NLP model to a new domain.
We categorize examples in our corpus, and use these categories in a novel model that allows us to target specific regions of the input sentence to be split and edited.
From the perspective of climate dynamics, these findings suggest a dominant role for local processes and a negligible role for remote teleconnections at the spatial and temporal scales we consider.
In this paper, we propose to leverage the large-scale hyperlinks and anchor texts to pre-train the language model for ad-hoc retrieval.
To enable the discrimination of RFF from both known and unknown devices, we propose a new end-to-end deep learning framework for extracting RFFs from raw received signals.
1 code implementation • 19 Jul 2021 • Dawei Du, Longyin Wen, Pengfei Zhu, Heng Fan, QinGhua Hu, Haibin Ling, Mubarak Shah, Junwen Pan, Ali Al-Ali, Amr Mohamed, Bakour Imene, Bin Dong, Binyu Zhang, Bouchali Hadia Nesma, Chenfeng Xu, Chenzhen Duan, Ciro Castiello, Corrado Mencar, Dingkang Liang, Florian Krüger, Gennaro Vessio, Giovanna Castellano, Jieru Wang, Junyu Gao, Khalid Abualsaud, Laihui Ding, Lei Zhao, Marco Cianciotta, Muhammad Saqib, Noor Almaadeed, Omar Elharrouss, Pei Lyu, Qi Wang, Shidong Liu, Shuang Qiu, Siyang Pan, Somaya Al-Maadeed, Sultan Daud Khan, Tamer Khattab, Tao Han, Thomas Golda, Wei Xu, Xiang Bai, Xiaoqing Xu, Xuelong Li, Yanyun Zhao, Ye Tian, Yingnan Lin, Yongchao Xu, Yuehan Yao, Zhenyu Xu, Zhijian Zhao, Zhipeng Luo, Zhiwei Wei, Zhiyuan Zhao
Crowd counting on the drone platform is an interesting topic in computer vision, which brings new challenges such as small object inference, background clutter and wide viewpoint.
Monolingual word alignment is important for studying fine-grained editing operations (i. e., deletion, addition, and substitution) in text-to-text generation tasks, such as paraphrase generation, text simplification, neutralizing biased language, etc.
We propose alternative methods that can help overcome these limitations and effectively help HCI professionals apply the HCAI approach to the development of AI systems.
Current weakly-supervised counting methods adopt the CNN to regress a total count of the crowd by an image-to-count paradigm.
Recently, particle-based variational inference (ParVI) methods have gained interest because they can avoid arbitrary parametric assumptions that are common in variational inference.
Our proposed framework is composed of two parts: the filter-based odometry and factor graph optimization.
Sensor Fusion Robotics
no code implementations • 2 Feb 2021 • Sebastian Gehrmann, Tosin Adewumi, Karmanya Aggarwal, Pawan Sasanka Ammanamanchi, Aremu Anuoluwapo, Antoine Bosselut, Khyathi Raghavi Chandu, Miruna Clinciu, Dipanjan Das, Kaustubh D. Dhole, Wanyu Du, Esin Durmus, Ondřej Dušek, Chris Emezue, Varun Gangal, Cristina Garbacea, Tatsunori Hashimoto, Yufang Hou, Yacine Jernite, Harsh Jhamtani, Yangfeng Ji, Shailza Jolly, Mihir Kale, Dhruv Kumar, Faisal Ladhak, Aman Madaan, Mounica Maddela, Khyati Mahajan, Saad Mahamood, Bodhisattwa Prasad Majumder, Pedro Henrique Martins, Angelina McMillan-Major, Simon Mille, Emiel van Miltenburg, Moin Nadeem, Shashi Narayan, Vitaly Nikolaev, Rubungo Andre Niyongabo, Salomey Osei, Ankur Parikh, Laura Perez-Beltrachini, Niranjan Ramesh Rao, Vikas Raunak, Juan Diego Rodriguez, Sashank Santhanam, João Sedoc, Thibault Sellam, Samira Shaikh, Anastasia Shimorina, Marco Antonio Sobrevilla Cabezudo, Hendrik Strobelt, Nishant Subramani, Wei Xu, Diyi Yang, Akhila Yerukola, Jiawei Zhou
We introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics.
Ranked #1 on Data-to-Text Generation on WebNLG ru
We propose a hierarchical reinforcement learning method, HIDIO, that can learn task-agnostic options in a self-supervised manner while jointly learning to utilize them to solve sparse-reward tasks.
In both letter-level and word-level attacks, our experiments show that in addition to natural appearance, FAWA achieves a 100% attack success rate with 60% less perturbations and 78% fewer iterations on average.
Meanwhile, such applications usually require modeling the intrinsic clusters in high-dimensional data, which usually displays heterogeneous statistical patterns as the patterns of different clusters may appear in different dimensions.
To address this problem, we present a one-shot framework for organ and landmark localization in volumetric medical images, which does not need any annotation during the training stage and could be employed to locate any landmarks or organs in test images given a support (reference) image during the inference stage.
This paper presents the results of the wet lab information extraction task at WNUT 2020.
Ranked #1 on Relation Extraction on WNUT 2020
Text Simplification improves the readability of sentences through several rewriting transformations, such as lexical paraphrasing, deletion, and splitting.
Federated learning (FL) in a bandwidth-limited network with energy-limited user equipments (UEs) is under-explored.
To lower the computation load in the presence of large number of measurements, we present a new formula to compute the Kalman gain.
In order to provide adaptive and user-friendly solutions to robotic manipulation, it is important that the agent can learn to accomplish tasks even if they are only provided with very sparse instruction signals.
It is shown that this new paradigm is much simpler and more natural than existing methods based on quaternion parameterizations.
To solve this problem, a dual method is proposed, where the dual problem is obtained as a semidefinite programming problem.
An unmanned aerial vehicle (UAV)-aided mobile edge computing (MEC) framework is proposed, where several UAVs having different trajectories fly over the target area and support the user equipments (UEs) on the ground.
Existing interactive visualization tools for deep learning are mostly applied to the training, debugging, and refinement of neural network models working on natural images.
Accurate channel state information (CSI) feedback plays a vital role in improving the performance gain of massive multiple-input multiple-output (m-MIMO) systems, where the dilemma is excessive CSI overhead versus limited feedback bandwith.
Simulation results verify the correctness of the obtained results and show that the proposed GA method has almost the same performance as the globally optimal solution.
This letter considers an unmanned aerial vehicle (UAV)-enabled relay communication system for delivering latency-critical messages with ultra-high reliability, where the relay is operating under amplifier-and-forward (AF) mode.
In this paper, the intelligent reflecting surface (IRS)-assisted unmanned aerial vehicle (UAV) communication system is studied, where an UAV is deployed to serve the user equipments (UEs) with the assistance of multiple IRSs mounted on several buildings to enhance the communication quality between UAV and UEs.
Meanwhile, we propose Gated-RGCN to accumulate evidence on the path-based reasoning graph, which contains a new question-aware gating mechanism to regulate the usefulness of information propagating across documents and add question information during reasoning.
We present a corpus of 7, 500 tweets annotated with COVID-19 events, including positive test results, denied access to testing, and more.
Building compact convolutional neural networks (CNNs) with reliable performance is a critical but challenging task, especially when deploying them in real-world applications.
The success of a text simplification system heavily depends on the quality and quantity of complex-simple sentence pairs in the training corpus, which are extracted by aligning sentences between parallel articles.
Ranked #1 on Text Simplification on Newsela
We also present the SoftNER model which achieves an overall 79. 10 F$_1$ score for code and named entity recognition on StackOverflow data.
In this network, multiple RISs are spatially distributed to serve wireless users and the energy efficiency of the network is maximized by dynamically controlling the on-off status of each RIS as well as optimizing the reflection coefficients matrix of the RISs.
Multilingual pre-trained Transformers, such as mBERT (Devlin et al., 2019) and XLM-RoBERTa (Conneau et al., 2020a), have been shown to enable the effective cross-lingual zero-shot transfer.
Based on the datasets, we propose novel tasks such as multi-hop knowledge abstraction (MKA), multi-hop knowledge concretization (MKC) and then design a comprehensive benchmark.
The performance of the model is assessed using simulations and applied to a human microbiome study, with results compared against a number of existing machine learning and distance-based approaches.
By taking full advantage of Computing, Communication and Caching (3C) resources at the network edge, Mobile Edge Computing (MEC) is envisioned as one of the key enablers for the next generation networks.
In reinforcement learning, an agent learns to reach a set of goals by means of an external reward signal.
Using the Galaxy Zoo dataset we demonstrate that our method clearly reveals attention areas of the Discriminator when differentiating generated galaxy images from ground truth images.
We inspect various document and discourse factors associated with sentence deletion, using a new manually annotated sentence alignment corpus we collected.
Each random draw from our generative model is a neural network that instantiates the dynamic function, hence multiple draws would approximate the posterior, and the variance in the future prediction based on this posterior is used as an intrinsic reward for exploration.
Natural language processing covers a wide variety of tasks predicting syntax, semantics, and information content, and usually each type of output is generated with specially designed architectures.
Ranked #1 on Relation Extraction on WLPC
In this paper, we consider a platform of flying mobile edge computing (F-MEC), where unmanned aerial vehicles (UAVs) serve as equipment providing computation resource, and they enable task offloading from user equipment (UE).
This extended abstract presents a visualization system, which is designed for domain scientists to visually understand their deep learning model of extracting multiple attributes in x-ray scattering images.
Virtual advertising is an important and promising feature in the area of online advertising.
Quantized channel state information (CSI) plays a critical role in precoding design which helps reap the merits of multiple-input multiple-output (MIMO) technology.
Information Theory Signal Processing Information Theory
In this research, we propose CAMEL, a weakly supervised learning framework for histopathology image segmentation using only image-level labels.
Both the feedback controller and the iterative learning feed-forward controller are based on the aircraft acceleration model, which is directly measurable by the onboard accelerometer.
Systems and Control
Hashtags are often employed on social media and beyond to add metadata to a textual utterance with the goal of increasing discoverability, aiding search, or providing additional semantics.
Such techniques involve replacing an existing advertisement in a video frame, with a new advertisement.
Most existing event extraction (EE) methods merely extract event arguments within the sentence scope.
The rapid increase in the number of online videos provides the marketing and advertising agents ample opportunities to reach out to their audience.
In this paper we present our scientific discovery that good representation can be learned via continuous attention during the interaction between Unsupervised Learning(UL) and Reinforcement Learning(RL) modules driven by intrinsic motivation.
With the advent of faster internet services and growth of multimedia content, we observe a massive growth in the number of online videos.
Online video advertising gives content providers the ability to deliver compelling content, reach a growing audience, and generate additional revenue from online media.
In this paper, we propose a class of robust stochastic subgradient methods for distributed learning from heterogeneous datasets at presence of an unknown number of Byzantine workers.
To demonstrate the effectiveness of DIAG-NRE, we apply it to two real-world datasets and present both significant and interpretable improvements over state-of-the-art methods.
Therefore, toaccelerate this research, we propose a newzero-shot transfer VQA(ZST-VQA)dataset by reorganizing the existing VQA v1. 0 dataset in the way that duringtraining, some words appear only in one module (i. e. questions) but not in theother (i. e. answers).
Performance on the five tasks of depth estimation, optical flow estimation, odometry, moving object segmentation and scene flow estimation shows that our approach outperforms other SoTA methods.
Current lexical simplification approaches rely heavily on heuristics and corpus level features that do not always align with human judgment.
Then the whole scene is decomposed into moving foreground and static background by compar- ing the estimated optical flow and rigid flow derived from the depth and ego-motion.
The separation of the task requires to define a hand-crafted training goal in affinity learning stage and a hand-crafted cost function of data association stage, which prevents the tracking goals from learning directly from the feature.
The four types of information, i. e. 2D flow, camera pose, segment mask and depth maps, are integrated into a differentiable holistic 3D motion parser (HMP), where per-pixel 3D motion for rigid background and moving objects are recovered.
In this paper, we analyze several neural network designs (and their variations) for sentence pair modeling and compare their performance extensively across eight datasets, including paraphrase identification, semantic textual similarity, natural language inference, and question answering tasks.
Ranked #1 on Paraphrase Identification on 2017_test set
Recently there has been a rising interest in training agents, embodied in virtual environments, to perform language-directed tasks by deep reinforcement learning.
Sentence pair modeling is critical for many NLP tasks, such as paraphrase identification, semantic textual similarity, and natural language inference.
The uniqueness of our design is a sensor fusion scheme which integrates camera videos, motion sensors (GPS/IMU), and a 3D semantic map in order to achieve robustness and efficiency of the system.
The Conflux consensus protocol represents relationships between blocks as a direct acyclic graph and achieves consensus on a total order of the blocks.
Distributed, Parallel, and Cluster Computing
Building intelligent agents that can communicate with and learn from humans in natural language is of great value.
The development of finger vein recognition algorithms heavily depends on large-scale real-world data sets.
Especially on KITTI dataset where abundant unlabeled samples exist, our unsupervised method outperforms its counterpart trained with supervised learning.
Learning to reconstruct depths in a single image by watching unlabeled videos via deep convolutional network (DCN) is attracting significant attention in recent years.
Several pioneering approaches have been proposed based on traffic observations of the target location as well as its adjacent regions, but they obtain somewhat limited accuracy due to lack of mining road topology.
As natural language processing research is growing and largely driven by the availability of data, we expanded research from news and small-scale dialog corpora to web and social media.
The main advantage of our method is its simplicity, as it gets rid of the classifier or human in the loop needed to select data before annotation and subsequent application of paraphrase identification algorithms in the previous work.
This paper presents two unsupervised learning layers (UL layers) for label-free video analysis: one for fully connected layers, and the other for convolutional ones.
We believe that our results provide some preliminary insights on how to train an agent with similar abilities in a 3D environment.
We investigate the task of inferring conversational dependencies between messages in one-on-one online chat, which has become one of the most popular forms of customer service.
This paper presents the results of the Twitter Named Entity Recognition shared task associated with W-NUT 2016: a named entity tagging task with 10 teams participating.
In this paper, we propose a method to automatically and incrementally construct datasets from massive weakly labeled data of the target domain which are readily available on the Internet under the help of a pretrained face model.
However, we observe that directly feeding the hallucinated facial images into recog- nition models can even degrade the recognition performance despite the much better visualization quality.
While recent neural machine translation approaches have delivered state-of-the-art performance for resource-rich language pairs, they suffer from the data scarcity problem for resource-scarce language pairs.
While question answering (QA) with neural network, i. e. neural QA, has achieved promising results in recent years, lacking of large scale real-word QA dataset is still a challenge for developing and evaluating neural QA system.
While end-to-end neural machine translation (NMT) has made remarkable progress recently, NMT systems only rely on parallel corpora for parameter estimation.
On the WMT'14 English-to-French task, we achieve BLEU=37. 7 with a single attention model, which outperforms the corresponding single shallow model by 6. 2 BLEU points.
Ranked #33 on Machine Translation on WMT2014 English-French
While deep convolutional neural networks (CNNs) have shown a great success in single-label image classification, it is important to note that real world images generally contain multiple labels, which could correspond to different objects, scenes, actions and attributes in an image.
In modern large-scale machine learning applications, the training data are often partitioned and stored on multiple machines.
Most recent sentence simplification systems use basic machine translation models to learn lexical and syntactic paraphrases from a manually simplified parallel corpus.
Ranked #6 on Text Simplification on TurkCorpus
The quality of the generated answers of our mQA model on this dataset is evaluated by human judges through a Turing Test.
While feedforward deep convolutional neural networks (CNNs) have been a great success in computer vision, it is important to remember that the human visual contex contains generally more feedback connections than foward connections.
ABC-CNN determines an attention map for an image-question pair by convolving the image feature map with configurable convolutional kernels derived from the question's semantics.
We adapt a state-of-the-art semantic image segmentation model, which we jointly train with multi-scale input images and the attention model.
The quality of the generated answers of our mQA model on this dataset is evaluated by human judges through a Turing Test.
In particular, we propose a transposed weight sharing scheme, which not only improves performance on image captioning, but also makes the model more suitable for the novel concept learning task.
In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel image captions.
In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel sentence descriptions to explain the content of images.
We present MultiP (Multi-instance Learning Paraphrase Model), a new model suited to identify paraphrases within the short messages on Twitter.