Biases continue to be prevalent in modern text and media, especially subjective bias – a special type of bias that introduces improper attitudes or presents a statement with the presupposition of truth.
Given the fixed point equation (FPE) derived from the variational inference on the Markov random fields, the deep GNNs, including JKNet, GCNII, DGCN, and the classical GNNs, such as GCN, GAT, and APPNP, can be regarded as different approximations of the FPE.
We present a human-in-the-loop evaluation framework for fact-checking novel misinformation claims and identifying social media messages that violate relevant policies.
Translating training data into many languages has emerged as a practical solution for improving cross-lingual transfer.
We present Stanceosaurus, a new corpus of 28, 033 tweets in English, Hindi, and Arabic annotated with stance towards 251 misinformation claims.
We first introduce arXivEdits, a new annotated corpus of 751 full papers from arXiv with gold sentence alignment across their multiple versions of revision, as well as fine-grained span-level edits and their underlying intentions for 1, 000 sentence pairs.
However, the gaps between contents and difficulties of different tasks bring us challenges on both which tasks should share the parameters and what parameters should be shared, as well as the optimization challenges due to parameter sharing.
This paper addresses the quality issues in existing Twitter-based paraphrase datasets, and discusses the necessity of using two separate definitions of paraphrase for identification and generation tasks.
By treating K as a variable that can be adjusted according to a fitting function of some learnable coefficients, an intelligent MIMO detection network based on deep neural networks (DNN) is proposed to reduce complexity of the detection algorithm with little performance degradation.
Deep learning (DL) applied to a device's radio-frequency fingerprint~(RFF) has attracted significant attention in physical-layer authentication due to its extraordinary classification performance.
In this paper, a new semi-supervised deep multiple-input multiple-output (MIMO) detection approach using a cycle-consistent generative adversarial network (cycleGAN) is proposed for communication systems without any prior knowledge of underlying channel distributions.
Accurate segmentation of Anatomical brain Barriers to Cancer spread (ABCs) plays an important role for automatic delineation of Clinical Target Volume (CTV) of brain tumors in radiotherapy.
In this paper, we consider a multiuser mobile edge computing (MEC) system, where a mixed-integer offloading strategy is used to assist the resource assignment for task offloading.
Since model bias and associated initialization shock are serious shortcomings that reduce prediction skills in state-of-the-art decadal climate prediction efforts, we pursue a complementary machine-learning-based approach to climate prediction.
While natural gradients have been widely studied from both theoretical and empirical perspectives, we argue that some fundamental theoretical issues regarding the existence of gradients in infinite dimensional function spaces remain underexplored.
On 12 Safety Gym tasks and 2 safe racing tasks, SEditor obtains much a higher overall safety-weighted-utility (SWU) score than the baselines, and demonstrates outstanding utility performance with constraint violation rates as low as once per 2k time steps, even in obstacle-dense environments.
On the other hand, our large-scale empirical study shows that using entropy regularization alone in policy improvement, leads to comparable or even better performance and robustness than using it in both policy improvement and policy evaluation.
GPM can therefore leverage its generated multi-step plans for temporally coordinated exploration towards high value regions, which is potentially more effective than a sequence of actions generated by perturbing each action at single step level, whose consistent movement decays exponentially with the number of exploration steps.
With the help of the concept and framework, the paper analyzes the human factors issues in the ecosystem of autonomous vehicle co-driving and proposes an initial human factors solution.
To enable robots to instruct humans in collaborations, we identify several aspects of language processing that are not commonly studied in this context.
In practice, discounted episodic return from the training experience or discounted goal return from hindsight relabeling can serve as the value lower bound when the environment is deterministic.
We categorize examples in our corpus, and use these categories in a novel model that allows us to target specific regions of the input sentence to be split and edited.
From the perspective of climate dynamics, these findings suggest a dominant role for local processes and a negligible role for remote teleconnections at the spatial and temporal scales we consider.
In this paper, we propose to leverage the large-scale hyperlinks and anchor texts to pre-train the language model for ad-hoc retrieval.
To enable the discrimination of RFF from both known and unknown devices, we propose a new end-to-end deep learning framework for extracting RFFs from raw received signals.
1 code implementation • 19 Jul 2021 • Dawei Du, Longyin Wen, Pengfei Zhu, Heng Fan, QinGhua Hu, Haibin Ling, Mubarak Shah, Junwen Pan, Ali Al-Ali, Amr Mohamed, Bakour Imene, Bin Dong, Binyu Zhang, Bouchali Hadia Nesma, Chenfeng Xu, Chenzhen Duan, Ciro Castiello, Corrado Mencar, Dingkang Liang, Florian Krüger, Gennaro Vessio, Giovanna Castellano, Jieru Wang, Junyu Gao, Khalid Abualsaud, Laihui Ding, Lei Zhao, Marco Cianciotta, Muhammad Saqib, Noor Almaadeed, Omar Elharrouss, Pei Lyu, Qi Wang, Shidong Liu, Shuang Qiu, Siyang Pan, Somaya Al-Maadeed, Sultan Daud Khan, Tamer Khattab, Tao Han, Thomas Golda, Wei Xu, Xiang Bai, Xiaoqing Xu, Xuelong Li, Yanyun Zhao, Ye Tian, Yingnan Lin, Yongchao Xu, Yuehan Yao, Zhenyu Xu, Zhijian Zhao, Zhipeng Luo, Zhiwei Wei, Zhiyuan Zhao
Crowd counting on the drone platform is an interesting topic in computer vision, which brings new challenges such as small object inference, background clutter and wide viewpoint.
Monolingual word alignment is important for studying fine-grained editing operations (i. e., deletion, addition, and substitution) in text-to-text generation tasks, such as paraphrase generation, text simplification, neutralizing biased language, etc.
We propose alternative methods that can help overcome these limitations and effectively help HCI professionals apply the HCAI approach to the development of AI systems.
Current weakly-supervised counting methods adopt the CNN to regress a total count of the crowd by an image-to-count paradigm.
TAAC has two important features: a) persistent exploration, and b) a new compare-through Q operator for multi-step TD backup, specially tailored to the action repetition scenario.
Recently, particle-based variational inference (ParVI) methods have gained interest because they can avoid arbitrary parametric assumptions that are common in variational inference.
Our proposed framework is composed of two parts: the filter-based odometry and factor graph optimization.
Most regression-based methods utilize convolution neural networks (CNN) to regress a density map, which can not accurately locate the instance in the extremely dense scene, attributed to two crucial reasons: 1) the density map consists of a series of blurry Gaussian blobs, 2) severe overlaps exist in the dense region of the density map.
no code implementations • • Sebastian Gehrmann, Tosin Adewumi, Karmanya Aggarwal, Pawan Sasanka Ammanamanchi, Aremu Anuoluwapo, Antoine Bosselut, Khyathi Raghavi Chandu, Miruna Clinciu, Dipanjan Das, Kaustubh D. Dhole, Wanyu Du, Esin Durmus, Ondřej Dušek, Chris Emezue, Varun Gangal, Cristina Garbacea, Tatsunori Hashimoto, Yufang Hou, Yacine Jernite, Harsh Jhamtani, Yangfeng Ji, Shailza Jolly, Mihir Kale, Dhruv Kumar, Faisal Ladhak, Aman Madaan, Mounica Maddela, Khyati Mahajan, Saad Mahamood, Bodhisattwa Prasad Majumder, Pedro Henrique Martins, Angelina McMillan-Major, Simon Mille, Emiel van Miltenburg, Moin Nadeem, Shashi Narayan, Vitaly Nikolaev, Rubungo Andre Niyongabo, Salomey Osei, Ankur Parikh, Laura Perez-Beltrachini, Niranjan Ramesh Rao, Vikas Raunak, Juan Diego Rodriguez, Sashank Santhanam, João Sedoc, Thibault Sellam, Samira Shaikh, Anastasia Shimorina, Marco Antonio Sobrevilla Cabezudo, Hendrik Strobelt, Nishant Subramani, Wei Xu, Diyi Yang, Akhila Yerukola, Jiawei Zhou
We introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics.
Ranked #1 on Extreme Summarization on GEM-XSum
We propose a hierarchical reinforcement learning method, HIDIO, that can learn task-agnostic options in a self-supervised manner while jointly learning to utilize them to solve sparse-reward tasks.
Meanwhile, such applications usually require modeling the intrinsic clusters in high-dimensional data, which usually displays heterogeneous statistical patterns as the patterns of different clusters may appear in different dimensions.
In both letter-level and word-level attacks, our experiments show that in addition to natural appearance, FAWA achieves a 100% attack success rate with 60% less perturbations and 78% fewer iterations on average.
To address this problem, we present a one-shot framework for organ and landmark localization in volumetric medical images, which does not need any annotation during the training stage and could be employed to locate any landmarks or organs in test images given a support (reference) image during the inference stage.
This paper presents the results of the wet lab information extraction task at WNUT 2020.
Ranked #1 on Relation Extraction on WNUT 2020
Text Simplification improves the readability of sentences through several rewriting transformations, such as lexical paraphrasing, deletion, and splitting.
Federated learning (FL) in a bandwidth-limited network with energy-limited user equipments (UEs) is under-explored.
To lower the computation load in the presence of large number of measurements, we present a new formula to compute the Kalman gain.
In order to provide adaptive and user-friendly solutions to robotic manipulation, it is important that the agent can learn to accomplish tasks even if they are only provided with very sparse instruction signals.
It is shown that this new paradigm is much simpler and more natural than existing methods based on quaternion parameterizations.
To solve this problem, a dual method is proposed, where the dual problem is obtained as a semidefinite programming problem.
An unmanned aerial vehicle (UAV)-aided mobile edge computing (MEC) framework is proposed, where several UAVs having different trajectories fly over the target area and support the user equipments (UEs) on the ground.
Existing interactive visualization tools for deep learning are mostly applied to the training, debugging, and refinement of neural network models working on natural images.
Accurate channel state information (CSI) feedback plays a vital role in improving the performance gain of massive multiple-input multiple-output (m-MIMO) systems, where the dilemma is excessive CSI overhead versus limited feedback bandwith.
Simulation results verify the correctness of the obtained results and show that the proposed GA method has almost the same performance as the globally optimal solution.
This letter considers an unmanned aerial vehicle (UAV)-enabled relay communication system for delivering latency-critical messages with ultra-high reliability, where the relay is operating under amplifier-and-forward (AF) mode.
Meanwhile, we propose Gated-RGCN to accumulate evidence on the path-based reasoning graph, which contains a new question-aware gating mechanism to regulate the usefulness of information propagating across documents and add question information during reasoning.
In this paper, we present a manually annotated corpus of 10, 000 tweets containing public reports of five COVID-19 events, including positive and negative tests, deaths, denied access to testing, claimed cures and preventions.
Building compact convolutional neural networks (CNNs) with reliable performance is a critical but challenging task, especially when deploying them in real-world applications.
The success of a text simplification system heavily depends on the quality and quantity of complex-simple sentence pairs in the training corpus, which are extracted by aligning sentences between parallel articles.
Ranked #1 on Text Simplification on Newsela
We also present the SoftNER model which achieves an overall 79. 10 F$_1$ score for code and named entity recognition on StackOverflow data.
In this network, multiple RISs are spatially distributed to serve wireless users and the energy efficiency of the network is maximized by dynamically controlling the on-off status of each RIS as well as optimizing the reflection coefficients matrix of the RISs.
Multilingual pre-trained Transformers, such as mBERT (Devlin et al., 2019) and XLM-RoBERTa (Conneau et al., 2020a), have been shown to enable the effective cross-lingual zero-shot transfer.
Based on the datasets, we propose novel tasks such as multi-hop knowledge abstraction (MKA), multi-hop knowledge concretization (MKC) and then design a comprehensive benchmark.
The performance of the model is assessed using simulations and applied to a human microbiome study, with results compared against a number of existing machine learning and distance-based approaches.
We propose several methods that incorporate both structured and textual information to represent relations for this task.
By taking full advantage of Computing, Communication and Caching (3C) resources at the network edge, Mobile Edge Computing (MEC) is envisioned as one of the key enablers for the next generation networks.
In reinforcement learning, an agent learns to reach a set of goals by means of an external reward signal.
Using the Galaxy Zoo dataset we demonstrate that our method clearly reveals attention areas of the Discriminator when differentiating generated galaxy images from ground truth images.
We inspect various document and discourse factors associated with sentence deletion, using a new manually annotated sentence alignment corpus we collected.
Each random draw from our generative model is a neural network that instantiates the dynamic function, hence multiple draws would approximate the posterior, and the variance in the future prediction based on this posterior is used as an intrinsic reward for exploration.
Natural language processing covers a wide variety of tasks predicting syntax, semantics, and information content, and usually each type of output is generated with specially designed architectures.
Ranked #1 on Relation Extraction on WLPC
In this paper, we consider a platform of flying mobile edge computing (F-MEC), where unmanned aerial vehicles (UAVs) serve as equipment providing computation resource, and they enable task offloading from user equipment (UE).
This extended abstract presents a visualization system, which is designed for domain scientists to visually understand their deep learning model of extracting multiple attributes in x-ray scattering images.
Virtual advertising is an important and promising feature in the area of online advertising.
Our results show that the mutual information between the context states and the states of interest can be an effective ingredient for overcoming challenges in robotic manipulation tasks with sparse rewards.
Quantized channel state information (CSI) plays a critical role in precoding design which helps reap the merits of multiple-input multiple-output (MIMO) technology.
Information Theory Signal Processing Information Theory
In this research, we propose CAMEL, a weakly supervised learning framework for histopathology image segmentation using only image-level labels.
Both the feedback controller and the iterative learning feed-forward controller are based on the aircraft acceleration model, which is directly measurable by the onboard accelerometer.
Systems and Control
Hashtags are often employed on social media and beyond to add metadata to a textual utterance with the goal of increasing discoverability, aiding search, or providing additional semantics.
Such techniques involve replacing an existing advertisement in a video frame, with a new advertisement.
Most existing event extraction (EE) methods merely extract event arguments within the sentence scope.
Ranked #4 on Document-level Event Extraction on ChFinAnn
The rapid increase in the number of online videos provides the marketing and advertising agents ample opportunities to reach out to their audience.
In this paper we present our scientific discovery that good representation can be learned via continuous attention during the interaction between Unsupervised Learning(UL) and Reinforcement Learning(RL) modules driven by intrinsic motivation.
With the advent of faster internet services and growth of multimedia content, we observe a massive growth in the number of online videos.
Online video advertising gives content providers the ability to deliver compelling content, reach a growing audience, and generate additional revenue from online media.
In this paper, we propose a class of robust stochastic subgradient methods for distributed learning from heterogeneous datasets at presence of an unknown number of Byzantine workers.
To demonstrate the effectiveness of DIAG-NRE, we apply it to two real-world datasets and present both significant and interpretable improvements over state-of-the-art methods.
Therefore, toaccelerate this research, we propose a newzero-shot transfer VQA(ZST-VQA)dataset by reorganizing the existing VQA v1. 0 dataset in the way that duringtraining, some words appear only in one module (i. e. questions) but not in theother (i. e. answers).
Performance on the five tasks of depth estimation, optical flow estimation, odometry, moving object segmentation and scene flow estimation shows that our approach outperforms other SoTA methods.
Current lexical simplification approaches rely heavily on heuristics and corpus level features that do not always align with human judgment.
Then the whole scene is decomposed into moving foreground and static background by compar- ing the estimated optical flow and rigid flow derived from the depth and ego-motion.
The separation of the task requires to define a hand-crafted training goal in affinity learning stage and a hand-crafted cost function of data association stage, which prevents the tracking goals from learning directly from the feature.
The four types of information, i. e. 2D flow, camera pose, segment mask and depth maps, are integrated into a differentiable holistic 3D motion parser (HMP), where per-pixel 3D motion for rigid background and moving objects are recovered.
In this paper, we analyze several neural network designs (and their variations) for sentence pair modeling and compare their performance extensively across eight datasets, including paraphrase identification, semantic textual similarity, natural language inference, and question answering tasks.
Ranked #1 on Paraphrase Identification on 2017_test set
Recently there has been a rising interest in training agents, embodied in virtual environments, to perform language-directed tasks by deep reinforcement learning.
Sentence pair modeling is critical for many NLP tasks, such as paraphrase identification, semantic textual similarity, and natural language inference.
The uniqueness of our design is a sensor fusion scheme which integrates camera videos, motion sensors (GPS/IMU), and a 3D semantic map in order to achieve robustness and efficiency of the system.
The Conflux consensus protocol represents relationships between blocks as a direct acyclic graph and achieves consensus on a total order of the blocks.
Distributed, Parallel, and Cluster Computing
We make our annotated Wet Lab Protocol Corpus available to the research community.
Building intelligent agents that can communicate with and learn from humans in natural language is of great value.
The development of finger vein recognition algorithms heavily depends on large-scale real-world data sets.
Especially on KITTI dataset where abundant unlabeled samples exist, our unsupervised method outperforms its counterpart trained with supervised learning.
Learning to reconstruct depths in a single image by watching unlabeled videos via deep convolutional network (DCN) is attracting significant attention in recent years.
Several pioneering approaches have been proposed based on traffic observations of the target location as well as its adjacent regions, but they obtain somewhat limited accuracy due to lack of mining road topology.
As natural language processing research is growing and largely driven by the availability of data, we expanded research from news and small-scale dialog corpora to web and social media.
The main advantage of our method is its simplicity, as it gets rid of the classifier or human in the loop needed to select data before annotation and subsequent application of paraphrase identification algorithms in the previous work.
This paper presents two unsupervised learning layers (UL layers) for label-free video analysis: one for fully connected layers, and the other for convolutional ones.
We propose a dynamic computational time model to accelerate the average processing time for recurrent visual attention (RAM).
We believe that our results provide some preliminary insights on how to train an agent with similar abilities in a 3D environment.
We investigate the task of inferring conversational dependencies between messages in one-on-one online chat, which has become one of the most popular forms of customer service.
This paper presents the results of the Twitter Named Entity Recognition shared task associated with W-NUT 2016: a named entity tagging task with 10 teams participating.
In this paper, we propose a method to automatically and incrementally construct datasets from massive weakly labeled data of the target domain which are readily available on the Internet under the help of a pretrained face model.
However, we observe that directly feeding the hallucinated facial images into recog- nition models can even degrade the recognition performance despite the much better visualization quality.
While recent neural machine translation approaches have delivered state-of-the-art performance for resource-rich language pairs, they suffer from the data scarcity problem for resource-scarce language pairs.
While question answering (QA) with neural network, i. e. neural QA, has achieved promising results in recent years, lacking of large scale real-word QA dataset is still a challenge for developing and evaluating neural QA system.
While end-to-end neural machine translation (NMT) has made remarkable progress recently, NMT systems only rely on parallel corpora for parameter estimation.
On the WMT'14 English-to-French task, we achieve BLEU=37. 7 with a single attention model, which outperforms the corresponding single shallow model by 6. 2 BLEU points.
Ranked #37 on Machine Translation on WMT2014 English-French
While deep convolutional neural networks (CNNs) have shown a great success in single-label image classification, it is important to note that real world images generally contain multiple labels, which could correspond to different objects, scenes, actions and attributes in an image.
In modern large-scale machine learning applications, the training data are often partitioned and stored on multiple machines.
Most recent sentence simplification systems use basic machine translation models to learn lexical and syntactic paraphrases from a manually simplified parallel corpus.
Ranked #7 on Text Simplification on TurkCorpus
While feedforward deep convolutional neural networks (CNNs) have been a great success in computer vision, it is important to remember that the human visual contex contains generally more feedback connections than foward connections.
The quality of the generated answers of our mQA model on this dataset is evaluated by human judges through a Turing Test.
ABC-CNN determines an attention map for an image-question pair by convolving the image feature map with configurable convolutional kernels derived from the question's semantics.
We adapt a state-of-the-art semantic image segmentation model, which we jointly train with multi-scale input images and the attention model.
The quality of the generated answers of our mQA model on this dataset is evaluated by human judges through a Turing Test.
In particular, we propose a transposed weight sharing scheme, which not only improves performance on image captioning, but also makes the model more suitable for the novel concept learning task.
In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel image captions.
In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel sentence descriptions to explain the content of images.
Multiple damage identification in beams using curvature mode shape has become a research focus of increasing interest during the last few years.
We present MultiP (Multi-instance Learning Paraphrase Model), a new model suited to identify paraphrases within the short messages on Twitter.
Polyak and Juditsky (1992) showed that asymptotically the test performance of the simple average of the parameters obtained by stochastic gradient descent (SGD) is as good as that of the parameters which minimize the empirical cost.