Large language models internalize enormous parametric knowledge during pre-training.
Intensity-modulation and direct-detection (IM/DD) transmission is widely adopted for high-speed optical transmission scenarios due to its cost-effectiveness and simplicity.
Recent advancements in large language models (LLMs) have led to the creation of intelligent agents capable of performing complex tasks.
1 code implementation • 17 Nov 2023 • Thomas M. Moerland, Matthias Müller-Brockhausen, Zhao Yang, Andrius Bernatavicius, Koen Ponse, Tom Kouwenhoven, Andreas Sauter, Michiel van der Meer, Bram Renting, Aske Plaat
To solve this issue we introduce EduGym, a set of educational reinforcement learning environments and associated interactive notebooks tailored for education.
Firstly, they cannot directly generate coherent motions and require additional operations such as interpolation to process the generated actions.
While deep reinforcement learning has shown important empirical success, it tends to learn relatively slow due to slow propagation of rewards information and slow update of parametric neural networks.
Referring image segmentation segments an image from a language expression.
In this paper, we present a clear ablation study of post-exploration in a general intrinsically motivated goal exploration process (IMGEP) framework, that the Go-Explore paper did not show.
Therefore, this paper introduces Continuous Episodic Control (CEC), a novel non-parametric episodic memory algorithm for sequential decision making in problems with a continuous action space.
Finally, a flexible Android malware detection model based on GANs with code tensor (MTFD-GANs) is proposed.
Although artificial intelligence (AI) has made significant progress in understanding molecules in a wide range of fields, existing models generally acquire the single cognitive ability from the single molecular modality.
Ranked #6 on Molecule Captioning on ChEBI-20
Go-Explore achieved breakthrough performance on challenging reinforcement learning (RL) tasks with sparse rewards.
Based on the above intuition, we first investigate types of end-to-end encoder-decoder based models in the single-input dual-output (SIDO) multi-task framework, after which a novel asynchronous decoding with fuzzy Pinyin sampling method is proposed according to the one-to-one correspondence characteristics between Pinyin and Character.
Validations on the gait recognition metric CASIA-B dataset further demonstrated the capability of our hybrid model.
Referring image segmentation is a fundamental vision-language task that aims to segment out an object referred to by a natural language expression from an image.
In this paper, we investigate the problem of video object segmentation from referring expressions (VOSRE).
Ranked #1 on Referring Expression Segmentation on J-HMDB (Precision@0.9 metric)
Meanwhile, since the reasoning process of deep models is inaccessible, researchers design various evaluation methods to demonstrate their arguments.
While previous work has investigated the use of expert knowledge to generate potential functions, in this work, we study whether we can use a search algorithm(A*) to automatically generate a potential function for reward shaping in Sokoban, a well-known planning task.
The delayed feedback problem is one of the imperative challenges in online advertising, which is caused by the highly diversified feedback delay of a conversion varying from a few minutes to several days.
Deep learning models have achieved great success on the task of Natural Language Inference (NLI), though only a few attempts try to explain their behaviors.
In reinforcement learning, learning actions for a behavior policy that can be applied to new environments is still a challenge, especially for tasks that involve much planning.
On the one hand, it utilizes UI relations and user neighborhood to capture both global and local information.
In this paper, we propose Helios, a heterogeneity-aware FL framework to tackle the straggler issue.
Distributed, Parallel, and Cluster Computing
Unsupervised video object segmentation has often been tackled by methods based on recurrent neural networks and optical flow.
Ranked #17 on Unsupervised Video Object Segmentation on DAVIS 2016 val
With the joint supervision of Cross-Entropy (CE) loss and HC loss, the network is trained to achieve two vital objectives, inter-class discrepancy and intra-class cross-modality similarity as much as possible.
In this paper, we model user behavior using an interest delay model, study carefully the embedding mechanism, and obtain two important results: (i) We theoretically prove that small aggregation radius of embedding vectors of items which belongs to a same user interest domain will result in good generalization performance of deep CTR model.