Recently, data-driven inertial navigation approaches have demonstrated their capability of using well-trained neural networks to obtain accurate position estimates from inertial measurement units (IMU) measurements.
no code implementations • 1 Oct 2021 • Yan Xia, Linhui Jiang, Lu Wang, Xue Chen, Jianjie Ye, Tangyan Hou, Liqiang Wang, Yibo Zhang, Mengying Li, Zhen Li, Zhe Song, Yaping Jiang, Weiping Liu, Pengfei Li, Daniel Rosenfeld, John H. Seinfeld, Shaocai Yu
Our results show that the ORRS measurements, assisted by the machine-learning-based ensemble model developed here, can realize day-to-day supervision of on-road vehicle-specific emissions.
Next, we use simulated observation sequences to query the simulation system to retrieve simulated projection sequences as knowledge.
This paper is concerned with ranking many pre-trained deep neural networks (DNNs), called checkpoints, for the transfer learning to a downstream task.
In this paper, we address this limitation with an efficient learning objective that considers the discriminative feature distributions between the visual objects and sentence words.
They fail to improve object detectors in their vanilla forms due to the domain gap between the Web images and curated datasets.
White box adversarial perturbations are sought via iterative optimization algorithms most often minimizing an adversarial loss on a $l_p$ neighborhood of the original image, the so-called distortion set.
The other is that the number of images used for the knowledge distillation should be small; otherwise, it violates our expectation of reducing the dependence on large-scale datasets.
We propose a new task towards more practical application for image generation - high-quality image synthesis from salient object layout.
Object frequency in the real world often follows a power law, leading to a mismatch between datasets with long-tailed class distributions seen by a machine learning model and our expectation of the model to perform well on all classes.
Ranked #23 on Long-tail Learning on Places-LT
Speaker diarization, which is to find the speech segments of specific speakers, has been widely used in human-centered applications such as video conferences or human-computer interaction systems.
We evaluate the effectiveness of traditional attack methods such as FGSM and PGD. The results show that A-GEM still possesses strong continual learning ability in the presence of adversarial examples in the memory and simple defense techniques such as label smoothing can further alleviate the adversarial effects.
Fine-tuning is a popular transfer learning technique for deep neural networks where a few rounds of training are applied to the parameters of a pre-trained model to adapt them to a new task.
As deep neural networks (DNNs) have become increasingly important and popular, the robustness of DNNs is the key to the safety of both the Internet and the physical world.
In this work, we show that shrinking the model size through proper weight pruning can even be helpful to improve the DNN robustness under adversarial attack.
Compared with image inpainting, performing this task on video presents new challenges such as how to preserving temporal consistency and spatial details, as well as how to handle arbitrary input video size and length fast and efficiently.
Powerful adversarial attack methods are vital for understanding how to construct robust deep neural networks (DNNs) and for thoroughly testing defense techniques.
Recent advancements in recurrent neural network (RNN) research have demonstrated the superiority of utilizing multiscale structures in learning temporal representations of time series.
To the best of our knowledge, we are the first to provide stochastic and deterministic accelerated extension of APCD algorithms for general nonconvex and nonsmooth problems ensuring that for both bounded delays and unbounded delays every limit point is a critical point.
A model aware of the relationships between different domains can also be trained to work on new domains with less resources.
The success of deep neural networks often relies on a large amount of labeled examples, which can be difficult to obtain in many real scenarios.
With vast amounts of video content being uploaded to the Internet every minute, video summarization becomes critical for efficient browsing, searching, and indexing of visual content.
The large volume of video content and high viewing frequency demand automatic video summarization algorithms, of which a key property is the capability of modeling diversity.
Despite being impactful on a variety of problems and applications, the generative adversarial nets (GANs) are remarkably difficult to train.
In the first stage, we identify a small portion of images from the noisy training set of which the labels are correct with a high probability.