We introduce Softmax Gradient Tampering, a technique for modifying the gradients in the backward pass of neural networks in order to enhance their accuracy.
Hence, a method is proposed to estimate the human intent from the high dimensional HG motion signal and reconstruct the signal at the RH to ensure object relocation.
We utilise a ResNet18 encoder, learning features for prediction of both depth and pose.
We propose a novel method OneShotAu2AV to generate an animated video of arbitrary length using an audio clip and a single unseen image of a person as an input.
The multi-modal adaptive normalization uses the various features of audio and video such as Mel spectrogram, pitch, energy from audio signals and predicted keypoint heatmap/optical flow and a single image to learn the respective affine parameters to generate highly expressive video.
High quality multi-speaker speech synthesis while considering prosody and in a few shot manner is an area of active research with many real-world applications.
A chest radiograph, commonly called chest x-ray (CxR), plays a vital role in the diagnosis of various lung diseases, such as lung cancer, tuberculosis, pneumonia, and many more.
Methods: The proposed work explores the performance of convolutional neural networks (CNNs) for the diagnosis of TB in Indian chest x-ray images.
We have curated a large, novel Indian heritage monuments dataset comprising of images of historical, cultural and religious importance with subtypes of eras, dynasties and architectural styles.
Recent advances in end-to-end unsupervised learning has significantly improved the performance of monocular depth prediction and alleviated the requirement of ground truth depth.
The spectral information in the infrared image is incorporated by adding the feature maps over several layers in the encoder part of the fusion structure, which makes inference on both the visual and infrared images separately.
Fall detection holds immense importance in the field of healthcare, where timely detection allows for instant medical assistance.
Data abundance along with scarcity of machine learning experts and domain specialists necessitates progressive automation of end-to-end machine learning workflows.
Using the RetinaNet model as our base, we modify the anchor scales to better handle the detection of dense distribution and small size of the objects.
Since the optimization of SLNNs is still a challenge, we show that using SLAF along with standard activations (like ReLU) can provide performance improvements with only a small increase in number of parameters.
The recent advances in deep learning are mostly driven by availability of large amount of training data.
Extensive experiments on data sources obtained in Delhi demonstrate that the proposed adaptive attention based Bidirectional LSTM Network outperforms several baselines for classification and regression models.
Plant Phenomics based on imaging based techniques can be used to monitor the health and the diseases of plants and crops.
In this paper, we present a novel end-to-end coupled Denoising based Saliency Prediction with Generative Adversarial Network (DSAL-GAN) framework to address the problem of salient object detection in noisy images.
The method constitutes a deep network for learning permutation invariant representation of 3D points.
To circumvent this problem, we construct a pose-oblivious shape signature which is fed to a sequence learning framework.
This paper discusses the experiments performed for predicting the emotion intensity in tweets using a generalized supervised learning approach.
Object cosegmentation addresses the problem of discovering similar objects from multiple images and segmenting them as foreground simultaneously.
In this paper we propose an ensemble of local and deep features for object classification.
1 code implementation • 17 Oct 2017 • Li Yi, Lin Shao, Manolis Savva, Haibin Huang, Yang Zhou, Qirui Wang, Benjamin Graham, Martin Engelcke, Roman Klokov, Victor Lempitsky, Yuan Gan, Pengyu Wang, Kun Liu, Fenggen Yu, Panpan Shui, Bingyang Hu, Yan Zhang, Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Minki Jeong, Jaehoon Choi, Changick Kim, Angom Geetchandra, Narasimha Murthy, Bhargava Ramu, Bharadwaj Manda, M. Ramanathan, Gautam Kumar, P Preetham, Siddharth Srivastava, Swati Bhugra, Brejesh lall, Christian Haene, Shubham Tulsiani, Jitendra Malik, Jared Lafer, Ramsey Jones, Siyuan Li, Jie Lu, Shi Jin, Jingyi Yu, Qi-Xing Huang, Evangelos Kalogerakis, Silvio Savarese, Pat Hanrahan, Thomas Funkhouser, Hao Su, Leonidas Guibas
We introduce a large-scale 3D shape understanding benchmark using data and annotation from ShapeNet 3D object database.
Quantification of physiological changes in plants can capture different drought mechanisms and assist in selection of tolerant varieties in a high throughput manner.
In this paper, we propose a novel object proposal generation scheme by formulating a graph-based salient edge classification framework that utilizes the edge context.
We test on unknown objects, which were not seen during training, and perform clustering in the learned embedding space of supervoxels to effectively perform novel object discovery.
The crux of the problem in KDD Cup 2016 involves developing data mining techniques to rank research institutions based on publications.
In this paper, we propose a novel approach for feature generation by appropriately fusing KAZE and SIFT features.
While computing similarity between users, we make use of a combined similarity measure involving rating overlap as well as similarity in the latent topic space.