Lighter model and faster inference are the focus of current single image super-resolution (SISR) research.
Line-of-sight (LoS) path is essential for the reliability of air-to-ground (A2G) communications, but the existence of LoS path is difficult to predict due to random obstacles on the ground.
Then, we design three specific networks, i. e., Global-Net, Semantic-Net and Expression-Net, to extract distinct emotional features from different stimuli simultaneously.
Images from many remote sensing satellites, including MODIS, Landsat-8, Sentinel-1, and Sentinel-2, are utilized in the experiments.
Computer vision tasks such as object detection and semantic/instance segmentation rely on the painstaking annotation of large training datasets.
In this paper, we systematically investigate the coupling of model-driven and data-driven methods, which has rarely been considered in the remote sensing image restoration and fusion communities.
Recent progress in 3D object detection from single images leverages monocular depth estimation as a way to produce 3D pointclouds, turning cameras into pseudo-lidar sensors.
Ranked #1 on Monocular 3D Object Detection on KITTI Cars Moderate (using extra training data)
Micro-expressions are spontaneous, unconscious facial movements that show people's true inner emotions and have great potential in related fields of psychological testing.
UGV-KPNet is computationally efficient with a small number of parameters and provides pixel-level accurate keypoints detection results in real-time.
The ultra-fast optical pulse functions as a tweezer that collects samples of the signals at very low sampling rates, and each sample is short enough to maintain the statistical properties of the signals.
Integrated sensing and communication (ISAC) is a promising technology to improve the band-utilization efficiency via spectrum sharing or hardware sharing between radar and communication systems.
The data fusion technology aims to aggregate the characteristics of different data and obtain products with multiple data advantages.
3D semantic scene completion and 2D semantic segmentation are two tightly correlated tasks that are both essential for indoor scene understanding, because they predict the same semantic classes, using positively correlated high-level features.
Visual Emotion Analysis (VEA) has attracted increasing attention recently with the prevalence of sharing images on social networks.
In this paper, we introduce a new acoustic leakage dataset of gas pipelines, called as GPLA-12, which has 12 categories over 684 training/testing acoustic signals.
Instead of considering iterative strategy, we make the blur kernel predictor trainable in the whole blind SR model, in which AMNet is well-trained.
Inspired by the theory of optimal control, we optimize the body states such that the simulated cloth motion is matched to the point cloud measurements, and the analytic gradient of the simulator is back-propagated to update the body states.
We use a hierarchical Lovasz hinge loss to learn a low-dimensional embedding space structured into a unified semantic and instance hierarchy without requiring separate network branches or object proposals.
We use a hierarchical Lov\'asz hinge loss to learn a low-dimensional embedding space structured into a unified semantic and instance hierarchy without requiring separate network branches or object proposals.
Semi-supervised learning acts as an effective way to leverage massive unlabeled data.
We argue this is due to the lack of rich information in the probability prediction and the overfitting caused by hard labels.
Although wireless channel models can be adopted for dataset generation, current channel models are mostly designed for communication rather than sensing.
Inspired by the common painting process of drawing a draft and revising the details, we introduce a novel feed-forward method named Laplacian Pyramid Network (LapStyle).
Simulators can efficiently generate large amounts of labeled synthetic data with perfect supervision for hard-to-label tasks like semantic segmentation.
We then propose a novel Transitive Learning method for blind Super-Resolution on transitive degradations (TLSR), by adaptively inferring a transitive transformation function to solve the unknown degradations without any iterative operations in inference.
In this work, we introduce an end-to-end trainable approach for joint object detection and tracking that is capable of such reasoning.
The equivalence holds given certain conditions about initial state distributions and policy formats, in which the system state is the estimation error, control input is the filter gain, and control objective function is the accumulated estimation error.
MPG contains two types of PG: 1) data-driven PG, which is obtained by directly calculating the derivative of the learned Q-value function with respect to actions, and 2) model-driven PG, which is calculated using BPTT based on the model-predictive return.
The novel coronavirus (SARS-CoV-2) which causes COVID-19 is an ongoing pandemic.
In this paper, we put forward a novel density-oriented PointNet (DPointNet) for 3D object detection in point clouds, in which the density of points increases layer by layer.
The applications of Normalized Difference Vegetation Index (NDVI) time-series data are inevitably hampered by cloud-induced gaps and noise.
Multi-scale biomedical knowledge networks are expanding with emerging experimental technologies that generates multi-scale biomedical big data.
The proposed method is a filter-based feature selection method, which directly utilises the Menger Curvature for ranking all the attributes in the given data set.
Second, we propose to learn a metric that combines the Mahalanobis and feature distances when comparing a track and a new detection in data association.
A new multifunctional 2D material is theoretically predicted based on systematic ab-initio calculations and model simulations for the honeycomb lattice of endohedral fullerene W@C28 molecules.
Materials Science Computational Physics
We study a multi-resonator optomechanical system, consisting of two SiN membranes coupled to a single optical cavity mode.
Optics Mesoscale and Nanoscale Physics
We measure a highest rotation frequency about 4. 3 GHz of the trapped nanoparticle without feedback cooling and a 6 GHz rotation with feedback cooling, which is the fastest mechanical rotation ever reported to date.
Optics Mesoscale and Nanoscale Physics Quantum Physics
Our work provides insights into the latest deployment of M-BGP in a major ISP network and it highlights the characteristics and effectiveness of M-BGP as a means to realize load sharing.
Networking and Internet Architecture
In practice, an initial semantic segmentation (SS) of a single sweep point cloud can be achieved by any appealing network and then flows into the semantic scene completion (SSC) module as the input.
Ranked #14 on LIDAR Semantic Segmentation on nuScenes
Due to the uncertainty in the distortion variation, restoring distorted images caused by liquify filter is a challenging task.
1 code implementation • 28 Oct 2020 • Steve Bryson, Michelle Kunimoto, Ravi K. Kopparapu, Jeffrey L. Coughlin, William J. Borucki, David Koch, Victor Silva Aguirre, Christopher Allen, Geert Barentsen, Natalie. M. Batalha, Travis Berger, Alan Boss, Lars A. Buchhave, Christopher J. Burke, Douglas A. Caldwell, Jennifer R. Campbell, Joseph Catanzarite, Hema Chandrasekharan, William J. Chaplin, Jessie L. Christiansen, Jorgen Christensen-Dalsgaard, David R. Ciardi, Bruce D. Clarke, William D. Cochran, Jessie L. Dotson, Laurance R. Doyle, Eduardo Seperuelo Duarte, Edward W. Dunham, Andrea K. Dupree, Michael Endl, James L. Fanson, Eric B. Ford, Maura Fujieh, Thomas N. Gautier III, John C. Geary, Ronald L Gilliland, Forrest R. Girouard, Alan Gould, Michael R. Haas, Christopher E. Henze, Matthew J. Holman, Andrew Howard, Steve B. Howell, Daniel Huber, Roger C. Hunter, Jon M. Jenkins, Hans Kjeldsen, Jeffery Kolodziejczak, Kipp Larson, David W. Latham, Jie Li, Savita Mathur, Soren Meibom, Chris Middour, Robert L. Morris, Timothy D. Morton, Fergal Mullally, Susan E. Mullally, David Pletcher, Andrej Prsa, Samuel N. Quinn, Elisa V. Quintana, Darin Ragozzine, Solange V. Ramirez, Dwight T. Sanderfer, Dimitar Sasselov, Shawn E. Seader, Megan Shabram, Avi Shporer, Jeffrey C. Smith, Jason H. Steffen, Martin Still, Guillermo Torres, John Troeltzsch, Joseph D. Twicken, Akm Kamal Uddin, Jeffrey E. Van Cleve, Janice Voss, Lauren Weiss, William F. Welsh, Bill Wohler, Khadeejah A Zamudio
We present occurrence rates for rocky planets in the habitable zones (HZ) of main-sequence dwarf stars based on the Kepler DR25 planet candidate catalog and Gaia-based stellar properties.
Earth and Planetary Astrophysics Solar and Stellar Astrophysics
To solve them, we propose a purposeful and interpretable detail-fidelity attention network to progressively process these smoothes and details in divide-and-conquer manner, which is a novel and specific prospect of image super-resolution for the purpose on improving the detail fidelity, instead of blindly designing or employing the deep CNNs architectures for merely feature representation in local receptive fields.
Random selection based defenses can achieve certified robustness by averaging the classifiers' predictions on the sub-datasets sampled from the training set.
The experiment results confirm that the TC can help LsrKD and MrKD to boost training, especially on the networks they are failed.
We present an improved version of PointRCNN for 3D object detection, in which a multi-branch backbone network is adopted to handle the non-uniform density of point clouds.
On the DOTA dataset, CenterFPANet mAP is 64. 00%, and FPS is 22. 2, which is close to the accuracy of the anchor-based methods currently used and much faster than them.
However most of existing methods rely on heuristically defined anchors with different scales, angles and aspect ratios and usually suffer from severe misalignment between anchor boxes and axis-aligned convolutional features, which leads to the common inconsistency between the classification score and localization accuracy.
In autonomous driving, accurately estimating the state of surrounding obstacles is critical for safe and robust path planning.
Nanomagnets with giant magnetic anisotropy energy and long coherence time are desired for various technological innovations such as quantum information procession and storage.
Materials Science Computational Physics
The simulation results show that the TPI algorithm can converge to the optimal solution for the linear plant, and has high resistance to disturbances for the nonlinear plant.
Deep convolution neural network has attracted many attentions in large-scale visual classification task, and achieves significant performance improvement compared to traditional visual analysis methods.
Specifically, we devise a partial segment loss regarded as a loss sampling to learn integral action parts from labeled segments.
Further, we propose a context encoding module to utilize the global predictor from the error map to enhance the feature representation and regularize the networks.
Heterogeneous face recognition (HFR) refers to matching face images acquired from different domains with wide applications in security scenarios.
For reducing the solution space, we first model the adversarial perturbation optimization problem as a process of recovering frequency-sparse perturbations with compressed sensing, under the setting that random noise in the low-frequency space is more likely to be adversarial.
In contrast to the standard 3D convolution that is limited to a fixed 3D receptive field, our module is capable of modeling the dimensional anisotropy voxel-wisely.
This paper investigates various applications of big data analytics, especially machine learning algorithms in wireless communications and channel modeling.
Instead of using semantic labels and proxy losses in a multi-task approach, we propose a new architecture leveraging fixed pretrained semantic segmentation networks to guide self-supervised representation learning via pixel-adaptive convolutions.
This paper tackles the problem of data fusion in the semantic scene completion (SSC) task, which can simultaneously deal with semantic labeling and scene completion.
In the SR processing, we first generated a group of FACs from the input LR face, and then reconstructed the HR face from this group of FACs.
Current state-of-the-art CNN methods usually treat the VSR problem as a large number of separate multi-frame super-resolution tasks, at which a batch of low resolution (LR) frames is utilized to generate a single high resolution (HR) frame, and running a slide window to select LR frames over the entire video would obtain a series of HR frames.
Besides, we devise a geometrical alignment constraint item to compensate for the pixel-based distance between prediction features and ground-truth ones.
Ranked #1 on Facial Inpainting on FFHQ
We propose PALNet, a novel hybrid network for SSC based on single depth.
Reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks.
Panoptic segmentation is a complex full scene parsing task requiring simultaneous instance and semantic segmentation at high resolution.
However, plenty of studies have shown that global information is crucial for image restoration tasks like image demosaicing and enhancing.
1 code implementation • 18 Nov 2019 • Andreas Lugmayr, Martin Danelljan, Radu Timofte, Manuel Fritsche, Shuhang Gu, Kuldeep Purohit, Praveen Kandula, Maitreya Suin, A. N. Rajagopalan, Nam Hyung Joon, Yu Seung Won, Guisik Kim, Dokyeong Kwon, Chih-Chung Hsu, Chia-Hsiang Lin, Yuanfei Huang, Xiaopeng Sun, Wen Lu, Jie Li, Xinbo Gao, Sefi Bell-Kligler
For training, only one set of source input images is therefore provided in the challenge.
In this paper, a novel spatiotemporal transformation deep learning method called Trident Segmentation CNN (TS-CNN) is proposed to segment PWML in MR images.
Learning depth and camera ego-motion from raw unlabeled RGB video streams is seeing exciting progress through self-supervision from strong geometric cues.
Dense depth estimation from a single image is a key problem in computer vision, with exciting applications in a multitude of robotic tasks.
As for the pose tracker, we propose a visual odometry system fusing both the feature matching and the virtual LiDAR scan matching results.
On the other hand, feature fusion modules are designed to combine different modal of semantic features, which leverage the information from both inputs for better accuracy.
We present the design and implementation of a visual search system for real time image retrieval on JD. com, the world's third largest and China's largest e-commerce site.
We introduce a novel single-shot object detector to ease the imbalance of foreground-background class by suppressing the easy negatives while increasing the positives.
In this paper, we propose a novel perceptual image super-resolution method that progressively generates visually high-quality results by constructing a stage-wise network.
Ranked #3 on Image Super-Resolution on Manga109 - 4x upscaling
In addition, we introduce a novel three-stage learning approach which enables the (cognitive) encoder to gradually distill useful knowledge from the paired (visual) encoder during the learning process.
In this paper, we construct an efficient two-stage PWML semantic segmentation network based on the characteristics of the lesion, called refined segmentation R-CNN (RS RCNN).
3D image segmentation is one of the most important and ubiquitous problems in medical image processing.
We train the deep encoder-decoder for landmark detection, and combine global landmark configuration with local high-resolution feature responses.
Because of data duplication, database decentralization, weak data relations, and sluggish data updates, the power asset management system eager to adopt a new strategy to avoid the information losses, bias, and improve the data storage efficiency and extraction process.
No-reference image quality assessment (NR-IQA) aims to measure the image quality without reference image.
In this paper, we propose a deep quantization approach, which is among the early attempts of leveraging deep neural networks into quantization-based cross-modal similarity search.
Short-term load forecasting (STLF) is essential for the reliable and economic operation of power systems.
RGB images differentiate from depth images as they carry more details about the color and texture information, which can be utilized as a vital complementary to depth for boosting the performance of 3D semantic scene completion (SSC).
We formulate the building thermal control as a cost-minimization problem which jointly considers the energy consumption of HVAC and the thermal comfort of the occupants.
We propose an end-to-end learning approach for panoptic segmentation, a novel task unifying instance (things) and semantic (stuff) segmentation.
Ranked #14 on Panoptic Segmentation on Cityscapes val (using extra training data)
1 code implementation • 14 Nov 2018 • Avi Shporer, Ian Wong, Chelsea X. Huang, Michael R. Line, Keivan G. Stassun, Tara Fetherolf, Stephen R. Kane, Luke G. Bouma, Tansu Daylan, Maximilian N. Guenther, George R. Ricker, David W. Latham, Roland Vanderspek, Sara Seager, Joshua N. Winn, Jon M. Jenkins, Ana Glidden, Zach Berta-Thompson, Eric B. Ting, Jie Li, Kari Haworth
The phase curve includes the transit, secondary eclipse, and sinusoidal modulations across the orbital phase shaped by the planet's atmospheric characteristics and the star-planet gravitational interaction.
Earth and Planetary Astrophysics Solar and Stellar Astrophysics
To close the gap, we propose an efficient tattoo search approach that is able to learn tattoo detection and compact representation jointly in a single convolutional neural network (CNN) via multi-task learning.
Deep Learning for Computer Vision depends mainly on the source of supervision. Photo-realistic simulators can generate large-scale automatically labeled syntheticdata, but introduce a domain gap negatively impacting performance.
The existence of hybrid noise in hyperspectral images (HSIs) severely degrades the data quality, reduces the interpretation accuracy of HSIs, and restricts the subsequent HSIs applications.
To capture more informative features and maintain long-term information for image super-resolution, we propose a channel-wise and spatial feature modulation (CSFM) network in which a sequence of feature-modulation memory (FMM) modules is cascaded with a densely connected structure to transform low-resolution features to high informative features.
In the field of spatial-spectral fusion, the model-based method and the deep learning (DL)-based method are state-of-the-art.
In this work, we propose a new method for multi-person pose estimation which combines the traditional bottom-up and the top-down methods.
Aspect sentiment classification, a challenging taskin sentiment analysis, has been attracting more andmore attention in recent years.
Hyperspectral image (HSI) denoising is a crucial preprocessing procedure to improve the performance of the subsequent HSI interpretation and applications.
Based on the extended fingerprint database, the accuracy of indoor localization system can be improved with reduced human effort.
Networking and Internet Architecture
To improve information flow and to capture sufficient knowledge for reconstructing the high-frequency details, we propose a cascaded multi-scale cross network (CMSC) in which a sequence of subnetworks is cascaded to infer high resolution features in a coarse-to-fine manner.
1 code implementation • 18 Oct 2017 • Susan E. Thompson, Jeffrey L. Coughlin, Kelsey Hoffman, Fergal Mullally, Jessie L. Christiansen, Christopher J. Burke, Steve Bryson, Natalie Batalha, Michael R. Haas, Joseph Catanzarite, Jason F. Rowe, Geert Barentsen, Douglas A. Caldwell, Bruce D. Clarke, Jon M. Jenkins, Jie Li, David W. Latham, Jack J. Lissauer, Savita Mathur, Robert L. Morris, Shawn E. Seader, Jeffrey C. Smith, Todd C. Klaus, Joseph D. Twicken, Bill Wohler, Rachel Akeson, David R. Ciardi, William D. Cochran, Thomas Barclay, Jennifer R. Campbell, William J. Chaplin, David Charbonneau, Christopher E. Henze, Steve B. Howell, Daniel Huber, Andrej Prsa, Solange V. Ramirez, Timothy D. Morton, Jorgen Christensen-Dalsgaard, Jessie L. Dotson, Laurance Doyle, Edward W. Dunham, Andrea K. Dupree, Eric B. Ford, John C. Geary, Forrest R. Girouard, Howard Isaacson, Hans Kjeldsen, Jason H. Steffen, Elisa V. Quintana, Darin Ragozzine, Megan Shabram, Avi Shporer, Victor Silva Aguirre, Martin Still, Peter Tenenbaum, William F. Welsh, Angie Wolfgang, Khadeejah A. Zamudio, David G. Koch, William J. Borucki
For orbital periods less than 100 days the Robovetter completeness (the fraction of simulated transits that are determined to be planet candidates) across all observed stars is greater than 85%.
Earth and Planetary Astrophysics
In this paper, to break the limit of the traditional linear models for synthetic aperture radar (SAR) image despeckling, we propose a novel deep learning approach by learning a non-linear end-to-end mapping between the noisy and clean SAR images with a dilated residual network (SAR-DRN).
Due to the depth-dependent water column effects inherent to underwater environments, we show that our end-to-end network implicitly learns a coarse depth estimate of the underwater scene from monocular underwater images.
The most time-consuming or main computation complexity for exemplar-based face sketch synthesis methods lies in the neighbor selection process.
FS followed by RP outperforms other combination methods in classification accuracy on most of the datasets.
An adaptive sparse graphical representation scheme is designed to represent heterogeneous face images, where a Markov networks model is constructed to generate adaptive sparse vectors.
In this paper, we proposed a synthesized face sketch recognition framework based on full-reference image quality assessment metrics.
Heterogeneous face recognition (HFR) refers to matching face images acquired from different sources (i. e., different sensors or different wavelengths) for identification.