Relative position embedding (RPE) is a successful method to explicitly and efficaciously encode position information into Transformer models.
Fact consistency assessment requires the reasoning capability to find subtle clues to identify whether a model-generated summary is consistent with the original document.
Can Transformer perform 2D object- and region-level recognition from a pure sequence-to-sequence perspective with minimal knowledge about the 2D spatial structure?
The past year has witnessed the rapid development of applying the Transformer module to vision problems.
Ranked #250 on Image Classification on ImageNet
Instead of randomly dropping units, SelectScale selects the important features in networks and adjusts them during training.
Recently, the rapid development of Solid-State LiDAR (SSL) enables low-cost and efficient obtainment of 3D point clouds from the environment, which has inspired a large quantity of studies and applications.
Quick Response (QR) code is one of the most worldwide used two-dimensional codes.~Traditional QR codes appear as random collections of black-and-white modules that lack visual semantics and aesthetic elements, which inspires the recent works to beautify the appearances of QR codes.
Although deep learning models like CNNs have achieved great success in medical image analysis, the small size of medical datasets remains a major bottleneck in this area.
Manga is a world popular comic form originated in Japan, which typically employs black-and-white stroke lines and geometric exaggeration to describe humans' appearances, poses, and actions.
Automatic designing computationally efficient neural networks has received much attention in recent years.