|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
In semantic segmentation, most existing real-time deep models trained with each frame independently may produce inconsistent results for a video sequence.
While the satellite-based Global Positioning System (GPS) is adequate for some outdoor applications, many other applications are held back by its multi-meter positioning errors and poor indoor coverage.
We use those segmentation maps inside the network as a self-attention mechanism to weight the feature map used to produce the bounding boxes, decreasing the signal of non-relevant areas.
We present a novel technique to automatically generate annotated data for important robotic perception tasks such as object segmentation and 3D object reconstruction using a robot manipulator.
Wearable cameras are becoming more and more popular in several applications, increasing the interest of the research community in developing approaches for recognizing actions from a first-person point of view.
To prove convergence we need a predictor for the dual variable based on (proximal) gradient flow.
To estimate depth in these scenarios, our algorithm models the dynamic scene motion using independent and rigid motions.