We introduce temporal multimodal multivariate learning, a new family of decision making models that can indirectly learn and transfer online information from simultaneous observations of a probability distribution with more than one peak or more than one outcome variable from one time stage to another.
At the end of each drive, a ground-in-the-loop (GITL) interaction is used to get a position update from human operators in a more global reference frame, by matching images or local maps from onboard the rover to orbital reconnaissance images or maps of a large region around the rover's current position.
We present MLNav, a learning-enhanced path planning framework for safety-critical and resource-limited systems operating in complex environments, such as rovers navigating on Mars.
Enhanced AutoNav (ENav), the baseline surface navigation software for NASA's Perseverance rover, sorts a list of candidate paths for the rover to traverse, then uses the Approximate Clearance Evaluation (ACE) algorithm to evaluate whether the most highly ranked paths are safe.
We consider the problem of designing policies for partially observable Markov decision processes (POMDPs) with dynamic coherent risk objectives.
We study the problem of learning sequential decision-making policies in settings with multiple state-action representations.
We found that the same result holds for stochastic optimal control problems with continuous state and action spaces. Furthermore, we show the randomization of control input can result in reduced cost when the optimization problem is nonconvex, and the cost reduction is equal to the duality gap.
The second capability is essential for the planner to solve problems with a continuous state space such as vehicle path planning.