Altitude-Loss Optimal Glides in Engine Failure Emergencies -- Accounting for Ground Obstacles and Wind

Engine failure is a recurring emergency in General Aviation and fixed-wing UAVs, often requiring the pilot or remote operator to carry out carefully planned glides to safely reach a candidate landing strip.

Cooperative Multi-Agent Path Finding: Beyond Path Planning and Collision Avoidance

We introduce the Cooperative Multi-Agent Path Finding (Co-MAPF) problem, an extension to the classical MAPF problem, where cooperative behavior is incorporated.

Deep Randomized Least Squares Value Iteration

Rather than using hand-design state representation, we use a state representation that is being learned directly from the data by a DQN agent.

ILS-SUMM: Iterated Local Search for Unsupervised Video Summarization

We consider shot-based video summarization where the summary consists of a subset of the video shots which can be of various lengths.

Learning Control for Air Hockey Striking using Deep Reinforcement Learning

We consider the task of learning control policies for a robotic mechanism striking a puck in an air hockey game.

The Max $K$-Armed Bandit: PAC Lower Bounds and Efficient Algorithms

Under the PAC framework, we provide a lower bound on the sample complexity of any $(\epsilon,\delta)$-correct algorithm, and propose an algorithm that attains this bound up to logarithmic factors.

The Max $K$-Armed Bandit: A PAC Lower Bound and tighter Algorithms

We consider the Max $K$-Armed Bandit problem, where a learning agent is faced with several sources (arms) of items (rewards), and interested in finding the best item overall.

An Online Convex Optimization Approach to Blackwell's Approachability

The notion of approachability in repeated games with vector payoffs was introduced by Blackwell in the 1950s, along with geometric conditions for approachability and corresponding strategies that rely on computing {\em steering directions} as projections from the current average payoff vector to the (convex) target set.

Response-Based Approachability and its Application to Generalized No-Regret Algorithms

The first (primary) condition is a geometric separation condition, while the second (dual) condition requires that the set be {\em non-excludable}, namely that for every mixed action of the opponent there exists a mixed action of the agent (a {\em response}) such that the resulting payoff vector belongs to $S$.

Online Classification with Specificity Constraints

To our best knowledge, this is the first algorithm that addresses the problem of the average tp-rate maximization under average fp-rate constraints in the online setting.

