Search Results for author: Mahammad Humayoo

Parameter Estimation with the Ordered $\ell_{2}$ Regularization via an Alternating Direction Method of Multipliers

The reason stems from the fact that the ordered regularization can reject irrelevant variables and yield an accurate estimation of the parameters.

Paper
Add Code

One reason for the instability of off-policy learning is a discrepancy between the target ($\pi$) and behavior (b) policy distributions.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.