This paper introduces a robust mixing model to describe hyperspectral data resulting from the mixture of several pure spectral signatures. This new model not only generalizes the commonly used linear mixing model, but also allows for possible nonlinear effects to be easily handled, relying on mild assumptions regarding these nonlinearities. The standard nonnegativity and sum-to-one constraints inherent to spectral unmixing are coupled with a group-sparse constraint imposed on the nonlinearity component. This results in a new form of robust nonnegative matrix factorization. The data fidelity term is expressed as a beta-divergence, a continuous family of dissimilarity measures that takes the squared Euclidean distance and the generalized Kullback-Leibler divergence as special cases. The penalized objective is minimized with a block-coordinate descent that involves majorization-minimization updates. Simulation results obtained on synthetic and real data show that the proposed strategy competes with state-of-the-art linear and nonlinear unmixing methods.