Auditory attention to natural speech is a complex brain process. Its quantification from physiological signals can be valuable to improving and widening the range of applications of current brain-computer-interface systems, however it remains a challenging task. In this article, we present a dataset of physiological signals collected from an experiment on auditory attention to natural speech. In this experiment, auditory stimuli consisting of reproductions of English sentences in different auditory conditions were presented to 25 non-native participants, who were asked to transcribe the sentences. During the experiment, 14 channel electroencephalogram, galvanic skin response, and photoplethysmogram signals were collected from each participant. Based on the number of correctly transcribed words, an attention score was obtained for each auditory stimulus presented to subjects. A strong correlation ($p<<0.0001$) between the attention score and the auditory conditions was found. We also formulate four different predictive tasks involving the collected dataset and develop a feature extraction framework. The results for each predictive task are obtained using a Support Vector Machine with spectral features, and are better than chance level. The dataset has been made publicly available for further research, along with a python library - phyaat to facilitate the preprocessing, modeling, and reproduction of the results presented in this paper. The dataset and other resources are shared on webpage - https://phyaat.github.io.