Massive Lossless Data Compression and Multiple Parameter Estimation from Galaxy Spectra

6 Nov 1999  ·  Alan Heavens, Raul Jimenez, Ofer Lahav ·

We present a method for radical linear compression of datasets where the data are dependent on some number $M$ of parameters. We show that, if the noise in the data is independent of the parameters, we can form $M$ linear combinations of the data which contain as much information about all the parameters as the entire dataset, in the sense that the Fisher information matrices are identical; i.e. the method is lossless. We explore how these compressed numbers fare when the noise is dependent on the parameters, and show that the method, although not precisely lossless, increases errors by a very modest factor. The method is general, but we illustrate it with a problem for which it is well-suited: galaxy spectra, whose data typically consist of $\sim 10^3$ fluxes, and whose properties are set by a handful of parameters such as age, brightness and a parametrised star formation history. The spectra are reduced to a small number of data, which are connected to the physical processes entering the problem. This data compression offers the possibility of a large increase in the speed of determining physical parameters. This is an important consideration as datasets of galaxy spectra reach $10^6$ in size, and the complexity of model spectra increases. In addition to this practical advantage, the compressed data may offer a classification scheme for galaxy spectra which is based rather directly on physical processes.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here