.. _sparse2:
Sparse Representation (new!)
============================

.. include:: colors.rst


.. pull-quote::

    Sparse Representation of photometric redshift PDFs [1]_ is a novel technique to represent and store these PDFs in an efficient way without
    losing resolution or information. It works by creating a over determined dictionary of basis with Gaussian and `Voigt <http://en.wikipedia.org/wiki/Voigt_profile>`_ profiles
    which cover the redshift space for several values for the center and width for these distributions. Usually we create less than 2^16
    bases so we can represent them using 16 bits integers. By using an Orthogonal Matching Pursuit algorithm we look to minimize the number of bases
    used, without repetition, and the coefficients for each base can be also represented using another 16 bits integer. In this way, we can represent
    a PDF using only 32-bits integers per base when only :red:`10-20 bases is necessary`.


.. figure:: figures/32bit.png
    :scale: 80 %
    :align: center

    A single four-byte integer scheme to store a single basis
    function in the sparse representation method. The first sixteen bits
    store the coefficients (including sign), while the second sixteen bits
    store the location of the bases in the dictionary.



.. figure:: figures/bases.png
    :scale: 60 %
    :align: center

    A illustration of the bases used in the dictionary, each shaded region is composed by several functions of
    different widths, those colored shaded regions are full of bases. In a real dictionary the spacing between these
    functions is much tighter (~200 points in the redshift range)


    The number of bases and parameters depend strongly on the data based used, the module :class:`pdf_storage` contains
    most of the functions used to make the representation, for more information check our `Sparse represenation paper
    <http://adsabs.harvard.edu/abs/2014arXiv1404.6442C>`_

.. toctree::
   :maxdepth: 2

   sparse_mod


Example Case
............


    Based on our previous  :ref:`example <run_sdss>` on the SDSS data. we have added additional keys on the
    ref:`input-file`, including a key for computing the sparse representation. If this key is set to :red:`yes` then MLZ
    will run longer but it will produce a fits file with the extension :blueit:`.Psparse.fits` which have all information
    necessary on its header and have the representation for each galaxy, to visualize an example, type::

        $ ./plot/plot_sparse test/SDSS_MGS.inputs 0

    where a random galaxy will be displayed as well as the bases used. An example is shown below.
    To get a specific galaxy just add an extra argument at the end with the galaxy id, like::

        $ ./plot/plot_sparse test/SDSS_MGS.inputs 0 100

    Which will display the 100th galaxy from the sample.

.. image:: figures/ex_sp0.png
    :scale: 50%
.. image:: figures/ex_sp1.png
    :scale: 50%


**References**

.. [1] Carrasco Kind, M., & Brunner, R. J., 2014, :blueit:`"Sparse Representation of Photometric Redshift PDFs: Preparing for Petascale Astronomy"`, MNRAS in press. (`Link <http://adsabs.harvard.edu/abs/2014arXiv1404.6442C>`_)
