In: Statistics and Probability
Why might the exponential family of distributions be important? In other words, if areal-world application turned out to have a distribution which is of this family, what interesting results or convenient truths might that bring to the table?
Exponential family is the biggest family of distributions and thus giving it a base for majority of the real life problems related to statistics. Most of the commonly used distributions form an exponential family or subset of an exponential family are
The main motivation behind exponential family distributions is that they are the maximum entropy distribution families given a set of sufficient statistics and a support. In other words, they are minimum assumptive distribution.
For example, if you measure only the mean and variance of real-valued quantity, the least assumptive modelling choice is a normal distribution.
From a computation standpoint, there are other advantages:
They are closed under "evidence combination". That is, the combination of two independent likelihoods from the same exponential family is always in the same exponential family and its natural parameters are merely the sum of the natural parameters of its components. This is convenient for Bayesian statistics.
The gradient of the cross entropy between two exponential family distributions is the difference of their expectation parameters. This means that a loss function that is such a cross entropy is a so-called matching loss function, which is convenient for optimization.