Likelihood¶

We just looked at how the scipy.optimize.curve_fit function can be used to obtain the optimal parameters the agreement between some model and data. To achieve this, the curve_fit function is maximising the likelihood of the model to the data, $\mathcal{L}\big[\mathbf{D} | M(\mathbf{X})\big]$. For the specific case of Gaussian uncertainties, this may be called minimising the $\chi^2$-statistic.

The likelihood, given the model $M$ and a set of parameters $\mathbf{X}$, of the data $\mathbf{D}$ where these data are assumed to have Gaussian uncertainties can be found as,

\[ \ln\mathcal{L}\big[\mathbf{D} | M(\mathbf{X})\big] = -\frac{1}{2} \sum_n\Bigg({\bigg[\frac{y_n - M(\mathbf{X})}{\sigma_n}\bigg]^2 + \ln(2\pi\sigma_n)\Bigg)}, \]

where, $y_n$ is the $n$-th data value and $\sigma_n$ the associated uncertainty.

The likelihood is a probability distribution, that represents how well the parameter values reproduce the experimental data. In the figure below, we show how the likelihood is affected by a change in $A$ and $\gamma$ for the Lorentzian modelling of QENS data discussed previously. Note that the values of $x_0$ and $b$ are constrained to the optimal solutions found previously while $A$ and $\gamma$ were varied, the code to generate this figure can be found here.

$An example of a likelihood function, where $A$ and $\gamma$ are varied.$

It is clear that the likelihood is maximised at the optimised solutions for $A$ and $\gamma$ found previously, and indeed the shape of the likelihood appears to be Gaussian in nature. These aspects will become important later in the course.

Traditional Machine Learning Methods

Likelihood¶