Task: Bayesian approach to linear fitting

This notebook compares different approaches to linear regression for datapoints with errors in both axes. The problem is demonstated using the below shown dataset $-$ correlation between Bondi accretion power $P_{\text{Bondi}}$ and mechanical power of relativistic jets $P_{\text{jet}}$ which was obtained as a part of my diploma thesis. The individual data points are differentiated by the presence the of H$\alpha$ emission to: nuclear emission (NE), extended emission (E) and no emission (N).

The task is to "fit" a subgroup (extended + nuclear H$\alpha$ emission) of datapoints using various fitting techniques:

Libraries

Data import

Linearization

Since the LINMIX and other fitting packages are often sutiable only for linear fitting (do not confuse with linear regression), individual quantities need to be linearized. Since we expect the quantities to be related by a power-law model (which is apparent also from the plot above):

$$ \log_{10} \frac{P_{\mathrm{Bondi}}}{10^{43} \; \mathrm{erg \, / \, s}} = \alpha + \beta \cdot \log_{10} \frac{P_{\mathrm{jet}}}{10^{43} \; \mathrm{erg \, / \, s}},$$

we rescaled the relation (by $10^{43}$ erg/s) and linearized it by calculating the logarithm of both quantities:

$$ x = \frac{P_{\mathrm{Bondi}}}{10^{43} \; \mathrm{erg \, / \, s}}, \hspace{4mm} y = \frac{P_{\mathrm{jet}}}{10^{43} \; \mathrm{erg \, / \, s}},$$$$ u = \log_{10}(x), \hspace{4mm} v = \log_{10}(y).$$

And similarly their errors had to be recomputed based on the rules of error propagation:

$$ \sigma_u = \left( \frac{\partial u}{\partial x} \right) \sigma_x = \frac{\sigma_x}{x \, \log(10)}, \hspace{4mm} \sigma_v = \left( \frac{\partial v}{\partial y} \right) \sigma_y = \frac{\sigma_y}{y \, \log(10)}.$$

PYMC3

Approach to linear regression using the PyMC3.

The solution was inspired by GeostatsGuy (https://github.com/GeostatsGuy/PythonNumericalDemos).

Posterior probability

Fitted line

LINMIX

Now let's compare the results with a much more sophisticated way of bayesian linear fitting package LINMIX (https://github.com/jmeyers314/linmix).

Posterior probability

Fitted line

Results

The relation between the Bondi power and mechanical jet power can be described by the relation:

$$ \log \frac{P_{\mathrm{Bondi}}}{10^{43} \; \mathrm{erg \, / \, s}} = (1.13 \pm 0.20) + (1.04 \pm 0.22) \cdot \log \frac{P_{\mathrm{jet}}}{10^{43} \; \mathrm{erg \, / \, s}}$$

and the correlation coefficient between the two quantities is $\rho = 0.96$.

ODR

Orthogonal least squares https://docs.scipy.org/doc/scipy/reference/odr.html

BCES

Bivariate Correlated Errors and intrinsic Scatter https://github.com/rsnemmen/BCES

# Method Description
0 y|x Assumes x as the independent variable
1 x|y Assumes y as the independent variable
2 bissector Line that bisects the y|x and x|y. This approach is self-inconsistent, do not use this method, cf. Hogg, D. et al. 2010, arXiv:1008.4686.
3 orthogonal Orthogonal least squares: line that minimizes orthogonal distances. Should be used when it is not clear which variable should be treated as the independent one

Summary