Skip to content
Snippets Groups Projects
Commit adf92693 authored by Hartmut Stadie's avatar Hartmut Stadie
Browse files

not needed

parent f1901a0a
Branches
No related tags found
No related merge requests found
%% Cell type:markdown id:d2193805-8359-4380-82da-4b1cb5358290 tags:
# Lecture 1
---
## Basic statistics
<br>
<br>
Hartmut Stadie
hartmut.stadie@uni-hamburg.de
%% Cell type:markdown id:668ad170-00ca-4d67-ba06-71aebd7092a3 tags:
# Parameterschätzung
## Einführung
### Schätzer
Schätzer $\hat a$ für $a$ aus Stichprobe $x_1,\dots x_n$
Anforderungen:
- erwartungstreu:
$E[\hat a]= a$
- konsistent:
$\lim_{n\to \infty } \hat a = a$
- effizient: $V[\hat a]$ möglichst klein
Schätzer für den Mittelwert:
$$\hat \mu = \bar x = \frac{1}{n}\sum_1^n x_i \text{ mit } V[\hat \mu] = \frac{\sigma_x^2}{N}$$
%% Cell type:markdown id:7c7c6ecd-1611-4787-8985-da0881e90d9f tags:
# Methode der kleinsten Quadrate
## Herleitung
### Methode der kleinsten Quadrate
$y(x) = mx + a$: Finde $\hat m$ und $\hat a$!
%% Cell type:markdown id:73e6e46a-4e41-4e66-857b-2946946498d1 tags:
<img src="./figures/11/line.png" style="width:90.0%" alt="image" />
%% Cell type:code id:55bf0cad-8a50-4831-8a37-daf9bcb39b71 tags:
``` python
#hideme
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
def f(x):
return 2*x + 1
n = 10
xs = np.linspace(0,4,n)
sigma_y=0.4
ys = stats.multivariate_normal.rvs(f(xs), np.eye(n)*sigma_y**2, 1, random_state=42)
x_axis = np.linspace(0,4,100)
plt.errorbar(xs,ys,yerr=sigma_y,fmt=".")
plt.plot(x_axis, f(x_axis),'--')
plt.xlabel("x")
plt.ylabel("y")
plt.savefig("line.png")
plt.show()
```
%% Output
%% Cell type:markdown id:64e67452-e6bd-442c-a174-e12bdb18dba0 tags:
### Methode der kleinsten Quadrate
$$\chi^2 = \sum_i \left(\frac{y_i - \hat y(x)}{\sigma_i}\right)^2$$
quantifiziert die Übereinstimmung von Modell zu Daten
$\rightarrow$ $\hat m$ und $\hat a$ sollten $\chi^2$ minimieren.
<img src="./figures/11/line.png" style="width:80.0%"
alt="image" />
%% Cell type:markdown id:bdc0949d-791e-4e29-9adc-169578e62821 tags:
### Methode der kleinsten Quadrate II
Minimiere
$\chi^2 = \sum_i \left(\frac{y_i - \hat y(x)}{\sigma_i}\right)^2 = \sum_i \frac{(y_i - m x_i - a)^2}{\sigma_i^2}$:
Erste Ableitung ist Null:
$$\begin{aligned}
\frac{d\chi^2}{dm} &=& -2\sum_i x_i\frac {y_i -\hat m x_i - \hat a}{\sigma_i^2} = 0\\
\frac{d\chi^2}{da} &=& -2\sum_i \frac{y_i - \hat m x_i - \hat a}{\sigma_i^2} = 0 \\
\sum_i\frac{x_iy_i}{\sigma_i^2} - \hat m \sum_i\frac{x_i^2}{\sigma_i^2}- \hat a \sum_i \frac{x_i}{\sigma_i^2} &=& 0 \\
\sum_i\frac{y_i}{\sigma_i^2} - \hat m \sum_i\frac{x_i}{\sigma_i^2}- \hat a \sum_i \frac{1}{\sigma_i^2} &=& 0
\end{aligned}$$
%% Cell type:markdown id:322c67fc-ddd1-4830-a5f5-6f118ef54c4c tags:
### Methode der kleinsten Quadrate III
Minimiere
$\chi^2 = \sum_i \left(\frac{y_i - \hat y(x)}{\sigma_i}\right)^2 = \sum_i \frac{(y_i - m x_i - a)^2}{\sigma_i^2}$:
$$\begin{aligned}
\sum_i\frac{x_iy_i}{\sigma_i^2} - \hat m \sum_i\frac{x_i^2}{\sigma_i^2}- \hat a \sum_i \frac{x_i}{\sigma_i^2} &=& 0 \\
\sum_i\frac{y_i}{\sigma_i^2} - \hat m \sum_i\frac{x_i}{\sigma_i^2}- \hat a \sum_i \frac{1}{\sigma_i^2} &=& 0
\end{aligned}$$ mit
$\frac{1}{\sum_i 1/\sigma_i^2} \sum_i \frac{f}{\sigma_i^2} = \langle f \rangle$:
$$\begin{aligned}
\langle xy \rangle -\langle x^2 \rangle \hat m& - \langle x \rangle \hat a&= 0\\
\langle y \rangle - \langle x \rangle \hat m& - \hat a& = 0
\end{aligned}$$
%% Cell type:markdown id:fb25687c-4410-4281-b540-39369732fb26 tags:
### Methode der kleinsten Quadrate IV
$$\begin{aligned}
\hat m&=&\frac{\langle xy \rangle - \langle y \rangle\langle x \rangle}{\langle x^2 \rangle - \langle x \rangle^2} = \frac{1}{\sum_i 1/\sigma_i^2} \sum_i \frac{x_i - \langle x \rangle}{\sigma_i^2(\langle x^2 \rangle - \langle x \rangle^2)}y_i\\
\hat a &=& \frac{ \langle y \rangle \langle x^2 \rangle- \langle y \rangle \langle x \rangle^2- \langle x \rangle \langle xy \rangle+ \langle y \rangle \langle x \rangle^2}{ \langle x^2 \rangle- \langle x \rangle^2}\\
&=& \frac{ \langle y \rangle \langle x^2 \rangle - \langle x \rangle \langle xy \rangle}{ \langle x^2 \rangle - \langle x \rangle^2} = \frac{1}{\sum_i 1/\sigma_i^2} \sum_i \frac{\langle x^2 \rangle - \langle x \rangle x_i}{\sigma_i^2(\langle x^2 \rangle - \langle x \rangle^2)}y_i
\end{aligned}$$
%% Cell type:markdown id:b08ade05-e5eb-4a7e-9315-31a45c51aadb tags:
## Fehler
### Fehler
$$\begin{aligned}
V(\hat m) = \sum_i \left(\frac{d\hat m}{y_i}\sigma_i\right)^2\text{; }\frac{d\hat m}{y_i} & = & \frac{1}{\sum_i 1/\sigma_i^2} \frac{x_i - \langle x \rangle}{\sigma_i^2(\langle x^2 \rangle - \langle x \rangle^2)} \\
V(\hat a) = \sum_i \left(\frac{d\hat a}{y_i}\sigma_i\right)^2\text{; }\frac{d\hat a}{y_i} & = & \frac{1}{\sum_i 1/\sigma_i^2} \frac{\langle x^2 \rangle - \langle x \rangle x_i}{\sigma_i^2(\langle x^2 \rangle - \langle x \rangle^2)}
\end{aligned}$$ $$\begin{aligned}
V(\hat m) &=& \left(\frac{1}{\sum_i 1/\sigma_i^2}\right)^2 \sum_i \left(\frac{x_i - \langle x \rangle}{\sigma_i^2(\langle x^2 \rangle - \langle x \rangle^2)}\right)^2 \sigma_i^2 \\
&=& \frac{1}{\sum_i 1/\sigma_i^2} \frac{\langle x^2 \rangle - 2\langle x \rangle \langle x \rangle + \langle x \rangle^2}{(\langle x^2 \rangle - \langle x \rangle^2)^2}
= \frac{1}{\sum_i 1/\sigma_i^2} \frac{1}{\langle x^2 \rangle - \langle x \rangle^2} \\
V(\hat a) &=& \frac{1}{\sum_i 1/\sigma_i^2} \frac{\langle x^2 \rangle^2 - 2\langle x^2 \rangle\langle x \rangle^2 + \langle x^2 \rangle\langle x \rangle^2}{(\langle x^2 \rangle - \langle x \rangle^2)^2}
= \frac{1}{\sum_i 1/\sigma_i^2} \frac{\langle x^2 \rangle}{\langle x^2 \rangle - \langle x \rangle^2}
\end{aligned}$$
%% Cell type:markdown id:d8828d04-0af8-4dc3-a846-24acbc9a0f8e tags:
### Korrelation
$$\begin{aligned}
V(\hat m) &=& \frac{1}{\sum_i 1/\sigma_i^2} \frac{1}{\langle x^2 \rangle - \langle x \rangle^2} \\
V(\hat a) &=& \frac{1}{\sum_i 1/\sigma_i^2} \frac{\langle x^2 \rangle}{\langle x^2 \rangle - \langle x \rangle^2}\\
\text{cov}(\hat m, \hat a) &=&= \frac{1}{\sum_i 1/\sigma_i^2} \frac{\langle (x-\langle x \rangle)(\langle x^2 \rangle - \langle x \rangle x)\rangle}{(\langle x^2 \rangle - \langle x \rangle^2)^2}\\
&=& \frac{1}{\sum_i 1/\sigma_i^2} \frac{\langle x^2 \rangle \langle x \rangle - \langle x \rangle \langle x^2 \rangle - \langle x \rangle \langle x^2 \rangle + \langle x \rangle^2\langle x \rangle}{(\langle x^2 \rangle - \langle x \rangle^2)^2}\\
&=& - \frac{1}{\sum_i 1/\sigma_i^2} \frac{\langle x \rangle}{\langle x^2 \rangle - \langle x \rangle^2}
\end{aligned}$$
### Beispiel in Jupyter
%% Cell type:markdown id:424bdd1f-53bf-422b-bc4a-0702231b976d tags:
### Minimales $\chi^2$
$$\begin{aligned}
\chi^2 &=& \sum_i \frac{(y_i - \hat m x_i - \hat a)^2}{\sigma_i^2} = \sum_i \frac{\left[y_i - \frac{\langle xy \rangle - \langle y \rangle\langle x \rangle}{\langle x^2 \rangle - \langle x \rangle^2} x_i - \frac{ \langle y \rangle \langle x^2 \rangle - \langle x \rangle \langle xy \rangle}{ \langle x^2 \rangle - \langle x \rangle^2} \right]^2}{\sigma_i^2}\\
& = & \sum_i \frac{\left[(\langle x^2 \rangle - \langle x \rangle^2)y_i - (\langle xy \rangle - \langle y \rangle\langle x \rangle)x_i - \langle y \rangle \langle x^2 \rangle + \langle x \rangle \langle xy \rangle\right]^2}{\sigma_i^2 ( \langle x^2 \rangle - \langle x \rangle^2)^2} \\
&=& \dots\\
& =& (\sum_i \frac{1}{\sigma_i^2}) V(y) ( 1- \rho^2_{xy})
\end{aligned}$$
%% Cell type:markdown id:e89f415e-8836-4b97-893c-a7335c3a21e5 tags:
### Beispiel in Jupyter
## In Python
%% Cell type:markdown id:7311c0ff-0ce0-4427-a50e-b6698100e454 tags:
### Mit Python I
Mit scipy.optimize:
```
import scipy.optimize as opti
def fitf(x, m , a):
return m*x + a
pfit, Vfit = opti.curve_fit(fitf , xs, ys,
sigma=[sigma_y]*len(ys),absolue_sigma=True)
print(pfit, Vfit)
```
Vorsicht! Falsche Unsicherheit ohne `absolute_sigma=True`
%% Cell type:markdown id:a5fec52e-bd3a-4437-bb43-620de44939b2 tags:
### Mit Python II
Mit scipy.optimize:
```
def chi2(x, y, sy, a, m):
my = m * x + a
r = (y - my)/sy
return np.sum(r**2)
res = opti.minimize( lambda p: chi2(xs, ys, sigma_y, p[1], p[0]),x0=np.zeros(2))
print(res.x, res.hess_inv*2)
```
%% Cell type:markdown id:84b00301-c0d3-4595-858e-4f784d69c876 tags:
### Inverse Hesse-Matrix und $\chi^2$
$\Delta \chi2$ und Kovarianz Ellipse um Minimum gemäß Kovarianzmatrix
genau bei $\Delta \chi^2 = 1$.
$$1 = \delta \chi^2 = (\vec a -\hat \vec a)^T V^{-1} (\vec a-\hat \vec a)$$
Mit
$\chi^2(\vec a) = \chi^2(\hat \vec a) + (\vec a -\hat \vec a)^T V^{-1} (\vec a-\hat \vec a)$
und
$H_{ij} = \frac{\partial^2 \chi^2(\vec a)}{\partial a_i \partial a_j}$
$$H_{ij} = \frac{\partial^2 (a_k -\hat a_k) V^{-1}_{kl} (a_l -\hat a_l)}{\partial a_i \partial a_j} = \frac{\partial( \delta_{ik}V^{-1}_{kl} (a_l -\hat a_l) + (a_k -\hat a_k) V^{-1}_{kl} \delta_{il})}{\partial a_j}$$
$$H_{ij} = \delta_{ik}V^{-1}_{kl}\delta_{lj} + \delta_{jk}V^{-1}_{kl}\delta_{il} = 2V^{-1}_{ij} \text{ und } V_{ij} = 2 * H^{-1}_{ij}$$
Vorsicht! Manche Algorithmen in `minimize` berechnen keine inverse
Hesse-Matrix.
%% Cell type:markdown id:632583d4-1e97-498b-b8f3-91e259baa24d tags:
# Maximum-Likelihood
Maximum-Likelihood (ML) Daten: $x_1,...,x_N$
Wahrscheinlichkeit der Daten für Modell mit Parametern $a$:
$$P(x_1,...,x_N; a) = \prod_i P(x_i ; a)$$
Likelihoodfunktion: $$L(a) = \prod_i P(x_i ; a)$$
ML-Schätzer $\hat a$: Maximum von $L(a)$:
$$\left.\frac{dL}{da}\right|_{a = \hat a} = 0$$ (praktischer:
Log-Likelihood: $-\ln L = \sum_i -\ln P(x_i; a)$)
%% Cell type:markdown id:dd0afbf7-8504-46f8-b3da-4fb33487a635 tags:
### Beispiel
$y(x) = mx + a$: Finde $\hat m$ und $\hat a$ Daten: $y_1,...,y_N$ und
Modell: $$P(y_i; m, a) = G(y_i; \mu = m x_i + a, \sigma=\sigma_i)$$
$$L(m, a) = \prod_i G(y_i; \mu = m x_i + a, \sigma=\sigma_i)$$
0.5 <img src="./figures/11/line.png" alt="image" />
<img src="./figures/11/like_a.png" style="width:49.0%"
alt="image" />
<img src="./figures/11/loglike_a.png" style="width:49.0%"
alt="image" />
ML-Schätzer für Poisson $\mu$ $$\begin{aligned}
L(\mu) & = & \prod_i^N P(k_i; \mu) = \prod_i^N \frac{\mu^{k_i}e^{-\mu}}{k_i!}\\
\ln L(\mu) & = & \sum_{i=1}^N \left( \ln \mu^{k_i} + \ln e^{-\mu} - \ln k_i!\right)\\
& = & \sum_{i=1}^N \left( k_i \ln \mu -\mu - \ln k_i!\right)\\
0 \stackrel{!}{=} \frac{d \ln L(\mu)}{d\mu} \Big|_{\hat \mu}& = & \sum_{i=1}^N \left( \frac{k_i}{\hat \mu} - 1\right) = \sum_{i=1}^N \frac{k_i}{\hat\mu} - N\\
N & = & \frac{1}{\hat\mu} \sum_{i=1}^N k_i \rightarrow \hat\mu = \frac{1} {N} \sum_{i=1}^N k_i
\end{aligned}$$
%% Cell type:markdown id:d9571970-772e-4e95-8474-82e0afb1dd68 tags:
Varianz des ML-Schätzers
Rao-Cramér-Frechet-Ungleichung: Schätzer $\hat a$ mit Bias (Verzerrung)
$b$
$$V(\hat a) \geq \frac{\left(1+ \frac{\partial b}{\partial a} \right)^2}{E\left[-\frac{\partial^2 \ln L}{\partial a^2}\right]}$$
Fisher-Information:
$$I(\hat a) = E\left [-\frac{\partial^2 \ln L}{\partial a^2}\right]$$
ML-Schätzer für Poisson $V(\hat \mu)$ $$\begin{aligned}
V(\hat \mu) & \geq &\frac{\left(1+ \frac{\partial b}{\partial \mu} \right)^2}{E\left[-\frac{\partial^2 \ln L}{\mu^2}\right]} \\
& = & \frac{1}{E\left[-\frac{\partial(\sum_{i=1}^N \frac{k_i}{\mu} - N)}{\partial \mu^2}\right]} \\
& = & \frac{1}{E\left[-\sum_{i=1}^N \frac{-k_i}{\hat \mu^2}\right]} = \frac{1}{E\left[\sum_{i=1}^N \frac{k_i}{\hat \mu^2}\right]} \\
& = & \frac{1}{\frac{1}{\hat \mu^2}E\left[\sum_{i=1}^N k_i \right]} = \frac{1}{\frac{1}{\hat \mu^2}E\left[N \hat \mu \right]}\\
& = & \frac{\hat \mu}{N}
\end{aligned}$$
Varianz für mehrere Parameter $\vec \theta$
Für effizienten und erwartungstreuen Schätzer:
$$\left(V^{-1}\right)_{ij} = E\left[ -\frac{\partial^2 \ln L(\theta)}{\partial \theta_i \partial \theta_j}\right]$$
Näherung für große Datensätze:
$$\left(\hat V^{-1}\right)_{ij} = -\frac{\partial^2 \ln L(\theta)}{\partial \theta_i \partial \theta_j}\Big|_{\theta=\hat \theta} =$$
Graphisch:
$$\ln L(\theta) \approx \ln L(\hat \theta) + \frac{\partial \ln L}{\partial \theta}\Big|_{\hat \theta}(\theta - \hat \theta) + \frac{1}{2} \frac{\partial^2 \ln L}{\partial \theta^2}(\theta - \hat \theta)^2$$
$$\ln L(\hat \theta + \sigma_\theta) \approx \ln L(\hat \theta) + \frac{1}{2} \frac{\partial^2 \ln L}{\partial \theta^2}(\sigma_\theta)^2 = \ln L(\hat \theta) - \frac{1}{2}$$
Zusammenhang ML und $\chi^2$
Likelihood-Quotient:
$$\lambda = -2 \ln \frac{L(\hat \theta)}{L(\hat \theta^\prime_\text{saturiert})}$$
Mit Normalverteilung: $$\begin{aligned}
\lambda &=& -2 \ln \frac{L(\hat \theta)}{L(\hat \theta^\prime_\text{saturiert})} = -2 \ln \frac{\prod_i G(x_i; \hat \mu, \sigma_i)}{\prod_i G(x_i; x_i, \sigma_i)}\\
& = & -2 \ln \frac{\frac{1}{\sqrt{2\pi}\sigma_i}exp\left(\frac{(x_i-\hat \mu)^2}{2\sigma_i^2}\right)}{\frac{1}{\sqrt{2\pi}\sigma_i}exp\left(\frac{(x_i-x_i)^2}{2\sigma_i^2}\right)} = -2\ln exp\left(\frac{(x_i-\hat \mu)^2}{2\sigma_i^2}\right) \\
& = & -2 \frac{(x_i-\hat \mu)^2}{2\sigma_i^2} = \chi^2 \text{; also } \ln L(\theta) = - \chi^2(\theta) / 2
\end{aligned}$$
%% Cell type:markdown id:7ae85d99-d6fa-409b-abb2-144cc24570db tags:
# Zusammenfassung und Ausblick
## Zusammenfassung und Ausblick
Zusammenfassung
- Methode der kleinsten Quadrate ($\chi^2$)
- Maximum-Likelihood
- Zusammenhang $\chi^2$-ML
- Minimierung
- Literatur:
- Glen Cowan, Statistical Data Analysis,
[pdf](https://www.sherrytowers.com/cowan_statistical_data_analysis.pdf)
- Roger John Barlow, Statistics: A Guide to the Use of Statistical
Methods in the Physical Sciences,
[Skript](https://arxiv.org/pdf/1905.12362.pdf)
- Volker Blobel, Erich Lohrmann, Statistische und numerische
Methoden der Datenanalyse,
[pdf](https://www.desy.de/~sschmitt/blobel/eBuch.pdf)
# Bibliography
Bibliography
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment