Statistics for Particle Physics

Lecture 1 Introduction

Statistics links physical theories with experiments

Theory + response of measurement apparatus = model prediction
Observations have uncertainty on many levels
Statistics allows us to quantify this with probablility

Definition of probability

Kolmogorov Axioms (1933)
Defined in set theory
Can be compacted to three axioms
- For all A subset S, P(A) >= 0
- P(S) = 1
- if A^B = empty set, P(a or b) = P(A) + P(B)
Also define conditional probability. This cannot be derived from the first axioms

Interpretations of Probability

The axioms provide no interpretations of the elements of the sample space.
Frequentist statistics treat probability the limiting frequency
- A, B, … are outcomes of an experiment
- That can be repeated an infinite number of times
- Probability is limiting frequency
- What does it mean to say whether a theory is favoured or disfavoured?
  - Preferred theories predict a high probability for the data that is ‘like’ the data observed
Bayesian or Subjective statistics treats probability as a degree of belief
- A, B, … are hypotheses
- P(A) = degree of belief that A is true
- S is sometimes called the hypothesis space
- As opposed to frequentist interpretation, bayes theroem says “if your priors wehre p, then it says how these probabilities should change in light of the data.”
- There is NO recipe for finding the prior probabilities. This is the subjective nature of bayesian statistics.
  - You can’t enumerate all the hypotheses! (denominator)
Both these interpretations are consistent with the Kolmogorov axioms
Then it also satisfies Bayes theorem

Bayes theorem

Relates P(AgivenB) to P(BgivenA)
link the essay
Both interpretations of statistics is consistent with bayes theorem
- add these things

Law of total probability

We express P(B) as a sum of P(BgivenA_i)P(A_i)
This is often used in bayes theorem

Probability density and mass fn

The first case is continuous, the second is discrete

Cumulative Distribution fn

We can integrate the density fn
Or differentiate the cumulative distribution to get the denisty fn

Expectation values

can be used to summarise a complicated pdf
E(X) = mu “centre of gravity” of the pdf

Lecture 2: Parameter estimation

Outcomes to probabilities - Frequentist! There are also bayesian

Hypotheses and likelihoods

A rules that assigns a probability to each data outcome
- P(x given H), “what’s the probability of x under the assumption of some hypothesis”
- this is likelihood function $L (θ)$
  - We fix x in L, so x is hidden.
- $L$ is not a pdf for the parameter. It’s a pdf for the data!

Parameters

of a pdf are any constants that characterise it.
we want a function, or estimator, to estimate the parameters.
- theta hat of x

Properties of estimators

we want small bias
- E(theta hat) - theta
we want small variance
- V(theta hat)
These are conflicting criteria, optimising for both is difficult.

Maximum likelihood estimators

Find theta such that theta hat is maximised.
Equivalent to maximising log likelihood
MLEs are not guaranteed to have optimal properties.
- ??

Properties of MLE

the bias is 0
the variance is theta squared/n

Monte carlo variance

In most cases, calculating the variance of estimators is not easy. One way is to simulate the experiment many times
The distribution of estimates is guassian for ML in large sample limit
- Central limit theorem

Variance of estimators from information inequality

Sets lower bound on variance for any estimator.
For small bias, the inequality becomes an equality.
⇒ MLE is ‘efficient’

Lecture 3: Hypothesis testing & confidence intervals

Suppose a measurement produces some data $x$ , consider hypotheses $H_{0}, H_{1}$ . We can reject or accept H0.

Setup a critical region such that the probability of finding the data there, under the hypothesis, is less than or equal to (discrete) a small probability alpha.

If $x$ is observed in the critical region, reject $H_{0}$ .

The alternative hypothesis motivates the placement of the critical region

Test significance / goodness of fit

’Discovery’ at 5 sigma in particle physics

Lecture 4: Machine Learning

Curve fitting

✍️ Joes notes

Explorer

Statistics for Particle Physics

Lecture 1 Introduction

Lecture 2: Parameter estimation

Lecture 3: Hypothesis testing & confidence intervals

Lecture 4: Machine Learning

Graph View

Table of Contents

Backlinks