Minimum amount of measure theory necessary to understand probability theory behind machine learning.

These notes are based on (Capinski & Kopp, 2013), (Rosenthal, 2006) and (Qian, 2016).

**Definition (-algebra).** Let be a set. Then a -algebra is a nonempty collection of subsets of such that

- .
- If is in , then so is the complement of .
- If is a sequence of elements of , then the union of is in .

Call a measurable space.

**Definition (Measure).**
Let be a measurable space.
Let be a mapping, where denotes the set of extended real numbers.
Then is called a measure on if and only if it has the following properties:

- For every , .
- For every sequence of pairwise disjoint sets : \begin{align} \mu\left(\cup_{n = 1}^\infty S_n \right) = \sum_{n = 1}^\infty \mu(S_n). \end{align} (that is, is a countably additive function)
- .

**Definition (Probability measure).**
Let be a measurable space.
A measure on this space is called a probability measure if .

Call a probability triple.

**Definition (Measurable function).**
Let be a measurable space.
Let be another measurable space.
Let be a function.
Define for .
is said to be -measurable if for all .

**Definition (Random variable).**
Let be a probability triple.
Let be a measurable space.
Then a function is called a random variable if it is -measurable.

**Definition (Probability distribution).**
Given a random variable on a probability triple and the output space , the probability distribution of is .
We write .

Note that is a valid measure on .

We also call *law of * and denote .

**Definition (Integration).**

**Definition (Expectation).**

**Definition (Product measures).**

**Theorem (Radon-Nikodym).**

**Definition (Probability density).**

**Definition (Conditional expectation).**

**Definition (Conditional probability).**

**Theorem (Bayes’ rule).**

**Theorem (Sum rule).**

**Theorem (Product rule).**

**References**

- Capinski, M., & Kopp, P. E. (2013).
*Measure, integral and probability*. Springer Science & Business Media.@book{capinski2013measure, title = {Measure, integral and probability}, author = {Capinski, Marek and Kopp, Peter E}, year = {2013}, publisher = {Springer Science \& Business Media} }

- Rosenthal, J. S. (2006).
*A first look at rigorous probability theory*. World Scientific.@book{rosenthal2006first, title = {A first look at rigorous probability theory}, author = {Rosenthal, Jeffrey Seth}, year = {2006}, publisher = {World Scientific} }

- Qian, Z. (2016, September). Lecture notes on the course “B8.1 Martingales through Measure Theory.” Mathematical Institute, University of Oxford.
@misc{qian2016martingales, author = {Qian, Zhongmin}, title = {Lecture notes on the course ``B8.1 Martingales through Measure Theory''}, month = sep, year = {2016}, publisher = {Mathematical Institute, University of Oxford}, link = {https://courses.maths.ox.ac.uk/node/124}, file = {../assets/pdf/qian2016martingales.pdf} }

[back]