Tuan Anh Le

Measure theory for probability (UNFINISHED)

Minimum amount of measure theory necessary to understand probability theory behind machine learning.

These notes are based on (Capinski & Kopp, 2013), (Rosenthal, 2006) and (Qian, 2016).

Definition (-algebra). Let be a set. Then a -algebra is a nonempty collection of subsets of such that

  1. .
  2. If is in , then so is the complement of .
  3. If is a sequence of elements of , then the union of is in .

Call a measurable space.

Definition (Measure). Let be a measurable space. Let be a mapping, where denotes the set of extended real numbers. Then is called a measure on if and only if it has the following properties:

  1. For every , .
  2. For every sequence of pairwise disjoint sets : \begin{align} \mu\left(\cup_{n = 1}^\infty S_n \right) = \sum_{n = 1}^\infty \mu(S_n). \end{align} (that is, is a countably additive function)
  3. .

Definition (Probability measure). Let be a measurable space. A measure on this space is called a probability measure if .

Call a probability triple.

Definition (Measurable function). Let be a measurable space. Let be another measurable space. Let be a function. Define for . is said to be -measurable if for all .

Definition (Random variable). Let be a probability triple. Let be a measurable space. Then a function is called a random variable if it is -measurable.

Definition (Probability distribution). Given a random variable on a probability triple and the output space , the probability distribution of is . We write .

Note that is a valid measure on .

We also call law of and denote .

Definition (Integration).

Definition (Expectation).

Definition (Product measures).

Theorem (Radon-Nikodym).

Definition (Probability density).

Definition (Conditional expectation).

Definition (Conditional probability).

Theorem (Bayes’ rule).

Theorem (Sum rule).

Theorem (Product rule).


  1. Capinski, M., & Kopp, P. E. (2013). Measure, integral and probability. Springer Science & Business Media.
      title = {Measure, integral and probability},
      author = {Capinski, Marek and Kopp, Peter E},
      year = {2013},
      publisher = {Springer Science \& Business Media}
  2. Rosenthal, J. S. (2006). A first look at rigorous probability theory. World Scientific.
      title = {A first look at rigorous probability theory},
      author = {Rosenthal, Jeffrey Seth},
      year = {2006},
      publisher = {World Scientific}
  3. Qian, Z. (2016, September). Lecture notes on the course “B8.1 Martingales through Measure Theory.” Mathematical Institute, University of Oxford.
      author = {Qian, Zhongmin},
      title = {Lecture notes on the course ``B8.1 Martingales through Measure Theory''},
      month = sep,
      year = {2016},
      publisher = {Mathematical Institute, University of Oxford},
      link = {https://courses.maths.ox.ac.uk/node/124},
      file = {../assets/pdf/qian2016martingales.pdf}