Slide 11 of 23
Notes:
I’m avoiding too much mathematics, but the actual formula is pretty simple. If there is a set of N possibilities, and one of them must occur, let p(i) be the probability of the ith one. Then the entropy is:
It turns out that this (rounded up) is the minimum number of bits which can represent all of the possibilities.
There are other definitions of entropy, but this is the one that is meant in this tutorial.
A great reference for this stuff is “Basic Concepts in Information Theory and Coding” (subtitled “the Adventures of Secret Agent 00111”) by Golomb, Peile, and Scholtz. It is surprisingly readable.