Compare commits
5 Commits
bff7640d0a
...
053eea8b11
Author | SHA1 | Date | |
---|---|---|---|
![]() |
053eea8b11 | ||
![]() |
9efb23dc3d | ||
![]() |
e5d60bfb94 | ||
![]() |
cbe5d6cde7 | ||
![]() |
c80bacb2fc |
@ -17,4 +17,33 @@ Arithmetic coding works by taking a stream of data, and converting it into an in
|
||||
|
||||
For example, the probability of a coin flip resulting in tails is 50%, and the probability of a coin flip resulting in heads is 50%. The probability of a coin flip resulting in heads *or* tails is %100.
|
||||
|
||||
If we wanted to keep track of the result of a series of coin flips, this could be done by subdividing a range. If the coin flip is between $0$ and $0.5$, then we know that the first flip must
|
||||
If we wanted to keep track of the result of a series of coin flips, this could be done by subdividing a range. If the coin flip is between $0$ and $0.5$, then we know that the first flip must have been tails.
|
||||
|
||||
If the coin flip is between $0.5$ and $1$, then we know that the first flip must have been heads.
|
||||
|
||||
This subdivision process can be repeated infinitely to store an infinite number of coin flips by dividing each range again.
|
||||
|
||||
To store two coin flips, you might have the first subdivision represent the outcome of the first coin flip, and the second subdivision represent the outcome of the second coin flip:
|
||||
|
||||
| Range | Result |
|
||||
| ------------- | ------------ |
|
||||
| $0.00 - 0.25$ | Tails, Tails |
|
||||
| $0.25 - 0.5$ | Tails, Heads |
|
||||
| $0.50 - 0.75$ | Heads, Tails |
|
||||
| $0.75 - 1.00$ | Heads, Heads |
|
||||
Imagine a situation where we want to store all possible outcomes of three consecutive coin flips using a decimal number, *Heads, Heads, Tails*.
|
||||
Encoding this would happen as follows:
|
||||
1. First we subdivide the range by the probability of each event happening. The probability of each is 50%, so that's simple. Referring above, we know that heads is represented by the top half of the range, and tails is represented by the bottom half of the range.
|
||||
> Because the *first* coin flip resulted in *Heads*, the output value must be between $0.50$ and $1.00$.
|
||||
2. Subdividing the range $0.50$ and $1.00$ again to store the results of the second flip, we end up with values between $0.50$ and $0.75$ representing the sequence *Heads, Tails*, and values between $0.75$ and $1.00$ representing the sequence *Heads, Heads*.
|
||||
> Because the *second* coin flip resulted in *Heads*, we know that the output value must be between $0.75$ and $1.00$
|
||||
3. Subdividing the range $0.75$ and $1.00$ yet again, $0.750$ - $0.875$ means the third coin flip resulted in *Tails*, and a value in the range $0.875$ - $1.000$ means the third coin flip resulted in *Heads*
|
||||
> Because the *third coin flip resulted in *Heads*, any value between $0.875$ and $1.000$ encodes the fact that the first three coin flips went *Heads, Heads, Tails*.
|
||||
|
||||
The decoding process performs the same series of steps, but by asking a question instead of outputting a value.
|
||||
1. Is the value between $0.00$ and $0.50$? If so, the first coin flip resulted in *Tails*. Otherwise if the value is between $0.50$ and $1.00$, the first coin flip resulted in *Heads*.
|
||||
The above process can be repeated just like the encoding process until we've determined the result of the first three coin flips.
|
||||
|
||||
These subdivisions can be encoded using $0$ and $1$, where $0$ represents the bottom half of the range, and $1$ represents the top half of the range.
|
||||
|
||||
When the alphabet is large enough that you can't select a particular outcome using one bit, multiple bits can be used instead to divide up and down the range.
|
Loading…
x
Reference in New Issue
Block a user