Compare commits

...

17 Commits

Author SHA1 Message Date
arc
145a871d1d vault backup: 2025-06-06 14:43:51 2025-06-06 14:43:51 -06:00
arc
b018456d64 vault backup: 2025-06-04 18:29:17 2025-06-04 18:29:17 -06:00
arc
6a8712dc7c vault backup: 2025-06-03 11:45:18 2025-06-03 11:45:18 -06:00
arc
892f40f37e vault backup: 2025-06-03 11:30:16 2025-06-03 11:30:16 -06:00
arc
83ceb7a5d0 vault backup: 2025-05-31 13:22:00 2025-05-31 13:22:00 -06:00
arc
3395a32204 vault backup: 2025-05-31 11:20:48 2025-05-31 11:20:48 -06:00
arc
4f1ea82d06 vault backup: 2025-05-30 16:10:23 2025-05-30 16:10:23 -06:00
arc
2857e5ea84 vault backup: 2025-05-30 16:05:23 2025-05-30 16:05:23 -06:00
arc
142ab93d04 vault backup: 2025-05-21 17:39:42 2025-05-21 17:39:42 -06:00
arc
90c9111a08 vault backup: 2025-05-09 13:07:48 2025-05-09 13:07:48 -06:00
arc
053eea8b11 vault backup: 2025-05-09 12:37:48 2025-05-09 12:37:48 -06:00
arc
9efb23dc3d vault backup: 2025-05-09 12:32:48 2025-05-09 12:32:48 -06:00
arc
e5d60bfb94 vault backup: 2025-05-09 12:27:48 2025-05-09 12:27:48 -06:00
arc
cbe5d6cde7 vault backup: 2025-05-09 12:22:48 2025-05-09 12:22:48 -06:00
arc
c80bacb2fc vault backup: 2025-05-09 12:17:48 2025-05-09 12:17:48 -06:00
arc
bff7640d0a vault backup: 2025-05-09 12:12:48 2025-05-09 12:12:48 -06:00
arc
27b25ff6dd vault backup: 2025-05-09 12:07:48 2025-05-09 12:07:48 -06:00
3 changed files with 38 additions and 4 deletions

2
.obsidian/app.json vendored
View File

@ -2,7 +2,7 @@
"vimMode": true, "vimMode": true,
"promptDelete": false, "promptDelete": false,
"pdfExportSettings": { "pdfExportSettings": {
"includeName": true, "includeName": false,
"pageSize": "Letter", "pageSize": "Letter",
"landscape": false, "landscape": false,
"margin": "0", "margin": "0",

View File

@ -14,7 +14,7 @@
"prevConfig": { "prevConfig": {
"pageSize": "A4", "pageSize": "A4",
"marginType": "1", "marginType": "1",
"showTitle": true, "showTitle": false,
"open": true, "open": true,
"scale": 100, "scale": 100,
"landscape": false, "landscape": false,

View File

@ -1,8 +1,8 @@
<https://arxiv.org/abs/1311.2540> <https://arxiv.org/abs/1311.2540>
In standard numeral systems, different digits are treated as containing the same amount of information. A 7 stores the same amount of info as a 9, which stores the same amount of info as a 1. In standard numeral systems, different digits are treated as containing the same amount of information. A 7 is stored using the same amount of info as a 9, which is stored using the same amount of info as a 1, that is, 1 digit.
This makes the amount of information a single digit stores *uniform* across all digits. However, that's far from the most efficient way to represent most datasets. This makes the amount of information a single digit stores *uniform* across all digits. However, that's far from the most efficient way to represent most datasets, because real world data rarely follows a uniform distribution.
ANS theory is based around the idea that digits that occur more often can be stored in a way that requires less information, and digits that occur less often can be stored using more information. ANS theory is based around the idea that digits that occur more often can be stored in a way that requires less information, and digits that occur less often can be stored using more information.
@ -12,4 +12,38 @@ Taking a look at the standard binary numeral system, there are two digits in the
Given that $x$ represents a natural number, and $s$ is the digit we're adding. In a standard binary system, adding $s$ to the least significant position means that in the new number $x$ (before the addition) now represents the Nth appearance of an even (when $s = 0$ ), or odd (when $s = 1$). With ANS, the goal is is to make that asymmetrical, so that you can represent more common values with a denser representation. Given that $x$ represents a natural number, and $s$ is the digit we're adding. In a standard binary system, adding $s$ to the least significant position means that in the new number $x$ (before the addition) now represents the Nth appearance of an even (when $s = 0$ ), or odd (when $s = 1$). With ANS, the goal is is to make that asymmetrical, so that you can represent more common values with a denser representation.
# Arithmetic Coding
Arithmetic coding works by taking a stream of data, and converting it into an infinitely precise number between $0.00$, and $1.00$. This is based off of the idea that the sum of the probability of all events happening will always amount to $100\%$.
For example, the probability of a coin flip resulting in tails is 50%, and the probability of a coin flip resulting in heads is 50%. The probability of a coin flip resulting in heads *or* tails is %100.
If we wanted to keep track of the result of a series of coin flips, this could be done by subdividing a range. If the coin flip is between $0$ and $0.5$, then we know that the first flip must have been tails.
If the coin flip is between $0.5$ and $1$, then we know that the first flip must have been heads.
This subdivision process can be repeated infinitely to store an infinite number of coin flips by dividing each range again.
To store two coin flips, you might have the first subdivision represent the outcome of the first coin flip, and the second subdivision represent the outcome of the second coin flip:
| Range | Result |
| ------------- | ------------ |
| $0.00 - 0.25$ | Tails, Tails |
| $0.25 - 0.5$ | Tails, Heads |
| $0.50 - 0.75$ | Heads, Tails |
| $0.75 - 1.00$ | Heads, Heads |
Imagine a situation where we want to store all possible outcomes of three consecutive coin flips using a decimal number, *Heads, Heads, Tails*.
Encoding this would happen as follows:
1. First we subdivide the range by the probability of each event happening. The probability of each is 50%, so that's simple. Referring above, we know that heads is represented by the top half of the range, and tails is represented by the bottom half of the range.
> Because the *first* coin flip resulted in *Heads*, the output value must be between $0.50$ and $1.00$.
2. Subdividing the range $0.50$ and $1.00$ again to store the results of the second flip, we end up with values between $0.50$ and $0.75$ representing the sequence *Heads, Tails*, and values between $0.75$ and $1.00$ representing the sequence *Heads, Heads*.
> Because the *second* coin flip resulted in *Heads*, we know that the output value must be between $0.75$ and $1.00$
3. Subdividing the range $0.75$ and $1.00$ yet again, $0.750$ - $0.875$ means the third coin flip resulted in *Tails*, and a value in the range $0.875$ - $1.000$ means the third coin flip resulted in *Heads*
> Because the *third coin flip resulted in *Heads*, any value between $0.875$ and $1.000$ encodes the fact that the first three coin flips went *Heads, Heads, Tails*.
The decoding process performs the same series of steps, but by asking a question instead of outputting a value.
1. Is the value between $0.00$ and $0.50$? If so, the first coin flip resulted in *Tails*. Otherwise if the value is between $0.50$ and $1.00$, the first coin flip resulted in *Heads*.
The above process can be repeated just like the encoding process until we've determined the result of the first three coin flips.
These subdivisions can be encoded using $0$ and $1$, where $0$ represents the bottom half of the range, and $1$ represents the top half of the range.
When the alphabet is large enough that you can't select a particular outcome using one bit, multiple bits can be used instead to divide up and down the range.