Visualizing Huffman Coding Trees

Huffman Coding
Image of Huffman Tree

Here is a link to a script I finished to help visualize the Huffman Coding Algorithm.

What would you do if you wanted to transfer a message, say one written in English but you only had a limited set of characters. Suppose these characters are 0 and 1. The only way of doing this is by writing some type of procedure to transfer from our 26 letter alphabet to the 0-1 binary alphabet. There are several ways of developing these encoding functions, but we will focus on those that attempt to translate each individual character into a sequence of 0s and 1s. One of the more popular such codes today is the ASCII code, which maps each character to a binary string (of 0s and 1s) of length 8. For example, here is the ASCII code for the upper and lower case alphanumeric characters.

ASCIIEnglish
01100001a
01100010b
01100011c
01100100d
01100101e
01100110f
01100111g
01101000h
01101001i
01101010j
01101011k
01101100l
01101101m
01101110n
01101111o
01110000p
01110001q
01110010r
01110011s
01110100t
01110101u
01110110v
01110111w
01111000x
01111001y
01111010z

What you notice from this is that each of these encodings beings with “011”, which amounts to a lot of wasted space. ASCII code doesn’t care about this because the fixed length of each binary string allows for easy lookup of particular characters (i,e, you can start almost anywhere in the string with your decomposition as long as you start at a multiple of 8).

But what if we were interested in minimizing the total bits used by the encoded string? This is where the Huffman coding algorithm gains its fame. Unlike the ASCII coding scheme, Huffman codes assign shorter codes to the more frequently occurring characters in your string. Huffman was able to prove this tactic would guarantee the shortest possible encoding.

The Huffman Coding procedure operates as follows:
1. Input string to be encoded -> Input
2. For each character in the input string, calculate the frequency of that character (i.e. the number of times it occurs in the input)
3. Sort the array of characters in the input by their decreasing frequencies
4. Place the array of characters into the queue with each one represented by a node.
5. While there are two or more nodes remaining in the queue.
6. Remove the nodes representing the two characters with the lowest frequency from the queue.
7. Create a node which points to the two nodes just removed from the queue (node -> left points to one node; node -> right points to the other).
8. Insert this new node into the queue, with the frequency equal to the sum of the frequencies of the nodes it points to.
9. If the length of the queue is greater than 1, then goto 5.

Other Blogs that have covered this topic:
Techno Nutty
billatnapier

Understanding Bayes’ Theorem

An Image of Bayes' Theorem Script
An Image of Bayes’ Theorem Script

I’ve finished a script that helps understand Bayes’ Theorem.

If we have a set of mutually exclusive (aka non-overlapping) sets Bi for i {0, 1, 2, …, n} for some integer n, then the union of these sets forms a sample space. Lets call the sample space S. Suppose that we also have some set (also known as an event) A which is also a subset of S. Bayes’ Theorem considers the probability that one of these mutually exclusive events (one of the Bi‘s) caused the observed event (A).

This probability can be calculated by the formula

Pr(Bj | A) =
Pr(Bj) Pr(A | Bj)
Pr(Bi) Pr(A | Bi)

The theorem helps us determine the the probability of the event Bj given A, or in more plain English, the probability that the event Bj is the cause that gives rise to the observed event A. The numerator is given by the product of of the probability of the causal event (Pr(Bj) times the conditional probability of the observed event given the causal event (Pr(A | Bj)). This numerator could be replaced by its equivalent statement of the set A Bj. Likewise, the denominator the sum (over all the causal events) of the probaility of each causal event times the conditional probability of the observed event given that particular causal event. Each term in this denominator could be replaced b its equivalent staetment A Bi, which when summed give the total probability of A because each pair of the Bj‘s is mutually exclusive. So we are able to replace the probability of A with Pr(Bi) Pr(A | Bi) because of the fundamental law of probability.

An example that would use Bayes’ Theorem is analyzing the results of an election. The set of mutually exclusive events could be membership in a political party (Democrat, Republican, or Independent). The observed event could be the election of an individual. And the conditional distributions could be the percentage of each party that voted for this individual. If we want to calculate how significant each party was to the individual’s election, we’d use Bayes’ Theorem.

The script I’ve written to help understand Bayes’ Theorem works as follows:
– A set of mutually exclusive sets is randomly generated (the number of sets also varies). These sets are called Bi for i (0, …, n}.
– A set A is randomly generated from the union of the Bi‘s.
– A table is displayed showing:
Pr(Bi) for each i on line 1.
Pr(A | Bi) for each i on line 2.

– The user is given the option to select which of the mutually exclusive sets they would like to use to calculate the probability that this set caused the event A.
– Once a set is chosen, the user clicks the “Calculate Conditional” button and Bayes’ Theorem gives the result.
– If the “show work” checkbox was checked, then the steps used in this calculation are also shown.
– All work is done using fractions to give an idea of where the numbers come from.

Other Blogs that have covered this topic:
Better Explained
Bayes’ Theorem-qed

Learn Math Through Set Relations

This is an image of a script I wrote to help users understand mathematics through set theory and relations.

I have just finished a script that helps users understand mathematics through set theory and relations.

Much of our world deals with relationships – both in the sense of romantic ones or ones that show some interesting property between two sets. When mathematicians think of set theory, a relation between the set A and the set B is a set of ordered pairs, where the first element of the ordered pair is from the set A and the second element of the ordered pair id from the set B. So if we say that R is a relation on the sets A and B, that would mean that R consists of elements that look like (a, b) where a is in A and b is in B. Another way of writing this is that R is a subset of A x B. For more on subsets and cross product, I refer you to my earlier script work on set operations.

Relations can provide a useful means of relating an abstract concept to a real world one. I think of things like the QB rating system in the NFL as an example. We have a set of all quarterbacks in the NFL (or really all people who have thrown a pass) and we would like some means of saying that one QB is performing better than another. The set of statistics kept on a QB is a large set, so attempting to show that one QB is better by showing that every year that they played one is better in every statistical category can be (a) exhaustive, and (b) will lead to very few interesting comparisons. Most of the really good QBs have some areas that they are really good and others that they are not. The QB rating system provides a relation between the set of all QBs in the NFL and the set of real numbers. Once this relation was defined, we can say that one QB is performing better than another if his QB rating is higher. Similarly we can compare a QB to his own statistics at different points in his career to see the changes and trends.

This is just one example, and there are countless others that I could have used instead.

Once we understand what a relation is, we have several properties that we are interested in. Below I list four, although there are many more.

Properties of Relations:
A relation R is symmetric if whenever an element (a, b) belongs to R, then so does (b, a).

A relation R is reflexive if for every element a in the universe of the relation, the element (a, a) belongs to R.

A relation R is transitive if for every pair of elements (a, b) and (c, d) and b = c, then the element (a, d) belongs to R.

A relation R is anti-symmetric if the elements (a, b) and (b, a) do not belong to the relation whenever a is not equal to b.

Once we understand what a relation is, there are a few common ones that we are interested in. Below I list four, but again, I want to stress that these are some of the more common ones, but there are several others.

Types of Relations:
A relation R is a function (on its set of defined elements) if there do not exist elements (a, b) and (a, c) which both belong to R.

A relation R is an equivalence relation if R is symmetric, reflexive and transitive.

A relation R is a partial order set if R is anti-symmetric, reflexive and transitive.

A relation R is a total order set if it is a partial order set and for every pair of elements a and b, either (a, b) is in R or (b, a) is in R.

A partial order is just an ordering, but not everything can be compared to everything else. Think about the Olympics, and a sport like gymnastics. Consider the floor and the balance beam. One person can win gold on the floor and another person wins gold on the balance beam. That puts each of them in the “top” of the order for their particular section, but there’s no way of comparing the person who won the floor exercise to the person who won the balance beam. So we say the set is “partially ordered”. More formally, lets say that two people (person X and person Y) relate if they competed in the same event and the the first person (in this case person X) received an equal or higher medal in that event than the second person (in this case person Y). Obviously any person receives the same medal as themselves, so this relation is reflexive. And if Jamie received an equal or higher medal than Bobby and Bobby received an equal or higher medal than Chris, then Jamie must have received an equal or higher medal than Chris so this relation is transitive. To test this relation for anti-symmetry, suppose that Chuck received an equal or higher medal than Charlie and Charlie received an equal or higher medal than Chuck. This means that they must have received the same medal, but since only one medal is awarded at each color for each event (meaning one gold, one silver and one bronze…if this is not true, assume it is), this must mean that Chuck and Charlie are the same person, and this relation is thus anti-symmetric.

If we have a partial ordering where we can compare everything, then we say that the set is “totally ordered”.

An equivalence relation tries to mimic equality on our relation. So, staying with that example of the Olympics, an example of an equivalence relation could be to say that two athletes relate to one another if they both received the same color medal in their event (for the sake of argument lets assume that no athlete competes in more than one event). Then obviously an athlete receives the same medal as themselves, so this relation is reflexive. If two people received the same medal, then it doesn’t matter if we say Chris and Charlie or Charlie and Chris, so the relation is symmetric. And Finally if Chris received the same medal as Charlie an if Charlie received the same medal as Jesse, then all three people received the same medals, so Chris and Jesse received the same medals and this relation is transitive. Because this relation has these three properties, it is called an equivalence relation.