Blog

Shade The Cells Puzzle

A friend described this puzzle to me and I enjoyed it so much that I just had to write a script so that I could play it more.

The rules of this puzzle are simple. Cells can be in one of three states:

An UNSHADED (white) cell means that you have not considered this cell yet.
A DARK GREY SHADED cell means that the sum of the dark grey shaded cells in that row and column must equal the number in that cell.
A LIGHT GREY SHADED cell means that the sum of the dark grey shaded cells in all the connected cells must equal the number in that cell.

I Hope you Enjoy

October 6, 2012

Learn Duality in Linear Programming

I have just completed a script to help learn about duality in linear programming.

Many real world problems can be formulated as Linear Programming problems. There are often many different ways to formulate a single problem. Some of these alternative formulations can easily be proven to be equal via simple algebra and arithmetic. For example, one person may see a problem as maximizing profit while another may see the same problem as minimizing losses. The relationship between the two alternative formulations can then be shown to be that one simply has the negative objective function value of the other.

Sometimes, though, this relationship between alternative formulations is not as easy to detect. Two alternative formulations that arise regularly in linear programming problems are a primal problem and a dual problem. The dual of a linear programming problem is the problem of finding the best bound on the objective function in terms of the constraints. This dual is formulated so that every feasible solution to the dual is a bound on the primal objective function. The Weak Duality Theorem for Linear Programming says that the optimal solution for a minimization problem is always an upper bound for its dual (which is a corresponding maximization problem). Correspondingly, the optimal solution for a maximization problem is always a lower bound for its dual. This leads to the Strong Duality Theorem for Linear Programming which says that if we can find feasible solutions to the primal and dual problems with matching objective function values, then both of these solutions must be optimal for their respective problems.

This shows an importance of duality. What my script provides is a means of formulating the dual of a given problem to help understand the concept.

The rules for constructing a dual linear program are as follows:

Primal:	Dual:
The i^th constraint is [<=]	Variable y_i is [>=] 0
The i^th constraint is [>=]	Variable y_i is is [<=] 0
The i^th constraint is =	Variable y_i is unbounded
Variable x_j is [>=] 0	The j^th constraint is [>=]
Variable x_j is [<=] 0	The j^th constraint is [<=]
Variable x_j is unbounded	The j^th constraint is =

Other Blogs that have covered this topic:
A Narrow Margin
Optimization and data mining

October 6, 2012

Learning the Apriori Algorithm

I have finished a script that runs the Apriori algorithm.

When we are given a large set of transactions, we are often interested in discovering patterns inside these transactions. The Apriori algorithm provides a means for formulating what are known as “association rules” for the set of transactions. An association rule is an observation from the database between different items inside a transaction. For example, the statement “If a customer buys chips they are 60% more likely to also purchase dip” could be an association rule based on data from supermarket purchases. The number 60% is called the confidence we have in the rule and we are generally interested in rules with higher confidence.

The Apriori algorithm takes as input a transaction database and a “threshold”. The initial pass through the database performs a count on each single item in the database and checks how many transactions contain each item. The algorithm proceeds by the Apriori property which states that “any subset of a frequent itemset must also be frequent”. What this means is that when checking the subsets of length 2 (and greater), we can ignore those subsets that contain an element that does not meet the minimum support, as it cannot be a part of a frequent itemset.

So lets look at an example. Suppose our list of transactions for the items Chips, Dip, Soda, Napkins and Paper Plates are as follows:

(Chips, Dip, Soda, Napkins, Paper Plates)
1 1 0 1 0
0 0 0 1 0
1 1 1 0 1
0 1 1 0 1
1 0 0 1 0
1 0 0 0 1
1 1 1 0 1
0 0 1 0 1
1 0 0 0 0
1 0 1 1 0
0 1 1 1 0
1 1 0 0 0
1 0 1 1 0
0 0 0 1 1
1 1 1 1 0
0 0 0 0 0
0 0 0 0 0
1 1 1 0 0
0 0 0 1 1
1 1 1 0 1

And lets suppose that we’re interested in finding the collections that occur more than a quarter (25%) of the time.

What we see from simply summing the columns and dividing by the number of columns is that 60% of the people purchased chips, 45% purchased dip, 50% purchased soda, 45% purchased napkins, and 40% purchased paper plates. Since these are all above our minimum threshold, they are all possible as elements of future collections.

When looking at larger collections, we see that:
– 35% of the people purchased both chips and dip
– 35% of the people purchased both chips and soda
– 25% of the people purchased both chips and napkins
– 35% of the people purchased both dip and soda
– 25% of the people purchased both soda and paper plates
– no other size two collections are above the minimum threshold.

You’ll note that since only 20% of the people purchased chips and paper plates, it is not above the minimum threshold and so we can ignore it in future collections. From the set of size 2 itemsets, we can derive our list of size three itemsets.

In particular, we see that:
– 25% of the people purchased all three of chips, dip, and soda.
– no other size three collections are above the minimum threshold.

The hierarchy of the itemsets is displayed in the image above.

To play around with this algorithm and understand more of its properties, visit Apriori algorithm.

Other Blogs that have covered this topic:
Analytics and Visualization of Big Data
Statistical Research

October 2, 2012

Visualizing Huffman Coding Trees

Here is a link to a script I finished to help visualize the Huffman Coding Algorithm.

What would you do if you wanted to transfer a message, say one written in English but you only had a limited set of characters. Suppose these characters are 0 and 1. The only way of doing this is by writing some type of procedure to transfer from our 26 letter alphabet to the 0-1 binary alphabet. There are several ways of developing these encoding functions, but we will focus on those that attempt to translate each individual character into a sequence of 0s and 1s. One of the more popular such codes today is the ASCII code, which maps each character to a binary string (of 0s and 1s) of length 8. For example, here is the ASCII code for the upper and lower case alphanumeric characters.

ASCII	English
01100001	a
01100010	b
01100011	c
01100100	d
01100101	e
01100110	f
01100111	g
01101000	h
01101001	i
01101010	j
01101011	k
01101100	l
01101101	m
01101110	n
01101111	o
01110000	p
01110001	q
01110010	r
01110011	s
01110100	t
01110101	u
01110110	v
01110111	w
01111000	x
01111001	y
01111010	z

What you notice from this is that each of these encodings beings with “011”, which amounts to a lot of wasted space. ASCII code doesn’t care about this because the fixed length of each binary string allows for easy lookup of particular characters (i,e, you can start almost anywhere in the string with your decomposition as long as you start at a multiple of 8).

But what if we were interested in minimizing the total bits used by the encoded string? This is where the Huffman coding algorithm gains its fame. Unlike the ASCII coding scheme, Huffman codes assign shorter codes to the more frequently occurring characters in your string. Huffman was able to prove this tactic would guarantee the shortest possible encoding.

The Huffman Coding procedure operates as follows:
1. Input string to be encoded -> Input
2. For each character in the input string, calculate the frequency of that character (i.e. the number of times it occurs in the input)
3. Sort the array of characters in the input by their decreasing frequencies
4. Place the array of characters into the queue with each one represented by a node.
5. While there are two or more nodes remaining in the queue.
6. Remove the nodes representing the two characters with the lowest frequency from the queue.
7. Create a node which points to the two nodes just removed from the queue (node -> left points to one node; node -> right points to the other).
8. Insert this new node into the queue, with the frequency equal to the sum of the frequencies of the nodes it points to.
9. If the length of the queue is greater than 1, then goto 5.

Other Blogs that have covered this topic:
Techno Nutty
billatnapier

August 28, 2012

Understanding Bayes’ Theorem

An Image of Bayes’ Theorem Script

I’ve finished a script that helps understand Bayes’ Theorem.

If we have a set of mutually exclusive (aka non-overlapping) sets B_i for i {0, 1, 2, …, n} for some integer n, then the union of these sets forms a sample space. Lets call the sample space S. Suppose that we also have some set (also known as an event) A which is also a subset of S. Bayes’ Theorem considers the probability that one of these mutually exclusive events (one of the B_i‘s) caused the observed event (A).

This probability can be calculated by the formula

Pr(B_j | A) = Pr(B_j) Pr(A | B_j)

Pr(B_i) Pr(A | B_i)

The theorem helps us determine the the probability of the event B_j given A, or in more plain English, the probability that the event B_j is the cause that gives rise to the observed event A. The numerator is given by the product of of the probability of the causal event (Pr(B_j) times the conditional probability of the observed event given the causal event (Pr(A | B_j)). This numerator could be replaced by its equivalent statement of the set A B_j. Likewise, the denominator the sum (over all the causal events) of the probaility of each causal event times the conditional probability of the observed event given that particular causal event. Each term in this denominator could be replaced b its equivalent staetment A B_i, which when summed give the total probability of A because each pair of the B_j‘s is mutually exclusive. So we are able to replace the probability of A with Pr(B_i) Pr(A | B_i) because of the fundamental law of probability.

An example that would use Bayes’ Theorem is analyzing the results of an election. The set of mutually exclusive events could be membership in a political party (Democrat, Republican, or Independent). The observed event could be the election of an individual. And the conditional distributions could be the percentage of each party that voted for this individual. If we want to calculate how significant each party was to the individual’s election, we’d use Bayes’ Theorem.

The script I’ve written to help understand Bayes’ Theorem works as follows:
– A set of mutually exclusive sets is randomly generated (the number of sets also varies). These sets are called B_i for i (0, …, n}.
– A set A is randomly generated from the union of the B_i‘s.
– A table is displayed showing:
Pr(B_i) for each i on line 1.
Pr(A | B_i) for each i on line 2.

– The user is given the option to select which of the mutually exclusive sets they would like to use to calculate the probability that this set caused the event A.
– Once a set is chosen, the user clicks the “Calculate Conditional” button and Bayes’ Theorem gives the result.
– If the “show work” checkbox was checked, then the steps used in this calculation are also shown.
– All work is done using fractions to give an idea of where the numbers come from.

Other Blogs that have covered this topic:
Better Explained
Bayes’ Theorem-qed

August 16, 2012
Learn Math Through Set Relations

I have just finished a script that helps users understand mathematics through set theory and relations.

Much of our world deals with relationships – both in the sense of romantic ones or ones that show some interesting property between two sets. When mathematicians think of set theory, a relation between the set A and the set B is a set of ordered pairs, where the first element of the ordered pair is from the set A and the second element of the ordered pair id from the set B. So if we say that R is a relation on the sets A and B, that would mean that R consists of elements that look like (a, b) where a is in A and b is in B. Another way of writing this is that R is a subset of A x B. For more on subsets and cross product, I refer you to my earlier script work on set operations.

Relations can provide a useful means of relating an abstract concept to a real world one. I think of things like the QB rating system in the NFL as an example. We have a set of all quarterbacks in the NFL (or really all people who have thrown a pass) and we would like some means of saying that one QB is performing better than another. The set of statistics kept on a QB is a large set, so attempting to show that one QB is better by showing that every year that they played one is better in every statistical category can be (a) exhaustive, and (b) will lead to very few interesting comparisons. Most of the really good QBs have some areas that they are really good and others that they are not. The QB rating system provides a relation between the set of all QBs in the NFL and the set of real numbers. Once this relation was defined, we can say that one QB is performing better than another if his QB rating is higher. Similarly we can compare a QB to his own statistics at different points in his career to see the changes and trends.

This is just one example, and there are countless others that I could have used instead.

Once we understand what a relation is, we have several properties that we are interested in. Below I list four, although there are many more.

Properties of Relations:
A relation R is symmetric if whenever an element (a, b) belongs to R, then so does (b, a).

A relation R is reflexive if for every element a in the universe of the relation, the element (a, a) belongs to R.

A relation R is transitive if for every pair of elements (a, b) and (c, d) and b = c, then the element (a, d) belongs to R.

A relation R is anti-symmetric if the elements (a, b) and (b, a) do not belong to the relation whenever a is not equal to b.

Once we understand what a relation is, there are a few common ones that we are interested in. Below I list four, but again, I want to stress that these are some of the more common ones, but there are several others.

Types of Relations:
A relation R is a function (on its set of defined elements) if there do not exist elements (a, b) and (a, c) which both belong to R.

A relation R is an equivalence relation if R is symmetric, reflexive and transitive.

A relation R is a partial order set if R is anti-symmetric, reflexive and transitive.

A relation R is a total order set if it is a partial order set and for every pair of elements a and b, either (a, b) is in R or (b, a) is in R.

A partial order is just an ordering, but not everything can be compared to everything else. Think about the Olympics, and a sport like gymnastics. Consider the floor and the balance beam. One person can win gold on the floor and another person wins gold on the balance beam. That puts each of them in the “top” of the order for their particular section, but there’s no way of comparing the person who won the floor exercise to the person who won the balance beam. So we say the set is “partially ordered”. More formally, lets say that two people (person X and person Y) relate if they competed in the same event and the the first person (in this case person X) received an equal or higher medal in that event than the second person (in this case person Y). Obviously any person receives the same medal as themselves, so this relation is reflexive. And if Jamie received an equal or higher medal than Bobby and Bobby received an equal or higher medal than Chris, then Jamie must have received an equal or higher medal than Chris so this relation is transitive. To test this relation for anti-symmetry, suppose that Chuck received an equal or higher medal than Charlie and Charlie received an equal or higher medal than Chuck. This means that they must have received the same medal, but since only one medal is awarded at each color for each event (meaning one gold, one silver and one bronze…if this is not true, assume it is), this must mean that Chuck and Charlie are the same person, and this relation is thus anti-symmetric.

If we have a partial ordering where we can compare everything, then we say that the set is “totally ordered”.

An equivalence relation tries to mimic equality on our relation. So, staying with that example of the Olympics, an example of an equivalence relation could be to say that two athletes relate to one another if they both received the same color medal in their event (for the sake of argument lets assume that no athlete competes in more than one event). Then obviously an athlete receives the same medal as themselves, so this relation is reflexive. If two people received the same medal, then it doesn’t matter if we say Chris and Charlie or Charlie and Chris, so the relation is symmetric. And Finally if Chris received the same medal as Charlie an if Charlie received the same medal as Jesse, then all three people received the same medals, so Chris and Jesse received the same medals and this relation is transitive. Because this relation has these three properties, it is called an equivalence relation.

August 8, 2012
How Could You Possibly Love/Hate Math?
Growing up, I never really liked math. I saw it as one of those necessary evils of school. People always told me that if I wanted to do well and get into college, I needed to do well in math. So I took the courses required of a high school student, but I remember feeling utter confusion from being in those classes. My key problem was my inquisitive nature. I really didn’t like being “told” that certain things were true in math (I felt this way in most classes). I hated just memorizing stuff, or memorizing it incorrectly, and getting poor grades because I couldn’t regurgitate information precise enough. If this stuff was in fact “true”, I wanted to understand why. It seemed like so much was told to us without any explanation, that its hard to expect anybody to just buy into it. But that’s what teachers expected. And I was sent to the principal’s office a number of times for what they called “disturbing class”, but I’d just call it asking questions.

At the same time, I was taking a debate class. This class was quite the opposite of my math classes, or really any other class I’d ever had. We were introduced to philosophers like Immanuel Kant, John Stuart Mill, Thomas Hobbes, John Rawls, etc. The list goes on and on. We discussed theories, and spoke of how these concepts could be used to support or reject various propositions. Although these philosophies were quite complex, what I loved was the inquiries we were allowed to make into understanding the various positions. Several classmates and I would sit and point out apparent paradoxes in the theories. We’d ask about them and sometimes find that others (more famous than us) had pointed out the same paradoxes and other things that seemed like paradoxes could be resolved with a deeper understanding of the philosophy.

Hate is a strong word, but I remember feeling that mathematicians were inferior to computer programmers because “all math could be programmed”. This was based on the number of formulas I had learned through high school and I remember having a similar feeling through my early years of college. But things changed when I took a course called Set Theory. Last year, I wrote a piece that somewhat describes this change:
```
They Do Exist!

Let me tell you a story about when I was a kid
See, I was confused and here's what I did.
I said "irrational number, what’s that supposed to mean?
Infinite decimal, no pattern? Nah, can't be what it seems."
So I dismissed them and called the teacher wrong.
Said they can't exist, so let’s move along.
The sad thing is that nobody seemed to mind.
Or maybe they thought showing me was a waste of time.

Then one teacher said "I can prove they exist to you.
Let me tell you about my friend, the square root of two."
I figured it'd be the same ol' same ol', so I said,
"Trying to show me infinity is like making gold from lead"
So he replies, "Suppose you're right, what would that imply?"
And immediately I thought of calling all my teachers lies.
"What if it can be written in lowest terms, say p over q.
Then if we square both sides we get a fraction for two."

He did a little math and showed that p must be even.
Then he asked, "if q is even, will you start believing?"
I stood, amazed by what he was about to do.
But I responded, "but we don't know anything about q"
He says, "but we do know that p squared is a factor of 4.
And that is equal to 2 q squared, like we said before."
Then he divided by two and suddenly we knew something about q.
He had just shown that q must be even too.

Knowing now that the fraction couldn't be in lowest terms
a rational expression for this number cannot be confirmed.
So I shook his hand and called him a good man.
Because for once I yould finally understand
a concept that I had denied all my life,
a concept that had caused me such strife.
And as I walked away from the teacher's midst,
Excited, I called him an alchemist and exhaled "THEY DO EXIST!"
```
Aside from its lack of poetic content, I think that many mathematicians can relate to this poem, particularly the ones who go into the field for its theoretic principles. For many of us, Set Theory is somewhat of a “back to the basics” course where we learn what math is really about. The focus is no longer on how well you can memorize a formula. Instead, its more of a philosophy course on mathematics – like an introduction to the theory of mathematics, hence the name Set Theory.

The poem above focuses on a particular frustration of mine, irrational numbers. Early on, we’re asked to believe that these numbers exist, but we’re not given any answers as to why they should exist. The same could be said for a number of similar concepts though – basically, whenever a new concept is introduced, there is a reasonable question of how do we know this is true. This is not just a matter of practicality, but a necessity of mathematics. I mean I could say “lets now consider the set of all numbers for which X + 1 = X + 2”, but if this is true for any X, then it means that 1 equals 2, which we know is not true. So the set I’d be referring to is the empty set. We can still talk about it, but that’s the set I’d be talking about.

So why is this concept of answering the why’s of mathematics ignored, sometimes until a student’s college years? This gives students a false impression of what math really is, which leads to people making statements like “I hate math”, not really knowing what math is about.
July 24, 2012
Learn About Binary Search Trees

An Image of My Binary Search Tree Script

I just finished a script that shows the properties of the binary search tree data structure.

These data structures are organized such that the data lies in “nodes” and each node connects directly to up to two new nodes. These new nodes are called the children of the node, and the original node is called the parent. Because there are up to two children, we designate one child as the “left” child, and the other as the “right” child with the properties that the value stored in the left child is less than the value in the parent, which in itself is less than the value of its right child. If a parent has less than two children, then one (or both) of its children are given the value of null.

The insert and delete procedures need to make sure that they keep the elements of a binary search tree in sorted order.
To insert into a BST, we must first find the correct location where the new element will be placed. This means comparing the value of the new element to the current head of the tree, resulting in three possible outcomes.
if the head is null, then insert the new node at the current position because there is no subtree to compare it to.
if the value of the new element is less than the value at the head node, run the insert procedure on the left child of head.
if the value of the new element is greater than the value at the head node, run the insert procedure on the right child of head.

Similarly, the remove procedure for a binary search tree must first find the element to be removed. Once that element is found, there are three cases depending on the type of node we are dealing with.
if the node has no children, then simply remove the node from the tree.
if the node has only one child (either a left child or a right child), then have the parent of the node point to the child of the node (thus bypassing the node itself).
if the node has two children, then we have two options, either replace the node with the minimum value of the right subtree or the maximum value of the left subtree. The nodes that have these minimum and maximum values will have at most one child because by definition a value less than the minimum value in a right subtree would be a left child and thus would be less than the minimum value, contradicting the meaning of a minimum value. Because these nodes have at most one child, we can now use the procedures above to remove these nodes from the tree.
Because a binary search tree is different than a standard array, there are different methods for viewing the its contents. Three common such methods are preorder, inorder, and postorder traversal.
Preorder traversal visits the nodes of a binary search tree in the order (node), (left child), (right child).
Inorder traversal visits the nodes of a binary search tree in the order (left child), (node), (right child).
Postorder traversal visits the nodes of a binary search tree in the order (left child), (right child), (node).

We are also interested in the depth of a tree, which amounts to the amount of layers or levels of the tree. This can be computed by counting the longest path from the root of the tree to a leaf node (a node with no children) in the tree.

Other Blogs that have covered this topic:
Stoimen’s Web Log

July 8, 2012
Learning Math through Set Theory

In grade school, we’re taught that math is about numbers. When we get to college (the ones of us who are still interested in math), we’re taught that mathematics is about sets, operations on sets and properties of those sets.

Understanding Set Theory is fundamental to understanding advanced mathematics. Iv wrote these scripts so that users could begin to play with the different set operations that are taught in a basic set theory course. Here, the sets are limited to positive integers and we’re only looking at a few operations, in particular the union, intersection, difference, symmetric difference, and cross product of two sets. I will explain what each of these is below.

The union of the sets S1 and S2 is the set S1 [union] S2, which contains the elements that are in S1 or S2 (or in both).
Note: S1 [union] S2 is the same as S2 [union] S1.

The intersection of the sets S1 and S2 is the set S1 [intersect] S2, which contains the elements that are in BOTH S1 and S2.
Note: S1 [intersect] S2 is the same as S2 [intersect] S1.

The difference between the sets S1 and S2 is the set S1 / S2, which contains the elements that are in S1 and not in S2.
. Note. S1 / S2 IS NOT the same as S2 / S1.
Note. S1 / S2 is the same as S1 [intersect] [not]S2.

The symmetric difference between the sets S1 and S2 is the set S1 [symm diff] S2, which contains the elements that are in S1 and not in S2, or the elements that are in S2 and not in S1.
Note. S1 [symm diff] S2 is the same as S2 [symm diff] S1.
Note. S1 [symm diff] S2 is the same as (S1 [intersect] [not] S2) [union] (S2 [intersect] [not] S1).

The cartesian product of the two sets S1 and S2 is the set of all ordered pairs (a, b), where a [in] S1 and b [in] S2.

June 25, 2012
Linear Search Algorithm

I have published code that shows examples of the Linear Search Algorithm.

The linear search algorithm iterates through each item in our data structure in search for a specific value. If the current item matches, we can return, else we must continue to the next item.

In the worse case, this requires that we search through all items because in a unsorted structure, we cannot say whether an untesetd value is the value we are searching for.

Other Blogs that have covered this topic:
Dream.In.Code
Coding Bot

June 25, 2012

Pr(B_j \| A) =	Pr(B_j) Pr(A \| B_j)
	Pr(B_i) Pr(A \| B_i)