Category Archives: Blog

These are my thoughts on various topics.

Binary Puzzles

As you can probably tell, I’m a big fan of puzzles. On one hand you can say that a good puzzle is nothing but particular instance of a complex problem that we’re being asked to solve. What exactly makes a problem complex though?

To a large extent that depends on the person playing the puzzles. Different puzzles are based on different concepts and meant to highlight different concepts. Some puzzles really focus on dynamic programming like the Triangle Sum Puzzles or the Unidirectional TSP Puzzles.

Other puzzles are based on more complicated problems, in many cases instances of NP-complete problems. Unlike the puzzles mentioned above, there is generally no known optimal strategy for solving these puzzles quickly. Some basic examples of these are ones like Independent Set Puzzles, which just give a random (small) instance of the problem and ask users to solve it. Most approaches involve simply using logical deduction to reduce the number of possible choices until a “guess” must be made and then implementing some form of backtracking solution (which is not guessing since you can form a logical conclusion that if the guess you made were true, you reach either (a) a violation of the rules or (b) a completed puzzle).

One day a few months back i was driving home from work and traffic was so bad that i decided to stop at the store. While browsing the books, I noticed a puzzle collection. Among the puzzles I found in that book were the Range Puzzles I posted about earlier. However I also found binary puzzles.

Filled Binary puzzles are based on three simple rules
1. No the adjacent cells in any row or column can contain the same value (so no 000 or 111 in any row or column).
2. Every row must have the same number of zeros and ones.
3. Each row and column must be unique.

There is a paper from 2013 stating that Binary Puzzles are NP Complete. There is another paper that discusses strategies involved in Solving a Binary Puzzle

Once I finished the puzzles in that book the question quickly became (as it always does) where can I get more. I began writing a generator for these puzzles and finished it earlier this year. Now i want to share it with you. You can visit the examples section to play those games at Binary Puzzles.

Below I will go over a sample puzzle and how I go about solving it. First lets look at a 6 by 6 puzzle with some hints given:

 0 1 0 1 0 1 1 0 1 0 0 1 1 0

We look at this table and can first look for locations where we have a “forced move”. An obvious choice for these moves wold be three adjacent cells in the same row or column where two have the same value. A second choice is that when we see that a row or column has the correct number of zeros or ones, the remaining cells in that row or column must have the opposite value.

So in the above puzzle, we can see that the value in cells (2, 2) and (2, 5) must also be a 0 because cells (2, 3) and (2, 4) are both 1. Now we see that column 2 has 5 of its 6 necessary values, and three 0’s. So the last value in this column (2, 6) must be a 1 in order for there to be an equal number of 0s and 1s.

For some easier puzzles these first two move types will get you far enough to completely fill in all the cells. For more advanced puzzles though, this may require a little more thorough analysis.

As always, check it out and let me know what you think.

Range Puzzles

Range Puzzles

I have always liked puzzles. I really enjoy discovering a new puzzle. Sometimes when I have discovered a new puzzle, I enjoy it so much that I can’t seem to do enough of them. In such cases, I quickly run out of these puzzles to do and find myself looking for either a new puzzle or a way to generate them on my own.

Such was the case when I discovered the “Range Puzzles”. The rules are simple:

Every cell is marked either blue or gray
No two gray cells can be next to one another
The grid must be a connected (i.e. there is always a path from every cell to every other cell using horizontal and vertical connections.
Some cells have a number inside. This indicates the number of cells that can be viewed (horizontally and vertically in both directions) by this cell, including the cell itself.

To try out these puzzles, visit Range Puzzles at LEARNINGlover.com

How To Take Notes in Math Class

I was recently asked by someone how they should be taking notes in math class. I could immediately relate because I once asked this very same question. In both undergrad and grad, I had to ask myself how to take notes because I often would leave class with a bunch of sentence fragments based on what the professor said, but without anything I could use as a study guide. At best, it would be a garbled thing that I could combine with somebody else’s notes and try to make some legible out of it. Generally though, I would just ignore my notes and go to the text book if it was well written, or the library if the text book was not well written.

So how did I get past this? Well, after taking the Set Theory course, I started seeing mathematics as more of a construction job, like building a house. Mathematics is based on proofs which is nothing more than logical reasoning, and logical reasoning is just a series of statements that are either assumptions, definitions, or conclusions drawn from those assumptions based on known facts. The main purpose of classes is to present these “known facts” to the students and help them become more informed mathematicians. So what is a mathematical fact?

There are two important types of facts we generally learn in mathematics. The first gives us a language we can work from – these are our definitions. These are the most important things in a mathematics class because if it is a class on “group theory”, then one of the first definitions will be of a “group,” and not knowing this definition will cause problems when you are trying to prove that something is or is not a group. Similarly, if your class is on solving quadratic equations, then it is very important to be able to define what a “quadratic equation” is. Definitions in mathematics are important because, unlike in an English class, you cannot always derive the meaning of a mathematical term from its usage. Mathematical definitions are very precise and your notes should include this precision.

As a side note, I will state (if not stress) that examples are not definitions. Examples are meant to bring the definition to life, and to help connect the definition to the student, but if you are asked what is a group and respond that “the set of integers mod 7 under the operation of addition is a group”, you’d be incorrect because you gave an example of a group without telling why this is a group. Similarly, if I were to ask you to define a quadratic equation and you said “x^2 + 2x + 1 = 0 is a quadratic equation”, you’d be giving me an example but not a definition. It is important to understand the distinction between examples and definitions because when we understand the definition we can clearly explain why (ala prove) that the given instance is in fact an example.

Some professors will briefly mention a definition and then focus most of their time on examples in an attempt to make the concepts easier to understand. As a student though, it is important to ask the question “what is the definition of _____” because the exams, or the research that follows, or the applications of this concept in real life are not likely to be that same example. If the professor does not write out the definition of whatever concept you are studying, or you are unclear of this definition you should ask questions, go to an online resource, or a text book for supplementary reference.

The second type of fact presented in mathematical classes is the theorem (aka lemma or corollary). A theorem is a statement that is provable by mathematical reasoning. In a classroom setting, theorems will generally be presented in an orderly fashion such that if Theorem 1 is presented before Theorem 2, then Theorem 1 does not require Theorem 2 in order to be proven.

Because theorems are provable statements, it may be tempting to jump directly into the proof, and particularly to only take notes on the proof. I cannot stress against this enough. Before writing the proof, you should make sure to begin the proof with a clear statement of the theorem. This should be a declarative statement that is thus provable. Do not confuse the questions a professor may ask before proving a theorem with the theorem itself. An example of a theorem from group theory is “Any group that has a prime number of elements of elements in that group is cyclic (or can be generated by a single element). Also, do not confuse the nicknames of theorems with the theorem itself. For instance Lagrange’s Theorem says that “the number of elements of any subgroup of a finite group divides the number of elements in the original group.” Remembering this as Lagrange’s Theorem is fine, but it is much more important to remember the declarative statement proved.

I’m sure there are other things people use to take notes in math classes. Feel free to leave a comment sharing some of this advice.

I had been meaning to write a script and blog post on descriptive statistics for some time now, but with work and winter weather and the extra work that winter weather brings, and now that the winter weather is over trying to get back into an exercise routine (running up a hill is such a challenging experience, but when I get to the top of that hill I feel like Rocky Balboa on the steps at the steps at the entrance of the Philadelphia Museum of Art), I haven’t had the time to devote to this site that I would have liked. Well, that’s not entirely true. I have still been programming in my spare time. I just haven’t been able to share it here. I went to a conference in February and in my down time, I was able to write a script on descriptive statistics that I think gives a nice introduction to the area.

Before I go into descriptive statistics though, lets talk about statistics, which is concerned with the collection, analysis, interpretation and presentation of data. Statistics can generally be broken down into two categories, descriptive statistics and infernalinferential statistics, depending on what we would like to do with that data. When we are concerned with visualizing and summarizing the given data, descriptive statistics gives methods to operate on this data set. On the other hand, if we wish to draw conclusions about a larger population from our sample, then we would use methods from inferential statistics.

In the script on descriptive statistics I’ve written, I consider three different types of summaries for descriptive statistics:

Measures of Central Tendency
Mean – the arithmetic average of a set of values
Median – the middle number in a set of values
Mode – the most used number in a set of values

Dispersion
Maximum – the largest value in the data set
Minimum – the smallest value in the data set
Standard Deviation – the amount of variation in a set of data values
Variance – how far a set of numbers is spread out

Shape
Kurtosis – how peaked or flat a data set is
Skewness – how symmetric a data set is

Plots
Histogram Plots – a bar diagram where the horizontal axis shows different categories of values, and the height of each bar is related to the number of observations in the corresponding category.
Box and Whisker Plots – A box-and-whisker plot for a list of numbers consists of a rectangle whose left edge is at the first quartile of the data and whose right edge is at the third quartile, with a left whisker sticking out to the smallest value, and a right whisker sticking out to the largest value.
Stem and Leaf Plots – A stem and leaf plot illustrates the distribution of a group of numbers by arranging the numbers in categories based on the first digit.

Slope Formula

I was watching a football game a few days ago, and to prepare for it, we decided to pick up some snacks. As the game progressed, I found myself eating a lot of chips, but at halftime, there was one snack that remained unopened, a snack I had been thinking about all night, a snack that I couldn’t seem to locate until that moment – Recee’s Pieces. We purchased a pretty large sized bag that I thought would last a while. So at the end of halftime, when the bag was almost empty, someone made sure to warn me that at the rate I’m eating these things I’d be sure to be sick the next day.

Just to give you some insight into my how my mind works, the fact that she used the term “rate” took me back to classes of Algebra 1 where we were first learning about linear equations, slopes, y-intercepts, point slope form, slope intercept form, and so on. The more I thought about it, the more I thought this would be a good script to add to this site as it would probably provide help to many students who are currently enrolled in those classes as well as a remembrance to adults who took those classes years ago and wish to recall the concepts.

So I have added a script which helps go over the slope formula. It randomly generates two points and asks users to select which of a set of choices is the slope of those two numbers. In case you forget, there is a button where you can be given the slope as well.

Interactive Midpoint Formula

I hope everyone had a great time over the holidays reconnecting with their family and friends. I definitely enjoyed hearing about the changes in their lives since we last spoke, and meeting new additions to the families. One of the best times I had was with a two year old who I had met earlier in the year. She was much more responsive this time, and as we sat and talked, she was eager to tell me the things she knew, from the alphabet to her numbers, to her shapes. So today’s blog-script is inspired by this conversation, which became a game of point and describe. I would point to an object and she would say “That’s a circle”, or “That’s a red triangle”. I was amazed at both how simple it was and how much I enjoyed this activity.

I enjoyed it so much that I decided that I’d like to have some programs on my site that were more of this genre. The first that I decided to write is a lesson on the midpoint formula. But instead of simply giving the formula and writing a script to walk users through the steps in calculating the midpoint, I thought I’d write a point-and-click approach to it.

The script will randomly generate two points in the XY plane and ask users to calculate the midpoint between these points. Five options are then given and the user is asked to select the radio button next to the correct choice. Once the submit button is pressed, the program will let users know if their choice is correct. Users can have the program generate new points at any time. There is also an option for users to have the midpoint formula displayed.

Because this is my first program of this sort, I am curious to know what users think. When generating choices that are supposed to be incorrect, what’s a good method for doing so? I decided not to keep a timer or a score for the “score” of a user, but I did think about it. Ultimately I wanted this to feel less like a “test” and more like a “game” so I decided against this option. But I would like to know what you think – either through your comments here or on twitter @MindAfterMath.

The Bridge Crossing Problem

Most puzzles are fun in their own right. Some puzzles are so fun that they have the added benefit that they are likely to come up in unexpected places, like maybe in a job interview. I was recently reading a paper by Günter Rote entitled “Crossing the Bridge at Night” where Rote analyzes such a puzzle. Upon finishing the paper, I decided to write a script so that users could see the general form of this puzzle.

The problem can be stated as follows: There is a set of people, lets make the set finite by saying that there are exactly n people, who wish to cross a bridge at night. There are a few restrictions that make crossing this bridge somewhat complicated.

• Each person has a travel time across the bridge.
• No more than two people can cross the bridge at one time.
• If two people are on the bridge together, they must travel at the pace of the slower person.
• There is only one flashlight and no party (of one or two people) can travel across the bridge without the flashlight.
• The flashlight cannot be thrown across the bridge, and nobody can go to the store to purchase another flashlight

The image above shows the optimal solution when the 4 people have travel times of 1, 2, 5, and 10. The script I have written allows users to work with different numbers of people with random travel times. Give it a try and see if you can spot the patterns in the solution.

Hierarchical Clustering

Hierarchical Clustering algorithms give a nice introduction for computer science students to unsupervised machine learning. I say this because the bottom-up approach to Hierarchical clustering (which I have implemented here) is very similar to Kruskal’s algorithm for finding the minimum spanning tree of a graph.

In Kruskal’s algorithm, we begin by creating a forest, or a set of trees where each node is its own tree. The algorithm then selects the two trees that are closest together (closest being defined as the minimum cost edge between two distinct trees) and merges those trees together. This process of merging the closest two trees is then repeated until there is only one tree remaining, which is a minimum spanning tree of the graph.

Similarly, bottom-up hierarchical clustering of a group of points begins by saying that each point is its own cluster. Then the clusters are compared to one another to check if two clusters will be merged into one. Generally, there will be some stopping criteria, , saying that we do not want to merge two clusters together if their distance is greater than . So if the minimum distance between two clusters is less than we will proceed as in Kruskal’s algorithm by merging these two clusters together into one cluster. We repeat this process of merging the closest two clusters together until we find that the minimum distance between two clusters is greater than or equal to , in which case we can stop and the result is a partition of our data set into distinct clusters.

Hierarchical clustering is comparable to K-Means Clustering. Here are some differences between the two approaches:

1. K-Means Clustering requires an initial number of desired clusters, while Hierarchical clustering does not.
2. A run of K-Means Clustering will always give K clusters, whereas Hierarchical Clustering can give more or less, depending on our tolerance .
3. K-Means can undo previous mistakes (assignments of an element to the wrong cluster), while Hierarchical Clustering cannot.

So, here is a link to my page on Hierarchical Clustering. Hope you enjoy.

Introduction to JavaScript Programming

I received a lot of attention from friends interested in programming after my recent blog post entitled “Introduction to Python Programming”. While many found it interesting, the fact that Python is more useful to mathematicians hindered sine of my friends desire to learn it as their first language.

In out conversations, my recommendation for a first language was JavaScript. This is a powerful language in the sense that just about anybody who is involved with the internet knows it, and it’s likely to boost a person’s resume. It also has many similarities to more powerful languages like C++ and Java, so while not trivial, it could be a good launch pad into more advanced languages. But my favorite reason is that unlike many other programming languages that rely in an MS-DOS like command like approach for run time interaction, JavaScript’s basic interaction is with the standard internet browsers we use everyday. There isn’t even anything you need to download or install. Just create a basic HTML file in a text editor (like notepad, wordpad, or notepad++). This makes it easier to show off your creations which makes learning more fun.

The script I’ve finished provides examples on writing output, declaring variables, data types, conditionals, loops, and functions. Although I do not go into detail about all the events and objects on an HTML page, I do finish with three examples of more advanced JavaScript programs. Once you’ve selected a program, the code well be revealed in the text area. There is also a button that, when clicked, will execute that script on a new HTML tab.

I hope you enjoy, and let me know if you have any suggestions or comments.

With that being said, here is a link to my sample JavaScript code.

Simple Linear Regression

We live in a world that is filled with patterns – patterns all around us just waiting to be discovered. Some of these patterns are not as easily discovered because of the existence of outside noise.

Consider for example an experiment where a set of people were each given the task of drinking a number of beers and having their blood alcohol level taken afterwards. Some noise factors in this could include the height and weight of the individual, the types of drinks, the amount of food eaten, and the time between drinks. Even with this noise, though, we can still see a correlation between the number of drinks and their blood alcohol level. Consider the following graph showing people’s blood-alcohol level after a given number of drinks. The x-axis represents the number of drinks and the y-axis is the corresponding blood alcohol level.

 x 5 2 9 8 3 7 3 5 3 5 4 6 5 7 1 4 y 0.1 0.03 0.19 0.12 0.04 0.095 0.07 0.06 0.02 0.05 0.07 0.1 0.085 0.09 0.01 0.05

We can definitely see a correlation, and although the data doesn’t quite fit on a straight line. It leads us to ask further questions like can we use this data to build a model that estimates a person’s blood-alcohol level and how strong is this model?

One of the tools we can use to model this problem is linear regression. A linear regression takes a two-dimensional data set, with the assumption that one column (generally represented by the x variable) is independent and the second column (generally represented by the y variable) being dependent on the first column. The assumption is that the relationship between the two columns is linear and can be represented by the linear equation

y = 0 + 1x + e.

The right hand side of the above equation has three terms. The first two (0 and 1) are the parameters of the linear equation (the y-intercept and slope respectively), while the third term of the right hand side of the above equation represents the error term. The error term represents the difference between this linear equation and the y values in the data provided. We are seeking a line that minimizes the error term. That is, we are seeking to minimize

D = i = 1 to n [yi – (0 + 1xi)]2

There are several ways one could approach this problem. In fact, there are several lines that one could use to build a linear model. The first line that one may use to model these points is the one generated by only mean of the y values of the points, called the horizontal line regression.

For the data set above, the mean of the y values can be calculated as = 0.0738, so we could build a linear model based on this mean that would be y = 0.738. This horizontal line regression model is a horizontal line that predicts the same score (the mean), regardless of the x value. This lack of adjustments means it is generally a poor fit for most models. But as we will see later, this horizontal line regression model does serve a purpose in determining how well the model we develop performs.

A second attempt at solving this problem would be to generate the least squares line. This is the line that minimizes the D value listed above. We can see that D is a multi-variable polynomial, and we can find the minimum of such a polynomial using calculus, partial derivatives and Gaussian elimination (I will omit the work here because it deters us from the main point of this blog post, however Steven J. Miller has a good write-up of this).

The calculus leads us to the following equations:

SXY = i = 1 to n(xy) –
 (i = 1 to nx)(i = 1 to ny) n
SXX = i = 1 to n(x2) –
 (i = 1 to nx)2 n
1 =
 SXY SXX
 0 = – 1

To calculate the least squares line for this example, we first need to calculate a few values:
i = 1 to n(xy) = 6.98
i = 1 to n(x2) = 443
i = 1 to nx = 77
i = 1 to ny = 1.18
Sxx = 72.44
Sxy = 1.30

This lets us evaluate that
1 = 0.018
and
0 = -0.0127

So the resulting linear equation for this data is

= -0.0127 + 0.018*x

Below is a graph of the two attempts at building a linear model for this data.

In the above image, the green line represents the horizontal line regression model and the blue line represents the least-squares line. As stated above, the horizontal line regression model is a horizontal line that does not adjust as the data changes. The least-squares line adjusts both the slope and y-intercept of this line according to the data provided to better fit the data provided. The question becomes how well does the least-squares line fit the data.

The Sum of Squares Error (SSE) sums the deviation at each point of our data from the least-squares line.

SSE = i = 1 to n(yii)2

A second metric that we are interested in is how well the horizontal line regression linear model estimates our data. This is called the Total Sum of Squares (SST).

SST = i = 1 to n(yi)2

The horizontal line regression model ignores the independent variable x from our data set and thus any line that takes this independent variable into account will be an improvement on the horizontal line regression model. Because of this, the SST sum is a worse case scenario of how poorly our model can perform.

Knowing now that SST is always greater than SSE, the regression sum of squares (SSR) is the difference between the total sum of squares and the sum of squares error.

SSR = SST – SSE

This tells us how much of the total sum of squares is accounted for by the model.

Finally, the coefficient of determination (r2) is defined by

r2 = SSR / SST

This tells SSR as a percentage of SST, or the amount of the variation in the data that is explained by the model.

So, check out my script on simple linear regression and let me know what you think.