# Permutation Problems

I love to play with puzzles. When I was in grade school I would spend hours at a time figuring out ways to solve from things like Tetris, Mindsweeper, Solitare, and Freecell. Later I was introduced to puzzles involving numbers like Sudoku and Nonograms.

These puzzles are often interesting in part because there is generally a very large way that things can be arranged, but only a few of these arrangements are correct. Generally a person solving a puzzle will figure out certain things that must be true or cannot be true, which helps in solving the puzzle and reducing the number of possible cases. Initially, though, we are often left with a situation where we have a new puzzle and our only method is to keep trying every possible solution until we start to notice a pattern (or reach a solution).

For example, consider a Sudoku puzzle. We are given a partially filled in grid and our job is to fill in the remaining cells with the rules that every row, column and marked subgrid must have the numbers 1 – 9 exactly once. One initial attempt at solving such a puzzle could be to attempt to permute the string 1, 2, 3, 4, 5, 6, 7, 8, 9 until we find a solution that fits the first row, then do the same with the second row, and so on and so forth.

One immediate question is how many ways are there to permute the numbers 1, …, 9? We can answer this by realizing that each permutation is a new string. So for each string that we construct, we have 9 choices for the first element in the string. Then once that element has been chosen, we are not allowed for that element to appear anywhere else. So there are only 8 possible choices for what can go in the second string. Continuing this process, we see that the number of possible permutations we can construct from the string 1, …, 9 is

9 * 8 * 7 * 6 * 5 * 4 * 3 * 2 * 1 = 362880

This is a large number of possible strings to generate just to get one row of a Sudoku, so hopefully you’ll be able to notice the pattern before going through this whole set (because once you’ve generated the first row you still have to do the other 8 rows).

Nonetheless because there is often great value that can be gained by knowing how to permute through all possible solutions, I have written three functions that help with this process: Next_Permutation, Previous_Permutation, and Random_Permutation.

Before I give these algorithms, I want to highlight two notations on ordering string. A string (a1, a2, …, an) is said to be in lexicographical order (or alphabetical order) if for each i [in] 1, …, n-1, ai [<=] ai+1. Likewise, a string is said to be in reverse lexicographical order if for each i [in] 1, …, n-1, ai [>=] ai+1.

Next Permutation

If the given string is not in reverse lexicographic order, then there exists two elements j1 and j2 such that j1 [<] j2 and aj1 [<] aj2.
1. The Next_Permutation algorithm first searches for the largest element j1 such that aj1 [<] aj1 + 1. Since we said the string is not in reverse lexicographic order, this j1 must exist.
2. Once this j1 is found, we search for the smallest element aj2 such that aj2 [>] aj1. Again, since we know that this is true for j1 + 1, we know that such a j2 must exist.
3. We swap the elements aj1 and aj2.
4. The elements after j1 + 1 are then placed in lexicographic order (i.e. in order such that ai [>=] ai+1

If the given string is in reverse lexicographic order, then we simply reverse the string.

Previous Permutation

If the given string is not in lexicographic order, then there exists two elements j1 and j2 such that j1 [<] j2 and aj1 [>] aj2.
1. The Previous_Permutation algorithm first searches for the largest element j1 such that aj1 [>] aj1 + 1. Since we said the string is not in reverse lexicographic order, this j1 must exist.
2. Once this j1 is found, we search for the smallest element aj2 such that aj1 [>] aj2. Again, since we know that this is true for j1 + 1, we know that such a j2 must exist.
3. We swap the elements aj1 and aj2.
4. The elements after j1 + 1 are then placed in reverse lexicographic order (i.e. in order such that ai [<=] ai+1

If the given string is in lexicographic order, then we simply reverse the string.

Random Permutation

This generates a random string permutation.

This script can be seen here, set to work on permutations of colors.

Here in DC, we recently had an unexpected snow day. By the word unexpected, I don’t mean that the snow wasn’t forecast – it was definitely forecast. It just never came. However due to the forecast I decided to avoid traffic just in case the predictions were correct. So while staying at home, I began thinking about some things that I’ve been wanting to update on the site and one thing that came up was an update to my Sudoku program. Previously, it contained about 10000 sample puzzles of varying difficulty. However, I told myself that I would return to the idea of generating my own Sudoku puzzles. I decided to tackle that task last week.

The question was how would I do this. The Sudoku solver itself works through the dancing links algorithm which uses backtracking, so this was the approach that figured as most likely to get me a profitable result in generating new puzzles (I have also seen alternative approaches discussed where people start with an initial Sudoku and swap rows and columns to generate a new puzzle). The next question was how to actually implement this method.

Here is an overview of the algorithm. I went from cell to cell (left to right, and top to bottom starting in the top left corner) attempting to place a random value in that cell. If that value can be a part of a valid Sudoku (meaning that there exists a solution with the current cells filled in as is), then we continue and fill in the next cell. Otherwise, we will try to place a different value in the current cell. This process is continued until all cells are filled in.

The next step was to create a puzzle out of a filled in Sudoku. The tricky about this step is that if too many cells are removed then we wind up generating a puzzle that has multiple solutions. If too few cells are removed though, then the puzzle will be too easy to solve. Initially, I went repeatedly removed cells from the locations that were considered the most beneficial. This generally results in a puzzle with about 35-40 values remaining. To remove additional cells, I considered each of the remaining values and questioned whether hiding the cell would result in the puzzle having multiple solutions. If this was the case, then the cell value was not removed. Otherwise it was. As a result I now have a program that generates Sudoku puzzles that generally have around 25 hints.

# New Years is a LEARNINGlover Thing!

Starting this web site has served as the perfect opportunity to unwind. In particular, two blogs I wrote recently have served different purposes. The first was “The Degrees of Consciousness of a Black Nerd“, where I spoke about many of the things I think about being who I am and relating the the (somewhat unique) set of people that I communicate with on a daily basis. The other was what I’ve been working on since mid December. Its the blog entry I wrote on Sudoku and the Sudoku program that I wrote last month using the Dancing Links Algorithm. Since originally writing that, I’ve updated the program with a lot of Sudoku problems, as well as two types of “hints”. One generates the “possibilities matrix” which basically just shows what is possible for each cell. The other scans the possibilities matrix and searches for isolated cells (cells where some number can only go in one row/column/subgrid). Both those additions were extremely fun and provided a nice opportunity to program in my spare time.

So I’ve been thinking about these two things and how much I enjoyed the two, but for different reasons. The Degrees of Consciousness of a Black Nerd brought much attention on facebook where the idea was both accepted and rejected and I was able to explain the ideas further and hear similar stories from others who had similar experiences. Sudoku, on the other hand hasn’t generated as much conversation. A few friends have told me that they liked the program, but I don’t know of too many people outside of myself using it. That’s OK. I didn’t really create the program for publication, moreso because I enjoy Sudoku and took it as a challenge to write a program to solve a puzzle.

That being said, I’ve been thinking about some other things that I would like to do – hopefully they’ll be more than just my opinion, but have some programming, operations research, or mathematical context to them as well. But here’s a list of things that I want to write about in the near future.

• Connections between math and football (or sports in general). Anybody who knows me knows that I’m a huge sports fan. At other sites, I’ve posted stuff on QB rating systems and the flaws/inaccuracies in them. Part of me would like to look further into this stuff and either do a comparison, create a new rating system, or just try to understand (explain) different scouting metrics, particularly for QBs.
• I wrote the Sudoku program, but that’s a well studied problem so it was easy to find research that helped me to understand other stuff. I also studied some problems in Ramsey Theory that can be represented by exact covering algorithms and it would be interesting to try to represent these as exact cover problems and to try to use the Dancing Links algorithm to solve these problems as well, maybe to find a comparative analysis to benchmarks.
• One of the things I’ve picked up on lately is machine learning. While I’ve added flash cards on Bayesian Networks, I would like to add some programs on things like k-means clustering.
• I added my sorting algorithms a few weeks ago, but would like to also add something on data structures (arrays, linked lists, trees, heaps, hash tables, etc). In undergrad this was called the class that weeded people out of computer science majors, so doing something on this type of stuff I think could be helpful to those who want to understand it better.
• I have been in the process of writing a linear programming (Simplex) implementation for a while. I would like to get back to that to allow for a solver that can do at least simple algorithms.
• I’ve written a few drafts that connect math to different areas that I’m interested in (music, sports, philosophy, religion, etc), but I need to find a better way to present the stuff because as they’re currently stated, this stuff can easily be misinterpreted.

The beautiful thing about this site is that I’m not constrained by any advisor or boss or deadlines. Its more of a how am I feeling right now kind of a thing and these are the things I’m feeling right now. So this list is kind of my “New Years Resolutions” for LEARNINGlover.com, and while I reserve the right to change my priorities any time I feel that something else deserves my attention more, these are the things I’m planning to spend time on in the next few weeks.

# My Sudoku Program

Earlier this week, I was able to write a script that solves the popular Sudoku game. It was a fun experience because along the way I learned about a new way to implement the backtracking algorithm on exact cover problems.

Say we have a set of items. They could be practically anything, but for the sake of understanding lets think of something specific, like school supplies. In particular lets say that school is starting back and you have a checklist of things you need to buy:

• a pencil
• a pen
• a notebook
• a spiral tablet
• a ruler
• a calculator
• a lamp
• a bookbag (or knapsack as I just learned people call them)
• and some paper

Suppose also that as you’re shopping for these items you also see that stores sell the items as collections. In order to spend less money, you think it may be best to buy the items in collections instead of individually. Say these collections are:

• a pencil, a notebook and a ruler
• a pen, a calculator and a bookbag
• a spiral tablet a lamp and some paper

The exact cover problem is a very important problem in computer science because many other problems can be described as problems of exact cover. Because the problem is in general pretty difficult to solve most strategies for solving these problems still generally take a long amount of time to solve. A common technique for solving these problems is called backtracking. This is basically a trial and error way of solving the problem. We try to construct a solution to the problem, and each time we realize our (partial) solution will not work, we realize the error, backup by eliminating part of the current solution and try to solve the problem by constructing another solution. This is basically how the backtracking procedure works. The main caveat to it is that we need to keep track of the partial solutions we have already tried so that we do not continuously retry the same solutions over and over again.

In particular Sudoku puzzles can be described as exact cover problems. Doing this involves translating the rules of Sudoku into exact cover statements. There are four basic rules of Sudoku that we need to take into consideration.

• Each cell in the grid must receive a value
• a value can only appear once in each row
• a value can only appear once in each column
• a value can only appear once in each pre-defined 3 by 3 subgrid.

In actuality these statements are joined together and what happens is that each position in the grid we (try to) fill in actually has effects in all four sections. We can set up the Suduko problem as an exact cover problem by noting these effects.

• If a cell (i, j) in the grid receives the value v, then we also need to note that row i,  column j, and the pre-defined subgrid have also received its proper value.

We can keep track of this by defining a table.

• The first 81 (81 = 9*9) columns in the table will answer the question of does row i, column j have anything in that position. This will tell us whether the cell is empty or not, but will not tell us the value in the cell.
• The next 81 columns in the table will answer the question of does row i (of the Sudoku grid) have the value v in it yet.
• The next 81 columns in the table will answer the question of does column j (of the Sudoku grid) have the value v in it yet.
• The final 81 columns in the table will answer the question of does the (3×3) subgrid containing (i, j) have the value v in it yet.

Notice that individually these columns do not give us much information about the grid, but because each value we place into the grid also has a precise location, a row, a column, a value and a subgrid we link together these columns of the new table and are able to understand more about the implications of each Sudoku operation. There are 9 possible rows, 9 possible columns and 9 possible values for each move, creating 729 possible moves. For each possible move, we create a row in the new table which links the columns of this new table together as follows:

• row[i*9+j] = 1
(this says that cell (i, j) is nonempty)
• row[81+i*9+v] = 1
(this says that cell row i has value v)
• row[81*2+j*9+v] = 1
(this says that column j has value v)
• row[81*3+(Math.floor(i/3)*3+Math.floor(j/3))*9+v] = 1
(this says that the 3 x 3 subgrid of (i, j) has value v)

Our goal is to choose a set of moves that fills in the grid. In the exact cover representation of the problem, this means we need to select a set of rows of the new table such that each column of the new table has exactly one 1 in our solution.

In my script I solve this problem using backtracking and the dancing links representation of this exact cover problem. The way dancing links works is that the exact cover version of this problem is translated from being a simple table to a graphical one where each cell becomes a node with pointers to the rows above and below it and the columns to the left and to the right of it. There is also a header row of column headers which contain information about that column, particularly the column number and the number of cells in that column that can cover it (basically the number of 1’s in the table form of the exact cover version of the problem). There is also a head node which points to the first column.

Once the graph version of the matrix is set up, the algorithm recursively tries to select a column and then a row in that column. Once these values have been chosen it tries to solve the new version of the problem with fewer cells remaining. If this is possible then it continues, if not then it goes back and chooses another row or another column and row.

I hope that was an understandable overview of the procedures I used to implement this algorithm. Otherwise I hope you enjoy the script and use it freely to help with your Sudoku   puzzles.