# The PageRank Algorithm

I think one of the best recent examples of the importance of mathematics is the rise of the search engine Google. I remember the world of search engines before Google and it was dominated by names like AltaVista, Yahoo, WebCrawler, Excite, and the likes. The standard way these search engines ranked the order that pages would be listed on a search query was basically to count the number of times that query appeared on pages in their database. The pages with the most listings were considered the most important, the second most listings were second most important and so on and so forth.

This sounds like a feasible way of doing things but let me show you an example of how this can be tricked. Suppose I wrote my first web page and it looked like the following:

That’s a basic web page that may not garner much attention, and it wouldn’t rank highly in most search engines as no work appears more than once. Suppose that, this being a math web page, I wanted it to rank higher on the query “math”. Then I could just edit the source code of the page to be as follows:

This second page says not much more than the first, but the fact that the word math appears 9 additional times would increase the ranking of this page among math pages. This is a very simple example, but it shows how these search engine rankings did not have a useful metric for determining the important sites on the web.

The way Google solved this problem of determining the importance of a web page is basically by counting the number of links into a web page – the theory being that the more important a web page is, the more people will be talking about it and thus linking to it. Also, the more important the people talking about (linking to) a web site, the more important that site is. This can be expressed mathematically by the following formula:

In the above formula, the variable d is called the damping factor, which helps to capture some of the random nature of the internet by saying that every site should have at least some minimal worth because of the idea that a random surfer could still get to these sites.

I have written a script to implement the algorithm here.

Other Blogs that have covered this topic
Blue Onion

# User Generated Flash Cards

When I originally started LEARNINGlover.com, I had the idea in my mind that I would utilize my skills in programming and mathematics to help others. While this remains an important goal of mine, I remain curious about how I can use this site in other ways. One thought that has occurred to me for a while is how do I expect to handle users who want to learn things that are not currently offered by the site? I do plan to regularly add sections, data and programs to the site, but what about people who don’t want to wait, or who don’t want to learn the things that I have been studying? My thought process on this is that I’d like to provide users with tools that I believe can be used to help learn, in particular to help teach yourself, just about anything – as long as you have a good source. We live in an age where information is everywhere. With just the click of a mouse we have the power to download an online textbook, read journal articles, learn a new language and a host of other things – if we know how to use this information correctly.

The concept of flash cards has been around much longer than I have. That said, I don’t see flash cards used outside the realm of vocabulary often. When I began studying theoretical mathematics, I found the number of vocabulary words offered in each class intimidating. So by instinct, I just used flash cards to learn these words. The class, though, also consisted of theorems, lemmas, proofs, corollaries, etc that do not fit into the vocabulary ‘box’ that I had seen flash cards used for. Nonetheless, I needed to study these concepts if I was going to do well in my classes.

What I found is that in my notes, not only are the notes themselves important, but there is also a question – ‘Why is this line that I just underlined important?’ Or phrased differently, ‘What question was this line written to answer?’ Take for example the following note which may come up in a class on Graph Theory:

The complexity of Kruskal’s algorithm is O(E log V).

This is certainly an important fact, as we are interested in how well algorithms perform and would like to compare these complexities. But once I have underlined it in my notes, I want a way to remember it. I know the question that will come to me during an exam or in real life practice is:

What is the complexity of Kruskal’s algorithm?

So I create a flash card with this question and answer pair.

I don’t have any ‘rules’ for creating these questions, but here are some things that come to mind:

1. In general I try to make my questions open-ended (not yes/no questions).
2. For definitions, I generally ask “What is the definition of <insert word>”.
3. Many times, things such as theorems, lemmas, or corollaries will connect two things together (either in one direction or two), so I ask questions like “What is the relationship between <insert concept 1> and <insert concept 2>?” for these.
4. Other times, theorems simply state facts (like the example above), or link a well known function (complexity) to a particular object (Kruskal’s algorithm). For these I form questions like “What is the value of <insert function> on <insert object>?”
Of course I don’t live by these rules and feel free to change or add to them as you like. I have used similar concepts on reading and understanding journal papers as well as some philosophy books. I hope this helps you as much as it has helped me.