Random Quote: Who de f*** is Stapopski and why are we discussing him? – Anonymous internet user

Random Quote: Even atheists say a little prayer now and then: Dear God, I am an idiot, thank you for protecting my children. – Garrison Keillor


Operating System

Right now my favorite operating system is Manjaro, an Arch-based Linux distribution. It’s open source and it has a rolling release that keeps the packages more up to date than some other popular distros. For my window manager I use XFCE.


I program in Python and dabble in Julia.

For fun things you can do with Python and Julia, see these lectures.

Another handy open source programming language is R, which in turn was developed back in the early days at Bell, along with C, the UNIX operating system, information theory, the transistor, the laser, and a few other minor contributions to world science and technology.

I do all my text editing using vim. I used to use Emacs but I was young and needed the money. I have made a FUNDAMENTAL ADVANCE in the field of vim text editing: map “jj” to the escape key. (Postscript: It turns out I was about the 11 millionth person to think of this.)

Machine Learning

I’m interested in machine learning. The machine learning problem is one of associating an output with a given set of inputs. One component of the problem is selection of a hypothesis space, which is a suitable collection of candidate mappings from the input to the output space. The second component is a learning algorithm, which takes training data as an input and returns a mapping from the hypothesis space that represents a proposed functional relationship between inputs and outputs. The task of the system designer is to implement this search through the hypothesis space with the objective of finding a mapping which optimizes the ability of the machine to generalize, i.e., to give the “right” output for an arbitrary set of inputs.

Statistical learning theory provides a principled approach to optimization of the generalization ability. The Ayatolla of statistical learning theory is V.N. Vapnik.

Here’s a nice quote from Vapnik:

I believe that something drastic has happened in computer science and machine learning. Until recently, philosophy was based on the very simple idea that the world is simple. In machine learning, for the first time, we have examples where the world is not simple. For example, when we solve the “forest” problem (which is a low-dimensional problem) and use data of size 15,000 we get 85%-87% accuracy. However, when we use 500,000 training examples we achieve 98% of correct answers. This means that a good decision rule is not a simple one, it cannot be described by a very few parameters. This is actually a crucial point in approach to empirical inference.

This point was very well described by Einstein who said “when the solution is simple, God is answering”. That is, if a law is simple we can find it. He also said “when the number of factors coming into play is too large, scientific methods in most cases fail”. In machine learning we dealing with a large number of factors. So the question is what is the real world? Is it simple or complex? Machine learning shows that there are examples of complex worlds. We should approach complex worlds from a completely different position than simple worlds. For example, in a complex world one should give up explain-ability (the main goal in classical science) to gain a better predict-ability.