Wow, has it really been that long since I last did a blog post? <checks calendar> Yes, yes, it appears it has been that long. Sorry about that – normal excuses apply: work, kids, moving house, the dog’s been ill etc.

What’s more, pretty much all my spare time in the last few months has been spent studying. “*Studying? What for?*”, I hear you ask. Well, I’ve been rather bitten by the whole MOOC explosion that’s been happening recently, which potentially threatens to make a really big shakeup in higher education. Distance learning via the internet is hardly a new concept, but the resources available from Coursera, Udacity, and EdX (amongst others) feel that they’ve just suddenly leapt forward in terms of quality, content and accessibility.

So far this year I’ve taken courses including Artificial Intelligence, Cryptography, Logic, Algorithms, Gamification, Theoretical Computer Science, and Machine Learning, taught by world-class teachers from Stanford, Princeton, Harvard, California Institute of Technology, and Google. Although they’re using the same material as taught in those institutions, the unregulated nature on the internet means that these courses don’t come with any *official* accreditation from the institution concerned but that’s not important to me – I’m doing it because they’re so interesting, well-taught, and fun to learn.

Anyway, one of the homeworks in the CalTech Learning from Data course was to come up with an implementation of a Perceptron. If you’ve never studied Machine Learning before, you’d be forgiven for thinking, as I did, that a perceptron was some sort of alien weaponry from an episode of Star Trek. In fact, it’s an algorithm for classifying data into categories – a primitive kind of machine learning and forerunner to more modern techniques such as neural networks. The course specified that you could implement a perceptron using whatever language you liked – most students chose Python, Java, or Octave, a few chose C. Just to add a bit of variety, I thought I’d create one using Javascript. Since this was to be a perceptron based on only two feature classes of a dataset, I could then plot the resulting classification function in two-dimensional space such as, ooh, I don’t know, a polyline on a Bing Map. (A perceptron can operate in any number of dimensions, in which case the decision boundary becomes a hyperplane, but they’re a bit tricky to draw on a map…).

I’m not saying that this is a particularly efficient way to implement a perceptron, but I just thought it would be fun. So here it is.

### Create some training data

A perceptron is a supervised learning algorithm. You provide a training data set which has already been “labelled” – i.e. the individual data points have already been correctly classified. I created an artificial training dataset by first creating an arbitrary straight line drawn between two random points in the range –1 to +1:

fx1 = 1-(Math.random()*2); fy1 = 1-(Math.random()*2); fx2 = 1-(Math.random()*2); fy2 = 1-(Math.random()*2); /** * Returns the value of x2 for a given x1 such that the point lies on the target function line */ function targetline(x) { return fy1 + (fy2 - fy1) / (fx2 - fx1) * (x - fx1); }

I then created a set of 30 random points. The points were assigned a label based on whether they lay to the left (-1) or right (1) of the target line defined above:

/** * Returns the result of the target function, f, for a given x */ function f(x) { if (x[2] > targetline(x[1])) { return 1; } else { return -1; } } // Training dataset array var data = []; for(i=0; i<30; i++){ var x = []; x[0] = 1.0; // Constant x[1] = 1.0 - (Math.random() * 2); // Feature between -1 and +1 x[2] = 1.0 - (Math.random() * 2); // Feature between -1 and +1 var y = f(x); // Correct label for this point data.push([x,y]); }

And then, if I plot what I’ve done so far on Bing Maps, I get something a bit like this – points to the left of the target function line are labelled –1 and coloured white; points to the right are labelled +1 and coloured black:

Now, what’s important to understand is that, in a real machine-learning application, you don’t ever know the true target function (the green line) – all your algorithm gets to work with is the labelled points in the training set, like this:

The role of the perceptron is to try to determine a hypothesis function that separates the features in the dataset based on their label (in machine learning-speak, it *shatters* the data), so that all the black points lie on one side and all the white points on the other. This will be an approximation to the true, unknown target function.

The perceptron works by first initialising a vector of weights corresponding to each feature (dimension). The vector contains one more element than the number of dimensions, as their is additional “zeroeth” weight to represent the offset from the origin. For a 2d Bing Map, I initialised a weight vector containing three elements as follows:

w = [0, 0, 0];

Then, loop through the dataset of training points, multiplying each feature by the corresponding weight, and determining the sign of the output compared to the sign of the correct labelled output for this point. In doing so, keep track of any misclassified points – that is, points that, based on the values in the current weight vector, produce the incorrect output compared to the known correct label:

// Find points that are misclassified under the current hypothesis/weights misclassified = []; // Identify misclassified points for (var i = 0; i < data.length; i++) { value = 0; for (var j = 0; j < w.length; j++) { value += w[j] * data[i][0][j]; } // Compare to correct output for this point if (sign(value) != sign(data[i][1])) { misclassified.push(data[i]); } }

If no points get misclassified, then the current weight vector can be used as the final hypothesis function. If any points get misclassified, then an arbitrary misclassified point is chosen and its output is used to update the weight vector. Then the process starts again.

It’s a fairly primitive algorithm, and it can take lots of iterations to determine a hypothesis function that correctly labels all the points in the training set, so I moved the logic for the perceptron calculation into a webworker which could run in the background and only sent the results back to the page when they were finished. This means that the page stays responsive and the UI doesn’t hang while the perceptron runs. When it’s finished you should get a result as follows:

- The green dotted line represents the true, unknown target function
- The black and white dots represent training examples which have been classified based on the target function
- The blue solid line is the hypothesis function suggested by the perceptron which shatters the dataset – it classifies all the training examples correctly (although note that it does not match the true target function exactly, so its performance out-of-sample will not be perfect).

Now, this has been a fairly trivial example, and I’ve only been using a Bing Map with the blank “Mercator” background as a 2d canvas on which to draw points and lines in arbitrary 2-dimensional space, but hopefully you can see the potential applications for this sort of algorithm to separate spatial features into multiple classifications:

Here’s the HTML Code:

Bing Maps 2D Perceptron .blacktext div{ color:red !important; /*Make Pushpin text black*/ } var MM = Microsoft.Maps; // Declare the points through which the target function line will pass var fx1, fy1, fx2, fy2 = null; /** * Returns the value of x2 for a given x1 such that the point lies on the target function line */ function targetline(x) { return fy1 + (fy2 - fy1) / (fx2 - fx1) * (x - fx1); } /** * Returns the result of the target function, f, for a given x */ function f(x) { if (x[2] > targetline(x[1])) { return 1; } else { return -1; } } /** The final hypothesis params: x : data point w : weight vector returns: The result of executing the hypothesis, given by the weights, on the given data point */ function h(x, w) { var val = 0.0; for (i = x.length - 2; i >= 0; i--) { val += -w[i] * x[i]; } return val / w[x.length-1]; } // Map instance var map = null; // Weight array var w = []; /** * Display the target function in the plane between [-1,-1] and [1,1] */ function drawTargetFunction() { var line = new MM.Polyline([new MM.Location(targetline(-1), -1), new MM.Location(targetline(1), 1)], { dashArray: new Array('1', '3'), strokeDashArray: "5,3", strokeColor: new Microsoft.Maps.Color(220, 20, 125, 20), strokeThickness: 5 }); map.entities.push(line); } /** * Display the hypothesis function returned by the perceptron in the plane between [-1,-1] and [1,1] */ function drawHypothesisFunction(w, hypothesis) { var line = new MM.Polyline([new MM.Location(h([1, -1, 1], w), -1), new MM.Location(h([1,1,1], w), 1)], { dashArray: new Array('1', '3'), strokeDashArray: "3,3", strokeColor: new Microsoft.Maps.Color(150, 20, 20, 125), strokeThickness: 5 }); // Draw intermediate hypotheses in black if (hypothesis == 'h') { line.setOptions({ strokeColor: new Microsoft.Maps.Color(20, 20, 20, 20) }); } // Draw final hypothesis in blue else if (hypothesis == 'g') { line.setOptions({ strokeDashArray: "0", strokeColor: new Microsoft.Maps.Color(220, 20, 20, 125) }); } map.entities.push(line); } /** * Create an array containing n random data points */ function generateData(n) { var data = [] for(i=0; i<n; i++){ var x = []; x[0] = 1.0; // Constant x[1] = 1.0 - (Math.random() * 2); // Between -1 and +1 x[2] = 1.0 - (Math.random() * 2); // Between -1 and +1 var y = f(x); // Correct output for this point data.push([x,y]); } return data; } /** * Plot (first 2-dimensions only) of supplied data point, and style according * to correct output, y */ function plotData(data) { for(var i=0; i 0) { point.setOptions({ /* text: "1", */ icon: "blank-button-round-black.png" }); } else { point.setOptions({ /* text: "-1", */ icon: "blank-button-round-white.png", typeName: "blacktext" }); } map.entities.push(point); } } function initialise() { // Create the target line function fx1 = 1-(Math.random()*2); fy1 = 1-(Math.random()*2); fx2 = 1-(Math.random()*2); fy2 = 1-(Math.random()*2); drawTargetFunction() trainingdata = generateData(30); plotData(trainingdata); // Initialise weight vector at zeros w = [0, 0, 0]; // Find the right weights by running the perceptron w = run_perceptron(trainingdata); // Draw the final hypothesis (g) given by the calculated weights //drawHypothesisFunction(w); } function sign(x) { if(x>0) { return 1; } else { return -1; } } function run_perceptron(data) { // Creates a Web Worker var worker = new Worker("perceptron_worker.js"); // Posts a message to the Web Worker worker.postMessage({'cmd': 'start', 'data': data, 'w': w}); // Triggered by postMessage in the Web Worker worker.onmessage = function(evt) { data = evt.data; switch (data.result) { case 'h': drawHypothesisFunction(data.w, 'h'); document.getElementById('iterations').innerHTML = data.iteration; break; case 'g': drawHypothesisFunction(data.w, 'g'); document.getElementById('iterations').innerHTML = data.iteration; break; } }; // If the Web Worker throws an error worker.onerror = function(evt) { alert('problem with the webworker:' + evt.data); }; } function GetMap() { map = new Microsoft.Maps.Map(document.getElementById("mapDiv"), { credentials: "PUTYOURCREDENTIALSHERE", center: new Microsoft.Maps.Location(0, 0), mapTypeId: Microsoft.Maps.MapTypeId.mercator, labelOverlay: Microsoft.Maps.LabelOverlay.hidden, showDashboard: false, showScalebar: false, zoom: 9, enableSearchLogo: false, enableClickableLogo: false } ); initialise(); } <div id='mapDiv' style="position:relative;width:768px;height:768px;"></div> <div id="iterations"></div> <div id="infoDiv"> <div><span style="color:#00DD00;">█ █ █</span><span>Target Function (f)</span></div> <div><span style="color:#202020;">█ █ █</span><span>Hypothesis (h)</span></div> <div><span style="color:#0000DD;">████</span><span>Final Hypothesis (g)</span></div> </div>

And here’s the script for the webworker, *perceptron_worker.js*:

// Simple helper function to determine the +ve/-ve sign of a value function sign(x) { if (x > 0) { return 1; } else { return -1; } } // Triggered by postMessage in the page onmessage = function(evt) { switch (evt.data.cmd) { // Start the perceptron case 'start': var iters = 1; // Hard-coded maximum number of iterations while (iters < 50000) { // Receive the set of training points and initial weight vector var data = evt.data.data; var w = evt.data.w; // Array of points that are misclassified under the current hypothesis/weights misclassified = []; // Identify misclassified points for (var i = 0; i < data.length; i++) { value = 0; for (var j = 0; j < w.length; j++) { value += w[j] * data[i][0][j]; } // Compare to correct output for this point if (sign(value) != sign(data[i][1])) { misclassified.push(data[i]); } } // If all training points are classified correctly, we've finished if (misclassified.length == 0) { break; } else { // Choose a random misclassified point var mispoint = data[Math.floor(Math.random() * data.length)]; // Update weight vector based on this point for (var j = 0; j < w.length; j++) { w[j] += mispoint[1] * mispoint[0][j]; } } // Uncomment the following line to plot every suggested hypothesis, h, that gets produced // postMessage({ 'result': 'h', 'iteration': iters, 'w': w }); iters += 1; } // Send back the final hypothesis, g postMessage({ 'result': 'g', 'iteration': iters, 'w': w }); self.close(); // Terminates the worker. break; // Stop the perceptron case 'stop': self.close(); // Terminates the worker. break; // Shouldn't get this default: self.postMessage('Unknown command: ' + data.msg); }; };