Posts

Showing posts with the label algorithms

The Hitchhiker's Guide to Computer Networks

Image
Introduction One of the few reliable things that didn't crumble under the pressure of quarantining and working from home is our digital infrastructure. It remains largely unaffected as we conduct most of our business activities online. This is no accident. It is the result of 40+ years of continuous research to make that infrastructure faster, cheaper, and more reliable. My goal writing this series is to shed some light on the different hidden aspects of our digital infrastructure. In doing so, I hope that I will help the reader get a behind the scene's view of what we take for granted when we are watching HD videos, buying our groceries online, and chatting online with people on the other side of the planet.  A typical Internet user pays for their Internet and online services (Netflix, Amazon Prime, Spotify, iCloud) with some expectations of the quality of experience they will receive. Users also expect that the way they interact with the digital world will keep improving. Th...

KNN Algorithm and KD-Trees

Image
I am taking a Pattern Recognition course this semester, it's not really one of my favourite topics but I am doing OK. We had an assignment last week to test and compare different classification algorithms. One of the algorithms we were asked to implement was KNN (K Nearest Neighbours) . We were asked to find a way that makes searching for the K Nearest Neighbours faster; that's what this post about. The problem briefly is that: Given two sets of K dimensional points find the N nearest points (using Euclidean distance or any other measurement) to a K dimensional point X. The Naive Solution: Go through the list, find the distance between X and all the points in the two sets, get the smallest N elements in the distances list... Simple enough. As part of our assignment we were given a dataset of 100,000 points which proved this algorithm to be very slow. My Naive WRONG Solution: I thought that it was easy to index the points in a way that makes it efficient to find the ne...