This article describes the algorithm of efficient finding the pair of closest points in the given set of points, including the interactive visualization.
As soon as we know the distance between the two arbitrary points, we do not need to check any pair for that we known that its points are located more far apart than this known distance. The approximate distances can be estimated if we sort our points along x or y axis. The difference between one of the coordinates is always less than the full distance between the points, Using the sorted list rather than arbitrary set can significantly reduce the number of pairs that must be checked, as we can break the search as soon as the difference along the sorted axis exceeds the distance between the points in the closest pair we already know. This optimization is very obvious and sufficient by itself for the single dimensional space.
Deeper optimization that also works with two or three dimensions and allows to find the closest pair in O(n log n) time requires to use recursive division and partitioning that is somewhat similar to Quicksort algorithm.
The last step is the most obvious at the deepest level of recursion, when there is only one point to the right and one point to the left (so just the distance between them needs to be checked). If not, the distance ranges are computed for the left side points. Points that are on the right side yet fall into these ranges are checked against the corresponding points on the left side for the possible closest pair. As the list is sorted by the x coordinate, the set of points between the two vertical lines can be iterated without looking at the rest of the points.
If there is an odd number of points, it is not possible to divide them into the two sets of equal size. In such cases one of the sides can contain one point less; the algorithm steps still can be correctly applied. Subset that contains only one point can return infinity as its value.
When the splitting is applied recursively, the left subset spans from left boundary till the middle line of the current subset and the right subset spans from the middle line till the right boundary of the current subset.
If we need to known the points and not just the distance between them, m_min, l_min and r_min must be structures that remember the points of the pair that they represent.
Our visualization allows to set points with the mouse and then apply algorithm at once or step by step. It shows the current boundaries of the partitioned interval as the vertical black lines and the middle line as the red line with the red triangle above. the boundaries and the middle line that are current (belong to the current level of recursion) are highlighted in bold (thicker). After recursively descending first to the right and then to the left regions, the yellow rectangles around the points on the left side show how far the machine should search. Only the points the belong to the right side and are inside the yellow rectangle must be taken into consideration.
There is a proven theorem that at most six neighbours need to be checked if they do not make the closest pair with the given point,