平面最近点对

Overview¶

Given points on a two-dimensional plane, find a pair of points with the closest Euclidean distance.

Here we will introduce a divide and conquer algorithm with time complexity of to solve this problem. This algorithm was proposed in 1975 by Franco P. Preparata, and was proven to be optimal under the decision tree model by Preparata and Michael Ian Shamos.

Algorithm¶

Like the regular divide-and-conquer algorithm, we split this set of points into two sets of the same size, and recurse continuously. But there is a problem: how to merge? That is, how to find the nearest point pair of one point in and another point in ? Here we first assume that the time complexity of the merging operation is , and the total complexity of the algorithm is .

We first sort all points according to as 1^st keyword and as 2^nd keyword, and use the point as the boundary point. The split point set is :

And recursively, find the nearest point pair within these two point sets and assume the distances are , and set the smaller value as .

Now it's time to merge! We are trying to find such a set of point pairs, one of which belongs to and the other belongs to , and the distance between them is less than . Therefore, we put all the points whose difference between the abscissa and is less than into the set :

For each point in , our current goal is to find a point that is also in to which the distance is less than . In order to avoid considering each other between two points, we only consider those points whose vertical coordinate is less than . Obviously for a legal point , must be less than . So we get a set :

If we sort the points in according to , will be easily obtained, that is, several consecutive points next to .

From this we have the steps of merging:

Build the collection .
Sort the points in according to . The usual practice is , but we can optimize it to (explained below).
For each , consider . Calculate the distance for each pair of and update the answer (the closest point pair of the current set).

Please note that we mentioned two sorts above, because the point coordinates remain the same throughout. The first sorting can be done only once before the divide and conquer begins. We make each recursive return to the result of the current point set sorted by . For the second sorting, the upper layer directly uses the two separately sorted point sets of the lower layer to merge.

It seems that this algorithm is still not optimal, will be in the order of , leading to thw wrong time complexity. Actually this is not the case, because its maximum size is , and we offer its proof below:

Proof of complexity¶

We have learned that the vertical coordinates of all points in are within the range of ; and the horizontal coordinates of all points in , and itself, are all in the range of . This forms a rectangle of .

We then split this rectangle into two squares of , regardless of , the point in one of the squares is and the other is , and the distance between any two points in the two squares is greater than . (Because they come from the same next layer recursion)

We split a square of into four small squares of . It can be found that there are at most points in each small square: because the maximum distance between any two points in the small square is the length of the diagonal, that is, , which is less than $ h$.

Therefore, there are at most points in each square and at most points in the rectangle. Remove itself, .

Implementation¶

We use a structure to store points and define a function object for sorting:

Structure definition

struct pt {
  int x, y, id;
};

struct cmp_x {
  bool operator()(const pt& a, const pt& b) const {
    return a.x < b.x || (a.x == b.x && a.y < b.y);
  }
};

struct cmp_y {
  bool operator()(const pt& a, const pt& b) const { return a.y < b.y; }
};

int n;
vector<pt> a;

To facilitate recursion, we introduce the upd_ans() helper function to calculate the distance between two points and try to update the answer:

function to update answer

double mindist;
int ansa, ansb;

inline void upd_ans(const pt& a, const pt& b) {
  double dist =
      sqrt((a.x - b.x) * (a.x - b.x) + (a.y - b.y) * (a.y - b.y) + .0);
  if (dist < mindist) mindist = dist, ansa = a.id, ansb = b.id;
}

Here is the recursion itself: suppose that a[] has been sorted by before the call. If is too small, use brute force algorithm to calculate and terminate the recursion.

We use std::merge() to perform the merge sort and create an auxiliary buffer t[] in which is stored.

main function

void rec(int l, int r) {
  if (r - l <= 3) {
    for (int i = l; i <= r; ++i)
      for (int j = i + 1; j <= r; ++j) upd_ans(a[i], a[j]);
    sort(a + l, a + r + 1, &cmp_y);
    return;
  }

  int m = (l + r) >> 1;
  int midx = a[m].x;
  rec(l, m), rec(m + 1, r);
  static pt t[MAXN];
  merge(a + l, a + m + 1, a + m + 1, a + r + 1, t, &cmp_y);
  copy(t, t + r - l + 1, a + l);

  int tsz = 0;
  for (int i = l; i <= r; ++i)
    if (abs(a[i].x - midx) < mindist) {
      for (int j = tsz - 1; j >= 0 && a[i].y - t[j].y < mindist; --j)
        upd_ans(a[i], t[j]);
      t[tsz++] = a[i];
    }
}

In the main function, start recursion like this:

Function call interface

sort(a, a + n, &cmp_x);
mindist = 1E20;
rec(0, n - 1);

Generalization: the smallest perimeter triangle on the plane¶

The above algorithm is interestingly extended to this problem: in a given set of points, select three points so that the sum of their distances is the smallest.

The algorithm remains largely unchanged. Each time it tries to find a triangle smaller than the current answer perimeter , put all the points whose distance between horizontal coordinates and is less than into the set , try to update the answer. (The longest side of the triangle with perimeter is less than )

Non-divide and conquer algorithm¶

In fact, in addition to the divide and conquer algorithm mentioned above, there is another non-divide and conquer algorithm whose time complexity is also .

We can consider the idea of a common statistical sequence: for each element, add the contribution of itself and all the elements to its left into the answer. This way of thinking can also be used to solve the plane closest points problem.

Specifically, we sort all points according to as the first keyword and as the second keyword, and create a sorting order with as the first keyword and as the second keyword multiset. For each position , we do the following:

Delete all points satisfying from the set. They will no longer contribute to the answer.
For all points in the set satisfying , count the distance between them and .
Insert into the collection.

Since each point will be inserted and deleted at most once, the time complexity of inserting and deleting points is , and the time complexity proof of the statistical answer part and the time complexity proof of the divide and conquer algorithm method is similar, and you may wish to give this a try.

Template code

#include <algorithm>
#include <cmath>
#include <cstdio>
#include <set>
const int N = 200005;
int n;
double ans = 1e20;
struct point {
  double x, y;
  point(double x = 0, double y = 0) : x(x), y(y) {}
};

struct cmp_x {
  bool operator()(const point &a, const point &b) const {
    return a.x < b.x || (a.x == b.x && a.y < b.y);
  }
};

struct cmp_y {
  bool operator()(const point &a, const point &b) const { return a.y < b.y; }
};

inline void upd_ans(const point &a, const point &b) {
  double dist = sqrt(pow((a.x - b.x), 2) + pow((a.y - b.y), 2));
  if (ans > dist) ans = dist;
}

point a[N];
std::multiset<point, cmp_y> s;

int main() {
  scanf("%d", &n);
  for (int i = 0; i < n; i++) scanf("%lf%lf", &a[i].x, &a[i].y);
  std::sort(a, a + n, cmp_x());
  for (int i = 0, l = 0; i < n; i++) {
    while (l < i && a[i].x - a[l].x >= ans) s.erase(s.find(a[l++]));
    for (auto it = s.lower_bound(point(a[i].x, a[i].y - ans));
         it != s.end() && it->y - a[i].y < ans; it++)
      upd_ans(*it, a[i]);
    s.insert(a[i]);
  }
  printf("%.4lf", ans);
  return 0;
}

Practice problems¶

References¶

The divide and conquer algorithm part in this page is mainly translated from the blog post Нахождение пары ближайших точек and its English version Finding the nearest pair of points. The copyright agreement for the Russian version is Public Domain + Leave a Link; the copyright agreement for the English version is CC-BY-SA 4.0.

Zhihu column: computational geometry-nearest point pair problem

Last update and/or translate time of this article，Check the history
Found smelly bugs? Translation outdated? Wanna contribute with us? Edit this Page on Github
Contributor of this article OI-wiki
Translator of this article Visit the original article!
The article is available under CC BY-SA 4.0 & SATA ; additional terms may apply.