Trouver la plus courte distance par paire de points en o (n log n)?

11

L'exercice suivant a été remis aux étudiants que je supervise:

Étant donné points dans le plan, imaginez un algorithme qui trouve une paire de points dont la distance est minimale parmi toutes les paires de points. L'algorithme doit s'exécuter au temps .no(n2)

Il existe un algorithme de division et de conquête (relativement) simple qui résout la tâche dans le temps .Θ(nlogn)

Question 1 : Existe-t-il un algorithme qui résout le problème donné exactement dans le pire des cas ?o(nlogn)

Ce qui m'a fait soupçonner que cela pourrait être possible est un résultat que je me souviens avoir vu dans certains discours (référence appréciée). Il a déclaré quelque chose dans le sens que pas plus d'un nombre constant cN de points peuvent être disposés dans le plan autour d'un point p intérieur d'un cercle de rayon rR , avec r la distance minimale entre deux des points impliqués . Je pense que c=7 , les points formant un hexagone équilatéral avec p au centre (dans le cas extrême).

Dans ce cas, l'algorithme suivant devrait résoudre le problème en n étapes.

fun mindist [] | p::[] = INFINITY
|   mindist p1::p1::[] = dist(P[0], P[1])
|   mindist p::r = let m = mindist(r) in
                     min(m, nextNeighbour(p, r, m))
                   end

Notez que ceci est (prétendu être) dans le temps linéaire , car seul un nombre constant de points ne rpeut être farer loin que mde p( en supposant déclaration ci - dessus); seuls ces points doivent être étudiés pour trouver un nouveau minimum. Il y a un hic, bien sûr; comment implémentez-vous nextNeighbour(peut-être avec un prétraitement en temps linéaire)?

Question 2 : Soit un ensemble de points R et un point pR . Soit mR avec

mmin{dist(p1,p2)p1,p2R}

and

Rp,m:={ppRdist(p,p)m}.

Assume Rp,m is finite. Is it possible to find pRp,m with minimal distance from p in (amortised) time O(1)? (You may assume R to be constructed by adding investigated points p one by one.)

Raphael
la source
2
I'd propose to search with "closest pair" as a keyword.
Yoshio Okamoto
This is all standard stuff by now, see first two chapters here: goo.gl/pLiEO
Sariel Har-Peled
Ps. If you want expected time, then you can even compute the Delaunay triangulation, which contains the minimum distance.
domotorp
After question 1 you write "not more than a constant number of points can be arranged in the plane around some point p inside a circle of radius r, with r the minimal distance between p and any other point." This is certainly not true: You can take any number of points on the circle of radius r. Your statement is true if r is the minimal distance between any two points, in which case the proof is quite simple.
domotorp
the first question is textbook stuff, as already pointed out: definitely not research level. i do not understand the second question: for any m, the p you are asking for either doesn't exist or is the closest neighbor to p in R. so how is this different from question 1? what are you amortizing over (i.e. if this is a data structure question what are the updates and queries)?
Sasho Nikolov

Réponses:

12

It is impossible to solve the problem in less than cnlogn time in standard models, e.g. using algebraic decision trees. This follows from the work of Yao and Ben-Or that shows that in this model it is not possible to decide if a set of n input numbers are all different or not (see http://people.bath.ac.uk/masnnv/Teaching/AAlg11_8.pdf). In case of your problem, imagine that all of them are on the real line. If two points are the same, then your output would be two points with distance zero, while otherwise not, so a solution to your problem would also solve the DISTINCT NUMBERS problem. If you want to suppose that all your points are different, then just add iϵ to the xi inputs of the DISTINCT NUMBERS problem, in this case if your output is at most nϵ, then the numbers are not all distinct. (Although in this case you have to use a promise version where the difference of any two distinct numbers is at least 2nϵ, but I think the same proof works to show that you also need Ω(nlogn) in this case.)

domotorp
la source
Indeed, thanks. That answers question 2 (negatively), too.
Raphael
The problem you mention is apparently also known as Element Distinctness Problem.
Raphael
@Sariel Har-Peled's reference (goo.gl/pLiEO) presents a practical linear-time algorithm for finding the closest pair. That document directly addresses this argument, and explains that it does not apply because the algorithm uses a more powerful computation model.
kevin cline
1
Yes, but the question specifically asked for worst case running time. But I agree that all my observations appear already in Sariel's thesis.
domotorp
17

There is a randomized linear expected time algorithm by Rabin; see e.g. Rabin Flips a Coin on Lipton's blog.

David Eppstein
la source
0

As much as I understand question 2, Rabin's algorithm provides a kind of an answer to that too. At each time step the data structure is a grid with cell width less than the smallest distance between pairs of points seen so far, divided by 2 (so that no cell contains more than a single point). To answer the query in question 2, you only need to map p to a cell in the grid and look at a constant number of cells around it. By the analysis of the algorithm, if the input point set is examined in random order, than the amortized update time for the grid is O(1) per new point in expectation.

Sasho Nikolov
la source
BTW I am referring not to the version of the algorithm as Lipton describes it, but as it is described in the first comment (and the Kleinberg and Tardos book).
Sasho Nikolov
Only in expectation, see domotorps answer.
Raphael
i don't see anywhere that you wanted to restrict yourself to deterministic algorithms. rabin's algorithm is interesting precisely because it goes around the decision tree lower bounds (this is similar to lower bounds for comparison sort vs. integer sorting algorithms). btw, there is probably more that rabin uses to go around the lower bound, namely the hashing trick used to access the grid
Sasho Nikolov
4
Re the "more that Rabin uses": also the ability to round real-number inputs to integers. One needs to be very careful with this: if you set up a model of computation in which one can do standard arithmetic operations and rounding on real numbers, all in constant time per operation, then it's possible to solve PSPACE-complete problems in polynomial time in this model. But Rabin only rounds input numbers (to different levels of precision in different iterations), and this circumscribed form of rounding isn't problematic.
David Eppstein
@SashoNikolov I was looking for worst-case time in o(nlogn), so also worst-case O(1) in question 2. This has nothing to do with wether the algorithm is deterministic. I am not sure what happens if you combine expected and amortised time.
Raphael