离散化

Introduction¶

Discretization can be seen as a type of hash in essence, which ensures that the data still maintains the original full/partial order relationship after hashing.

In layman's terms, when some data is too large or the data type is not supported, causing the value itself unable to be array subscripts, and the answer is determined only by the relative size of the elements. We can use the original data according to the index from large to small to deal with the problem, that is, discretization.

The data used for discretization can be large integers, floating-point numbers, strings, etc.

Implementation¶

C++ discretization has ready-made STL algorithms:

Discretize array¶

Discretizing and querying an array is a common application scenario:

// a[i] is the initial array, and the subscript range is [1, n]
// len is the effective length of the discretized array
std::sort(a + 1, a + 1 + n);
len = std::unique(a + 1, a + n + 1) - a - 1;

// when discretizing the entire array, find the number of essentially different numbers after discretization.

After completing the above discretization, you can use the std::lower_bound function to find the rank (that is, the new index):

std::lower_bound(a + 1, a + len + 1, x) - a;  // query the corresponding index of x after discretization

Similarly, we can also discretize vector:

// std::vector<int> a, b; // b is a copy of a
std::sort(a.begin(), a.end());
a.erase(std::unique(a.begin(), a.end()), a.end());
for (int i = 0; i < n; ++i)
  b[i] = std::lower_bound(a.begin(), a.end(), b[i]) - a.begin();

Last update and/or translate time of this article，Check the history
Found smelly bugs? Translation outdated? Wanna contribute with us? Edit this Page on Github
Contributor of this article GavinZhengOI
Translator of this article Visit the original article!
The article is available under CC BY-SA 4.0 & SATA ; additional terms may apply.

离散化

Introduction¶

Implementation¶

Discretize array¶

Comments