The Charm of Data Structures 1: Leetcode 1906.
Leetcode 1906. Minimum Absolute Difference Queries: Segment tree or binary search
Leetcode 1906. Minimum Absolute Difference Queries: Segment tree or binary search
This series(The Charm of Data Structures) of articles will introduce the clever application of data structures in algorithm problems, allowing us to appreciate the charm of data structures. This article is the first in this series.
Problem description
The minimum absolute difference of an array a
is defined as the minimum value of |a[i] - a[j]|
, where 0 <= i < j < a.length
and a[i] != a[j]
. If all elements of a
are the same, the minimum absolute difference is -1
.
For example, the minimum absolute difference of the array
[5,2,3,7,2]
is|2 - 3| = 1
. Note that it is not0
becausea[i]
anda[j]
must be different.
You are given an integer array nums
and the array queries
where queries[i] = [li, ri]
. For each query i
, compute the minimum absolute difference of the subarray nums[li...ri]
containing the elements of nums
between the 0-based indices li
and ri
(inclusive).
Return an array ans
where ans[i]
is the answer to the ith
query.
The value of |x|
is defined as:
x
ifx >= 0
.x
ifx < 0
.
Example 1:
Input: nums = [1,3,4,8], queries = [[0,1],[1,2],[2,3],[0,3]]
Output: [2,1,4,1]
Explanation: The queries are processed as follows:
- queries[0] = [0,1]: The subarray is [1,3] and the minimum absolute difference is |1-3| = 2.
- queries[1] = [1,2]: The subarray is [3,4] and the minimum absolute difference is |3-4| = 1.
- queries[2] = [2,3]: The subarray is [4,8] and the minimum absolute difference is |4-8| = 4.
- queries[3] = [0,3]: The subarray is [1,3,4,8] and the minimum absolute difference is |3-4| = 1.
Constraints:
2 <= nums.length <= 10⁵
1 <= nums[i] <= 100
1 <= queries.length <= 2 * 10⁴
0 <= li < ri < nums.length
https://leetcode.com/problems/minimum-absolute-difference-queries/
Thought process
The most intuitive solution for this problem is to, for each query interval [l, r]
, if we can quickly find the Minimum Absolute Difference in it, then we can directly return it.
However, within our limited knowledge, there is no algorithm or data structure that can directly maintain the information of the Minimum Absolute Difference of each sub-interval of the array.
So we have to seek a secondary solution, since we already know the interval, can we directly calculate the Minimum Absolute Difference for each query interval? Unfortunately, this requires calculating between each pair of elements, and the time complexity is O(n²). Adding the outer loop of the query enumeration, the total time complexity is O(n³), which clearly exceeds the time limit.
Furthermore, is it possible to sort the elements in the query range and calculate them in ascending order? This can reduce the time complexity to O(n log n). Adding the outer loop for the query enumeration, the overall time complexity becomes O(n² log n). Although the time complexity has decreased, it may still exceed the time limit.
So we can see that the bottleneck of time complexity lies in sorting the elements in the query interval every time we make a query.
Is there an algorithm or data structure that can quickly return the sorted elements in the query interval during each query? That would be great.
Segment tree and Bitset
We know that for interval query problems, segment tree is an expert in handling them(If you are not familiar with segment trees, you can take a look at Segment Tree 1: Basic Concepts and Operations in Detail), such as interval maximum, sum, etc.
Moreover, this problem has a feature that the value range of array elements is [1, 100]
, which can be used as a breakthrough point.
The node of the segment tree can maintain the information of the element value range of the interval it governs, which can be implemented using C++ bitset.
When querying, simply return the corresponding bitset of the value range of the query interval. For instance, for example 1:
Input: nums = [1,3,4,8], queries = [[0,1],[1,2],[2,3],[0,3]]
When the query interval is [0, 1]
and the values of the elements are 1
and 3
, the biset<101>
returned by query(…) is:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001010
When the query interval is [1, 2]
and the element values are 3
and 4
, the biset<101>
returned by query(…)
is:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011000
According to the bitset result, since it is already sorted naturally, we can use greedy algorithm to solve the Minimum Absolute Difference in one traversal. It is a bit similar to the idea of counting sort.
Since the value range is [1, 100], then the time complexity is constant, therefore the time complexity of one query is also controlled at O(logN).
Algorithm code is as follows:
const int N = 1e5 + 9;
// Each node (both internal and leaf nodes) of the segment tree is represented
// as a bitset, where each node represents the range of values from 1 to 100 that it covers.
bitset<101> sum[4 * N];
void push_up(int rt)
{
sum[rt] = sum[rt * 2] | sum[rt * 2 + 1];
}
void build(vector<int> &A, int rt, int L, int R)
{
if (L == R)
sum[rt].set(A[L]); // sum[rt] is a node that governs A[L], set the bit when initializing.
else
{
int mid = (L + R) / 2;
build(A, rt * 2, L, mid);
build(A, rt * 2 + 1, mid + 1, R);
push_up(rt);
}
}
// rt: current tree node number
// [L, R]: the interval corresponding to node rt
// [ql, qr]: the interval range we want to query,
// in other words [ql, qr] is our target search range
inline bitset<101> query(int rt, int L, int R, int ql, int qr)
{
if (ql > R or qr < L)
return bitset<101>(); // Case 1 (disjoint)
if (ql <= L and R <= qr)
return sum[rt]; // Case 2 (fully contained)
int mid = (L + R) >> 1; // Case 3 (intersection)
return query(2 * rt, L, mid, ql, qr) | query(2 * rt + 1, mid + 1, R, ql, qr);
}
class Solution {
public:
vector<int> minDifference(vector<int>& nums, vector<vector<int>>& queries) {
memset(sum, 0, sizeof(sum));
int n = nums.size();
nums.insert(nums.begin(), 0); // Convert to 1-indexed
build(nums, 1, 1, n);
vector<int> res;
for(auto &q: queries)
{
// The returned mp is a bitset.
auto mp = query(1, 1, n, q[0] + 1, q[1] + 1); // Convert to 1-indexed
// find the first value in [q[0] + 1, q[1] + 1]
int i = 0;
while(i < 101 and mp[i] != 1)
i++;
// find the Minimum Absolute Difference in [q[0] + 1, q[1] + 1]
int mi = 0x3f3f3f3f, prev = i;
for(int j = i+1; j < 101; j++)
if(mp[j] == 1)
mi = min(mi, j - prev), prev = j;
res.emplace_back(mi == 0x3f3f3f3f ? -1 : mi);
}
return res;
}
};
Hash table and binary search
Another solution to this problem is to use a hash table to count the positions of each element that appears, and then for each query interval, starting from the value range, sequentially perform binary search on each element.
For the elements within the query interval, use the greedy algorithm above to calculate the Minimum Absolute Difference.
The code is as follows:
class Solution:
def minDifference(self, nums: List[int], queries: List[List[int]]) -> List[int]:
mp = defaultdict(list)
for i in range(len(nums)):
mp[nums[i]].append(i) # Count the positions of elements that have occurred
res = []
for l, r in queries:
pre, mi = -inf, inf
for k in sorted(mp): # sequentially perform binary search on each element
i = bisect_left(mp[k], l)
if i < len(mp[k]) and mp[k][i] <= r:
if pre != -inf:
mi = min(mi, k - pre) # if it is within the interval
pre = k
if mi < inf:
res.append(mi)
else:
res.append(-1)
return res
Conclusion
Through the gradual progression of our thought process, we discovered that we can lower time complexity by cleverly constructing data structures. This is the charm of data structures.
Additionally, the advanced version of this problem is Codeforces 765F , readers who are interested can give it a try.
Finally, if there are any omissions or errors in this article, please kindly advise me. Thank you.