Lazy Segment Trees (Recursive)

Written by Timon Gehr.

This article builds on the previous article about recursive segment trees.

Here, we will focus on operation conquer = max, but similar ideas can be applied more generally. Recall that we can build a recursive segment tree with range maximum queries as follows:

vector<int> b;
int build(int l, int r, int i) {
    if (l + 1 == r) return b[i] = a[l];
    int m = l + (r-l)/2;
    return b[i] = max(build(l, m, 2*i+1), build(m, r, 2*i+2));
}
b.resize(4 * n);
build(0, n, 0);

This then allows range queries like this:

// compute max(a[l],…,a[r-1])
int query_range(int l, int r, int i, int ql, int qr) {
    if (r <= ql || qr <= l) return -INF;
    if (ql <= l && r <= qr) return b[i];
    int m = l + (r-l)/2;
    return max(query_range(l, m, 2*i+1, ql, qr), query_range(m, r, 2*i+2, ql, qr));
}

int result = query_range(0, n, 0, ql, qr);

And we can update single elements as follows:

// set a[k] to x and update segment tree
int update(int l, int r, int i, int k, int x) {
    if (!(l<=k && k<r)) return b[i];
    if (l + 1 == r) return b[i] = a[k] = x;
    int m = l + (r-l)/2;
    return b[i] = max(update(l, m, 2*i+1, k, x), update(m, r, 2*i+2, k, x));
}

int result = update(0, n, 0, k, x);

We now investigate how to allow updating entire ranges of the segment tree to a new value.

A naive approach would be the following:

// set a[ul],…,a[ur-1] to x and update segment tree
void update_range(int ul, int ur, int x) {
    for (int k = ul; k < ur; k++)
        update(0, n, 0, k, x);
}

update_range(ul, ur, x);

Unfortunately, this is quite slow: we use O((urul)logn)O((\mathit{ur}-\mathit{ul})\cdot \log n) operations.

Note that there are some segment tree nodes that we visit multiple times. For example, the root node 00 is visited once for every update. We can therefore slightly improve upon the approach from above by updating each node at most once.

// set a[ul],…,a[ur-1] to x and update segment tree
int update_range(int l, int r, int i, int ul, int ur, int x) {
    if (r <= ql || qr <= l) return b[i];    // no update required for current node
    if (l + 1 == r) return b[i] = a[l] = x; // leaf update
    int m = l + (r-l)/2;
    return b[i] = max(update_range(l, m, 2*i+1, ul, ur, x), update_range(m, r, 2*i+2, ul, ur, x));
}

update_range(0, n, 0, ul, ur);

To realize a range update in a standard segment tree, in the worst case, Ω(urul+logn)\Omega(\mathit{ur}-\mathit{ul}+\log n) values stored in the segment tree have to change, and the code above indeed has running time O(urul+logn)O(\mathit{ur}-\mathit{ul}+\log n). Therefore, this code is asymptotically optimal for standard segment trees and to get a better running time for range updates, we have to change the data structure itself.

Lazy Propagation

We will use lazy propagation to realize range updates in time O(logn)O(\log n).

Comparing our implementations of query_range and update_range, note that the salient difference is that in query_range we can terminate early once the query range [ql,qr)[\mathit{ql},\mathit{qr}) completely covers the node range [l,r)[\mathit{l},\mathit{r}). We might therefore attempt something along the following lines for range updates:

// set a[ul],…,a[ur-1] to x and update segment tree
int update_range_incomplete(int l, int r, int i, int ul, int ur, int x) {
    if (r <= ul || ur <= l) return b[i];    // no update required for current node
    if (ul <= l && r <= ur) { // node's range completely covered by range to update
        // ... (something still missing here)
        return b[i] = x;
    }
    if (l + 1 == r) return b[i] = a[l] = x; // leaf update
    int m = l + (r-l)/2;
    return b[i] = max(update_range(l, m, 2*i+1, ul, ur, x), update_range(m, r, 2*i+2, ul, ur, x));
}

update_range(0, n, 0, ul, ur, x);

Why is this not enough? Well, now we will never apply the update to nodes below a node that was completely covered by the update range. The main insight behind lazy propagation is that it is enough to simply record such missing updates and then apply them once we actually have to read the value. This means whenever we want to access the children of some node i, we will first call a new function propagate. Note that we have to change all other functions operating on the segment tree such that they call propagate when needed.

int update_children(int l, int r, int i, int ul, int ur, int x) { // (this is just the last two lines of update_range factored out)
    int m = l + (r-l)/2;
    return b[i] = max(update_range(l, m, 2*i+1, ul, ur, x), update_range(m, r, 2*i+2, ul, ur, x));
}

struct ToDo{
    bool needs_update; // whether or not we actually need to apply an update
    int x;             // the new value
};
vector<ToDo> lazy; // stores updates to be applied later for each segment tree node

void propagate(int l, int r, int i){
    if (lazy[i].needs_update) {
        update_children(l, r, i, l, r, lazy[i].x);
        lazy[i].needs_update = false;
    }
}

int update_range(int l, int r, int i, int ul, int ur, int x) {
    if (r <= ul || ur <= l) return b[i];    // no update required for current node
    if (l + 1 == r) return b[i] = a[l] = x; // leaf update
    if (ul <= l && r <= ur) { // node's range completely covered by range to update
        lazy[i] = ToDo{true, x}; // instead of calling update_range for both children, we just procrastinate
        return b[i] = x;
    }
    propagate(l, r, i); // we want to read values in subtree of node i, need to propagate any pending operations
    return update_children(l, r, i, ul, ur, x);
}

(Note that this code has mutually recursive functions, therefore update_children has a forward reference to update_range. To make this compile, we either have to create a forward declaration of update_range at the very top of the code listing [int update_range(int l, int r, int i, int ul, int ur, int x);], or we can just encapsulate our segment tree in a struct, as member functions can forward reference each other in C++.)

To support both range updates and range queries, we also have to fix query_range so it upholds the new datastructure invariants:

// compute max(a[l],…,a[r-1])
int query_range(int l, int r, int i, int ql, int qr) {
    if (r <= ql || qr <= l) return -INF;
    if (ql <= l && r <= qr) return b[i];
    propagate(l, r, i); // we want to read values in subtree of node i, need to propagate any pending operations
    int m = l + (r-l)/2;
    return max(query_range(l, m, 2*i+1, ql, qr), query_range(m, r, 2*i+2, ql, qr));
}

int result = query_range(0, n, 0, ql, qr);

Building the segment tree works just as before, but we also have to initialize the lazy values:

vector<int> b;
int build(int l, int r, int i) {
    if (l + 1 == r) return b[i] = a[l];
    int m = l + (r-l)/2;
    return b[i] = max(build(l, m, 2*i+1), build(m, r, 2*i+2));
}
b.resize(4 * n);
lazy.resize(2 * n); // we just had to add this
build(0, n, 0);

On this lazy segment tree data structure, both range updates and range queries now run in time O(logn)O(\log n), because they perform a constant number of operations at each level of the segment tree.

Additive Updates

Because max(a+c,b+c)=max(a,b)+c\max(a+c, b+c)=\max(a, b)+c, we can support additive updates instead. I.e., instead of replacing every value in our range by xx, we increase every value in the range by xx. Note that now, we do not need to store a value needs_update, because increasing by 00 already covers this case.

vector<int> lazy; // we just store the total amount by which we have to increase

void propagate(int l, int r, int i){
    if (lazy[i] != 0) {
        update_children(l, r, i, l, r, lazy[i]);
        lazy[i] = 0;
    }
}

// increase each value a[ul],…,a[ur-1] by x
int update_range(int l, int r, int i, int ul, int ur, int x) {
    if (r <= ul || ur <= l) return b[i];    // no update required for current node
    if (l + 1 == r) return b[i] = a[l] += x; // leaf update
    if (ul <= l && r <= ur) { // node's range completely covered by range to update
        lazy[i] += x; // procrastinate
        return b[i] += x;
    }
    propagate(l, r, i); // we want to read values in subtree of node i, need to propagate any pending operations
    return update_children(l, r, i, ul, ur, x);
}

Combining Additive and Destructive Updates

To combine additive and destructive updates, we first create a struct representing an arbitrary operation that we might apply to a range:

struct Operation{
    bool needs_reset; // whether or not we reset the value to 0 before applying the additive update
    int x; // additive update

    int apply(int value) {
        return (needs_reset ? 0 : value) + x;
    }

    bool is_identity() { // whether or not this operation does nothing
        return !needs_reset && x == 0;
    }
};

Operation identity = Operation{false, 0}; // operation that does nothing

We can combine arbitrary operations:

 // compute overall operation we obtain by applying first, then second
Operation combine(Operation first, Operation second) {
    if (second.needs_reset) return second;
    return Operation{first.needs_reset, first.x + second.x};
}

We then write our lazy segment tree accordingly:

vector<Operation> lazy; // we store the operation we still have to apply to each node's range

int update_children(int l, int r, int i, int ul, int ur, const Operation &operation) {
    int m = l + (r-l)/2;
    return b[i] = max(update_range(l, m, 2*i+1, ul, ur, operation), update_range(m, r, 2*i+2, ul, ur, operation));
}

void propagate(int l, int r, int i){
    if (!lazy[i].is_identity()) {
        update_children(l, r, i, l, r, lazy[i].x);
        lazy[i] = identity;
    }
}

// apply operation to each value a[ul],…,a[ur]
int update_range(int l, int r, int i, int ul, int ur, const Operation &operation) {
    if (r <= ul || ur <= l) return b[i];    // no update required for current node
    if (l + 1 == r) return b[i] = a[l] = operation.apply(a[l]); // leaf update
    if (ul <= l && r <= ur) { // node's range completely covered by range to update
        lazy[i] = combine(lazy[i], operation); // procrastinate
        return b[i] = operation.apply(b[i]);
    }
    propagate(l, r, i);
    return update_children(l, r, i, ul, ur, operation);
}

As we can vary the type of values we store in the segment tree as well as the type of supported operations, lazy segment trees are a versatile tool, useful for solving a variety of problems.