The Algorithm Design Manual (3rd) Exercise Solution 'E4-20'

Table of Contents

Introduction
Repository
Exercise 4-20
- Question
- Solution

Introduction

This series is my solutions for algorithm exercises in'The Algorithm Design Manual' 3rd edition.
It's my solutions, not model answers, so if you find some mistakes or more sophisticated solutions, please post in comment area below.

Repository

Here

Exercise 4-20

Question

(a) Give an efficient algorithm to find the second-largest key among $n$ keys. You can do better than $2n − 3$ comparisons.
(b) Then, give an efficient algorithm to find the third-largest key among $n$ keys. How many key comparisons does your algorithm do in the worst case? Must your algorithm determine which key is largest and second-largest in the process?

Solution

Subproblem (a)

Algorithm and Analysis

Use max-heap structure and convert key-list to heap.
But, Bubble-Down algorithm have to be tuned in.

Instead of bubbling down until reaching a leaf node, the element bubbles down only 2 heights. By this modigication, unnecessary comparisons are removed, and after bubbling down, the tree still holds 2 largest elements on the top 2 layers of it (Figure 1).

After heap construction, we compare the top element's two children, and the larger one is the second largest key we're searching.

The number of comparisons is:

\begin{equation} \underbrace{2}_\text{\textdagger1}(\underbrace{0*\frac{n}{2}}_\text{\textdagger2}+\underbrace{1*\frac{n}{4}}_\text{\textdagger3}+\underbrace{2*\frac{n}{4}}_\text{\textdagger4})+1=\frac{3}{2}n+1 \end{equation}

†1: Comparisons per Bubble-Down
†2: The number of leaf nodes
†3: The number of nodes of height 1
†4: The rest of nodes

If n is sufficiently large (n>8), ' $\frac{3}{2}n+1$ ' is smaller than ' $2n-3$ '. So this algorithm is better than brute force algorithm in terms of number of comparisons.

Subproblem (b)

Algorithm and Analysis

We adopt the same strategy as subproblem (a) except the difference that instead of limiting bubble down 2 times we limit it 3 times.

After heap construction we compare the top element's children. Then we compare the smaller one and the larger one's two children (Figure 2).

The number og comparison is:

\begin{equation} 2(0*\frac{n}{2}+1*\frac{n}{4}+2*\frac{n}{8}+3*\frac{n}{8})+3=\frac{7}{4}n+3 \end{equation}

It is still smaller than ' $2n-3$ ' for sufficiently large $n$ .

Code for (a), (b)

#include <algorithm>
#include <cassert>
#include <functional>
#include <stdexcept>
#include <vector>
size_t pq_parent(const size_t i) { return i / 2; }
size_t pq_young_child(const size_t i) { return i * 2; }
size_t pq_top() { return 1; }
size_t pq_len(const size_t x) { return x - 1; }
int pq_compare(const std::vector<int>& q, const size_t i_parent,
               const std::function<int(const int, const int)> comp) {
    auto i_chd{pq_young_child(i_parent)};
    if (pq_len(q.size()) < i_chd + 1) return q[i_chd];
    return comp(q[i_chd], q[i_chd + 1]);
}
void pq_swap(std::vector<int>& q, const size_t i1, const size_t i2) {
    auto t{q[i1]};
    q[i1] = q[i2];
    q[i2] = t;
}
void bubble_down_limited_descents(std::vector<int>& q, const size_t p,
                                  const size_t cnt, const size_t limit) {
    if (cnt == limit) return;
    auto c{pq_young_child(p)};
    auto min_idx{p};
    for (size_t i = 0; i < 2; i++) {
        if (c + i <= pq_len(q.size()) && q[c + i] > q[min_idx]) min_idx = c + i;
    }
    if (min_idx == p) return;
    // swap
    pq_swap(q, p, min_idx);
    // recurse
    bubble_down_limited_descents(q, min_idx, cnt + 1, limit);
}
std::vector<int> make_heap_limited_descents(const std::vector<int>& v,
                                            const size_t limit) {
    std::vector<int> q(v.size() + 1);
    std::copy(v.cbegin(), v.cend(), q.begin() + 1);

    for (size_t i = pq_len(q.size()) / 2; i >= 1; i--) {
        bubble_down_limited_descents(q, i, 0, limit);
    }
    return q;
}

auto f_int_max{[](const int a, const int b) { return std::max(a, b); }};

int get_second_largest(const std::vector<int>& v) {
    auto q{make_heap_limited_descents(v, 2)};
    if (pq_len(q.size()) < 2) throw std::invalid_argument{"Elements shortage"};
    return pq_compare(q, pq_top(), f_int_max);
}

int get_third_largest(const std::vector<int>& v) {
    auto q{make_heap_limited_descents(v, 3)};
    if (pq_len(q.size()) < 3) throw std::invalid_argument{"Elements shortage"};
    auto i_chd{pq_young_child(pq_top())};
    struct S_idx {
        size_t i_lrg;
        size_t i_sml;
    };
    S_idx s_idx{q[i_chd] > q[i_chd + 1] ? S_idx{i_chd, i_chd + 1}
                                        : S_idx{i_chd + 1, i_chd}};

    if (pq_len(q.size()) < pq_young_child(s_idx.i_lrg)) return q[s_idx.i_sml];
    return std::max(q[s_idx.i_sml], pq_compare(q, s_idx.i_lrg, f_int_max));
}

int main(int argc, char const* argv[]) {
    assert(get_second_largest({2, 5}) == 2);
    assert(get_second_largest({6, 2, 5, -3}) == 5);
    assert(get_second_largest(
               {-4, -43, -51, -8, 1, 15, 18, 2, 3, 37, 6, 7, 8, 99}) == 37);
    assert(get_second_largest(
               {-4, -43, -51, -8, 1, 15, 18, 2, 3, -37, 6, 7, 8, 99}) == 18);

    assert(get_third_largest({2, 5, -3}) == -3);
    assert(get_third_largest({6, 2, 5, -3}) == 2);
    assert(get_third_largest(
               {-4, -43, -51, -8, 1, 15, 18, 2, 3, 37, 6, 7, 8, 99}) == 18);
    assert(get_third_largest(
               {-4, -43, -51, -8, 1, 15, 18, 2, 3, -37, 6, 7, 8, 99}) == 15);
    return 0;
}