- Published on
The Algorithm Design Manual (3rd) Exercise Solution 'E4-37'
Table of Contents
Introduction
This series is my solutions for algorithm exercises in'The Algorithm Design Manual' 3rd edition.
It's my solutions, not model answers, so if you find some mistakes or more sophisticated solutions, please post in comment area below.
Repository
Exercise 4-37
Question
Consider the problem of sorting a sequence of n 0’s and 1’s using comparisons. For each comparison of two values x and y, the algorithm learns which of , , or holds.
- (a) Give an algorithm to sort in comparisons in the worst case. Show that your algorithm is optimal.
- (b) Give an algorithm to sort in comparisons in the average case (assuming each of the n inputs is 0 or 1 with equal probability). Show that your algorithm is optimal.
Solution
Subproblem (a)
We partition the sequence for index range , comparing each element with 0. If it is equal to zero, move to left.
During the comparisons, we keep the position where at first '1' appears in the sequence. Let the position be pos-f1.
After the comparisons, swap the element in the last and that in the pos-f1. Swapping requires no comparison.
So, the number of comparisons is .
Subproblem (b)
There are two solutions as long as I came to mind.
Solution1
Outline
In the first traversal, we compare each pair of two adjacent bits in the sequence.
After the traversal, if there are some pairs that the two bits are not equal, we move the larger one to '1's group, and the smaller one to '0's group.
Then we run the same traversal to remaining elements, but this time we treat each pair of the first traversal as one bit, because they have the same bit value.
We continue this until there are no bit or just one bit. If there are one bit (actually, this may mass of bits of the same bit value), we compare that with '0', and according the result we move the bit either to '0's group or '1's.
Finally, we concatenate '0's group and '1's group.
Cost Analysis
Since each of the n inputs is 0 or 1 with equal probability, the number of the pairs that the two bits are not equal is the same as the remaining pairs. And before the next traversal, the remaining pairs are transformed to one bit. So in one traversal, the total number is reduced to .
Thus, the total number of comparisons is represented as the geometric series as following:
Code
#include <algorithm>
#include <cassert>
#include <iostream>
#include <vector>
using namespace std;
void sort_seq(vector<int>& v, int& cnt_comparison) {
cnt_comparison = 0;
int cnt_0{0};
int cnt_1{0};
// repeat while there are at least 2 elements to compare
size_t ir_open{v.size()};
int weight{1};
for (; 1 < ir_open; weight <<= 1) {
size_t new_iro{0};
// traverse and do pairwise comparisons
//// among the comparison, construct the new array for the next
//// traversal
for (size_t i = 0; i < ir_open - 1; i += 2) {
++cnt_comparison;
if (v[i] == v[i + 1]) {
v[new_iro++] = v[i];
} else {
cnt_0 += weight;
cnt_1 += weight;
}
}
if (ir_open & 1) {
++cnt_comparison;
if (!v[ir_open - 1])
cnt_0 += weight;
else
cnt_1 += weight;
}
ir_open = new_iro;
}
if (ir_open == 1) {
++cnt_comparison;
if (!v[0])
cnt_0 += weight;
else
cnt_1 += weight;
}
// reconstruct the array by the count of 0 and 1
fill_n(v.begin(), cnt_0, 0);
fill(v.begin() + cnt_0, v.end(), 1);
}
int main(int argc, char const* argv[]) {
vector<int> v;
int cnt_comparison;
v = {0};
sort_seq(v, cnt_comparison);
assert(v == (vector<int>{0}));
v = {1};
sort_seq(v, cnt_comparison);
assert(v == (vector<int>{1}));
v = {0, 1};
sort_seq(v, cnt_comparison);
assert(v == (vector<int>{0, 1}));
v = {1, 0};
sort_seq(v, cnt_comparison);
assert(v == (vector<int>{0, 1}));
v = {0, 0, 1};
sort_seq(v, cnt_comparison);
assert(v == (vector<int>{0, 0, 1}));
v = {0, 1, 0};
sort_seq(v, cnt_comparison);
assert(v == (vector<int>{0, 0, 1}));
v = {1, 0, 0};
sort_seq(v, cnt_comparison);
assert(v == (vector<int>{0, 0, 1}));
v = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0};
sort_seq(v, cnt_comparison);
assert(v == (vector<int>{0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1}));
assert(cnt_comparison == 6);
v = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
sort_seq(v, cnt_comparison);
assert(v == (vector<int>{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}));
assert(cnt_comparison == 11);
v.resize(1000);
auto it{v.begin()};
fill(it, it + 500, 0);
it += 500;
fill(it, it + 500, 1);
vector cp_v{v};
random_shuffle(v.begin(), v.end());
sort_seq(v, cnt_comparison);
assert(v == cp_v);
cout << "Comparison No1| Count: " << cnt_comparison
<< ", Propertion:" << (double)cnt_comparison / 1000 << endl;
v.resize(10000);
it = v.begin();
fill(it, it + 5000, 0);
it += 5000;
fill(it, it + 5000, 1);
cp_v = v;
random_shuffle(v.begin(), v.end());
sort_seq(v, cnt_comparison);
assert(v == cp_v);
cout << "Comparison No2| Count: " << cnt_comparison
<< ", Propertion:" << (double)cnt_comparison / 10000 << endl;
return 0;
}
Computed Comparison Counts
Comparison No1| Count: 663, Propertion:0.663
Comparison No2| Count: 6648, Propertion:0.6648
Solution2
Outline
We use "pool" of elements and check whether the pool is empty or not.
If the pool is frequently empty, we can reduce comparison because we simply put the element dealed with that time.
Algorithm
Let P be pool for elements.
Loop until (n-1) elements examined.
During the loop, we trace the count of '0' and '1'.
Take a element from the rest of the elements.
If p is empty, put it in P.
Else
compare the value of the element with any of the elements in the P.
If the former is larger
Increment the count of '1' and increase the count of '0' by the total amount of elements in P.
Else if the latter is larger
Increment the count of '0' and increase the count of '1' by the total amount of elements in P.
Else put it in P.
If P is not empty, compare any of elements in it with 0, and increase the count of '0' or that of '1' according to the result.
Then, reconstruct the sequence by both counts.
The last element is placed between the last '0' and the first '1', so we don't have to compare and clarify which value the last element is.
The number of comparison is:
- The best case:
- The worst case:
Cost Analysis
Let p_m be a probability that the pool is empty just before examining the m-th element.
To think the relation between and , we think the transition of the state of the pool from the time just before examining m-th element to the time just before examining (m+2)-th element.
From Figure 1, we can formulate the relation:
Then, initial states are:
So, even and odd sequence's relations are:
If the pool is not empty, 1 comparison is required for each element. So Let be the total count of comparisons in the average case, then is:
The second term in the equation above is asymtotically constant.
So, .
Code
#include <algorithm>
#include <cassert>
#include <iostream>
#include <vector>
using namespace std;
struct Pool {
int val;
size_t cnt{0};
};
void sort_seq(vector<int>& v, int& cnt_comparison) {
cnt_comparison = 0;
// examine data
size_t cnt_zero{0};
size_t cnt_one{0};
Pool p;
for (size_t i{0}; i < v.size() - 1; ++i) {
if (!p.cnt) {
p.val = v[i];
p.cnt++;
} else {
++cnt_comparison;
if (p.val < v[i]) {
cnt_zero += p.cnt;
++cnt_one;
p.cnt = 0;
} else if (v[i] < p.val) {
++cnt_zero;
cnt_one += p.cnt;
p.cnt = 0;
} else
p.cnt++;
}
}
// clean pool
if (p.cnt) {
++cnt_comparison;
if (!p.val)
cnt_zero += p.cnt;
else
cnt_one += p.cnt;
}
// reconstruct
auto it{v.begin()};
fill(it, it + cnt_zero, 0);
it += cnt_zero;
*it++ = v.back();
fill(it, it + cnt_one, 1);
}
int main(int argc, char const* argv[]) {
vector<int> v;
int cnt_comparison;
v = {0};
sort_seq(v, cnt_comparison);
assert(v == (vector<int>{0}));
v = {1};
sort_seq(v, cnt_comparison);
assert(v == (vector<int>{1}));
v = {0, 1};
sort_seq(v, cnt_comparison);
assert(v == (vector<int>{0, 1}));
v = {1, 0};
sort_seq(v, cnt_comparison);
assert(v == (vector<int>{0, 1}));
v = {0, 0, 1};
sort_seq(v, cnt_comparison);
assert(v == (vector<int>{0, 0, 1}));
v = {0, 1, 0};
sort_seq(v, cnt_comparison);
assert(v == (vector<int>{0, 0, 1}));
v = {1, 0, 0};
sort_seq(v, cnt_comparison);
assert(v == (vector<int>{0, 0, 1}));
v = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0};
sort_seq(v, cnt_comparison);
assert(v == (vector<int>{0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1}));
assert(cnt_comparison == 5);
v = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
sort_seq(v, cnt_comparison);
assert(v == (vector<int>{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}));
assert(cnt_comparison == 10);
v.resize(1000);
auto it{v.begin()};
fill(it, it + 500, 0);
it += 500;
fill(it, it + 500, 1);
vector cp_v{v};
random_shuffle(v.begin(), v.end());
sort_seq(v, cnt_comparison);
assert(v == cp_v);
cout << "Comparison No1| Count: " << cnt_comparison
<< ", Propertion:" << (double)cnt_comparison / 1000 << endl;
v.resize(10000);
it = v.begin();
fill(it, it + 5000, 0);
it += 5000;
fill(it, it + 5000, 1);
cp_v = v;
random_shuffle(v.begin(), v.end());
sort_seq(v, cnt_comparison);
assert(v == cp_v);
cout << "Comparison No2| Count: " << cnt_comparison
<< ", Propertion:" << (double)cnt_comparison / 10000 << endl;
return 0;
}
Computed Comparison Counts
Comparison No1| Count: 663, Propertion:0.663
Comparison No2| Count: 6631, Propertion:0.6631