Accumulating Unique Values. This uses the Bounded Linear Search
algorithm to locate duplicate values in a sequence. This is a
powerful technique to eliminate sorting from a wide variety of
summary-type reports. Failure to use this algorithm leads to
excessive processing in many types of applications.
Procedure 14.1. Unique Values of a Sequence,
seq
Initalize
set a ← an empty sequence.
Loop. For each value, v, in
seq.
We'll use the Bounded Linear Search for
v in a.
Initialize
i ← 0.
Append v to the
lista.
Search. while a[i] ≠
v, increment
i.
At this point
a[i] =
v. The question is whether
i = len(a
) or not.
New Value? if i =
len(a ), then
v is unique.
Existing Value? if i ≠
len(a ), then
v is a duplicate of
a[i];
a[-1] can be removed.
Result. Return array a, which has unique
values from seq.
Binary Search. This is not as universally useful as the Bounded Linear
Search (above) because it requires the data be sorted.
Procedure 14.2. Binary Search a sorted Sequence,
seq, for a target value,
tgt
Initialize
l←0.
h←len(seq
).
m←(l+h)÷2.
Loop. While l+1<h and
seq[m] ≠
tgt
If tgt <
seq[m], then
h←m
If tgt >
seq[m], then
l←m
m←(l+h)÷2.
If tgt =
seq[m], then return
m
If tgt ≠
seq[m], then return
-1 as a code for “not
found”.
Quicksort. The super-fast sort routine
As a series of loops it is rather complex. As a recursion it
is quite short. This is the same basic algorithm in the C
libraries.
Quicksort proceeds by partitioning the
list into two regions: one has all of the
high values, the other has all the low values. Each of these regions
is then individually sorted into order using the quicksort
algorithm. This means the each region will be subdivided and
sorted.
For now, we'll sort an array of simple numbers. Later, we can
generalize this to sort generic objects.
Procedure 14.3. Quicksort a List, a between elements
lo and hi
Partition
Initialize. middle ←
(hi+lo)÷2.
ls ← lo
hs ← hi
Swap To Partition. while ls <
hs
If a[ls].key ≤
a[middle].key, increment ls
If a[ls].key >
a[middle].key, swap ls with
middle
If a[hs].key ≥
a[middle].key, decrement hs
If a[hs].key <
a[middle].key, swap hs with
middle
Quicksort Each Partition
QuickSort( a, lo, middle
)
QuickSort( a, middle+1, hi
)
Recursive Search. This is also a binary search: it works using a design called
“divide and conquer”. Rather than search the whole
list, we divide it in half and search just
half the list. This version, however is
defined with a recusive function instead of a loop. This can often
be faster than the looping version shown in exercise 2.
Procedure 14.4. Recursive Search a List, seq for a
target, tgt, in the region between elements
lo and hi
Empty Region? If lo+1 =
hi, then return -1 as a code for
“not found”.
Middle Element. m ←
(lo+hi)÷2
Found? If seq[m] =
tgt, then return
m
Lower Half? If seq[m] <
tgt, then return
recursiveSearch (seq,
tgt, lo,
m )
Upper Half? If seq[m] >
tgt, then return
recursiveSearch (seq,
tgt, m,
hi )
Sieve of Eratosthenes. This is an algorithm which locates prime numbers. A prime
number can only be divided evenly by 1 and itself. We locate
primes by making a table of all numbers, and then crossing out the
numbers which are multiples of other numbers. What is left must be
prime.
Procedure 14.5. Sieve of Eratosthenes - List Version
Initialize
Create a list,
prime of 5000 booleans, all
True, initially.
p ← 2
Generate Primes. while 2 ≤ p < 5000
Find Next Prime. while not
prime[p] and 2 ≤
p < 5000: increment
p.
At this point, p is
prime.
Remove Multiples
k ← p +
p
while k < 5000
prime[k]
← false
k ← k +
p
Next p. increment p
Report
At this point, for all p if
prime[ p ] is true,
p is prime.
while 2 ≤ p < 5000
ifprime[ p ],
print p
Polynomial Arithmetic. We can represent numbers as polynomials. We can represent
polynomials as arrays of their coefficients. This is covered in
detail in [Knuth73], section 2.2.4 algorithms A
and M.
Example: 4x3 + 3x + 1 has the
following coefficients: ( 4, 0, 3, 1 ).
You can apply this to large decimal numbers. In this case,
x is assumed to be a power of 10, and the
coefficients must all be between 0 and x-1. For
example, 1987 is 1x3 +
9x2 + 8x + 7, when x = 10.
Procedure 14.6. Add Polynomials, p,
q
Result Size. rSize ← the larger of
len(p ) and
len(q ).
Pad P? If len(p )
< rSize, create p1
from zeroes on the left + p.
Else, p1 ←
p.
Pad Q? If len(q )
< rSize, create q1
from zeroes on the left + q.
Else, q1 ←
q.
Add. Add matching elements from p1 and
q1 to create result,
r
Result. Return r as the sum of
p and q.
Procedure 14.7. Multiply Polynomials, x,
y
Result Size. rSize ← the sum of
len(x ) +
len(y ).
Initialize the result array, r, to
all zeroes, with a size of rSize.
For all elements of x. while 0 ≤ i <
len(x ) loop
For all elements of y. do while 0 ≤ j <
len(y )
loop
add
x[i] *
y[j] to
r[i+j]
Result. Return r as the product of
x and y.
Random Number Evaluation. Before using a new random number generator, it is wise to
evaluate the degree of randomness in the numbers produced. A
variety of clever algorithms look for certain types of expected
distributions of numbers, pairs, triples, etc. This is one of many
random number tests.
Use random.random to generate an array of
random samples. These numbers will be uniform over the interval
0..1
Procedure 14.8. Distribution test of a sequence of random samples,
u
Initialize
Initialize count to a
list of 10 zeroes.
Examine Samples. for each sample value, v in
u
Coerce Into Range. Multiply v by 10 and truncate to
get a value x in 0..9 range
Count. increment
count[x]
Report. Display counts, % of total, deviation from expected
count of 1/10th of available samples
Dutch National Flag. A challenging problem, one of the hardest in this set. This
is from Edsger Dijkstra's book, A Discipline of
Programming [Dijkstra76].
Imagine a board with a row of holes filled with red, white,
and blue pegs. Develop an algorithm which will swap pegs to make
three bands of red, white, and blue (like the Dutch flag). You must
also satisfy this additional constraint: each peg must be examined
exactly once.
Without the additional constraint, this is a relatively simple
sorting problem. The additional constraint requires that instead of
a simple sort which passes over the data several times, we need a
more clever sort.
Hint: You will need four partitions in the array. Initially,
every peg is in the “Unknown” partition. The other
three partitions (“Red”, “White” and
“Blue”) are empty. As the algorithm proceeds, pegs are
swapped into the Red, White or Blue partition from the Unknown
partition. When you are done, the unknown partition is reduced to
zero elements, and the other three partitions have known numbers of
elements.
Published under the terms of the Open Publication License