Python - Advanced Class Definition Exercises

`Sample` Class with Statistical Methods

We can create a Samples class which holds a collection of sample values. This class can have functions for common statistics on the object's samples. For additional details on these algorithms, see the exercises in Chapter 13, Tuples and the section called “Sequence Processing Functions: map, filter, reduce and zip”.

We'll look at subclassing the built-in list class, by creating a class, Samples, which extends list. You'll need to implement the following methods in your new class.

__init__( *args ): The __init__ method saves a sequence of samples. It could, at this time, also precompute a number of useful values, like the sum, count, min and max of this set of data. When no data is provided, these values would be set to None.
__str__: The __str__ method should return string with a summary of the data. An example is a string like "%d values, min %g, max %g, mean %g" with the number of data elements, the minimum, the maximum and the mean. The superclass, list, __repr__ function will return the raw data.
mean: The mean method returns the sum divided by the count.
mode: The mode returns the most popular of the sample values. Below we've provided an algorithm that can be used to locate the mode of a sequence of samples.
median: The median is the value in the middle of the sequence. First, sort the sequence. If there is an odd number of elements, pick the middle-most element. If there is an even number of elements, average the two elements that are mid-most.
variance: The variance is a more complex calculation than the simple mean. For each sample, compute the difference between the sample and the mean, square this value, and sum these squares. The number of samples minus 1 is the degrees of freedom. The sum, divided by the degrees of freedom is the variance. Note that you need two samples to meaningfully compute a variance.
stdev: The stdev is the square root of the variance.

Note that the list superclass already works correctly with the built-in min and max functions. In this case, this consequence of using inheritance instead of wrapping turns out to be an advantage.

Procedure 22.1. Computing the Mode

Initialize. Create an empty dictionary, freq, for frequency distribution.
For Each Value. For each value, v, in the sequence of sample values.
1. Unknown? If v is not a key in freq, then add v to the dictionary with a value of 0.
2. Increment Frequency. Increment the frequency count associated with v in the dictionary.
Extract. Extract the dictionary items as a sequence of tuples ( value , frequency ).
Sort. Sort the sequence of tuples in ascending order by frequency.
Result. The last value in the sorted sequence is a tuple with the most common sample value and the frequency count.

Shuffling Method for the `Deck` class

Shuffling is a matter of taking existing cards and putting them into other positions. There are a many ways of doing this. We'll need to try both to see which is faster. In essence, we need to create a polymorphic family of classes that we can use to measure performance.

Procedure 22.2. Shuffling Variation 1 - Exchange

For i in range 0 to the number of cards
1. Generate a random number r in the range 0 to the number of cards.
2. Use Multiple Assignement to swap cards at position i and r.

Procedure 22.3. Shuffling Variation 2 - Build

Create an empty result sequence, s.
While there are cards in the source self.cards sequence.
1. Generate a random number r in the range 0 to the number of cards.
2. Append card r to the result sequence; delete object r from the source self.cards sequence. The pop method of a sequence can return a selected element and delete it from a sequence nicely.
Replace self.cards with the result sequence, s.

Procedure 22.4. Shuffling Variation 3 - Sort

Create a function which returns a random value selected from the sequence ( -1, 0, 1 ). This is similar to the built-in cmp function.
Use the sort method of a list with this random comparison-like function.
```
self.cards.sort( lambda a,b: random.choice( (-1,0,1) ) )
```

Procedure 22.5. Shuffling Variation 4 - random.shuffle

The random module has a shuffle method which can be used as follows.
```
random.shuffle( self.cards )
```

Of these four algorithms, which is fastest? The best way to test these is to create four separate subclasses of Deck, each of which provides a different implementation of the shuffle method. A main program can then create an instance of each variation on Deck and do several hundred shuffles.

We can create a timer using the time module. The time.clock function will provide an accurate time stamp. The difference between two calls to time.clock is the elapsed time.

Because all of our variations on Deck are polymorphic, our main program should look something like the following.

d1= DeckExch()
d2= DeckBuild()
d3= DeckSortRandom()
d4= DeckShuffle()
for deck in ( d1, d2, d3, d4 ):
    start= time.clock()
    for i in range(100):
        d.shuffle()
    finish= time.clock()

Encapsulation

The previous exercise built several alternate solutions to a problem. We are free to implement an algorithm with no change to the interface of Deck. This is a important effect of the principal of encapsulation: a class and the clients that use that class are only coupled together by an interface defined by method functions.

There are a variety of possible dependencies between a class and its clients.

Interface Method Functions. A client can depend on method functions specifically designated as an interface to a class. In Python, we can define internal methods by prefixing their names with _. Other names (without the leading _) define the public interface.
All Method Functions. A client can depend on all method functions of a class. This removes the complexity of hidden, internal methods.
Instance Variables. A client can depend on instance variables in addition to method functions. This can remove the need to write method functions that simply return the value of an instance variable.
Global Variables. Both classes share global variables. The Python global statement is one way to accomplish this.
Implementation. A client can depend on the specific algorithm being executed by a class. A client method can have expectations of how a class is implemented.

What are the advantages and disadvantages of each kind of dependency?

Class Responsibilities

Assigning responsibility to class can be challenging. A number of reasons can be used to justify the functions and instance variables that are combined in a single class.

Convenience. A class is defined to do things because — well — it's convenient to write the program that way.
Similar Operations. A class is defined because it does all input, all output, or all calculations.
Similar Time. A class is defined to handle all initialization, all processing or all final cleanup.
Sequence. We identify some operations which are performed in a simple sequence and bundle these into a single class.
Common Data. A class is defined because it has the operations which isolate a data structure or algorithm.

What are the possible differences between theses? What are the advantages and disadvantages of each?

Advanced Class Definition Exercises

Sample Class with Statistical Methods

Shuffling Method for the Deck class

Encapsulation

Class Responsibilities

`Sample` Class with Statistical Methods

Shuffling Method for the `Deck` class