Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Programming
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Databases
Mail Systems
openSolaris
Eclipse Documentation
Techotopia.com
Virtuatopia.com
Answertopia.com

How To Guides
Virtualization
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Windows
Problem Solutions
Privacy Policy

  




 

 

Chapter 18. Generators and the yield Statement

We've made extensive use of the relationship between the for statement and various kinds of iterable containers without looking closely at how this works.

In this chapter, we'll look at the semantics of generators, their close relationshp an iterable container, and the for statement. We'll look at some additional functions that we can use to create and access data structures that support elegant iteration.

Generator Semantics

The easiest way to define an iterator (and the closely-related concept of generator) is to look at the for statement.

Let's look at the following snippet of code.

for i in ( 1, 2, 3, 4, 5 ):
    print i

Under the hood, the for statement engages in the following sequence of interactions with an iterable object like the sequence in the code snippet above.

  1. The for statement requests an iterator from the object; in this case the object is a tuple. The for statement does this by evaluating the iter function on the given expression. The working definition of iterable is that the object responds to the iter function.

  2. The for statement evaluates the the iterator's next method and assigns the value to the target variable; in this case, i.

  3. The for statement evaluates the suite of statements; in this case, the suite is just the print statement.

  4. The for statement continues steps 2 and 3 until an exception is raised. If the exception is a StopIteration, this is handled to indicate that the loop has finished normally.

The other side of this relationship is the iterator, which must define a next method; this method either returns the next item from a sequence (or other container) or it raises the StopIteration exception. Also, an iterator must maintain some kind of internal state to know which item in the sequence will be delivered next.

When we describe a container as iterable, we mean that it responds to the iter function by returning an iterator object that can be used by the for statement. All of the sequence containers return iterators; sets and files also return iterators. In the case of a dict, the iterator returns the dict's keys in no particular order.

Defining An Iterator. Generally, we don't directly create iterators, this can be complex. Most often, we define a generator. A generator is a function that can be used by the for statement as if it were an iterator. A generator looks like a conventional function, with one important difference: a generator includes the yield statement.

The essential relationship between a generator and the for statement is the same as between an iterator and the for statement.

  1. The for statement calls the generator. The generator begins execution and executes statements up to the first yield statement.

  2. The for statement assigns the value that was returned by the yield to the target variable.

  3. The for statement evaluates the suite of statements.

  4. The for statement continues steps 2 and 3 until the generator executes a return statement. In a generator, the return statement secretly raises the StopIteration exception. When a StopIteration is raised, it is handled by the for statement.

What we Provide. Generator definition is similar to function definition (see Chapter 9, Functions ); we provide three pieces of information: the name of the generator, a list of zero or more parameters, and a suite of statements that yields the output values.

We use a generator in a for statement by following the function's name with ()'s. The Python interpreter evaluates the argument values in the ()'s, then applies the generator. This will execute the generator's suite up to the first yield statement, which yields the first value from the generator. When the for statement requests the next value, Python will resume execution at the statement after the yield statement; the generator will work until it yields another value to the for statement.

This back-and-forth between the for statement and the generator means that the generator's local variables are all preserved by the yield statement. A generator has a peer relationship with the for statement; it's local variables are kept when it yields, and disposed of when it returns. This is distinct from ordinary functions, which have a context that is nested within the context that evaluated the function; an ordinary function's local variables are disposed of when it returns.

Example: Using a Generator to Consolidate Information. Lexical scanning and parsing are both tasks that compilers do to discover the higher-level constructs that are present in streams of lower-level elements. A lexical scanner discovers punctuation, literal values, variables, keywords, comments, and the like in a file of characters. A parser discovers expressions and statements in a sequence of lexical elements.

Lexical scanning and parsing algorithms consolidate a number of characters into tokens or a number of tokens into a statement. A characteristic of these algorithms is that some state change is required to consolidate the inputs prior to creating each output. A generator provides these characteristics by preserving the generator's state each time an output is yielded.

In both lexical scanning and parsing, the generator function will be looping through a sequence of input values, discovering a high-level element, and then yielding that element. The yield statement returns the sequence of results from a generator function, and also saves all the local variables, and even the location of the yield statement so that the generator's next request will resume processing right after the yield .


 
 
  Published under the terms of the Open Publication License Design by Interspire