Comprehensions and Generators

The Quick and Cool Way to Generate Sequences in Python

Introduction

If you have been learning Python, then you are most likely thinking that it's a very easy programming language to learn. The main reason being, it has one of the simplest syntax of all the programming languages, especially compared to languages like Java or C#. You don’t have to declare a data type to instantiate a variable, or specify whether this variable can change or not.

One cool feature in Python that will be discussed in this article is comprehensions and generators. If you wanted to create a list or a dictionary whose values (and keys) are of a certain pattern, you would most likely resort to writing a for loop to accomplish this. Let’s say you wanted to create a list of the first 10 triangular numbers. Or maybe You want to create a dictionary for U.S. presidents, whose values are the names of the presidents, and the keys are the president numbers associated with them at the time of their inaugurations (ex. George Washington #1, John Adams #2, Thomas Jefferson #3, etc.). We can accomplish both tasks with simple for loops.

# List of triangular numbers created using a for loop.
MAX = 10
triangular_numbers = []
for n in range(1, MAX + 1):
    triangular_numbers.append(int(n * (n + 1) / 2))

print(triangular_numbers)

"""
Printed Output
[1, 3, 6, 10, 15, 21, 28, 36, 45, 55]
"""
# Dictionary of U.S. presidents created using a for loop.

PRESIDENT_NAMES = [
    "George Washington",
    "John Adams",
    "Thomas Jefferson",
    "James Madison",
    "James Monroe",
    "James Polk",
    "Andrew Jackson",
    "Martin van Buren",
    "William Henry Harrison",
    "John Tyler"
]

presidents_dictionary = {}
for n in range(len(PRESIDENT_NAMES)):
    presidents_dictionary[n + 1] = PRESIDENT_NAMES[n]

print(presidents_dictionary)

"""
Printed Output:
{1: 'George Washington', 2: 'John Adams', 3: 'Thomas Jefferson', 
4: 'James Madison', 5: 'James Monroe', 6: 'John Quincy Adams', 
7: 'Andrew Jackson', 8: 'Martin van Buren', 9: 'William Henry Harrison', 
10: 'John Tyler'}
"""

Although this is fine, what if I were to tell you that there is a better way to do this? How, you ask? By writing this in one line (or one statement). This is accomplished using comprehensions or generators.

Comprehensions

List Comprehensions

A list comprehension, as just stated, is the act of writing a for loop to populate values in a list in a single line. You don't have to call "list.append(value)" each time a pass in the loop is made. You can cut to the chase by writing the expression you want to compute the value. List comprehensions are written as shown below:

new_list = [expression(n) for n in sequence if optional_condition == True]

Since lists are enclosed in brackets, you must enclose the list comprehension in brackets. The first argument in the brackets are the expression to compute the values for your list. Then the second argument is a for loop that points to a sequence. This could be a range, a tuple, or another list. And the final argument is optional. It tells the program to only add to the new list if a certain condition is met for the current item in the loop.

Now to convert our plain old for loop for populating the triangular numbers list into a list comprehension. Before looking at the solution. Go ahead and give it a try to see if you can do it based on the format shown above.

List Comprehension Example Solution

MAX = 10
triangular_numbers = [int(n * (n + 1) / 2) for n in range(1, MAX + 1)]
print(triangular_numbers)
# Printed Output: [1, 3, 6, 10, 15, 21, 28, 36, 45, 55]

How nice is that. We literally cut the number of lines of code in half. If you wanted to add a condition to only add triangular numbers if the current value of n is even, it can be done as shown below:

MAX = 10
triangular_numbers = [int(n * (n + 1) / 2) for n in range(1, MAX + 1) if n % 2 == 0]
print(triangular_numbers)
# Printed Output: [3, 10, 21, 36, 55]

As you can see, with that condition in place, the program now only prints the 2nd, 4th, 6th, 8th, and 10th triangular numbers. You most likely never need to use this specific example in the real world, but it’s simply for demonstrative purposes.

Dictionary Comprehensions

A dictionary comprehension works in a similar way, with the exception that since dictionaries are enclosed in curly braces, you have to surround your comprehension statements in curly braces as well. A dictionary comprehension can be created in multiple ways, but I will only show one for demonstrating our code. If you would like to check out other ways, please see the reference titled “Python Dictionary Comprehension”.

new_dictionary = {key: value for (key, value) in zip(keys, values) if optional_condition == True}

The zip function allows us to iterate through multiple sequences at once, stopping at the smallest length of the sequences. Before looking at the solution, try to see if you can convert the dictionary of presidents from a plain old for loop into a comprehension based on the format shown above.

Dictionary Comprehension Example Solution

PRESIDENT_NAMES = [
    "George Washington",
    "John Adams",
    "Thomas Jefferson",
    "James Madison",
    "James Monroe",
    "John Quincy Adams",
    "Andrew Jackson",
    "Martin van Buren",
    "William Henry Harrison",
    "John Tyler"
]

presidents_dictionary = {number: president for(number, president) in zip(range(1,len(PRESIDENT_NAMES) + 1), PRESIDENT_NAMES)}
print(presidents_dictionary)

"""
Printed Output:
{1: 'George Washington', 2: 'John Adams', 3: 'Thomas Jefferson', 
4: 'James Madison', 5: 'James Monroe', 6: 'John Quincy Adams', 
7: 'Andrew Jackson', 8: 'Martin van Buren', 9: 'William Henry Harrison', 
10: 'John Tyler'}
"""

That’s all you would have to do. If you care about the 80 character limit per line, then feel free to break the statement in to the appropriate number of lines. But just remember, it’s still one statement of code.

Generator Expressions

Just so you know, list comprehensions consume memory and take more time to execute. If you would like to reduce the amount of space and time to write one-line statements, go for generator expressions. A generator expression saves a list comprehension into code and allows you to print out the values of the collection using a for loop.

MAX = 10
triangular_numbers = (int(n * (n + 1) / 2) for n in range(1, MAX + 1))
print(triangular_numbers)
[print(n) for n in triangular_numbers]

"""
Printed Output: 
<generator object <genexpr> at 0x7f5c39c15430>
1
3
6
10
15
21
28
36
45
55
"""

Two things can be noted here. The first noticeable thing is that when you try printing the variable that the generator expression is stored in, it will print an alphanumeric sequence enclosed in angle brackets. So you would have to resort to for loops to print the values. This leads to the second noticeable thing, you can write a for loop to print all the values in a single line. The line looks like a list comprehension. There seems to be no way to do the same thing for dictionaries. If I'm wrong, or that changes, I will make changes to this article.

Conclusions

That's all there is to it for comprehensions and generators. The main takeaway is to simplify creating of lists and dictionaries using single line statements instead of multi-line for loops. If you would like to learn more about any of the topics discussed here, check out the references below. And happy coding!

References

  1. Triangular Numbers Sequence

  2. List of U.S. Presidents

  3. Python List Comprehensions vs Generator Expressions

  4. Python Dictionary Comprehension

  5. Python zip() Function