List Comprehensions, Generators, Variadic Arguments, & Decorators

Announcements

  • Assignment 4 due April 1
  • Assignment 5 due April 13

Intermediate Python Concepts

  • Nothing we're going to learn is absolutely necessary to be able to do something in Python
  • However they do make it much easier to do some things
  • If you spend any amount of time with other people's code, you are going to encounter these
  • Some packages make extensive use of these concepts (e.g. Django and decorators)

A common task in python: you have a list of things and you want to go over each thing and change it somehow and create a new list of those changed things.


        numbers = [1, 3, 5, 7, 9]
        squares = []
        for number in numbers:
            squares.append(number * number)
    

There is a word for this operation; it's called a map.

As in "I want to map a function to square a number over a list of numbers."


        numbers = [1, 3, 5, 7, 9]
        squares = []
        for number in numbers:
            squares.append(number * number)
    

A common task in python: you have a list of things and you want to filter some out based on some condition and keep only the remaining things.


        numbers = [1, 2, 3, 5, 6, 13, 21, 34]
        even_numbers = []
        for number in numbers:
            if number % 2 == 0:
                even_numbers.append(number)
    

There is a word for this operation; it's called a filter.

As in "I want to filter a list of numbers so that it only contains even numbers."


        numbers = [1, 2, 3, 5, 6, 13, 21, 34]
        even_numbers = []
        for number in numbers:
            if number % 2 == 0:
                even_numbers.append(number)
    

Python provides a syntactic tool to express mapping and filtering operations in a more concise way.

It's called a list comprehension and it looks like this:

[expression for item in iterable]

Simplify Mapping Operations


        numbers = [1, 3, 5, 7, 9]
        squares = []
        for number in numbers:
            squares.append(number * number)
    

We can simplify the above expression to this:


        numbers = [1, 3, 5, 7, 9]
        squares = [number * number for number in numbers]
    

Instead of initializing a new list and building it up
over a for-loop, we generate the list all at once.

You can use any iterable in a list comprehension:


        numbers = (1, 3, 5, 7, 9)
        squares = [number * number for number in numbers]
    

        tags = soup.select("h1")
        headers = [tag.text for tag in tags]
    

But as the name implies, the result of the comprehension will be a Python list.

List comprehensions have an expanded syntax that lets us do filtering too:

[expression for item in iterable if condition]

Simplify Filter Operations


        numbers = [1, 2, 3, 5, 6, 13, 21, 34]
        even_numbers = []
        for number in numbers:
            if number % 2 == 0:
                even_numbers.append(number)
    

We can simplify the above expression to this:


        numbers = [1, 2, 3, 5, 6, 13, 21, 34]
        even_numbers = [num for num in numbers if num % 2 == 0]
    

We can filter and map in one line.
Let's say we want to filter a list of numbers down to
just the even numbers then square them:


        numbers = [1, 2, 3, 5, 6, 13, 21, 34]
        evens_squared = [num * num for num in numbers if num % 2 == 0]
    

Unpacking tuples works the same way it does in a for loop:


        # Find all pairs that add up to an even number:
        pairs = [(1, 2), (2, 2), (1, 3), (2, 3)]
        even_pairs = [(a, b) for a, b in pairs if a + b % 2 == 0]
    

What questions do you have?

When you call a function like this:


        def add(a, b):
            return a + b

        add(1, 2)
    

...what actually happens?

  1. You call a function
  2. Memory is allocated on the stack for the function
  3. Function returns result
  4. Memory is reclaimed

Function calls are usually a one-way process

This function stores the result in a list and then returns it:


        def evens_until(start, until):
            result = []
            current = start
            while current <= until:
                if current % 2 == 0:
                    result.append(current)
                current += 1
            return result
    

Using this:


        for number in evens_until(2, 2000):
            print(number)
    

What about this?


        for number in evens_until(2, 2_000_000_000):
            print(number)
    

This used up 15 GB of memory.

This function returns a generator:


        def evens_until(start, until):
            current = start
            while current <= until:
                if current % 2 == 0:
                    yield current
                current += 1
    

        for number in evens_until(2, 2000):
            print(number)
    

A generator is an example of what's called "lazy evaluation".
Unlike in real life, in computation laziness can be a good thing!

Generators

  • A generator lets us compute one result in a sequence at a time
  • We can keep yielding control back to the function until we don't need more results...
  • ...or until the function has no more results to give us.
  • If we reach that point, we say the generator is exhausted.
  • We don't have to exhaust a generator. We can stop earlier if we want:

Generators let us work with infinity:


        def all_positive_integers():
            current = 1
            while True:
                yield current
                current += 1

        for number in all_positive_integers():
            print(number)
            if number > 1000000:
                break
        print("Ok that's enough.")
    

        def evens_until(start, until):
            current = start
            while current <= until:
                if current % 2 == 0:
                    yield current
                current += 1

        result = evens_until(2, 10)  # Result is a generator.
        for number in result:
            print(number)       # Prints 2, 4, 6, 8, 10
        for number in result:
            print(number)       # Prints nothing!
    

Limitation 1:

You can only iterator over a generator once.


        def evens_until(start, until):
            current = start
            while current <= until:
                if current % 2 == 0:
                    yield current
                current += 1

        result = evens_until(2, 10)
        result[1]
        TypeError: 'generator' object is not subscriptable
    

Limitation 2:

You can't access an element via index.

Work-around:


        def evens_until(start, until):
            current = start
            while current <= until:
                if current % 2 == 0:
                    yield current
                current += 1

        result = evens_until(2, 10)
        result_as_list = list(result)
    

You can convert a generator to a list if a list is what you need.

This will really blow your mind:

A coroutine is like a generator, but for both ends.

What questions do you have?

So far, we have always operated under the assumption that when we define a function we have to know exactly what the arguments can be:

def scrape(html,
           name_selector='[itemprop="name"]',
           ingredient_selector='[itemprop="recipeIngredient"]',
           step_selector='[itemprop="step"]'):
        [...]
    

But what if we wanted to write a function where we don't know what the arguments would be in advance?

Or maybe we don't even care?

This is called a variadic function, and python lets us create them.

Imagine I want to write a function that behaves like this:


        add(1, 2)        # Returns 3
        add(1, 2, 3)     # Returns 6
        add(1, 2, 3, 4)  # Returns 10
    

How can I write a function that takes different numbers of arguments?

Here's how:


        def add(*args):
            total = 0
            for number in args:
                total += number
            return total
    

The asterisk indicates that when the function is called the arguments, no matter how many they are, will be stored in a list called args.

We can mix variadic and non-variadic arguments


        def what_things_are_there(first_thing,
                                  second_thing,
                                  *rest_of_things):
            print(f"The first thing is: {first_thing}")
            print(f"The second thing is: {second_thing}")
            print("There are {} other things".format(len(rest_of_things))
    

But the *args part has to be last.

Keyword-arguments work too:


        # returns False:
        is_there_a_cow(horse=True, chicken=True, emu=True)
        is_there_a_cow(cow=False, chicken=True, emu=True)
        # returns True:
        is_there_a_cow(horse=True, cow=True, emu=True)
    

A double asterisk is used for variadic keyword arguments:


        def is_there_a_cow(**kwargs):
            return "cow" in kwargs and kwargs["cow"] is True
    

**kwargs will be a dictionary where the keys are the argument name and the value is the argument.

Decorators

Let's say I want to write a "magic" function that allows me to modify any given function so that we first print out the arguments of the given function before calling it.

The magic function should work for any function it's given.

The magic function should return a new function that acts exactly like the given function except it prints out the arguments first.

I can do this right now with the python we have learned:


        def magic(some_function):

            def wrap(*args, **kwargs):
                print(f"Arguments are: {args}.")
                print(f"Keyword-arguments are: {kwargs}.")
                return some_function(*args, **kwargs)

            return wrap
    

magic() is an example of a higher-order function.

It takes another functions as an argument and returns a function as a result.

In python (and many languages) functions are just like any other thing.

Let's see this in use:


        from requests import get

        magical_get = magic(get)

        # Will first print out:
        #    Arguments are: ["http://www.google.com"].
        #    Keyword-arguments are: {"auth": "password"}.
        response = magical_get("http://www.google.com",
                               auth="mypassword")

    

Another example:


        # Remember this one?
        def add(*args):
            total = 0
            for number in args:
                total += number
            return total

        magic_add = magic(add)

        # Will first print out:
        #    Arguments are: [1, 2].
        #    Keyword-arguments are: {}.
        magic_add(1, 2)
        # Will first print out:
        #    Arguments are: [3, 4, 5].
        #    Keyword-arguments are: {}.
        magic_add(3, 4, 5)
    

Python gives us some syntax to allow us to easily "decorate" our functions with magic:


        @magic
        def add(*args):
            total = 0
            for number in args:
                total += number
            return total
    

Is equivalent to this:


        def add(*args):
            total = 0
            for number in args:
                total += number
            return total

        add = magic(add)
    

We can also decorate class methods:


        from datetime import datetime

        class Cat

            def __init__(name, age):
                self.name = name
                self.age = age

            def birth_year(self):
                now = datetime.now()
                return now.year - self.age
    

Usage:


        percy = Cat("Percy", 4)
        percy.birth_year()  # 2016
    

But we can also do this:


        from datetime import datetime

        class Cat

            def __init__(name, age):
                self.name = name
                self.age = age

            @property
            def birth_year(self):
                now = datetime.now()
                return now.year - self.age
    

Usage:


        percy = Cat("Percy", 4)
        percy.birth_year  # 2016
    

What's the point of decorators?

  • They allow you to easily change the way a function behaves without needing to change the function itself.
  • Can be useful for logging, error handling, debugging
  • They can also be useful for certain domain-specific cases, as we'll see soon
  • You don't need decorators, but they can make life much easier.

What questions do you have?