Readability Counts – Line Breaks – Regular Expressions



Readability Counts – Line Breaks – Regular Expressions

0 1


readability-counts

25 minute talk on writing readable code

On Github treyhunner / readability-counts

Readability Counts

Trey Hunner / @treyhunner

Hey everybody! 😄

So let's talk about readability.

“Readability is the ease with which a reader can understand a written text.” —Wikipedia

  • alright, so before we talk about readability, let's make sure we're all on the same page about what readability means
  • readability is just the measure of how easily our code can be read
  • I assume you're all here because you care about readability.
  • But why do you care about readability?
  • What makes readability important?

Why does readability matter?

  • Readability is about making our lives easier
  • Code is read more often than it is written
  • Better readability means easier maintainability
  • Readable code makes on-boarding easier
  • every time you fix a bug, change some functionality, or add a feature, you probably need to read some code
  • so you probably read code more often than you write it
  • also code doesn't always completely stagnate after it's written. Sometimes you do need to change code.
  • and in order to change code you sort of need to read it.
  • so readability is a prerequisite for maintainability. You cannot maintain code without reading it.
  • Lastly, not all teams are immortal. You do sometimes need to hire developers and those developers will need to be on-boarded.
  • It's a lot easier to on-board a new team member if they can read your code with ease

We're not discussing

  • Writability
  • Performance
  • Python-speaking robots
  • Before we get started, let's make it clear what we're not talking about
  • We're not going to talk about how easy it is to write code
  • We're also not going to talk about how easy it is for a computer to run your code
  • During this talk, we're assuming that source code is primarily designed for human consumption
  • We're talking specifically about how easy it is for humans to read your code

We will discuss

  • Whitespace, line breaks, and code structure
  • Giving concepts a name
  • Choosing the right construct
  • we'll talk about how you structure your code: which basically boils down to where you put your line breaks
  • We'll also discuss naming unnamed things and naming things descriptively
  • And finally we'll re-consider some of the programming idioms that we use every day
  • We'll be looking at a lot of small code examples. I'll tweet these slides out after the talk, so you can re-read each of these examples later.

Structuring code

  • Line length: number of characters in one line of text
  • Line length is a human limitation, not technical one
  • Focus on readability when wrapping lines, not line length
  • 🖱 Let's talk about the structure of our code first
  • 🖱 In the modern age, line length is no longer a technical limitation. Screens are really wide now.
  • Line length is about readability
  • Long lines are hard to read.
  • Line length is a little flawed though
  • Line length is only approximately co-related with readability
  • 🖱 When discussing line length, remember that short lines are not the end goal: readability is
  • So when inserting line breaks make sure that you're focusing on readability, not line length.

Line Breaks

Let's take a look at an example

employee_hours = (schedule.earliest_hour for employee in
                  self.public_employees for schedule in
                  employee.schedules)
return min(h for h in employee_hours if h is not None)

  • this code has a line length under 60 characters
  • as you read this code, you're probably trying to figure out what it does
  • but you won't figure out what it does until you've discovered the structure of the code
  • you'll eventually notice that that 1st statement includes a generator expression with 2 loops
  • and that 2nd statement includes a generation expressions with a single loop and a condition
  • this code is hard to read because the line breaks are inserted completely arbitrarily
  • the author simply wrapped their lines whenever they were approaching a certain line length
  • the author was valuing the line length as most important. They completely forgot about readability.

employee_hours = (
    schedule.earliest_hour
    for employee in self.public_employees
    for schedule in employee.schedules
)
return min(
    hour
    for hour in employee_hours
    if hour is not None
)

  • is this code more readable?
  • this is the same code as before
  • but the line breaks have been moved around to split the code up into logical parts
  • these line breaks not inserted arbitrarily
  • these line breaks were inserted with the express goal of improving readability

Regular Expressions

Let's talk about regular expressions

def is_valid_uuid(uuid):
    """Return True if given variable is a valid UUID."""
    return bool(re.search(r'^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}'
                          r'-[0-9a-f]{4}-[0-9a-f]{12}$',
                          uuid, re.IGNORECASE))

  • this function returns True if the string it was given is a valid UUID
  • A UUID consists of a bunch of hexadecimal digits with dashes in between
  • this is a function that uses a regular expression to validate the string as a UUID

  • is this code readable?

  • this code is difficult to read because regular expressions are information dense

  • So regular expressions are code. They're like mini programs that are written all on line, without whitespace or comments

  • you wouldn't write Python code all on line of code without spaces or comments. Why do you write your regular expression code that way?

def is_valid_uuid(uuid):
    """Return True if given variable is a valid UUID."""
    UUID_RE = re.compile(r'''
        ^
        [0-9a-f] {8}    # 8 hex digits
        -
        [0-9a-f] {4}    # 4 hex digits
        -
        [0-9a-f] {4}    # 4 hex digits
        -
        [0-9a-f] {4}    # 4 hex digits
        -
        [0-9a-f] {12}   # 12 hex digits
        $
    ''')
    return bool(UUID_RE.search(uuid, re.IGNORECASE | re.VERBOSE))

  • when using regular expressions: always enable VERBOSE mode!
  • With verbose mode, we can wrap a regular expression over multiple lines reducing the information density
  • We've broken those 2 lines into 13
  • We've also extra whitespace and comments, but those line breaks are a huge help on their own
  • when using regular expressions: always turn on VERBOSE mode!
  • VERBOSE mode allows you to add extra line breaks which improve readability

Function Calls

    • Let's look at another example.
    • Let's say we're creating a Django model
    • and one of our model fields has a whole bunch of arguments being passed to it

default_appointment = models.ForeignKey(
    'AppointmentType', null=True, on_delete=models.SET_NULL,
    related_name='+')
default_appointment = models.ForeignKey('AppointmentType',
                                        null=True,
                                        on_delete=models.SET_NULL,
                                        related_name='+')
default_appointment = models.ForeignKey(
    'AppointmentType',
    null=True,
    on_delete=models.SET_NULL,
    related_name='+')

  • So we're passing a lot of arguments into this ForeignKey field
  • and it's feeling unwieldy
  • (pause)
  • Is this a good way wrap our code over multiple lines?
  • 🖱 What about this way?
  • Is this better? Or is it worse?
  • 🖱 What about this one? How does it compare?
  • Would anything change if we were using exclusively keyword arguments?

default_appointment = models.ForeignKey(
    othermodel='AppointmentType', null=True,
    on_delete=models.SET_NULL, related_name='+')
default_appointment = models.ForeignKey(othermodel='AppointmentType',
                                        null=True,
                                        on_delete=models.SET_NULL,
                                        related_name='+')
default_appointment = models.ForeignKey(
    othermodel='AppointmentType',
    null=True,
    on_delete=models.SET_NULL,
    related_name='+')

  • Would that affect our choice?
  • (pause)
  • So personally... I usually prefer that last strategy for wrapping my lines, especially with all keyword arguments
  • That first one is difficult to read and the second one can be problematic when you have really long lines like we do right now
  • Let's take a look at that last strategy more closely

default_appointment = models.ForeignKey(
    othermodel='AppointmentType',
    null=True,
    on_delete=models.SET_NULL,
    related_name='+')
          

  • (pause)
  • Would it be better to leave that closing parenthesis on its own line? 🖱

default_appointment = models.ForeignKey(
    othermodel='AppointmentType',
    null=True,
    on_delete=models.SET_NULL,
    related_name='+'
)

  • (pause)
  • What if we added a trailing comma? Would this be an improvement or is this worse? 🖱

default_appointment = models.ForeignKey(
    othermodel='AppointmentType',
    null=True,
    on_delete=models.SET_NULL,
    related_name='+',
)

  • Personally, I prefer this one the most when I'm calling a function or a class on its own
  • But not when I'm defining a function
  • Now... I'm certain that many of you disagree with my preferences here... and that's okay
  • The fact that we disagree means that we must document the way that we wrap functions in our style guide for every. project. we. create.
  • You do have a style guide for every project you work on... right?
  • (pause and lift an eyebrow)
  • Consistency lies at the heart of readability.
  • Make sure you define a style guide with explicit conventions in every single Python project you make.

PEP 8

  • PEP 8 is the Python style guide
  • Read PEP 8 every 6 months
  • PEP 8 is not your project's style guide
  • PEP 8 is just a sane starting point
  • Speaking of style guides, let's talk about PEP 8
  • 🖱 PEP8 is the Python style guide
  • 🖱 If you aren't familiar with PEP8, read it
  • If you think you're familiar with PEP8, re-read it!
  • 🖱 In fact, why not make a habit of re-reading PEP8 every 6 months and pondering the relationship between PEP8 and your code?
  • 🖱 So PEP 8 is the Python style guide, but not your project's style guide
  • You need an opinionated code style guide.
  • 🖱 PEP 8 is not opinionated enough for your project. It's a great starting point though

Recap: code structure

  • Keep your text width narrow
  • Do not rely on automatic line wrapping
  • Insert line breaks with readability in mind
  • Document your code styles and stick to them
  • do poets use a maximum line length to wrap their lines?
  • No! poets break up their line breaks with purpose.
  • in poetry: inserting a line break is an art form
  • in code: inserting a line break is also an art form
  • so as programmers, we should wrap our lines with great care
  • And remember: all of your projects should have a style guide that goes beyond PEP 8.
  • Your code style conventions should be explicitly documented.

Naming things

  • Important concepts deserve names
  • Naming things is hard because describing things is hard
  • A name should fully & accurately describe its subject
  • Don't be afraid of using long variable names
  • If a concept is important, it needs a name
  • 🖱 Names give you something to communicate about
  • 🖱 unfortunately, naming things is hard
    • naming a thing requires describing it. And describing things isn't always easy.
    • 🖱 Not only that, once you've described a thing, you need to shorten your description into a name. And that's not easy either.
  • 🖱 if you can't think of a good short name, use a long and descriptive one. That's a lot better than a subpar name. Plus you can always shorten it tomorrow.
  • 🖱 so worry about accuracy, not name length
  • (pause)

Better Names

Let's take a look at some code with poor variable names

sc = {}
for i in csv_data:
    sc[i[0]] = i[1]

  • I bet you do not know what sc stands for in this code
  • You might know if you had more context, but if you're new to this code you'll search around for a while to discover that it stands for state_capitals

state_capitals = {}
for i in capitals_csv_data:
    state_capitals[i[0]] = i[1]

  • don't use two-letter variable names in Python, use descriptive names
  • (pause)
  • speaking of descriptive names, what does the variable i represent here?
  • is it a two-tuple?
  • is i[0] capitals or states?
  • or is it something else?
  • (pause)
  • whenever you see an index access this should be a red flag
  • index access can usually be replaced by variables

state_capitals = {}
for s, c, *_ in capitals_csv_data:
    state_capitals[s] = c

  • we can do this with tuple unpacking
  • you can probably tell now that s means state and c means capital
  • so avoid using arbitrary indexes
  • and when possible, use tuple unpacking instead
  • it's often a lot more explicit and a lot more clear
  • (pause)
  • now while you did probably guess s and c in this case, there's no reason not to use real words for these variable names

state_capitals = {}
for state, capital, *_ in capitals_csv_data:
    state_capitals[state] = capital

  • (pause)
  • name every variable with care
  • optimize for maximum accuracy and completeness...
  • not short variable names

Name All The Things

Let's look at an example of code that could use some more variables

def detect_anagrams(word, candidates):
    anagrams = []
    for candidate in candidates:
        if (sorted(word.upper()) == sorted(candidate.upper())
                and word.upper() != candidate.upper()):
            anagrams.append(candidate)

  • this code returns a list of all anagrams of word that are in the candidates list
  • it's not bad code, but it's also not the most descriptive code
  • that if statement in particular is pretty loaded
  • what if we abstracted out that logic into its own function

def detect_anagrams(word, candidates):
    anagrams = []
    for candidate in candidates:
        if is_anagram(word, candidate):
            anagrams.append(candidate)

  • With that is_anagram function, I think it's a lot more obvious now that we're checking whether two words are in fact anagrams
  • we've broken the problem down and described the process that we're using
  • and at the same time, we've hidden away the details of our actual algorithm
  • let's take a look at that is_anagram function

def is_anagram(word1, word2):
    return (sorted(word1.upper()) == sorted(word2.upper())
            and word1.upper() != word2.upper())

  • so this is pretty much exactly what our if statement had in it before
  • it could still use some work
  • word1.upper() appears twice and so does and word2.upper().
  • so we have some code duplication going on
  • let's fix

def is_anagram(word1, word2):
    word1, word2 = word1.upper(), word2.upper()
    return sorted(word1) == sorted(word2) and word1 != word2

  • that's a lot better
  • I find the conditional expression on that last line a little easier to read
  • (pause)
  • I think there's still room for improvement though
  • so one strategy I use for testing code clarity is to read my code alound in, to test how descriptive it is.
  • here we're:
    • sorting our words
    • checking whether sorted versions are equal
    • and then checking whether the unsorted versions are not equal
  • that description isn't very helpful
  • let's write a comment that describes our intent a little better

def is_anagram(word1, word2):
    word1, word2 = word1.upper(), word2.upper()
    return (
        sorted(word1) == sorted(word2)  # words have same letters
        and word1 != word2  # words are not the same
    )

  • we've added two new comments
  • if we ignore the code and read the comments we'll see that:
    • we're verifying that the words have the same letters
    • and whether they're different words
  • (pause)
  • whenever you find yourself adding a comment to your code,
  • that might be a hint that you need to make another variable name
  • remember that comments describe things...
  • and variables make that description into code
  • so let's turn those comments into some descriptive variable names

def is_anagram(word1, word2):
    word1, word2 = word1.upper(), word2.upper()
    are_different_words = (word1 != word2)
    have_same_letters = (sorted(word1) == sorted(word2))
    return have_same_letters and are_different_words

  • so we've turned those two conditional statements into two new variables that describe what they do
  • that last line says that we're checking whether these words have the same letters but are different words
  • that's exactly what our comments said before this
  • those new variables, are_different_letters and have_same_words made the logic of our function a little bit more explicit
  • this code is more clear and more readable because we're conveying the intent of our algorithm and not just the details

def detect_anagrams(word, candidates):
    anagrams = []
    for candidate in candidates:
        if is_anagram(word, candidate):
            anagrams.append(candidate)


def is_anagram(word1, word2):
    word1, word2 = word1.upper(), word2.upper()
    are_different_words = word1 != word2
    have_same_letters = sorted(word1) == sorted(word2)
    return have_same_letters and are_different_words

  • so we ended up adding a four extra lines to our code
  • but we broke down our process a bit
  • so that our code is a little more understandable at a glance
  • (pause)
  • Now you may think this a silly example
  • I mean what we started with wasn't really that complicated
  • but even if we do decided to revert some of the changes we just made, this was a worthwhile mental exercise
  • (pause)
  • the exercise of refactoring your code to be more self-documenting can really reframe the way that you think about your code

So Many Functions

Let's take look at a complex Django model method

def update_appointment_types(self):
    """Delete/make appt. types and set default appt. type"""

    self.appt_types.exclude(specialty=self.specialty).delete()

    new_types = self.specialty.appt_types.exclude(agent=self)
    self.appt_types.bulk_create(
        AppointmentType(agent=self, appointment_type=type_)
        for type_ in new_types
    )

    old_default_id = self.default_appt_id
    self.default_appt_type = self.specialty.default_appt_type
    if self.default_appt_type.id != old_default_id:
        self.save(update_fields=['default_appt_type'])

  • So this function is a method that lives inside a Django model
  • There's a lot going on in this method
  • Instead of reading all this code, I want you to unfocus your eyes and just look at the structure
  • Let me help you out.

def update_appointment_types(self):
    """Delete/make appt. types and set default appt. type"""

    self.appt_types.exclude(specialty=self.specialty).delete()

    new_types = self.specialty.appt_types.exclude(agent=self)
    self.appt_types.bulk_create(
        AppointmentType(agent=self, appointment_type=type_)
        for type_ in new_types
    )

    old_default_id = self.default_appt_id
    self.default_appt_type = self.specialty.default_appt_type
    if self.default_appt_type.id != old_default_id:
        self.save(update_fields=['default_appt_type'])

  • There. Now I can talk without you getting distracted by reading all that code
  • The first thing you'll notice is that this code is broken up into three sections
  • This code was broken into three sections because each of these section performs a different task

  • Let's add some comments to these sections

def update_appointment_types(self):
    """Delete/make appt. types and set default appt. type"""

    # Delete appointment types for specialty besides current one
    self.appt_types.exclude(specialty=self.specialty).delete()

    # Create new appointment types based on specialty (if needed)
    new_types = self.specialty.appt_types.exclude(agent=self)
    self.appt_types.bulk_create(
        AppointmentType(agent=self, appointment_type=type_)
        for type_ in new_types
    )

    # Set default appointment type based on specialty
    old_default_id = self.default_appt_id
    self.default_appt_type = self.specialty.default_appt_type
    if self.default_appt_type.id != old_default_id:
        self.save(update_fields=['default_appt_type'])

  • In my opinion, adding those comments improved the readability of this code
  • We can understand very quickly what each sections is actually doing

def update_appointment_types(self):
    """Delete/make appt. types and set default appt. type"""

    # Delete appointment types for specialty besides current one
    self.appt_types.exclude(specialty=self.specialty).delete()

    # Create new appointment types based on specialty (if needed)
    new_types = self.specialty.appt_types.exclude(agent=self)
    self.appt_types.bulk_create(
        AppointmentType(agent=self, appointment_type=type_)
        for type_ in new_types
    )

    # Set default appointment type based on specialty
    old_default_id = self.default_appt_id
    self.default_appt_type = self.specialty.default_appt_type
    if self.default_appt_type.id != old_default_id:
        self.save(update_fields=['default_appt_type'])

  • Depending on why you're reading this code, you might even be able to get away with only reading the comments
  • which is great. We can understand what this code is doing without needing to understand all the details.
  • (pause)
  • but sometimes comments are a hint that we might have forgot to name some things

def update_appointment_types(self):
    """Delete/make appt. types and set default appt. type"""

    # Delete appointment types for specialty besides current one
    self.appt_types.exclude(specialty=self.specialty).delete()

    # Create new appointment types based on specialty (if needed)
    new_types = self.specialty.appt_types.exclude(agent=self)
    self.appt_types.bulk_create(
        AppointmentType(agent=self, appointment_type=type_)
        for type_ in new_types
    )

    # Set default appointment type based on specialty
    old_default_id = self.default_appt_id
    self.default_appt_type = self.specialty.default_appt_type
    if self.default_appt_type.id != old_default_id:
        self.save(update_fields=['default_appt_type'])

  • we have comments that describe these sections of code
  • but we could also name them
  • (pause)
  • let's turn these comments into variables names by putting each of these code blocks into its own method

def _delete_stale_appointment_types(self):
    """Delete appointment types for specialties besides ours"""
    self.appt_types.exclude(specialty=self.specialty).delete()

def _create_new_appointment_types(self):
    """Create new appointment types based on specialty if needed"""
    new_types = self.specialty.appt_types.exclude(agent=self)
    self.appt_types.bulk_create(
        AppointmentType(agent=self, appointment_type=type_)
        for type_ in new_types
    )

def _update_default_appointment_type(self):
    """Set default appointment type based on specialty"""
    old_default_id = self.default_appt_id
    self.default_appt_type = self.specialty.default_appt_type
    if self.default_appt_type.id != old_default_id:
        self.save(update_fields=['default_appt_type'])

  • So here we've named each of these three sections of code by making them separate functions
  • These functions are methods that live next to our original method. They're helper functions of sorts.
  • We left the comments in as documentation strings for clarity
  • now that we've made separate methods for these, we need to call them in our original method

def update_appointment_types(self):
    """Delete/make appt. types and set default appt. type"""
    self._delete_stale_appointment_types()
    self._create_new_appointment_types()
    self._update_default_appointment_type()

  • now I don't know about you, but I find this a lot easier to digest than three original completely undocumented code blocks
  • we've named the three actions we're doing
  • and most of the time, we probably don't really have to worry about the details of those actions: these names are good enough

Recap: naming things

  • Use whole words for variable names
  • Try to use tuple unpacking instead of index lookups
  • Read your code aloud to test whether it's descriptive
  • Try converting comments to better variable names
  • Create names for unnamed concepts/code
  • In general, try to make your code self-documenting
  • Let's do a brief recap
  • Read your code aloud to ensure you're describing the intent of your algorithm in detail
  • Remember that comments are great for describing things
  • But sometimes a comment is just the first step toward a better variable name
  • Make sure you give a name to everything you can
  • and in general, strive for descriptive and self-documenting code

Programming idioms

  • Special purpose constructs can reduce complexity
  • When possible, use constructs with specific intent
  • Let's talk about some of the code constructs that we use
  • There's usually multiple ways to write the same code
  • Because there's often multiple tools that can solve the same problem
  • When given the opportunity, I often prefer to use special-purpose tools over general-purpose tools
  • As long as the tools are easy to understand
  • (pause)
  • Specific problem call for specific solution

Clean Up

Let's take a look at exception handling

db = DBConnection("mydb")
try:
    records = db.query_all()
finally:
    db.close()

  • Here we're opening a database connection, reading data from it, and closing the connection
  • We need to make sure we close our connection even if an exception occurs so we're using a try-finally block
  • Whenever you have a section of code that's wrapped in a try-finally or has some kind of cleanup step...
  • think about whether you should use a context manager instead

class connect:
    def __init__(self, path):
        self.connection = DBConnection(path)

    def __enter__(self):
        return self.connection

    def __exit__(self):
        self.close()

  • It's not that hard to make your own context managers
  • All you need is an object with a dunder enter method and a dunder exit method
  • Oh... and for those who don't know: dunder stands for double underscore because there's two underscores on each side of these names
  • Let's take a look at how we can use this context manager

with connect("mydb") as db:
    db.query_all()

  • So this is somewhat simpler than that try-finally statement we were using before
  • As a reader, we don't have worry whether we're closing our database connection after we open it
  • That's pretty nice
  • But we don't always have to write our own context managers

from contextlib import closing

with closing(DBConnection("mydb")) as db:
    db.query_all()

  • Here's the closing context manager from the standard library does pretty much the same thing
  • So whenever you need a cleanup step: think about using a context manager
  • You can probably find one that fit your use case, but you can also make your own

Lists from lists

Let's talk about for loops

employees  = []
for calendar, availabilities in calendar_availabilities:
    if availabilities:
        employees.append(calendar.employee)

This code loops over something

You can tell that even though it's blurred out.

This code actually does a little more than that though...

employees  = []
for calendar, availabilities in calendar_availabilities:
    if availabilities:
        employees.append(calendar.employee)

Specifically: the purpose of this code is to

loop over something check a condition and create a new list from items that pass that condition

We're using a list append, an if statement, and a for loop to accomplish this task.

There's a better way to write this code.

employees = [
    calendar.employee
    for (calendar, availabilities) in
    calendar_availabilities
    if availabilities
]

  • Here we're accomplishing the exact same task...
  • but instead of using a for loop, an if statement, and an append call
  • we're using a list comprehension
  • this code isn't shorter, but it does contain less unnecessary information for our brains to process while reading
  • when we glance at this code we don't think looping
  • instead we think: one list is being transformed into another
  • that's a better description of what the code actually does
  • (pause)
  • when you have a specific problem, use a specific tool

Operator Overloading

Let's say we're creating a class to represent items in a shopping cart.

class ShoppingCart:
    def contains(self, product):
        """Return True if cart contains the product."""

    def add(self, product, quantity):
        """Add the quantity of a product to the cart."""

    def remove(self, product):
        """Completely remove a product from the cart."""

    def set(self, product, quantity):
        """Set the quantity of a product in the cart."""

    @property
    def count(self):
        """Return product count in cart, ignoring quantities."""

    @property
    def is_empty(self):
        """Return True if cart is empty."""

  • this shopping cart class is a wrapper around a dictionary
  • notice that this class implements a lot of methods that really just access that dictionary in a fancy way
  • there's methods to check for containment, add, and remove quantities of products, and ask questions about the status of our cart: what it's product count is and whether or not it's empty
  • (pause)
  • these methods should all seem a little familiar
Before After cart.contains(item) item in cart cart.set(item, q) cart[item] = q cart.add(item, q) cart[item] += q cart.remove(item) del cart[item] cart.count len(cart) cart.is_empty not cart
  • all of these methods correspond to operations that work automatically on many native Python objects, like dictionaries and lists
  • when we use custom methods and properties, someone trying to use our class will need to learn how it works first
  • if we use built-in Python operators instead, users won't need to learn as much because they probably already know how lists and dictionaries work

class ShoppingCart:
    def __contains__(self, product):
        """Return True if cart contains the product."""

    def __setitem__(self, product, quantity):
        """Set the quantity of a product in the cart."""

    def __delitem__(self, product):
        """Completely remove a product from the cart."""

    def __len__(self):
        """Return product count in cart, ignoring quantities."""

    def __bool__(self):
        """Return True if cart is non-empty."""

  • in terms of details...
  • we can make these operators work by using dunder methods that do operator overloading
  • with this change, our shopping cart container will feel more like a native Python object
  • and new users will find it much easier to learn and to use
  • Don't be afraid to reach for operator overloading when it makes sense
  • by the way, the standard library has abstract base classes that make a lot of this even easier

Abstract Base Classes

  • collections.UserList: make custom list
  • collections.UserDict: make custom dictionary
  • collections.UserString: make custom string

If you're ever planning to make your own type of container, check out the helpers in the collections library first.

Shared data

Let's talk about functions.

def get_connection(host, username, password):
    """Initialize IMAP server and login"""
    server = IMAP4_SSL(host)
    server.login(username, password)
    server.select("inbox")
    return server

def close_connection(server):
    server.close()
    server.logout()

def get_message_uids(server):
    """Return unique identifiers for each message"""
    return server.uid("search", None, "ALL")[1][0].split()

def get_message(server, uid):
    """Get email message identified by given UID"""
    result, data = server.uid("fetch", uid, "(RFC822)")
    (_, message_text), _ = data
    message = Parser().parsestr(message_text)
    return message

This code connects to an IMAP server and reads email

def get_connection(host, username, password):
    """Initialize IMAP server and login"""
    server = IMAP4_SSL(host)
    server.login(username, password)
    server.select("inbox")
    return server

def close_connection(server):
    server.close()
    server.logout()

def get_message_uids(server):
    """Return unique identifiers for each message"""
    return server.uid("search", None, "ALL")[1][0].split()

def get_message(server, uid):
    """Get email message identified by given UID"""
    result, data = server.uid("fetch", uid, "(RFC822)")
    (_, message_text), _ = data
    message = Parser().parsestr(message_text)
    return message

Notice that one of these functions returns a server object and the other three functions each accept a server object.

This should be a hint that something weird is going on.

If you ever find that you're repeatedly passing the same data to multiple functions, think of making a class.

class IMAPChecker:
    def __init__(self, host):
        """Initialize IMAP email server with given host"""
        self.server = IMAP4_SSL(host)

    def authenticate(self, username, password):
        """Authenticate with email server"""
        self.server.login(username, password)
        self.server.select("inbox")

    def quit(self):
        self.server.close()
        self.server.logout()

    def get_message_uids(self):
        """Return unique identifiers for each message"""
        return self.server.uid("search", None, "ALL")[1][0].split()

    def get_message(self, uid):
        """Get email message identified by given UID"""
        result, data = self.server.uid("fetch", uid, "(RFC822)")
        (_, message_text), _ = data
        message = Parser().parsestr(message_text)
        return message

This is exactly what classes were designed for.

Classes bundle functionality and data together.

Recap: programming idioms

  • Think about using a context manager when you notice duplicate code wrapped around other code
  • When making one list from another, use a comprehension
  • Don't be afraid to use dunder methods
  • Classes are good for bundling up code and data
  • okay let's do a recpap
  • When you find yourself wrapping code in redundant try-finally or try-except blocks, think about whether you could use a context manager instead
  • when making one list from another, use a list comprehension
  • when your object looks like a container and acts a container, use operator overloading and make it into a container
  • don't be afraid of dunder methods
  • (pause)
  • If you have a specific problem, use a specific solution

Readability checklist

Can I modify line breaks to improve clarity? Can I create a variable name for unnamed code? Can I add a comment to improve clarity? Can I turn a comment into a better variable name? Can I use a more specific programming construct? Have I stated detailed preferences in a style guide?
  • When you're writing code, stop to pause every once in a while and actively consider the readability of your code.
  • You can use this checklist as a starting point for your own reflections on code readability
  • As you use this check list on your own code, start to build up that code style guide we talked about earlier
  • remember that every project should have a detailed code style guide
  • The more decisions you can offload to the style guide, the more brain power you'll have left over to spend on less trivial things

More talks to watch

@treyhunner http://WeeklyPython.Chat

  • And finally:
  • Here's a list of videos I recommend watching when you get home
  • Any questions?
Readability Counts Trey Hunner / @treyhunner Hey everybody! 😄 So let's talk about readability.