A common question I’ve seen from beginning Python programmers is, “How do I make my code more Pythonic?” The problem with a word like “Pythonic” is that its meaning is nebulous: it means different things to different people.

The meaning isn’t static, either. Whether or not code is Pythonic can depend on which version of Python you’re using, and best practices for writing Pythonic code may change over time.

In this article, I’ll share my perspective on what makes code Pythonic by looking at a few concrete examples. I’ll also leave you with some hand-picked resources that will help you build a mental model for deciding when code is Pythonic or not.

But first, let’s agree on at least some kind of definition for the word Pythonic.

What does “Pythonic” mean?

The Python language is over 30 years old. In that time, Python programmers have collectively gained an enormous amount of experience using the language for a wide range of purposes. Over time, that collective experience has been shared and distilled into best practices — commonly referred to as the Pythonic way.

The Zen of Python, written by Tim Peters and accessible from any Python installation by typing import this into the REPL, traditionally exemplifies the Pythonic mindset:

>>> import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

The beauty of the Zen of Python is also the most annoying feature for Python beginners. The Zen elegantly captures the spirit of what it means to be Pythonic without giving any explicit advice. For example, consider the first principle: “Beautiful is better than ugly.” OK, sure! But how do I take my ugly code and make it beautiful? What even is beautiful code in the first place?

The ambiguity of the Zen of Python, however frustrating, is what makes it as relevant now as when Tim Peters wrote it in 1999. It serves as a set of guiding principles that equip you with a sense for distinguishing Pythonic code from un-Pythonic code and provides a foundation for a mental framework for making your own decisions.

So, where does this leave us concerning an actual definition of the word “Pythonic?” The best definition that I’ve found is from a 2014 StackOverflow answer to the question “What does Pythonic mean?” that describes Pythonic code as:

[C]ode that doesn’t just get the syntax right but that follows the conventions of the Python community and uses the language in the way it is intended to be used.

There are two key takeaways here:

  1. The adjective Pythonic has more to do with style than syntax, although Pythonic idioms often have implications beyond purely stylistic choices, including better performance.
  2. What passes as Pythonic is driven by the Python community.

So now that we have at least some understanding of what Python programmers mean when they refer to code as Pythonic let’s look at three specific ways you can write more Pythonic code right now.

Tip #1: Get Familiar With PEP8

PEP8 is Python’s official style guide. PEP stands for Python Enhancement Proposal. PEPs are documents that propose new Python features and serve as official documentation for the feature while the Python community debates its acceptance or rejection. Following PEP8 won’t quite get your code to Pythonic perfection, but it does go a long way towards making your code look familiar to many Python programmers.

PEP8 deals with things like how to handle whitespace in your code, such as using four spaces for indentation instead of a tab character, or what the maximum line length should be, which, according to PEP8, is 79 characters — although this is probably the most widely ignored PEP8 recommendation.

If you’re new to Python programming, one of the first things I recommend internalizing from PEP8 is the recommendations for naming conventions. For example, you should write function and variable names in the lowercase_with_underscores style:

# Correct
seconds_per_hour = 3600

# Incorrect
secondsperhour = 3600
secondsPerHour = 3600

Class names should use the CapitalizedWords style:

# Correct
class SomeThing:
    pass

# Incorrect
class something:
    pass

class some_thing:
    pass

Write constants in the UPPER_CASE_WITH_UNDERSCORES style:

# Correct
PLANCK_CONSTANT = 6.62607015e-34

# Incorrect
planck_constant = 6.6260715e-34
planckConstant = 6.6260715e-34

The whitespace recommendations laid out in PEP8 include how to use spaces around operators, around function parameter names and arguments, and how to break long lines. While years of practicing reading and writing PEP8 compliant Python code will help you internalize these recommendations, it’s still a lot to remember.

Don’t worry if you can’t memorize all of PEP8’s conventions. You don’t need to! Tools like flake8 can help you find and fix PEP8 issues in your code. You can install flake8 with pip:

# Linux/macOS
$ python3 -m pip install flake8

# Windows
$ python -m pip install flake8

flake8 can be used as a command-line application to scan a Python file for style violations. For example, let’s say I have a text file called myscript.py containing the following code:

def add( x, y ):
    return x+y

num1=1
num2=2
print( add(num1,num2) )

Running flake8 against this code tells you what violations there are and exactly where they’re located:

$ flake8 myscript.py
myscript.py:1:9: E201 whitespace after '('
myscript.py:1:11: E231 missing whitespace after ','
myscript.py:1:13: E202 whitespace before ')'
myscript.py:4:1: E305 expected 2 blank lines after class or function definition, found 1
myscript.py:4:5: E225 missing whitespace around operator
myscript.py:5:5: E225 missing whitespace around operator
myscript.py:6:7: E201 whitespace after '('
myscript.py:6:16: E231 missing whitespace after ','
myscript.py:6:22: E202 whitespace before ')'

Each line of output from flake8 tells you which file the problem is in, which line the problem is on, which column in the line the error starts on, an error number (use these codes to configure flake8 to ignore specific errors if you wish), as well as a description of the error:

You can even set up editors like VS Code to lint your code with flake8 while you write it to continuously check your code for PEP8 violations. When flake8 finds an issue, a squiggly red line appears underneath the offending portion of your code, and you can see which errors have been detected in the Problems tab of the built-in terminal:

Using flake8 in Visual Studio Code

flake8 is an excellent tool for finding PEP8 errors in your code, but you still have to manually fix all of those errors. This can be a lot of work. Fortunately, there’s a way to automate the whole process.

The black auto-formatter for Python is a tool for automatically formatting your code to conform to PEP8. Of course, PEP8 recommendations leave a lot of wiggle room for stylistic choices and black makes a lot of decisions for you. You may or may not agree with these decisions. black is minimally configurable, so you may want to play around with it first before committing to using it.

You can install black with pip:

# Linux/macOS
$ python3 -m pip install black

# Windows
$ python -m pip install black

Once installed, you can use the black --check command in your shell to see if black would make any changes to a file:

$ black --check myscript.py
would reformat myscript.py

Oh no! 💥 💔 💥
1 file would be reformatted.

You can use the --diff flag to see a diff of what changes black would make:

$ black --diff myscript.py
--- myscript.py	2022-03-15 21:27:20.674809 +0000
+++ myscript.py	2022-03-15 21:28:27.357107 +0000
@@ -1,6 +1,7 @@
-def add( x, y ):
-    return x+y
+def add(x, y):
+    return x + y

-num1=1
-num2=2
-print( add(num1,num2) )
+
+num1 = 1
+num2 = 2
+print(add(num1, num2))
would reformat myscript.py

All done! ✨ 🍰 ✨
1 file would be reformatted.

To automatically format your file, pass the file name to the black command:

$ black myscript.py
reformatted myscript.py

All done! ✨ 🍰 ✨
1 file reformatted.

# Show the formatted file
$ cat myscript.py
def add(x, y):
    return x + y

num1 = 1
num2 = 2
print(add(num1, num2))

To check that your file is PEP8 compliant now, run flake8 against it again and see if you get any errors:

# No output from flake8 so everything is good!
$ flake8 myscript.py

One thing to keep in mind when using black is that, by default, black sets the maximum line length to 88 columns. This diverges from PEP8’s recommendation for 79 columns lines, so you may see flake8 report line length errors even when using black. You can configure black to use 79 columns or configure flake8 to accept longer line lengths. Many Python devs use 88 columns instead of 79, and some even set black and flake8 to use even longer line lengths.

It’s important to remember that PEP8 is just a set of recommendations, although these recommendations are taken seriously by many Python programmers. But there’s nothing in Python that enforces the PEP8 style guide. If there’s something in PEP8 that you strongly disagree with, then, by all means, ignore it! If you do want to adhere strictly to PEP8, however, tools like flake8 and black can make your life a lot easier.

Tip #2: Avoid C-style loops

In languages like C or C++, keeping track of an index variable while looping over an array is common. For example, when asked to print the elements of a list, it’s not uncommon for new Python programmers coming from C or C++ to write something like the following:

>>> names = ["JL", "Raffi", "Agnes", "Rios", "Elnor"]

>>> # Using a `while` loop
>>> i = 0
>>> while i < len(names):
...     print(names[i])
...     i += 1
JL
Raffi
Agnes
Rios
Elnor

>>> # Using a `for` loop
>>> for i in range(len(names)):
...     print(names[i])
JL
Raffi
Agnes
Rios
Elnor

Instead of iterating over an index, however, you can iterate over items in a list directly:

>>> for name in names:
...     print(name)
JL
Raffi
Agnes
Rios
Elnor

However, avoiding C-style loops goes a lot deeper than just directly iterating over items in a list. Leveraging Python idioms, such as list comprehensions, built-in functions like min(), max(), and sum() and making use of object methods can help take your Python code to the next level.

Prefer List Comprehensions Over Simple for Loops

A common programming task is to process the elements from one array and store the results in a new array. For example, suppose you have a list of numbers and want to transform it into a list of the squares of those numbers. You know that you should avoid C-style loops, so you may end up writing something like this:

>>> nums = [1, 2, 3, 4, 5]

>>> squares = []
>>> for num in nums:
...     squares.append(num ** 2)
...
>>> squares
[1, 4, 9, 16, 25]

A more Pythonic way to do this is to use a list comprehension:

>>> squares = [num ** 2 for num in nums]  # <-- List comprehension
>>> squares
[1, 4, 9, 16, 25]

List comprehensions can be difficult to grok at first. However, if you’re familiar with set-builder notation for writing sets in mathematics, then list comprehensions may already look familiar.

Here’s how I usually think about list comprehensions:

  1. Start by creating an empty list literal:
    []
  2. The first thing that goes in the list comprehension is whatever you would typically put inside of the .append() method if you were building the list using a for loop:
    [num ** 2]
  3. Finally, put the for loop’s header at the end of the list:
    [num ** 2 for num in nums]

List comprehensions are an important concept to master when writing Pythonic code. But they can be overused. They're also not the only kind of comprehension in Python. In the following sections, you'll learn about other comprehensions, such as generator expressions and dictionary comprehensions, and see an example of when it makes sense to avoid a list comprehension.

Use Built-in Functions Like min(), max(), and sum()

Another common programming task is finding the minimum or maximum value in an array of numbers. Using a for loop, you can find the minimum number in a list as follows:

>>> nums = [10, 21, 7, -2, -5, 13]

>>> min_value = nums[0]
>>> for num in nums[1:]:
...     if num < min_value:
...         min_value = num
...
>>> min_value
-5

A more Pythonic way to do this is to use the min() built-in function:

>>> min(nums)
-5

In a similar vein, there’s no need to write a loop to find the maximum value in a list. You can use the max() built-in function:

>>> max(nums)
21

To find the sum of the numbers in a list, you could write a for loop. But a more Pythonic approach is to use the sum() function:

>>> # Not Pythonic: Use a `for` loop
>>> sum_of_nums = 0

>>> for num in nums:
...     sum_of_nums += num
...
>>> sum_of_nums
44

>>> # Pythonic: Use `sum()`
>>> sum(nums)
44

Another Pythonic use of sum() is to count the number of elements of a list for which some condition holds. For example, here’s a for loop that counts the number of strings in a list that start with the letter A:

>>> capitals = ["Atlanta", "Houston", "Denver", "Augusta"]

>>> count_a_capitals = 0
>>> for capital in capitals:
...     if capital.startswith("A"):
...         count_a_capitals += 1
...
>>> count_a_capitals
2

Combining sum() with a list comprehension reduces the for loop to a single line of code:

>>> sum([capital.startswith("A") for capital in capitals])
2

As lovely as that is, you can make it even more Pythonic by replacing the list comprehension with a generator expression by removing the brackets around the list:

>>> sum(capital.startswith("A") for capital in capitals)
2

How exactly does this work? Both the list comprehension and the generator expression return an iterable containing True and False values corresponding to whether or not the string in the capitals list starts with the letter "A":

>>> [capital.startswith("A") for capital in capitals]
[True, False, False, True]

In Python, True and False are integers in disguise. True is equal to 1 and False is equal to 0:

>>> isinstance(True, int)
True

>>> True == 1
True

>>> isinstance(False, int)
True

>>> False == 0
True

When you pass the list comprehension or generator expression to sum(), the True and False values get treated like 1 and 0, respectively. Since there are two True values and two False values, the total sum is equal to 2.

Using sum() to count how many list elements satisfy a condition highlights an important point about the concept of Pythonic code. Personally, I find this use of sum() to be very Pythonic. After all, it leverages several Python language features to create what is, in my opinion, concise-yet-readable code. However, not every Python developer may agree with me.

One could argue that this example violates one of the principles of the Zen of Python: “Explicit is better than implicit.” After all, it’s not obvious that True and False are integers and that sum() should even work with a list of True and False values. Understanding this use of sum() requires a deep understanding of Python’s built-in types.

🐍
To learn more about True and False as integers, as well as other surprising facts about numbers in Python, check out my article 3 Things You Might Not Know About Numbers in Python.

There is no set of rigid rules that tell you whether or not code is Pythonic. There’s always a gray area. Use your best judgment when confronted with a code example that feels like it might be in this gray area. Always err on the side of readability, and don’t be afraid to reach out to coworkers or use social media to get help.

Tip #3: Use the Right Data Structure

A big part of writing clean, Pythonic code boils down to picking the proper data structure for the task at hand. Python is well-known as a “batteries included” language. Several of the batteries included with Python are efficient and ready-to-use data structures.

Use Dictionaries For Fast Lookup

Suppose you have a CSV file called clients.csv containing client data for a business that looks something like this:

first_name,last_name,email,phone
Manuel,Wilson,[email protected],757-942-0588
Stephanie,Gonzales,[email protected],385-474-4769
Cory,Ali,[email protected],810-361-3885
Adam,Soto,[email protected],724-603-5463

Let’s say you’re tasked with writing a program that takes an email address as input and outputs the phone number of the client with that email if such a client exists. How would you go about doing it?

You can read each row of this file as a dictionary using the DictReader object from the csv module:

>>> import csv

>>> with open("clients.csv", "r") as csvfile:
...     clients = list(csv.DictReader(csvfile))
...
>>> clients
[{'first_name': 'Manuel', 'last_name': 'Wilson', 'email': '[email protected]', 'phone': '757-942-0588'},
{'first_name': 'Stephanie', 'last_name': 'Gonzales', 'email': '[email protected]', 'phone': '385-474-4769'},
{'first_name': 'Cory', 'last_name': 'Ali', 'email': '[email protected]', 'phone': '810-361-3885'},
{'first_name': 'Adam', 'last_name': 'Soto', 'email': '[email protected]', 'phone': '724-603-5463'}]

clients is a list of dictionaries, so to find the client with a given email, say [email protected], you’ll need to loop over the list and compare each client’s email with the target email until the right client is found:

>>> target = "[email protected]"
>>> phone = None

>>> for client in clients:
...     if client["email"] == target:
...         phone = client["phone"]
...         break
...
>>> print(phone)
385-474-4769

The problem with this code is that looping over the list of clients is inefficient. If there are a large number of clients in the CSV file, your program could be spending a significant amount of time scanning the list for a client with a matching email. If you need to do this check often, this could result in a whole bunch of wasted time.

A more Pythonic approach is to forget about storing the clients in a list and use a dictionary to map clients to their email addresses. A great way to do this is with a dictionary comprehension:

>>> with open("clients.csv", "r") as csvfile:
...     # Use a `dict` comprehension instead of a `list`
...     clients = {row["email"]: row["phone"] for row in csv.DictReader(csvfile)}
... 
>>> clients
{'[email protected]': '757-942-0588', '[email protected]': '385-474-4769',
'[email protected]': '810-361-3885', '[email protected]': '724-603-5463'}

Dictionary comprehensions are a lot like list comprehensions:

  1. Start by creating an empty dictionary:
    {}
  2. Then put in a key-value pair separated by a colon:
    {row[“email”]: row[“phone”]}
  3. Finally, write a for expression that loops over each row in the CSV file:
    {row[“email”]: row[“phone”] for row in csv.DictReader(csvfile)}

Translated into a for loop, this dictionary comprehension would look something like this:

>>> clients = {}
>>> with open("clients.csv", "r") as csvfile:
...     for row in csv.DictReader(csvfile):
...         clients[row["email"]] = row["phone"]

With the clients dictionary made, you can find a client’s phone number using their email address without having to write any more loops:

>>> target = "[email protected]"
>>> clients[target]
385-474-4769

This code is not only shorter than looping over a list, it’s much more efficient. Python can jump straight to the correct value in the dictionary without any loops. There’s a problem, however. If no client in clients has a matching email, then a KeyError will be raised:

>>> clients["[email protected]"]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: '[email protected]'

One way to handle this is to catch the KeyError and print a default value if no client is found:

>>> target = "[email protected]"
>>> try:
...     phone = clients[target]
... except KeyError:
...     phone = None
...
>>> print(phone)
None

There’s a more Pythonic way to do this, though, using the dictionary’s .get() method. .get() returns a key’s corresponding value if the key exists and None otherwise:

>>> clients.get("[email protected]")
'385-474-4769'

Let’s compare the two solutions side-by-side:

import csv

target = "[email protected]"
phone = None

# Un-Pythonic: loop over a list
with open("clients.csv", "r") as csvfile:
    clients = list(csv.DictReader(csvfile))

for client in clients:
    if client["email"] == target:
        phone = client["phone"]
        break

print(phone)

# Pythonic: lookup in a dictionary
with open("clients.csv", "r") as csvfile:
    clients = {row["email"]: row["phone"] for row in csv.DictReader(csvfile)}

phone = clients.get(target)
print(phone)

The Pythonic solution is more concise and efficient without sacrificing readability.

Take Advantage of Set Operations

Sets are an undervalued data structure in Python. As a result, even intermediate Python developers tend to ignore sets and miss out on opportunities to use them to their advantage.

Perhaps the most well-known use-case for sets in Python is to remove duplicates from a list:

>>> nums = [1, 3, 2, 3, 1, 2, 3, 1, 2]
>>> unique_nums = list(set(nums))
>>> unique_nums
[1, 2, 3]

But there’s so much more that you can do with sets. One use-case I’ve used often in my code is to use sets to filter values from an iterable efficiently. This works best when you also need unique values.

Here’s a contrived but not unrealistic example. Suppose a shop owner has a CSV file of clients containing their email addresses. We’ll reuse the clients.csv file from the previous section. The shop owner has another CSV file of orders from the last month that also contains email addresses. Maybe this CSV file is called orders.csv and looks something like this:

date,email,items_ordered
2022/03/01,[email protected],2
2022/03/04,[email protected],3
2022/03/07,[email protected],1

The shop owner would like to email every client who didn’t order anything from the past month with a discount coupon. One way to do this would be to read the emails from the clients.csv and orders.csv files and use a list comprehension to filter the client’s emails:

>>> import csv

>>> # Create a list of all client emails
>>> with open("clients.csv", "r") as clients_csv:
...     client_emails = [row["email"] for row in csv.DictReader(clients_csv)]
...

>>> # Create a list of emails from orders
>>> with open("orders.csv") as orders_csv:
...     order_emails = [row["email"] for row in csv.DictReader(orders_csv)]
...

>>> # Use a list comprehension to filter the clients emails
>>> coupon_emails = [email for email in clients_emails if email not in order_emails]
>>> coupon_emails
["[email protected]", "[email protected]"]

The above code works fine and certainly looks Pythonic. But suppose the shop owner has millions of clients and orders each month. (They’re apparently very successful!) Filtering the emails to determine which customers to send coupons to requires looping over the entire client_emails list. And what if there are duplicate rows in the client.csv and orders.csv files? Accidents happen, you know.

A more Pythonic approach would be to read in the client and order emails into sets and use the set difference operator to filter the set of client emails:

>>> import csv

>>> # Create a set of all client emails using a set comprehension
>>> with open("clients.csv", "r") as clients_csv:
...     client_emails = {row["email"] for row in csv.DictReader(clients_csv)}
...

>>> # Create a set of emails frp, orders using a set comprehension
>>> with open("orders.csv", "r") as orders_csv:
...     order_emails = {row["email"] for row in csv.DictReader(orders_csv)}
...

>>> # Filter the client emails using set difference
>>> coupon_emails = client_emails - order_emails
>>> coupon_emails
{"[email protected]", "[email protected]"}

This approach is much more efficient than the previous one because it only loops over the client emails once, not twice. It also has the advantage of naturally removing any duplicate emails from both CSV files.

Three Books For Learning To Write Pythonic Code

You can’t learn to write clean Pythonic code overnight. You need to study lots of code examples, practice writing your own code, and consult with other Python developers. To help you on your journey, I’ve compiled a list of three books that I’ve found immensely helpful for grokking the Pythonic way.

All three books mentioned below are written for intermediate or advanced Python programmers. If you're just getting started with Python, and especially if you're new to programming in general, consider my book Python Basics: A Practical Introduction to Python 3.

Disclaimer: The following sections contain affiliate links. If you decide to purchase one of the books through my link, I will receive a small commission at no cost to you.

Python Tricks by Dan Bader

Dan Bader's short and sweet book Python Tricks: A Buffet of Awesome Python Features is an excellent starting place for beginner-to-intermediate Python programmers to learn more about writing Pythonic code.

Python Tricks will teach you patterns for writing clean, idiomatic Python, best practices for writing functions, how to use Python's object-oriented programming features effectively, and a whole lot more.

Effective Python by Brett Slatkin

Brett Slatkin's Effective Python was the first book I read after learning the Python syntax that opened my eyes to the power of idiomatic Pythonic code.

As the book's subtitle states, Effective Python covers 90 specific ways to write better Python. The first chapter alone, titled Python Thinking, is a goldmine of tips and tricks that even beginner Python programmers will find helpful, although beginners may find the rest of the book difficult to follow.

Fluent Python by Luciano Ramalho

If I could only own one book about Python, Luciano Ramalho's Fluent Python would be the one.

Full of practical examples supported by clear exposition, Fluent Python is an excellent guide for anyone looking to learn how to write Pythonic code. However, keep in mind that Fluent Python is not for beginning Python programmers. As stated in the book's preface:

If you are just learning Python, this book is going to be hard to follow. Not only that, if you read it too early in your Python journey, it may give you the impression that every Python script should leverage special methods and metaprogramming tricks. Premature abstraction is as bad as premature optimization.

Experienced Python programmers will benefit greatly from the book, however.

Ramalho recently updated his book for modern Python. Currently, the second edition is only available for pre-order. I strongly recommend pre-ordering the second edition as the first edition is now outdated.

Next Steps

This article covered a lot of ground. You learned:

  • How the PEP8 style guide can help you write standardized Python code
  • How to avoid C-style loops via direct iteration and leveraging some of Python's built-in functions
  • Why choosing the right data structure enables you to write shorter code that is also more efficient

These tips will help you write more Pythonic code, but they're just a start. Mastering Python takes years. During your years of working towards Python mastery, accepted norms for Pythonic code may change, so it's crucial to stay up-to-date with the current best practices. The r/learnpython subreddit can be a good place to ask questions and get help. I'm also always happy to answer questions on Twitter.

But the first step is to get your hands dirty and practice what you've learned. As the Zen of Python says: "Now is better than never."