Python differs from programming languages such as C# or Java, which force the programmer to name classes according to the names of the files in which the code for those classes resides.Every minute spent organizing your activities saves you an hour.
- Benjamin Franklin
Python is the most flexible programming language I've ever encountered. And when you're dealing with something "too flexible," the likelihood of making wrong decisions increases.
- Want to keep all your project classes in a single file main.py? Yes, it's possible.
- Need to read an environment variable? Go ahead and read it where it's needed.
- Need to modify a function's behavior? Why not use a decorator!?
But if you know exactly what you're doing, the consequences of Python's flexibility aren't necessarily bad.
Here I'm going to present some Python code organization tips that have served me well while working at different companies and interacting with many people.
Python Project Structure
First, let's look at the project directory structure, file naming, and module organization.I recommend keeping all module files in the directory src, and tests in a subdirectory tests of this directory:
<project>
├── src
│ ├── <module>/*
│ │ ├── __init__.py
│ │ └── many_files.py
│ │
│ └── tests/*
│ └── many_tests.py
│
├── .gitignore
├── pyproject.toml
└── README.md
Explain with
This <module> is the project's main module. If you're unsure which module is your main one, think about what project users will install with the command pip installand what you think the command import for your module should look like.
Often, the name of the main module is the same as the name of the entire project. But this is not a hard and fast rule.
Arguments for the src directory
I've seen many projects done differently.For example, a project may not have a directory src, and all modules will simply be located in its root directory:
non_recommended_project
├── <module_a>/*
│ ├── __init__.py
│ └── many_files.py
│
├── .gitignore
│
├── tests/*
│ └── many_tests.py
│
├── pyproject.toml
│
├── <module_b>/*
│ ├── __init__.py
│ └── many_files.py
│
└── README.md
A project whose structure is completely disorganized because its folders and files are simply arranged alphabetically, in accordance with the IDE's object sorting rules, looks dull.
The main reason why it is recommended to use the folder srcis so that the active project code is collected in one directory, and the settings, CI/CD parameters, and project metadata are located outside of this directory.
The only downside to this approach is that, without additional effort, you won't be able to use a command like this in your code import module_a. This requires some additional work. We'll discuss how to solve this problem below.
File naming
Rule #1: There are no files here.
First, there is no such thing as a "file" in Python, and I've found this to be a major source of confusion for beginners.If you are in a directory that contains a file __init__.py, then that is a directory that contains modules, not files.
Think of each module as a namespace.
I say "namespace" because it's impossible to say with certainty whether a module contains many functions and classes, or only constants. It could contain almost anything, or just a few entities of a couple types.
Rule #2: If necessary, keep entities in one place
It's perfectly normal to have multiple classes in a single module. This is how you should organize your code (but only if the classes are related to the module, of course).Only separate classes into separate modules if the module becomes too large, or if its different parts address different problems.
It's often argued that this is an example of poor practice. Those who believe this are influenced by experiences gained from using other programming languages, which force them to adopt different solutions (for example, Java and C#).
Rule #3: Give modules names that are plural nouns
When naming modules, follow the general rule that they should be plural nouns. They should also reflect the specifics of the project's subject area.There is an exception to this rule, however. Modules can be named core, main.pyor something similar, indicating that they represent a single entity. When choosing module names, use common sense, and when in doubt, adhere to the above rule.
A real-world example of module naming
Here is my project - Google Maps Crawler , created as an example.This project aims to collect data from Google Maps using Selenium and present it in a form convenient for further processing ( if you're interested, you can read about it here ).
Here is the current state of the project tree (exceptions to rule #3 are highlighted):
gmaps_crawler
├── src
│ └── gmaps_crawler
│ ├── __init__.py
│ ├── config.py (форма единственного числа)
│ ├── drivers.py
│ ├── entities.py
│ ├── exceptions.py
│ ├── facades.py
│ ├── main.py (форма единственного числа)
│ └── storages.py
│
├── .gitignore
├── pyproject.toml
└── README.md
It seems quite natural to import classes and functions like this:
from gmaps_crawler.storages import get_storage
from gmaps_crawler.entities import Place
from gmaps_crawler.exceptions import CantEmitPlace
It can be understood that there exceptions can be one or many exception classes.
Naming modules with plural nouns has the following pleasant features:
- The modules are not too "small" (in the sense that one module is supposed to contain several classes).
- They can be broken down into smaller modules at any time if necessary.
- "Multiple" names give the programmer a strong sense of knowing what might be inside the corresponding modules.
Naming classes, functions, and variables
Some programmers find naming entities difficult. But if you establish naming rules in advance, the task becomes less challenging.Function and method names must be verbs
Functions and methods represent actions, or something that performs actions.A function or method is not just something that "exists." It is something that "does."
Actions are clearly defined by verbs.
Here are some successful examples from a real project I worked on before:
def get_orders():
...
def acknowledge_event():
...
def get_delivery_information():
...
def publish():
...
Here are some unsuccessful examples:
def email_send():
...
def api_call():
...
def specific_stuff():
...
It's not very clear here whether the functions return an object that allows you to call the API, or whether they themselves perform some actions, for example, sending an email.
I can imagine a scenario where a poorly named function could be used like this:
email_send.title = "title"
email_send.dispatch()
There are some exceptions to the rule discussed:
- Creating a function main()that will be called in the application's main entry point is a good reason to break this rule.
- Using it @propertyto treat a class method as an attribute is also acceptable.
Variable and constant names must be nouns
Variable and constant names should always be nouns and never verbs (this allows them to be clearly separated from functions).Here are some examples of successful names:
plane = Plane()
customer_id = 5
KEY_COMPARISON = "abc"
Here are some unfortunate names:
fly = Plane()
get_customer_id = 5
COMPARE_KEY = "abc"
And if a variable or constant is a list or collection, a name represented by a plural noun will do:
planes: list[Plane] = [Plane()] # Даже если содержит всего один элемент
customer_ids: set[int] = {5, 12, 22}
KEY_MAP: dict[str, str] = {"123": "abc"} # Имена словарей остаются существительными в единственном числе
Class names should be self-explanatory, but using suffixes is okay.
ServicePrefer self-explanatory class names. Suffixes such as , Strategy, , are also acceptable, Middlewarebut only as a last resort, when necessary to clearly describe the class's purpose.Always give classes singular names, not plural ones. Plural names are reminiscent of names for collections of elements (for example, if I see the name orders, I assume it's a list or an iterable). Therefore, when choosing a class name, remind yourself that once an instance of a class is created, we have a single object at our disposal.
Classes represent certain entities
Classes representing something from the business environment should be named according to the names of the entities they represent (and the names should be nouns!). For example, [classname] Order, Sale[classname] Store, [classname] Restaurant , and so on.Example of using suffixes
Let's imagine we need to create a class responsible for sending emails. Calling it simply Email, its purpose would be unclear.Some may decide that it may represent some entity:
email = Email() # Предполагаемый пример использования
email.title = "Title"
email.body = create_body()
email.send_to = "guilatrova.dev"
send_email(email)
Such a class should be called EmailSender or EmailService.
Entity Naming Conventions
Follow these conventions for naming entities:Type | Public | Interior |
Packages (directories) | lower_with_under | — |
Modules (files) | lower_with_under.py | — |
Classes | CapWords | — |
Functions and methods | lower_with_under() | _lower_with_under() |
Constants | ALL_CAPS_UNDER | _ALL_CAPS_UNDER |
A digression on "private" methods
If a method name looks like __method(self) (any method whose name begins with two underscores), Python will not allow external classes/methods to call that method normally. Some people, upon learning this, think it's okay.For those like me coming to Python from C#, it might seem strange that (using the above guide) a class method cannot be protected.
But Guido van Rossum has good reason to believe there are good reasons for this: “We are all adults here, responsible people.”
This means that if you know you shouldn't call a method, then you shouldn't do it unless you're absolutely sure what you're doing.
After all, if you really decide to call a private method, you'll do something unusual to do it (in C#, this is called Reflection).
So give your private methods/functions names that start with a single underscore to indicate that they are for internal use only, and be okay with it.
When to create a function and when to create a class?
I have been asked the question in the title of this section several times.If you follow the guidelines above, your modules will be clear, and clear modules are an effective way to organize functions:
from gmaps_crawler import storages
storages.get_storage() # Похоже на класс, но экземпляр не создаётся, а имя - это существительное во множественном числе
storages.save_to_storage() # Так может называться функция, хранящаяся в модуле
Sometimes a module may contain a subset of related functions. In such cases, it makes sense to separate these functions into a class.
An example of grouping a subset of functions
Let's assume we have a module we've already encountered storages with 4 functions:def format_for_debug(some_data):
...
def save_debug(some_data):
"""Выводит данные на экран"""
formatted_data = format_for_debug(some_data)
print(formatted_data)
def create_s3(bucket):
"""Создаёт бакет s3, если он не существует"""
...
def save_s3(some_data):
s3 = create_s3("bucket_name")
...
S3 is Amazon's (AWS) cloud storage solution, suitable for storing any data. It's like Google Drive for software.
After analyzing this code, we can say the following:
- The developer can save data in debug mode ( save_debug) (it is simply displayed on the screen), or in S3 ( save_s3) (it is stored in the cloud).
- The function save_debug uses the function format_for_debug.
- The function save_s3 uses the function create_s3.
class DebugStorage:
def format_for_debug(self, some_data):
...
def save_debug(self, some_data):
"""Выводит данные на экран"""
formatted_data = self.format_for_debug(some_data)
print(formatted_data)
class S3Storage:
def create_s3(self, bucket):
"""Создаёт бакет s3, если он не существует"""
...
def save_s3(self, some_data):
s3 = self.create_s3("bucket_name")
...
Here's a rule of thumb to help decide between functions and classes:
- Always start with functions.
- Move to classes when you feel like you can group different subsets of functionality.
Creating modules and entry points to the application
Every application has an entry point.That is, there is a single module (in other words, a file) that launches the application. This could be a single script or a larger module.
Whenever you create an entry point to your application, be sure to add a check to the code that this code is being executed and not imported:
def execute_main():
...
if __name__ == "__main__": # Добавьте это условие
execute_main()
By doing this, you ensure that importing this code won't accidentally execute it. It will only execute if it's explicitly executed.
File __main__.py
You may have noticed that some Python packages can be called using the key -m:python -m pytest
python -m tryceratops
python -m faust
python -m flake8
python -m black
The system treats such packages almost like regular command-line utilities, since they can also be launched like this:
pytest
tryceratops
faust
flake8
black
To equip your project with this capability, you need to add a file __main.py__ to the main module:
<project>
├── src
│ ├── example_module Главный модуль
│ │ ├── __init__.py
│ │ ├── __main__.py Добавьте сюда этот файл
│ │ └── many_files.py
│ │
│ └── tests/*
│ └── many_tests.py
│
├── .gitignore
├── pyproject.toml
└── README.md
And don't forget that here, in the file __main__.py, a check will be needed __name__ == "__main__".
Once you install your module, you can run it with a command like python -m example_module.