Skip to main content

On This Page

Python Modules and Imports - Best Practices and Pitfalls

15 min read
Share

Python Modules and Imports - Best Practices and Pitfalls

1. TL;DR - The Import Essentials

What you need to know:

  • Use absolute imports for clarity and maintainability
  • Avoid circular imports by restructuring code or using local imports
  • Import at module level, not inside functions (except for specific cases)
  • Use if TYPE_CHECKING: for type-only imports to avoid runtime overhead
  • Never use import * in production code

Common pitfalls:

  • Circular dependencies causing ImportError or AttributeError
  • Importing mutable objects that get modified unexpectedly
  • Performance issues from importing heavy modules unnecessarily
  • Name collisions from wildcard imports

Quick performance check:

# Bad - imports inside loop
for i in range(1000):
    import numpy as np  # Reimports 1000 times (cached but still overhead)
    
# Good - import once
import numpy as np
for i in range(1000):
    np.array([1, 2, 3])

2. Understanding Python’s Import System

How Imports Actually Work

When you write import module, Python:

  1. Checks sys.modules cache (already imported?)
  2. Searches for the module in sys.path
  3. Executes the module code (only once per interpreter session)
  4. Binds the name in the current namespace
# Example: Understanding sys.modules cache
import sys

# First import - module code executes
import math
print('math' in sys.modules)  # True

# Second import - uses cached version
import math  # No re-execution, just name binding

The sys.path Search Order

import sys
print('\n'.join(sys.path))

Typical output:

/current/working/directory     # Current script's directory
/usr/lib/python3.12/site-packages  # Third-party packages
/usr/lib/python3.12            # Standard library
...

Critical pitfall: Files in your current directory shadow standard library modules!

# If you create a file named "random.py" in your project:
# my_project/
#   random.py  # Your file - BAD NAME!
#   main.py

# In main.py:
import random  # Imports YOUR random.py, not stdlib!

3. Import Styles - When to Use Each

# project/
#   src/
#     __init__.py
#     utils/
#       __init__.py
#       helpers.py
#     services/
#       __init__.py
#       user_service.py

# In user_service.py - GOOD
from src.utils.helpers import validate_email
from src.config import DATABASE_URL

# Clear, explicit, works everywhere

Relative Imports (Use Sparingly)

# In user_service.py - relative imports
from ..utils.helpers import validate_email  # Up one level
from .auth_service import authenticate      # Same level

# Only works inside packages, not for top-level scripts

When to use relative imports:

  • Within tightly coupled package internals
  • When you might rename/move the entire package
  • In reusable libraries where parent package name is unknown

When NOT to use:

  • In top-level scripts (raises ImportError)
  • When absolute path is clearer
  • Cross-package imports

Import Variants

# 1. Import module
import math
result = math.sqrt(16)  # Must use qualified name

# 2. Import specific names
from math import sqrt, pi
result = sqrt(16)  # Direct access

# 3. Import with alias
import numpy as np  # Convention for large/common modules
from utils.helpers import validate_email as validate

# 4. Import everything (DON'T DO THIS)
from math import *  # Now sqrt, pi, sin, etc. pollute namespace

4. Understanding __init__.py

What is __init__.py?

__init__.py is a special file that marks a directory as a Python package. When Python sees this file in a directory, it treats that directory as an importable package rather than just a folder.

# Without __init__.py - NOT a package
myproject/
  utils/          # Just a folder
    helpers.py    # Can't import as: from myproject.utils import helpers

# With __init__.py - IS a package
myproject/
  utils/
    __init__.py   # Makes it a package!
    helpers.py    # Can import as: from myproject.utils import helpers

Note: Python 3.3+ introduced “namespace packages” which don’t require __init__.py, but explicit is better than implicit.

Why Use __init__.py?

1. Package Recognition Makes Python recognize the directory as a package for imports.

2. Package Initialization Runs once when the package is first imported - useful for setup code.

3. Control Public API Define what gets imported with from package import *.

4. Simplify Imports Expose commonly used functions/classes at package level.

5. Namespace Management Organize related modules into logical groups.

How to Use __init__.py

Empty __init__.py (Minimal Approach)

# mypackage/__init__.py
# Empty file - just marks directory as package

# Usage:
from mypackage.module import function

Expose Public API

# mypackage/
#   __init__.py
#   core.py
#   utils.py
#   _internal.py

# mypackage/__init__.py
"""
MyPackage - A utility library for data processing.
"""

# Import from submodules
from .core import process_data, DataProcessor
from .utils import validate_input, format_output

# Define public API
__all__ = ['process_data', 'DataProcessor', 'validate_input', 'format_output']

# Package metadata
__version__ = '1.0.0'
__author__ = 'Your Name'

# Users can now do:
# from mypackage import process_data  # Instead of from mypackage.core import process_data

Package Initialization Code

# database/__init__.py
"""Database package - handles all database operations."""

import logging

# Setup package-level logger
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)

# Initialize connection pool (runs once on first import)
_connection_pool = None

def get_connection_pool():
    """Lazy initialization of connection pool."""
    global _connection_pool
    if _connection_pool is None:
        logger.info("Initializing database connection pool")
        _connection_pool = create_pool()
    return _connection_pool

# Expose public API
from .models import User, Post
from .queries import get_user, create_user

__all__ = ['User', 'Post', 'get_user', 'create_user', 'get_connection_pool']

Conditional Imports and Compatibility

# mypackage/__init__.py
"""Package with optional dependencies."""

# Core functionality (always available)
from .core import basic_function

__all__ = ['basic_function']

# Optional features
try:
    from .advanced import advanced_feature
    __all__.append('advanced_feature')
    HAS_ADVANCED = True
except ImportError:
    HAS_ADVANCED = False

# Version-specific imports
import sys
if sys.version_info >= (3, 10):
    from .modern import new_feature
    __all__.append('new_feature')

__version__ = '2.1.0'

Subpackage Organization

# myapp/
#   __init__.py
#   models/
#     __init__.py
#     user.py
#     post.py
#   services/
#     __init__.py
#     user_service.py
#   utils/
#     __init__.py
#     validators.py

# myapp/__init__.py
"""Main application package."""
from .models import User, Post
from .services import UserService

__all__ = ['User', 'Post', 'UserService']
__version__ = '1.0.0'

# myapp/models/__init__.py
"""Data models subpackage."""
from .user import User
from .post import Post

__all__ = ['User', 'Post']

# myapp/services/__init__.py
"""Business logic services."""
from .user_service import UserService

__all__ = ['UserService']

# Now users can do:
from myapp import User, UserService
# Instead of:
from myapp.models.user import User
from myapp.services.user_service import UserService

Common Patterns

Pattern 1: Re-export Everything from Submodules

# mypackage/__init__.py
"""Convenience imports - expose all submodule contents."""

from .module_a import *
from .module_b import *
from .module_c import *

# Combine __all__ from all submodules
from .module_a import __all__ as all_a
from .module_b import __all__ as all_b
from .module_c import __all__ as all_c

__all__ = all_a + all_b + all_c

Pattern 2: Lazy Loading Heavy Modules

# mypackage/__init__.py
"""Lazy loading for heavy dependencies."""

def __getattr__(name):
    """Lazy import heavy modules only when accessed."""
    if name == 'heavy_module':
        from . import heavy_module
        return heavy_module
    elif name == 'MLModel':
        from .ml import MLModel
        return MLModel
    raise AttributeError(f"module {__name__!r} has no attribute {name!r}")

__all__ = ['heavy_module', 'MLModel']

# Usage:
import mypackage
# heavy_module not loaded yet...
mypackage.heavy_module.do_something()  # NOW it's loaded

Pattern 3: Plugin Discovery

# plugins/__init__.py
"""Auto-discover and register all plugins."""

import os
import importlib
from pathlib import Path

# Find all plugin modules
plugin_dir = Path(__file__).parent
plugin_modules = []

for file in plugin_dir.glob('plugin_*.py'):
    module_name = file.stem
    module = importlib.import_module(f'.{module_name}', package=__name__)
    if hasattr(module, 'register'):
        module.register()
    plugin_modules.append(module)

__all__ = [m.__name__.split('.')[-1] for m in plugin_modules]

Pattern 4: Deprecation Warnings

# mypackage/__init__.py
"""Package with deprecated functions."""

import warnings

from .core import new_function

# Deprecated function
def old_function(*args, **kwargs):
    warnings.warn(
        "old_function is deprecated, use new_function instead",
        DeprecationWarning,
        stacklevel=2
    )
    return new_function(*args, **kwargs)

__all__ = ['new_function', 'old_function']

Python 3.3+ Namespace Packages

Since Python 3.3, you can create packages WITHOUT __init__.py:

# PEP 420 - Namespace packages
# project1/company/
#   module_a.py  # No __init__.py!

# project2/company/
#   module_b.py  # No __init__.py!

# Both directories combine into one namespace:
from company import module_a  # From project1
from company import module_b  # From project2

5. Best Practices

Practice #1: Import at Module Level

# GOOD - imports at top
import json
import requests
from typing import Dict, List

def fetch_user_data(user_id: int) -> Dict:
    response = requests.get(f'/api/users/{user_id}')
    return json.loads(response.text)

# BAD - importing inside function
def fetch_user_data(user_id: int):
    import json  # Why? Unless you have a good reason...
    import requests
    response = requests.get(f'/api/users/{user_id}')
    return json.loads(response.text)

Exceptions where local imports are acceptable:

# 1. Breaking circular dependencies
def create_user():
    from .models import User  # Avoid circular import
    return User()

# 2. Optional dependencies
def export_to_excel(data):
    try:
        import openpyxl  # Only imported if this function is called
    except ImportError:
        raise RuntimeError("openpyxl required for Excel export")
    # ... use openpyxl

# 3. Heavy imports in rarely used code paths
def debug_visualize(data):
    import matplotlib.pyplot as plt  # Heavy import, only for debugging
    plt.plot(data)
    plt.show()

Practice #2: Use TYPE_CHECKING for Type Hints

Avoid circular imports and runtime overhead:

from typing import TYPE_CHECKING

if TYPE_CHECKING:
    # Only imported by type checkers (mypy, pyright)
    # Not imported at runtime!
    from .models import User
    from .database import Database

def process_user(user: 'User', db: 'Database') -> None:
    # Use string literals for forward references
    pass

Why this matters:

# Without TYPE_CHECKING - circular import!
# user_service.py
from models import User  # models imports user_service

# models.py
from user_service import validate_user  # user_service imports models

# WITH TYPE_CHECKING - works!
# user_service.py
from typing import TYPE_CHECKING
if TYPE_CHECKING:
    from models import User

def validate_user(user: 'User') -> bool:
    pass

Practice #3: Organize Imports (PEP 8)

# Standard library imports
import os
import sys
from pathlib import Path

# Third-party imports
import numpy as np
import pandas as pd
from fastapi import FastAPI

# Local application imports
from src.config import settings
from src.utils import helpers
from .models import User

Use isort to automate this:

pip install isort
isort your_file.py

Practice #4: Avoid Wildcard Imports

# BAD - wildcard import
from utils import *

def format_date(date):
    pass

# Later, someone adds format_date to utils.py
# Your function is now shadowed! Silent bug!

# GOOD - explicit imports
from utils import validate_email, sanitize_input

def format_date(date):
    pass  # No collision possible

The only acceptable use of import *:

# In package __init__.py to expose public API
# utils/__init__.py
from .validators import *
from .formatters import *

__all__ = ['validate_email', 'format_phone', 'sanitize_input']

Practice #5: Use __all__ to Define Public API

# mypackage/__init__.py
from .core import process_data
from .utils import helper_function
from ._internal import _private_function

# Define what's public
__all__ = ['process_data', 'helper_function']

# Now: from mypackage import *
# Only imports process_data and helper_function
# _private_function is not imported

6. Common Pitfalls and Solutions

Pitfall #1: Circular Imports

The Problem:

# models.py
from services import UserService

class User:
    def save(self):
        UserService.save(self)

# services.py
from models import User

class UserService:
    @staticmethod
    def save(user: User):
        # Save to database
        pass

# Result: ImportError: cannot import name 'UserService' from partially initialized module

Solution 1: Restructure Code

# models.py - no imports needed
class User:
    def save(self):
        pass  # Keep models dumb

# services.py
from models import User

class UserService:
    @staticmethod
    def save(user: User):
        # All business logic here
        pass

# main.py
from models import User
from services import UserService

user = User()
UserService.save(user)

Solution 2: Local Import

# models.py
class User:
    def save(self):
        from services import UserService  # Import only when needed
        UserService.save(self)

# services.py
from models import User  # This import works

class UserService:
    @staticmethod
    def save(user: User):
        pass

Solution 3: Type-only Import

# models.py
from typing import TYPE_CHECKING

if TYPE_CHECKING:
    from services import UserService

class User:
    def save(self, service: 'UserService') -> None:
        service.save(self)

Pitfall #2: Importing Mutable Objects

The Problem:

# config.py
DATABASE_CONFIG = {
    'host': 'localhost',
    'port': 5432
}

# module_a.py
from config import DATABASE_CONFIG
DATABASE_CONFIG['host'] = 'production.db.com'  # Modifies original!

# module_b.py
from config import DATABASE_CONFIG
print(DATABASE_CONFIG['host'])  # 'production.db.com' - unexpected!

Solution: Use Immutable Config

# config.py
from typing import Final
from dataclasses import dataclass

@dataclass(frozen=True)
class DatabaseConfig:
    host: str
    port: int

DATABASE_CONFIG: Final = DatabaseConfig(
    host='localhost',
    port=5432
)

# module_a.py
from config import DATABASE_CONFIG
DATABASE_CONFIG.host = 'production'  # FrozenInstanceError!

Pitfall #3: Late Import Side Effects

# analytics.py
print("Analytics module loaded!")  # Runs on import
_connection = setup_database()     # Runs on import!

# main.py
import analytics  # Prints message and connects to DB immediately

# Even if you never use analytics!

Solution: Lazy Initialization

# analytics.py
_connection = None

def get_connection():
    global _connection
    if _connection is None:
        _connection = setup_database()
    return _connection

# main.py
import analytics
# No side effects yet...

# Only connects when needed
conn = analytics.get_connection()

Pitfall #4: Import Order Dependencies

# Bad: Order-dependent side effects
# app.py
from database import db  # Calls db.init_app()
from models import User  # Assumes db is initialized

# If imports are reordered:
from models import User  # AttributeError: db not initialized!
from database import db

Solution: Explicit Initialization

# app.py
from database import db
from models import User

def create_app():
    db.init_app()  # Explicit initialization
    return app

Pitfall #5: Namespace Pollution

# utils.py
from math import *  # Imports 50+ names
from os import *    # Another 200+ names

def sqrt(x):
    # Intended to wrap math.sqrt, but which sqrt?
    pass

# Name collision! Debugging nightmare!

Solution: Explicit Imports

# utils.py
from math import sqrt as math_sqrt
from os import path

def sqrt(x):
    """Custom wrapper for math.sqrt"""
    return math_sqrt(abs(x))  # Clear which sqrt we're using

7. Performance Considerations

Import Cost Measurement

import time

# Measure import time
start = time.perf_counter()
import pandas as pd
end = time.perf_counter()
print(f"pandas import took {end - start:.3f} seconds")

# Typical results:
# pandas: 0.3-0.5 seconds (heavy)
# numpy: 0.05-0.1 seconds (medium)
# json: 0.001 seconds (negligible)

Lazy Imports for CLI Tools

# cli.py - before optimization
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf

def main():
    if sys.argv[1] == 'simple':
        print("Hello")  # Don't need those heavy imports!

# cli.py - after optimization
def main():
    if sys.argv[1] == 'simple':
        print("Hello")
    elif sys.argv[1] == 'analyze':
        import pandas as pd  # Only import when needed
        # analyze data
    elif sys.argv[1] == 'train':
        import tensorflow as tf
        # train model

Import Caching

# Imports are cached in sys.modules
import sys
import time

# First import
start = time.perf_counter()
import numpy as np
print(f"First import: {time.perf_counter() - start:.6f}s")

# Remove from cache
del sys.modules['numpy']

# Second import (cache miss)
start = time.perf_counter()
import numpy as np
print(f"Second import (no cache): {time.perf_counter() - start:.6f}s")

# Third import (cache hit)
start = time.perf_counter()
import numpy as np
print(f"Third import (cached): {time.perf_counter() - start:.6f}s")

# Output:
# First import: 0.082453s
# Second import (no cache): 0.081234s
# Third import (cached): 0.000003s

8. Advanced Patterns

Dynamic Imports with importlib

import importlib

# Load module by string name
module_name = 'json'
json_module = importlib.import_module(module_name)
data = json_module.loads('{"key": "value"}')

# Plugin system
plugins = ['plugin_a', 'plugin_b', 'plugin_c']
for plugin_name in plugins:
    try:
        plugin = importlib.import_module(f'plugins.{plugin_name}')
        plugin.register()
    except ImportError:
        print(f"Plugin {plugin_name} not found")

Conditional Imports

# Support multiple backends
try:
    import orjson as json  # Faster alternative
    print("Using orjson")
except ImportError:
    import json  # Fallback to stdlib
    print("Using standard json")

# data = json.loads(...)  # Works with either backend

Import Hooks (Advanced)

import sys
from importlib.abc import MetaPathFinder, Loader
from importlib.machinery import ModuleSpec

class CustomImporter(MetaPathFinder, Loader):
    def find_spec(self, fullname, path, target=None):
        if fullname.startswith('auto_'):
            return ModuleSpec(fullname, self)
        return None
    
    def create_module(self, spec):
        return None  # Use default module creation
    
    def exec_module(self, module):
        # Auto-generate module content
        module.message = f"Auto-generated module: {module.__name__}"
        module.greet = lambda: print(module.message)

# Register custom importer
sys.meta_path.insert(0, CustomImporter())

# Now you can import modules that don't exist!
import auto_hello
auto_hello.greet()  # "Auto-generated module: auto_hello"

9. Project Structure Best Practices

Flat is Better Than Nested

# BAD - over-nested
project/
  src/
    app/
      core/
        services/
          user/
            implementations/
              user_service.py  # import src.app.core.services.user.implementations.user_service

# GOOD - flatter structure
project/
  src/
    services/
      user_service.py  # import src.services.user_service
    models/
      user.py
    utils/
      validators.py

Package Structure

# mypackage/
#   __init__.py
#   core.py
#   utils.py
#   _internal.py

# mypackage/__init__.py - expose public API
from .core import main_function, MainClass
from .utils import helper_function

__all__ = ['main_function', 'MainClass', 'helper_function']
__version__ = '1.0.0'

# Users can now do:
# from mypackage import main_function
# Instead of:
# from mypackage.core import main_function

Namespace Packages (Advanced)

# Split package across multiple directories
# project1/mycompany/
#   __init__.py  # Namespace package (can be empty or omitted)
#   module_a.py

# project2/mycompany/
#   __init__.py
#   module_b.py

# Python 3.3+ supports implicit namespace packages
# Now you can:
from mycompany import module_a  # From project1
from mycompany import module_b  # From project2

Conclusion

Performance checklist:

  • Heavy imports inside hot code paths?
  • Could I lazy-load rarely used modules?
  • Are CLI tools importing unnecessary heavy dependencies?

Code quality:

  • Run isort to organize imports
  • Run pylint or ruff to catch import issues
  • Check for unused imports with autoflake
# Auto-format imports
isort .

# Check for issues
pylint yourmodule.py
ruff check .

# Remove unused imports
autoflake --remove-all-unused-imports --in-place yourfile.py
  1. Be explicit - absolute imports over relative, specific names over wildcards
  2. Avoid circular dependencies - restructure code or use local/type-only imports
  3. Import at module level - except for breaking circles or lazy loading
  4. Use TYPE_CHECKING - for type hints without runtime overhead
  5. Define public APIs - with __all__ and __init__.py
  6. Test your imports - watch for side effects and ordering issues

Continue reading

Next article

Angular v21: Zoneless by Default and the Death of Zone.js

Related Content