Files
password-security-python/AGENTS.md
2025-11-13 23:56:05 +00:00

15 KiB

Repository Guidelines for AI Agents

Project Overview

This is a Python password hashing utility supporting multiple cryptographic algorithms (PBK-DF2, Argon2, bcrypt). The project provides both library functions and two CLI implementations with different UX patterns.

Key Architecture:

  • Algorithm Registry Pattern: Pluggable hash algorithms implementing a common Algorithm protocol
  • Auto-registration: Algorithms register themselves on module import
  • Two CLIs: salt.py (shortcut syntax) and salt2.py (explicit commands)
  • Backward compatible: Default is PBKDF2 for compatibility with legacy code

Module Structure:

salt.py               # Core library + CLI with shortcut syntax
salt2.py              # Alternative CLI with explicit subcommands
algorithms.py         # Algorithm registry and Protocol definition
pbkdf2_algorithm.py   # PBKDF2-HMAC-SHA256 implementation
argon2_algorithm.py   # Argon2id implementation
bcrypt_algorithm.py   # bcrypt implementation
tests/
  test_hashing.py     # Tests for core hash/verify functions
  test_cli.py         # Tests for salt.py CLI behavior
  test_cli2.py        # Tests for salt2.py CLI behavior
  test_algorithms.py  # Tests for algorithm implementations
  test_integration.py # End-to-end tests for CLI tools

Essential Commands

Running the Tools

# Hash a password with default algorithm (PBKDF2)
python3 salt.py "MyPassword"                    # Shortcut syntax
python3 salt.py hash "MyPassword"               # Explicit command
python3 salt2.py generate "MyPassword"          # Alternative CLI

# Hash with specific algorithm
python3 salt.py hash --algorithm argon2 "MyPassword"
python3 salt.py --algorithm bcrypt "MyPassword" # Shortcut with algorithm
python3 salt2.py generate --algorithm argon2 "MyPassword"

# Verify a password
python3 salt.py verify "MyPassword" <salt_b64> <hash_b64>
python3 salt.py verify --algorithm argon2 "MyPassword" "" <hash_b64>
python3 salt2.py verify --algorithm bcrypt "MyPassword" "" <hash_b64>

# List available algorithms
python3 salt.py list-algorithms
python3 salt2.py list-algorithms

Testing

# Run all tests (28 tests total as of now)
python3 -m pytest

# Verbose output
python3 -m pytest -v

# Run specific test files
python3 -m pytest tests/test_hashing.py
python3 -m pytest tests/test_cli.py
python3 -m pytest tests/test_algorithms.py
python3 -m pytest tests/test_integration.py

# Run specific test patterns
python3 -m pytest -k verify
python3 -m pytest -k algorithm

# Run with verbose failure details
python3 -m pytest --maxfail=1 -vv

# Run tests with coverage (if pytest-cov installed)
python3 -m pytest --cov=. --cov-report=term-missing

Development Workflow

# Install dependencies
pip install -r requirements.txt

# Quick smoke test of CLI
python3 salt.py "test" && echo "salt.py works"
python3 salt2.py generate "test" && echo "salt2.py works"

# Test algorithm round-trip
salt_hash=$(python3 salt.py "secret" | tail -1 | cut -d' ' -f2)
# Extract salt and hash from output for verify command

Code Organization and Patterns

Algorithm Registry Pattern

How it works:

  1. algorithms.py defines Algorithm Protocol with hash(), verify(), and identifier attribute
  2. Each algorithm module creates a class implementing the protocol
  3. At module bottom, algorithm imports register_algorithm and registers itself
  4. algorithms.py imports all algorithm modules at bottom to trigger registration
  5. Core functions use get_algorithm(name) to retrieve registered implementations

Adding a new algorithm:

  1. Create <name>_algorithm.py implementing Algorithm protocol
  2. Add register_algorithm(<Name>Algorithm()) at bottom of file
  3. Import module at end of algorithms.py: import <name>_algorithm # noqa: E402, F401
  4. Add tests to tests/test_algorithms.py
  5. Update CLI choices in _build_parser() if using explicit choices (currently uses dynamic list)

Import Pattern for Registration

Critical: The algorithm modules use a specific import pattern to avoid circular dependencies:

# At BOTTOM of algorithm module (e.g., pbkdf2_algorithm.py)
from algorithms import register_algorithm

# Auto-register when module is imported
register_algorithm(PBKDF2Algorithm())

In algorithms.py:

# At BOTTOM of file
# Import to auto-register algorithms
import pbkdf2_algorithm  # noqa: E402, F401
import argon2_algorithm  # noqa: E402, F401
import bcrypt_algorithm  # noqa: E402, F401

The # noqa: E402, F401 comments suppress linter warnings about:

  • E402: module level import not at top of file (intentional for registration)
  • F401: imported but unused (side-effect import for registration)

Salt Handling Across Algorithms

IMPORTANT difference between algorithms:

  • PBKDF2: Generates separate 16-byte salt with os.urandom(), returns (salt_b64, hash_b64) where both are populated
  • Argon2: Embeds salt in hash string (e.g., $argon2id$v=19$m=65536,t=3,p=4$...), returns ("", hash_b64) with empty salt
  • bcrypt: Embeds salt in hash string (e.g., $2b$12$...), returns ("", hash_b64) with empty salt

When verifying Argon2/bcrypt passwords, the salt_b64 parameter is ignored since the salt is embedded in hash_b64.

CLI Argument Normalization (salt.py only)

salt.py has special shortcut syntax via _normalize_args():

# These are equivalent:
python3 salt.py "mypassword"
python3 salt.py hash "mypassword"

# Algorithm shortcut also works:
python3 salt.py --algorithm argon2 "mypassword"
# Becomes: python3 salt.py hash --algorithm argon2 "mypassword"

Normalization rules:

  1. If first arg is hash, verify, or list-algorithms → pass through unchanged
  2. If first arg is -h or --help → pass through unchanged
  3. If first arg is --algorithm or -a → prepend hash
  4. Otherwise (plain password) → prepend hash

Tested in: tests/test_cli.py::test_main_supports_hash_shortcut() and test_main_hash_algorithm_shortcut()

Core Functions

hash_password(password, *, algorithm="pbkdf2", iterations=None, salt_bytes=16) -> tuple[str, str]

  • Located in salt.py lines 19-30
  • Returns (salt_b64, hash_b64) tuple
  • For Argon2/bcrypt: salt_b64 will be empty string
  • Delegates to algorithm implementation via get_algorithm(algorithm).hash(...)

verify_password(password, salt_b64, hash_b64, *, algorithm="pbkdf2", iterations=None) -> bool

  • Located in salt.py lines 33-45
  • Returns True on match, False on mismatch or invalid input
  • Never raises exceptions (catches binascii.Error, ValueError)
  • Uses hmac.compare_digest() for timing-safe comparison (PBKDF2)
  • Delegates to algorithm implementation via get_algorithm(algorithm).verify(...)

Naming Conventions and Style

Python Style

  • Target version: Python 3.11+ (using | union types, structural pattern matching potential)
  • Actual version in use: Python 3.12.3 (per pytest output)
  • Indentation: 4 spaces (never tabs)
  • Type hints: Required for all public functions, use from __future__ import annotations for forward refs
  • Function/variable names: lowercase_with_underscores
  • Class names: CapWords (e.g., PBKDF2Algorithm, Argon2Algorithm)
  • Private/internal functions: Prefix with _ (e.g., _build_parser(), _normalize_args())

Naming Patterns Observed

Module names: <algorithm>_algorithm.py (e.g., pbkdf2_algorithm.py) Class names: <Algorithm>Algorithm (e.g., PBKDF2Algorithm, BcryptAlgorithm) Test files: test_<feature>.py (e.g., test_hashing.py, test_algorithms.py) Test functions: test_<behavior>_<context>() with docstrings explaining intent

Constants

DEFAULT_SALT_BYTES = 16
DEFAULT_ITERATIONS = int(os.environ.get("PBKDF2_ITERATIONS", "200000"))

Import Order

  1. from __future__ import annotations (always first if present)
  2. Standard library imports
  3. Third-party imports (argon2, bcrypt)
  4. Local imports (at bottom for registration side-effects)

Testing Guidelines

Test Structure

Coverage goal: >90% for hashing utilities

Test organization:

  • test_hashing.py: Core hash_password() and verify_password() functions
  • test_cli.py: CLI behavior for salt.py (including shortcut syntax)
  • test_cli2.py: CLI behavior for salt2.py
  • test_algorithms.py: Individual algorithm implementations
  • test_integration.py: End-to-end integration tests for CLI commands

Testing Patterns

Round-trip tests (most important):

def test_<algo>_algorithm_hash_round_trip():
    """Verify <algo> algorithm can hash and verify passwords."""
    algo = get_algorithm("<algo>")
    salt, hashed = algo.hash("test password")
    assert algo.verify("test password", salt, hashed)
    assert not algo.verify("wrong password", salt, hashed)

Property tests:

  • Base64 validation for outputs (see test_hash_password_returns_base64)
  • Invalid input handling (see test_verify_password_handles_invalid_base64)
  • Algorithm-specific behavior (see test_pbkdf2_algorithm_respects_iterations)

CLI tests:

def test_main_<command>_<scenario>():
    """Test description."""
    assert main([<args>]) == <expected_exit_code>

Exit codes: 0 = success, 1 = verification failure

Test Execution Notes

  • 28 tests total (as of current state)
  • Test execution time: ~7 seconds (mostly from Argon2/bcrypt hashing)
  • All tests must pass before committing
  • Use pytest -v for verbose output
  • Use pytest -k <pattern> to run subset

Adding Tests for New Algorithms

When adding a new algorithm, add these tests to tests/test_algorithms.py:

def test_<algo>_algorithm_hash_round_trip():
    """Verify <algo> algorithm can hash and verify passwords."""
    algo = get_algorithm("<algo>")
    salt, hashed = algo.hash("test password")
    assert algo.verify("test password", salt, hashed)
    assert not algo.verify("wrong password", salt, hashed)

def test_<algo>_algorithm_identifier():
    """Verify <algo> algorithm has correct identifier."""
    algo = get_algorithm("<algo>")
    assert algo.identifier == "<algo>"

And to tests/test_cli.py or tests/test_cli2.py:

def test_main_hash_with_algorithm_<algo>():
    """Test hash command with --algorithm <algo>."""
    assert main(["hash", "--algorithm", "<algo>", "test-password"]) == 0

Important Gotchas and Non-Obvious Patterns

1. Algorithm Registration Order Matters

The imports at the bottom of algorithms.py trigger registration. If you import algorithms module before the algorithm modules are imported, the registry will be empty. This is handled correctly in the current code, but be careful when refactoring.

2. CLI Output to Stdout (Tests Must Capture)

The CLI functions print to stdout. Tests use main([...]) to check exit codes but don't capture output. If you need to test output content, use capsys fixture. Subprocess-based integration tests also capture stdout.

def test_output_format(capsys):
    main(["hash", "test"])
    captured = capsys.readouterr()
    assert "Salt:" in captured.out
    assert "Hash:" in captured.out

3. Empty Salt for Argon2/bcrypt

When working with Argon2 or bcrypt hashes:

  • The first element of the tuple is always "" (empty string)
  • The second element contains the full hash with embedded salt
  • When verifying, pass "" for salt or any string (it's ignored)
  • Tests should handle this: salt, hashed = algo.hash("pw") where salt == ""

4. Environment Variable for Iterations

PBKDF2_ITERATIONS environment variable affects default behavior. Tests should be aware this can change behavior:

DEFAULT_ITERATIONS = int(os.environ.get("PBKDF2_ITERATIONS", "200000"))

5. German UI Text

CLI output and help text are in German:

  • "✓ Passwort korrekt" / "✗ Passwort falsch"
  • Help text uses German descriptions
  • When adding CLI features, maintain German for user-facing text

6. Exception Handling in Verify

verify_password() and algorithm implementations catch exceptions and return False rather than raising. This is intentional for security to avoid leaking information via exception types.

7. Dynamic Algorithm List

The CLI parser uses list_algorithms() to populate choices dynamically:

hash_parser.add_argument(
    "--algorithm",
    "-a",
    choices=list_algorithms(),  # Dynamic based on registered algorithms
    default="pbkdf2",
    help="Hash-Algorithmus (Standard: pbkdf2)",
)

This means adding a new algorithm automatically adds it to CLI choices.

8. Relative Imports vs Absolute Imports

The code uses absolute imports (not relative):

from algorithms import get_algorithm  # Not: from .algorithms import get_algorithm

This works because modules are in the root directory, not a package. Keep this pattern consistent.

Security Practices

Cryptographic Randomness

  • Always use os.urandom() for salt generation (never random module)

Timing Attack Protection

  • Use hmac.compare_digest() for hash comparison (PBKDF2)
  • Argon2 and bcrypt libraries handle timing safety internally

Input Validation

  • Use base64.b64decode(data, validate=True) to validate base64 format
  • Catch and handle binascii.Error and ValueError
  • Return False rather than raising exceptions in verify functions

Iteration Count

  • Default 200,000 for PBKDF2 (OWASP recommended as of 2023)
  • Configurable via PBKDF2_ITERATIONS environment variable

No Logging of Secrets

  • Never log passwords, salts, or hashes
  • Print statements only in CLI code for user output

Configuration and Environment

Environment Variables

PBKDF2_ITERATIONS

  • Default: "200000" (as string, converted to int)
  • Used by: pbkdf2_algorithm.py
  • Example: export PBKDF2_ITERATIONS=300000

Dependencies (requirements.txt)

pytest>=7.4              # Testing framework
argon2-cffi>=23.1.0      # Argon2id implementation
bcrypt>=4.1.0            # bcrypt implementation

Python Version Requirements

  • Minimum: Python 3.11 (for | union type syntax)
  • Tested on: Python 3.12.3

Commit Guidelines

Commit Message Format

Use conventional commit format with imperative mood:

<type>: <short description>

Types in use: feat, fix, test, refactor, docs.

Before Committing

  1. Run full test suite: python3 -m pytest -v
  2. Ensure all tests pass (28/28)
  3. Verify no regressions
  4. Add tests for new functionality

Working with Plans and Documentation

Plan Execution Pattern

The repository includes a detailed plan at docs/plans/2025-11-13-multi-algorithm-support.md. When implementing from plans, follow the TDD approach outlined in the tasks.

Multiple Documentation Files

  • AGENTS.md: This file - agent/developer guidelines
  • CLAUDE.md: German translation, more detailed code examples
  • README.md: User-facing documentation in German
  • docs/plans/*.md: Implementation plans with TDD steps

Keep these in sync when making significant changes.