# Repository Guidelines for AI Agents ## Project Overview This is a Python password hashing utility supporting multiple cryptographic algorithms (PBKDF2, Argon2, bcrypt). The project provides both library functions and two CLI implementations with different UX patterns. **Key Architecture:** - **Algorithm Registry Pattern**: Pluggable hash algorithms implementing a common `Algorithm` protocol - **Auto-registration**: Algorithms register themselves on module import - **Two CLIs**: `salt.py` (shortcut syntax) and `salt2.py` (explicit commands) - **Backward compatible**: Default is PBKDF2 for compatibility with legacy code **Module Structure:** ``` salt.py # Core library + CLI with shortcut syntax salt2.py # Alternative CLI with explicit subcommands algorithms.py # Algorithm registry and Protocol definition pbkdf2_algorithm.py # PBKDF2-HMAC-SHA256 implementation argon2_algorithm.py # Argon2id implementation bcrypt_algorithm.py # bcrypt implementation tests/ test_hashing.py # Tests for core hash/verify functions test_cli.py # Tests for salt.py CLI behavior test_algorithms.py # Tests for algorithm implementations ``` ## Essential Commands ### Running the Tools ```bash # Hash a password with default algorithm (PBKDF2) python3 salt.py "MyPassword" # Shortcut syntax python3 salt.py hash "MyPassword" # Explicit command python3 salt2.py generate "MyPassword" # Alternative CLI # Hash with specific algorithm python3 salt.py hash --algorithm argon2 "MyPassword" python3 salt.py --algorithm bcrypt "MyPassword" # Shortcut with algorithm python3 salt2.py generate "MyPassword" # Note: salt2.py doesn't support --algorithm yet # Verify a password python3 salt.py verify "MyPassword" python3 salt.py verify --algorithm argon2 "MyPassword" "" # List available algorithms python3 salt.py list-algorithms # Not implemented yet, would be useful addition ``` ### Testing ```bash # Run all tests (20 tests total as of now) python3 -m pytest # Verbose output python3 -m pytest -v # Run specific test files python3 -m pytest tests/test_hashing.py python3 -m pytest tests/test_cli.py python3 -m pytest tests/test_algorithms.py # Run specific test patterns python3 -m pytest -k verify python3 -m pytest -k algorithm # Run with verbose failure details python3 -m pytest --maxfail=1 -vv # Run tests with coverage (if pytest-cov installed) python3 -m pytest --cov=. --cov-report=term-missing ``` ### Development Workflow ```bash # Install dependencies pip install -r requirements.txt # Quick smoke test of CLI python3 salt.py "test" && echo "salt.py works" python3 salt2.py generate "test" && echo "salt2.py works" # Test algorithm round-trip salt_hash=$(python3 salt.py "secret" | tail -1 | cut -d' ' -f2) # Extract salt and hash from output for verify command ``` ## Code Organization and Patterns ### Algorithm Registry Pattern **How it works:** 1. `algorithms.py` defines `Algorithm` Protocol with `hash()`, `verify()`, and `identifier` attribute 2. Each algorithm module creates a class implementing the protocol 3. At module bottom, algorithm imports `register_algorithm` and registers itself 4. `algorithms.py` imports all algorithm modules at bottom to trigger registration 5. Core functions use `get_algorithm(name)` to retrieve registered implementations **Adding a new algorithm:** 1. Create `_algorithm.py` implementing `Algorithm` protocol 2. Add `register_algorithm(Algorithm())` at bottom of file 3. Import module at end of `algorithms.py`: `import _algorithm # noqa: E402, F401` 4. Add tests to `tests/test_algorithms.py` 5. Update CLI choices in `_build_parser()` if using explicit choices (currently uses dynamic list) ### Import Pattern for Registration **Critical:** The algorithm modules use a specific import pattern to avoid circular dependencies: ```python # At BOTTOM of algorithm module (e.g., pbkdf2_algorithm.py) from algorithms import register_algorithm # Auto-register when module is imported register_algorithm(PBKDF2Algorithm()) ``` **In algorithms.py:** ```python # At BOTTOM of file # Import to auto-register algorithms import pbkdf2_algorithm # noqa: E402, F401 import argon2_algorithm # noqa: E402, F401 import bcrypt_algorithm # noqa: E402, F401 ``` The `# noqa: E402, F401` comments suppress linter warnings about: - E402: module level import not at top of file (intentional for registration) - F401: imported but unused (side-effect import for registration) ### Salt Handling Across Algorithms **IMPORTANT difference between algorithms:** - **PBKDF2**: Generates separate 16-byte salt with `os.urandom()`, returns `(salt_b64, hash_b64)` where both are populated - **Argon2**: Embeds salt in hash string (e.g., `$argon2id$v=19$m=65536,t=3,p=4$...`), returns `("", hash_b64)` with empty salt - **bcrypt**: Embeds salt in hash string (e.g., `$2b$12$...`), returns `("", hash_b64)` with empty salt When verifying Argon2/bcrypt passwords, the `salt_b64` parameter is ignored since the salt is embedded in `hash_b64`. ### CLI Argument Normalization (salt.py only) **`salt.py` has special shortcut syntax via `_normalize_args()`:** ```python # These are equivalent: python3 salt.py "mypassword" python3 salt.py hash "mypassword" # Algorithm shortcut also works: python3 salt.py --algorithm argon2 "mypassword" # Becomes: python3 salt.py hash --algorithm argon2 "mypassword" ``` **Normalization rules:** 1. If first arg is `hash` or `verify` → pass through unchanged 2. If first arg is `-h` or `--help` → pass through unchanged 3. If first arg is `--algorithm` or `-a` → prepend `hash` 4. Otherwise (plain password) → prepend `hash` **Tested in:** `tests/test_cli.py::test_main_supports_hash_shortcut()` and `test_main_hash_algorithm_shortcut()` ### Core Functions **`hash_password(password, *, algorithm="pbkdf2", iterations=None, salt_bytes=16) -> tuple[str, str]`** - Located in `salt.py` lines 19-30 - Returns `(salt_b64, hash_b64)` tuple - For Argon2/bcrypt: salt_b64 will be empty string - Delegates to algorithm implementation via `get_algorithm(algorithm).hash(...)` **`verify_password(password, salt_b64, hash_b64, *, algorithm="pbkdf2", iterations=None) -> bool`** - Located in `salt.py` lines 33-45 - Returns `True` on match, `False` on mismatch or invalid input - Never raises exceptions (catches `binascii.Error`, `ValueError`) - Uses `hmac.compare_digest()` for timing-safe comparison (PBKDF2) - Delegates to algorithm implementation via `get_algorithm(algorithm).verify(...)` ## Naming Conventions and Style ### Python Style - **Target version:** Python 3.11+ (using `|` union types, structural pattern matching potential) - **Actual version in use:** Python 3.12.3 (per pytest output) - **Indentation:** 4 spaces (never tabs) - **Type hints:** Required for all public functions, use `from __future__ import annotations` for forward refs - **Function/variable names:** `lowercase_with_underscores` - **Class names:** `CapWords` (e.g., `PBKDF2Algorithm`, `Argon2Algorithm`) - **Private/internal functions:** Prefix with `_` (e.g., `_build_parser()`, `_normalize_args()`) ### Naming Patterns Observed **Module names:** `_algorithm.py` (e.g., `pbkdf2_algorithm.py`) **Class names:** `Algorithm` (e.g., `PBKDF2Algorithm`, `BcryptAlgorithm`) **Test files:** `test_.py` (e.g., `test_hashing.py`, `test_algorithms.py`) **Test functions:** `test__()` with docstrings explaining intent ### Constants ```python DEFAULT_SALT_BYTES = 16 DEFAULT_ITERATIONS = int(os.environ.get("PBKDF2_ITERATIONS", "200000")) ``` ### Import Order 1. `from __future__ import annotations` (always first if present) 2. Standard library imports 3. Third-party imports (argon2, bcrypt) 4. Local imports (at bottom for registration side-effects) ## Testing Guidelines ### Test Structure **Coverage goal:** >90% for hashing utilities **Test organization:** - `test_hashing.py`: Core `hash_password()` and `verify_password()` functions - `test_cli.py`: CLI behavior for `salt.py` (including shortcut syntax) - `test_algorithms.py`: Individual algorithm implementations ### Testing Patterns **Round-trip tests (most important):** ```python def test__algorithm_hash_round_trip(): """Verify algorithm can hash and verify passwords.""" algo = get_algorithm("") salt, hashed = algo.hash("test password") assert algo.verify("test password", salt, hashed) assert not algo.verify("wrong password", salt, hashed) ``` **Property tests:** - Base64 validation for outputs (see `test_hash_password_returns_base64`) - Invalid input handling (see `test_verify_password_handles_invalid_base64`) - Algorithm-specific behavior (see `test_pbkdf2_algorithm_respects_iterations`) **CLI tests:** ```python def test_main__(): """Test description.""" assert main([]) == ``` Exit codes: 0 = success, 1 = verification failure ### Test Execution Notes - 20 tests total (as of current state) - Test execution time: ~5 seconds (mostly from Argon2/bcrypt hashing) - All tests must pass before committing - Use `pytest -v` for verbose output - Use `pytest -k ` to run subset ### Adding Tests for New Algorithms When adding a new algorithm, add these tests to `tests/test_algorithms.py`: ```python def test__algorithm_hash_round_trip(): """Verify algorithm can hash and verify passwords.""" algo = get_algorithm("") salt, hashed = algo.hash("test password") assert algo.verify("test password", salt, hashed) assert not algo.verify("wrong password", salt, hashed) def test__algorithm_identifier(): """Verify algorithm has correct identifier.""" algo = get_algorithm("") assert algo.identifier == "" ``` And to `tests/test_cli.py`: ```python def test_main_hash_with_algorithm_(): """Test hash command with --algorithm .""" assert main(["hash", "--algorithm", "", "test-password"]) == 0 ``` ## Important Gotchas and Non-Obvious Patterns ### 1. Algorithm Registration Order Matters The imports at the bottom of `algorithms.py` trigger registration. If you import `algorithms` module before the algorithm modules are imported, the registry will be empty. This is handled correctly in the current code, but be careful when refactoring. ### 2. CLI Output to Stdout (Tests Must Capture) The CLI functions print to stdout. Tests use `main([...])` to check exit codes but don't capture output. If you need to test output content, use `capsys` fixture: ```python def test_output_format(capsys): main(["hash", "test"]) captured = capsys.readouterr() assert "Salt:" in captured.out assert "Hash:" in captured.out ``` ### 3. Empty Salt for Argon2/bcrypt When working with Argon2 or bcrypt hashes: - The first element of the tuple is always `""` (empty string) - The second element contains the full hash with embedded salt - When verifying, pass `""` for salt or any string (it's ignored) - Tests should handle this: `salt, hashed = algo.hash("pw")` where `salt == ""` ### 4. Environment Variable for Iterations `PBKDF2_ITERATIONS` environment variable affects default behavior. Tests should be aware this can change behavior: ```python DEFAULT_ITERATIONS = int(os.environ.get("PBKDF2_ITERATIONS", "200000")) ``` ### 5. German UI Text CLI output and help text are in German: - `"✓ Passwort korrekt"` / `"✗ Passwort falsch"` - Help text uses German descriptions - Comments in code are mixed German/English - When adding CLI features, maintain German for user-facing text ### 6. Exception Handling in Verify `verify_password()` catches exceptions and returns `False` rather than raising: ```python try: salt = base64.b64decode(salt_b64, validate=True) stored_hash = base64.b64decode(hash_b64, validate=True) except (binascii.Error, ValueError): return False ``` This is intentional for security (avoid leaking information via exception types). ### 7. Dynamic Algorithm List The CLI parser uses `list_algorithms()` to populate choices dynamically: ```python hash_parser.add_argument( "--algorithm", "-a", choices=list_algorithms(), # Dynamic based on registered algorithms default="pbkdf2", help="Hash-Algorithmus (Standard: pbkdf2)", ) ``` This means adding a new algorithm automatically adds it to CLI choices. ### 8. salt2.py Missing Algorithm Support **Current limitation:** `salt2.py` does NOT yet support the `--algorithm` flag. It only uses the default (PBKDF2). To add support: 1. Update `_build_parser()` to add `--algorithm` argument to both subparsers 2. Update `_command_generate()` to pass `algorithm=args.algorithm` 3. Update `_command_verify()` to pass `algorithm=args.algorithm` 4. Add tests to `tests/test_cli.py` or create `tests/test_cli2.py` ### 9. Relative Imports vs Absolute Imports The code uses **absolute imports** (not relative): ```python from algorithms import get_algorithm # Not: from .algorithms import get_algorithm ``` This works because modules are in the root directory, not a package. Keep this pattern consistent. ## Security Practices ### Cryptographic Randomness - Always use `os.urandom()` for salt generation (never `random` module) - Never use predictable values or timestamps ### Timing Attack Protection - Use `hmac.compare_digest()` for hash comparison (PBKDF2) - Argon2 and bcrypt libraries handle timing safety internally ### Input Validation - Use `base64.b64decode(data, validate=True)` to validate base64 format - Catch and handle `binascii.Error` and `ValueError` - Return `False` rather than raising exceptions in verify functions ### Iteration Count - Default 200,000 for PBKDF2 (OWASP recommended as of 2023) - Configurable via `PBKDF2_ITERATIONS` environment variable - Always parameterize in functions (don't hardcode) ### No Logging of Secrets - Never log passwords, salts, or hashes - Print statements only in CLI code for user output - Tests don't need to capture sensitive values ## Configuration and Environment ### Environment Variables **`PBKDF2_ITERATIONS`** - Default: `"200000"` (as string, converted to int) - Used by: `pbkdf2_algorithm.py` and `salt.py` - Example: `export PBKDF2_ITERATIONS=300000` ### Dependencies (requirements.txt) ``` pytest>=7.4 # Testing framework argon2-cffi>=23.1.0 # Argon2id implementation bcrypt>=4.1.0 # bcrypt implementation ``` **Standard library only:** `hashlib`, `hmac`, `os`, `base64`, `binascii`, `argparse`, `sys` ### Python Version Requirements - **Minimum:** Python 3.11 (for `|` union type syntax) - **Tested on:** Python 3.12.3 - **Type hints:** Using modern syntax (`int | None`, not `Optional[int]`) ## Commit Guidelines ### Commit Message Format Use conventional commit format with imperative mood: ``` : ``` **Types in use:** - `feat:` - New features (e.g., "feat: add argon2 algorithm") - `fix:` - Bug fixes - `test:` - Test additions/changes - `refactor:` - Code restructuring without behavior change - `docs:` - Documentation updates ### Examples from Git History ``` docs: add README.md for repository docs: translate CLAUDE.md to German first commit ``` ### Before Committing 1. Run full test suite: `python3 -m pytest -v` 2. Ensure all tests pass (20/20) 3. Verify no regressions in existing functionality 4. Check that new code follows style conventions 5. Add tests for new functionality ### Pull Request Guidelines - Keep PRs focused on single feature/fix - Include motivation in description - Reference issues with `Fixes #ID` - For security changes: include CLI output or reproduction - All tests must pass - Maintain >90% coverage ## Planned Enhancements and Known Gaps Based on existing documentation and code: ### Missing Features (mentioned in plans but not implemented) 1. **`list-algorithms` command** - Mentioned in plan (Task 9) but not in current code 2. **salt2.py algorithm support** - `salt2.py` doesn't support `--algorithm` flag yet 3. **Integration tests** - `tests/test_integration.py` mentioned in plan but not created 4. **Backward compatibility test** - Specific test mentioned in Task 12 plan ### Documentation Gaps 1. README.md is in German but doesn't mention multi-algorithm support yet 2. CLAUDE.md is German translation but may be outdated vs English version 3. No API docs for algorithm developers ### Potential Improvements 1. Add `list-algorithms` subcommand to CLIs 2. Complete algorithm support in `salt2.py` 3. Add integration tests with subprocess 4. Document algorithm addition process more clearly 5. Consider packaging as proper Python package (`salt/` directory structure) ## Working with Plans and Documentation ### Plan Execution Pattern The repository includes a detailed plan at `docs/plans/2025-11-13-multi-algorithm-support.md`: - 12 tasks with step-by-step TDD approach - Each task has verification steps - Plan includes commit message examples - Some tasks completed, others not yet **If implementing from plans:** 1. Read the full task before starting 2. Follow TDD: failing test first, then implementation 3. Run tests after each step 4. Use suggested commit messages 5. Mark off checklist items as you complete them ### Multiple Documentation Files - **AGENTS.md**: This file - agent/developer guidelines - **CLAUDE.md**: German translation, more detailed code examples - **README.md**: User-facing documentation in German - **docs/plans/*.md**: Implementation plans with TDD steps Keep these in sync when making significant changes. ## Quick Reference ### File a Bug or Add a Feature 1. Create test demonstrating issue/feature 2. Implement fix/feature 3. Run `python3 -m pytest -v` 4. Commit with conventional message 5. Update docs if needed ### Debug a Test Failure ```bash # Run specific test with verbose output python3 -m pytest tests/test_file.py::test_name -vv # Run with pdb on failure python3 -m pytest --pdb tests/test_file.py::test_name # See print statements python3 -m pytest -s tests/test_file.py ``` ### Verify All Systems ```bash # Comprehensive check python3 -m pytest -v && \ python3 salt.py "test" && \ python3 salt.py hash --algorithm argon2 "test" && \ python3 salt.py hash --algorithm bcrypt "test" && \ python3 salt2.py generate "test" && \ echo "All systems operational" ``` --- **Document Version:** 2025-11-13 **Last Updated:** Initial comprehensive analysis based on repository state **Test Count:** 20 tests (all passing) **Python Version:** 3.12.3