429 lines
15 KiB
Markdown
429 lines
15 KiB
Markdown
# Repository Guidelines for AI Agents
|
|
|
|
## Project Overview
|
|
|
|
This is a Python password hashing utility supporting multiple cryptographic algorithms (PBK-DF2, Argon2, bcrypt). The project provides both library functions and two CLI implementations with different UX patterns.
|
|
|
|
**Key Architecture:**
|
|
- **Algorithm Registry Pattern**: Pluggable hash algorithms implementing a common `Algorithm` protocol
|
|
- **Auto-registration**: Algorithms register themselves on module import
|
|
- **Two CLIs**: `salt.py` (shortcut syntax) and `salt2.py` (explicit commands)
|
|
- **Backward compatible**: Default is PBKDF2 for compatibility with legacy code
|
|
|
|
**Module Structure:**
|
|
```
|
|
salt.py # Core library + CLI with shortcut syntax
|
|
salt2.py # Alternative CLI with explicit subcommands
|
|
algorithms.py # Algorithm registry and Protocol definition
|
|
pbkdf2_algorithm.py # PBKDF2-HMAC-SHA256 implementation
|
|
argon2_algorithm.py # Argon2id implementation
|
|
bcrypt_algorithm.py # bcrypt implementation
|
|
tests/
|
|
test_hashing.py # Tests for core hash/verify functions
|
|
test_cli.py # Tests for salt.py CLI behavior
|
|
test_cli2.py # Tests for salt2.py CLI behavior
|
|
test_algorithms.py # Tests for algorithm implementations
|
|
test_integration.py # End-to-end tests for CLI tools
|
|
```
|
|
|
|
## Essential Commands
|
|
|
|
### Running the Tools
|
|
|
|
```bash
|
|
# Hash a password with default algorithm (PBKDF2)
|
|
python3 salt.py "MyPassword" # Shortcut syntax
|
|
python3 salt.py hash "MyPassword" # Explicit command
|
|
python3 salt2.py generate "MyPassword" # Alternative CLI
|
|
|
|
# Hash with specific algorithm
|
|
python3 salt.py hash --algorithm argon2 "MyPassword"
|
|
python3 salt.py --algorithm bcrypt "MyPassword" # Shortcut with algorithm
|
|
python3 salt2.py generate --algorithm argon2 "MyPassword"
|
|
|
|
# Verify a password
|
|
python3 salt.py verify "MyPassword" <salt_b64> <hash_b64>
|
|
python3 salt.py verify --algorithm argon2 "MyPassword" "" <hash_b64>
|
|
python3 salt2.py verify --algorithm bcrypt "MyPassword" "" <hash_b64>
|
|
|
|
# List available algorithms
|
|
python3 salt.py list-algorithms
|
|
python3 salt2.py list-algorithms
|
|
```
|
|
|
|
### Testing
|
|
|
|
```bash
|
|
# Run all tests (28 tests total as of now)
|
|
python3 -m pytest
|
|
|
|
# Verbose output
|
|
python3 -m pytest -v
|
|
|
|
# Run specific test files
|
|
python3 -m pytest tests/test_hashing.py
|
|
python3 -m pytest tests/test_cli.py
|
|
python3 -m pytest tests/test_algorithms.py
|
|
python3 -m pytest tests/test_integration.py
|
|
|
|
# Run specific test patterns
|
|
python3 -m pytest -k verify
|
|
python3 -m pytest -k algorithm
|
|
|
|
# Run with verbose failure details
|
|
python3 -m pytest --maxfail=1 -vv
|
|
|
|
# Run tests with coverage (if pytest-cov installed)
|
|
python3 -m pytest --cov=. --cov-report=term-missing
|
|
```
|
|
|
|
### Development Workflow
|
|
|
|
```bash
|
|
# Install dependencies
|
|
pip install -r requirements.txt
|
|
|
|
# Quick smoke test of CLI
|
|
python3 salt.py "test" && echo "salt.py works"
|
|
python3 salt2.py generate "test" && echo "salt2.py works"
|
|
|
|
# Test algorithm round-trip
|
|
salt_hash=$(python3 salt.py "secret" | tail -1 | cut -d' ' -f2)
|
|
# Extract salt and hash from output for verify command
|
|
```
|
|
|
|
## Code Organization and Patterns
|
|
|
|
### Algorithm Registry Pattern
|
|
|
|
**How it works:**
|
|
1. `algorithms.py` defines `Algorithm` Protocol with `hash()`, `verify()`, and `identifier` attribute
|
|
2. Each algorithm module creates a class implementing the protocol
|
|
3. At module bottom, algorithm imports `register_algorithm` and registers itself
|
|
4. `algorithms.py` imports all algorithm modules at bottom to trigger registration
|
|
5. Core functions use `get_algorithm(name)` to retrieve registered implementations
|
|
|
|
**Adding a new algorithm:**
|
|
1. Create `<name>_algorithm.py` implementing `Algorithm` protocol
|
|
2. Add `register_algorithm(<Name>Algorithm())` at bottom of file
|
|
3. Import module at end of `algorithms.py`: `import <name>_algorithm # noqa: E402, F401`
|
|
4. Add tests to `tests/test_algorithms.py`
|
|
5. Update CLI choices in `_build_parser()` if using explicit choices (currently uses dynamic list)
|
|
|
|
### Import Pattern for Registration
|
|
|
|
**Critical:** The algorithm modules use a specific import pattern to avoid circular dependencies:
|
|
|
|
```python
|
|
# At BOTTOM of algorithm module (e.g., pbkdf2_algorithm.py)
|
|
from algorithms import register_algorithm
|
|
|
|
# Auto-register when module is imported
|
|
register_algorithm(PBKDF2Algorithm())
|
|
```
|
|
|
|
**In algorithms.py:**
|
|
```python
|
|
# At BOTTOM of file
|
|
# Import to auto-register algorithms
|
|
import pbkdf2_algorithm # noqa: E402, F401
|
|
import argon2_algorithm # noqa: E402, F401
|
|
import bcrypt_algorithm # noqa: E402, F401
|
|
```
|
|
|
|
The `# noqa: E402, F401` comments suppress linter warnings about:
|
|
- E402: module level import not at top of file (intentional for registration)
|
|
- F401: imported but unused (side-effect import for registration)
|
|
|
|
### Salt Handling Across Algorithms
|
|
|
|
**IMPORTANT difference between algorithms:**
|
|
|
|
- **PBKDF2**: Generates separate 16-byte salt with `os.urandom()`, returns `(salt_b64, hash_b64)` where both are populated
|
|
- **Argon2**: Embeds salt in hash string (e.g., `$argon2id$v=19$m=65536,t=3,p=4$...`), returns `("", hash_b64)` with empty salt
|
|
- **bcrypt**: Embeds salt in hash string (e.g., `$2b$12$...`), returns `("", hash_b64)` with empty salt
|
|
|
|
When verifying Argon2/bcrypt passwords, the `salt_b64` parameter is ignored since the salt is embedded in `hash_b64`.
|
|
|
|
### CLI Argument Normalization (salt.py only)
|
|
|
|
**`salt.py` has special shortcut syntax via `_normalize_args()`:**
|
|
|
|
```python
|
|
# These are equivalent:
|
|
python3 salt.py "mypassword"
|
|
python3 salt.py hash "mypassword"
|
|
|
|
# Algorithm shortcut also works:
|
|
python3 salt.py --algorithm argon2 "mypassword"
|
|
# Becomes: python3 salt.py hash --algorithm argon2 "mypassword"
|
|
```
|
|
|
|
**Normalization rules:**
|
|
1. If first arg is `hash`, `verify`, or `list-algorithms` → pass through unchanged
|
|
2. If first arg is `-h` or `--help` → pass through unchanged
|
|
3. If first arg is `--algorithm` or `-a` → prepend `hash`
|
|
4. Otherwise (plain password) → prepend `hash`
|
|
|
|
**Tested in:** `tests/test_cli.py::test_main_supports_hash_shortcut()` and `test_main_hash_algorithm_shortcut()`
|
|
|
|
### Core Functions
|
|
|
|
**`hash_password(password, *, algorithm="pbkdf2", iterations=None, salt_bytes=16) -> tuple[str, str]`**
|
|
- Located in `salt.py` lines 19-30
|
|
- Returns `(salt_b64, hash_b64)` tuple
|
|
- For Argon2/bcrypt: salt_b64 will be empty string
|
|
- Delegates to algorithm implementation via `get_algorithm(algorithm).hash(...)`
|
|
|
|
**`verify_password(password, salt_b64, hash_b64, *, algorithm="pbkdf2", iterations=None) -> bool`**
|
|
- Located in `salt.py` lines 33-45
|
|
- Returns `True` on match, `False` on mismatch or invalid input
|
|
- Never raises exceptions (catches `binascii.Error`, `ValueError`)
|
|
- Uses `hmac.compare_digest()` for timing-safe comparison (PBKDF2)
|
|
- Delegates to algorithm implementation via `get_algorithm(algorithm).verify(...)`
|
|
|
|
## Naming Conventions and Style
|
|
|
|
### Python Style
|
|
- **Target version:** Python 3.11+ (using `|` union types, structural pattern matching potential)
|
|
- **Actual version in use:** Python 3.12.3 (per pytest output)
|
|
- **Indentation:** 4 spaces (never tabs)
|
|
- **Type hints:** Required for all public functions, use `from __future__ import annotations` for forward refs
|
|
- **Function/variable names:** `lowercase_with_underscores`
|
|
- **Class names:** `CapWords` (e.g., `PBKDF2Algorithm`, `Argon2Algorithm`)
|
|
- **Private/internal functions:** Prefix with `_` (e.g., `_build_parser()`, `_normalize_args()`)
|
|
|
|
### Naming Patterns Observed
|
|
|
|
**Module names:** `<algorithm>_algorithm.py` (e.g., `pbkdf2_algorithm.py`)
|
|
**Class names:** `<Algorithm>Algorithm` (e.g., `PBKDF2Algorithm`, `BcryptAlgorithm`)
|
|
**Test files:** `test_<feature>.py` (e.g., `test_hashing.py`, `test_algorithms.py`)
|
|
**Test functions:** `test_<behavior>_<context>()` with docstrings explaining intent
|
|
|
|
### Constants
|
|
```python
|
|
DEFAULT_SALT_BYTES = 16
|
|
DEFAULT_ITERATIONS = int(os.environ.get("PBKDF2_ITERATIONS", "200000"))
|
|
```
|
|
|
|
### Import Order
|
|
1. `from __future__ import annotations` (always first if present)
|
|
2. Standard library imports
|
|
3. Third-party imports (argon2, bcrypt)
|
|
4. Local imports (at bottom for registration side-effects)
|
|
|
|
## Testing Guidelines
|
|
|
|
### Test Structure
|
|
|
|
**Coverage goal:** >90% for hashing utilities
|
|
|
|
**Test organization:**
|
|
- `test_hashing.py`: Core `hash_password()` and `verify_password()` functions
|
|
- `test_cli.py`: CLI behavior for `salt.py` (including shortcut syntax)
|
|
- `test_cli2.py`: CLI behavior for `salt2.py`
|
|
- `test_algorithms.py`: Individual algorithm implementations
|
|
- `test_integration.py`: End-to-end integration tests for CLI commands
|
|
|
|
### Testing Patterns
|
|
|
|
**Round-trip tests (most important):**
|
|
```python
|
|
def test_<algo>_algorithm_hash_round_trip():
|
|
"""Verify <algo> algorithm can hash and verify passwords."""
|
|
algo = get_algorithm("<algo>")
|
|
salt, hashed = algo.hash("test password")
|
|
assert algo.verify("test password", salt, hashed)
|
|
assert not algo.verify("wrong password", salt, hashed)
|
|
```
|
|
|
|
**Property tests:**
|
|
- Base64 validation for outputs (see `test_hash_password_returns_base64`)
|
|
- Invalid input handling (see `test_verify_password_handles_invalid_base64`)
|
|
- Algorithm-specific behavior (see `test_pbkdf2_algorithm_respects_iterations`)
|
|
|
|
**CLI tests:**
|
|
```python
|
|
def test_main_<command>_<scenario>():
|
|
"""Test description."""
|
|
assert main([<args>]) == <expected_exit_code>
|
|
```
|
|
|
|
Exit codes: 0 = success, 1 = verification failure
|
|
|
|
### Test Execution Notes
|
|
|
|
- 28 tests total (as of current state)
|
|
- Test execution time: ~7 seconds (mostly from Argon2/bcrypt hashing)
|
|
- All tests must pass before committing
|
|
- Use `pytest -v` for verbose output
|
|
- Use `pytest -k <pattern>` to run subset
|
|
|
|
### Adding Tests for New Algorithms
|
|
|
|
When adding a new algorithm, add these tests to `tests/test_algorithms.py`:
|
|
|
|
```python
|
|
def test_<algo>_algorithm_hash_round_trip():
|
|
"""Verify <algo> algorithm can hash and verify passwords."""
|
|
algo = get_algorithm("<algo>")
|
|
salt, hashed = algo.hash("test password")
|
|
assert algo.verify("test password", salt, hashed)
|
|
assert not algo.verify("wrong password", salt, hashed)
|
|
|
|
def test_<algo>_algorithm_identifier():
|
|
"""Verify <algo> algorithm has correct identifier."""
|
|
algo = get_algorithm("<algo>")
|
|
assert algo.identifier == "<algo>"
|
|
```
|
|
|
|
And to `tests/test_cli.py` or `tests/test_cli2.py`:
|
|
```python
|
|
def test_main_hash_with_algorithm_<algo>():
|
|
"""Test hash command with --algorithm <algo>."""
|
|
assert main(["hash", "--algorithm", "<algo>", "test-password"]) == 0
|
|
```
|
|
|
|
## Important Gotchas and Non-Obvious Patterns
|
|
|
|
### 1. Algorithm Registration Order Matters
|
|
|
|
The imports at the bottom of `algorithms.py` trigger registration. If you import `algorithms` module before the algorithm modules are imported, the registry will be empty. This is handled correctly in the current code, but be careful when refactoring.
|
|
|
|
### 2. CLI Output to Stdout (Tests Must Capture)
|
|
|
|
The CLI functions print to stdout. Tests use `main([...])` to check exit codes but don't capture output. If you need to test output content, use `capsys` fixture. Subprocess-based integration tests also capture stdout.
|
|
|
|
```python
|
|
def test_output_format(capsys):
|
|
main(["hash", "test"])
|
|
captured = capsys.readouterr()
|
|
assert "Salt:" in captured.out
|
|
assert "Hash:" in captured.out
|
|
```
|
|
|
|
### 3. Empty Salt for Argon2/bcrypt
|
|
|
|
When working with Argon2 or bcrypt hashes:
|
|
- The first element of the tuple is always `""` (empty string)
|
|
- The second element contains the full hash with embedded salt
|
|
- When verifying, pass `""` for salt or any string (it's ignored)
|
|
- Tests should handle this: `salt, hashed = algo.hash("pw")` where `salt == ""`
|
|
|
|
### 4. Environment Variable for Iterations
|
|
|
|
`PBKDF2_ITERATIONS` environment variable affects default behavior. Tests should be aware this can change behavior:
|
|
```python
|
|
DEFAULT_ITERATIONS = int(os.environ.get("PBKDF2_ITERATIONS", "200000"))
|
|
```
|
|
|
|
### 5. German UI Text
|
|
|
|
CLI output and help text are in German:
|
|
- `"✓ Passwort korrekt"` / `"✗ Passwort falsch"`
|
|
- Help text uses German descriptions
|
|
- When adding CLI features, maintain German for user-facing text
|
|
|
|
### 6. Exception Handling in Verify
|
|
|
|
`verify_password()` and algorithm implementations catch exceptions and return `False` rather than raising. This is intentional for security to avoid leaking information via exception types.
|
|
|
|
### 7. Dynamic Algorithm List
|
|
|
|
The CLI parser uses `list_algorithms()` to populate choices dynamically:
|
|
```python
|
|
hash_parser.add_argument(
|
|
"--algorithm",
|
|
"-a",
|
|
choices=list_algorithms(), # Dynamic based on registered algorithms
|
|
default="pbkdf2",
|
|
help="Hash-Algorithmus (Standard: pbkdf2)",
|
|
)
|
|
```
|
|
|
|
This means adding a new algorithm automatically adds it to CLI choices.
|
|
|
|
### 8. Relative Imports vs Absolute Imports
|
|
|
|
The code uses **absolute imports** (not relative):
|
|
```python
|
|
from algorithms import get_algorithm # Not: from .algorithms import get_algorithm
|
|
```
|
|
|
|
This works because modules are in the root directory, not a package. Keep this pattern consistent.
|
|
|
|
## Security Practices
|
|
|
|
### Cryptographic Randomness
|
|
- Always use `os.urandom()` for salt generation (never `random` module)
|
|
|
|
### Timing Attack Protection
|
|
- Use `hmac.compare_digest()` for hash comparison (PBKDF2)
|
|
- Argon2 and bcrypt libraries handle timing safety internally
|
|
|
|
### Input Validation
|
|
- Use `base64.b64decode(data, validate=True)` to validate base64 format
|
|
- Catch and handle `binascii.Error` and `ValueError`
|
|
- Return `False` rather than raising exceptions in verify functions
|
|
|
|
### Iteration Count
|
|
- Default 200,000 for PBKDF2 (OWASP recommended as of 2023)
|
|
- Configurable via `PBKDF2_ITERATIONS` environment variable
|
|
|
|
### No Logging of Secrets
|
|
- Never log passwords, salts, or hashes
|
|
- Print statements only in CLI code for user output
|
|
|
|
## Configuration and Environment
|
|
|
|
### Environment Variables
|
|
|
|
**`PBKDF2_ITERATIONS`**
|
|
- Default: `"200000"` (as string, converted to int)
|
|
- Used by: `pbkdf2_algorithm.py`
|
|
- Example: `export PBKDF2_ITERATIONS=300000`
|
|
|
|
### Dependencies (requirements.txt)
|
|
|
|
```
|
|
pytest>=7.4 # Testing framework
|
|
argon2-cffi>=23.1.0 # Argon2id implementation
|
|
bcrypt>=4.1.0 # bcrypt implementation
|
|
```
|
|
|
|
### Python Version Requirements
|
|
|
|
- **Minimum:** Python 3.11 (for `|` union type syntax)
|
|
- **Tested on:** Python 3.12.3
|
|
|
|
## Commit Guidelines
|
|
|
|
### Commit Message Format
|
|
|
|
Use conventional commit format with imperative mood:
|
|
```
|
|
<type>: <short description>
|
|
```
|
|
**Types in use:** `feat`, `fix`, `test`, `refactor`, `docs`.
|
|
|
|
### Before Committing
|
|
|
|
1. Run full test suite: `python3 -m pytest -v`
|
|
2. Ensure all tests pass (28/28)
|
|
3. Verify no regressions
|
|
4. Add tests for new functionality
|
|
|
|
## Working with Plans and Documentation
|
|
|
|
### Plan Execution Pattern
|
|
|
|
The repository includes a detailed plan at `docs/plans/2025-11-13-multi-algorithm-support.md`. When implementing from plans, follow the TDD approach outlined in the tasks.
|
|
|
|
### Multiple Documentation Files
|
|
- **AGENTS.md**: This file - agent/developer guidelines
|
|
- **CLAUDE.md**: German translation, more detailed code examples
|
|
- **README.md**: User-facing documentation in German
|
|
- **docs/plans/*.md**: Implementation plans with TDD steps
|
|
|
|
Keep these in sync when making significant changes.
|