# Scratchpad ## Goal - Capture a compact, project-wide list of optimization candidates after a broad scan of the current FFX codebase, tooling, and requirements. ## Settled - The biggest near-term wins are in startup cost, repeated subprocess work, repeated database query patterns, and general repo hygiene. - This list is intentionally optimization-oriented rather than bug-oriented. Some items below also improve correctness or maintainability, but they were selected because they can reduce runtime cost, operator friction, or iteration overhead. ## Focused Snapshot - Highest-leverage application optimizations: - Lazy-load CLI command dependencies so lightweight commands do not import most of the app. - Collapse repeated `ffprobe` calls into a single probe result per source file. - Replace `query.count()` plus `first()` patterns with single-query ORM accessors. - Cache or precompile filename pattern regexes instead of scanning every pattern for every file. - Guard logger handler installation to avoid duplicated handlers and noisy repeated setup. - Highest-leverage repo and workflow optimizations: - Stop tracking nested `__pycache__` output and other generated artifacts. - Consolidate setup and upgrade tooling to reduce overlapping shell-script responsibilities. - Trim or reorganize the oversized test/combinator surface so it is easier to run, debug, and extend. ## Optimization Candidates 1. CLI startup and import cost - [`src/ffx/ffx.py`](/home/osgw/.local/src/codex/ffx/src/ffx/ffx.py) imports a large portion of the application at module import time, even for cheap commands such as `version`, `help`, `setup_dependencies`, and `upgrade`. - Optimization: - Move heavy imports into the commands that actually need them. - Keep the CLI root importable with only core stdlib and Click dependencies. - Expected value: - Faster startup for scripting and tooling commands. - Less coupling between maintenance commands and the runtime stack. 2. Repeated database queries via `count()` plus `first()` - Controllers such as [`src/ffx/show_controller.py`](/home/osgw/.local/src/codex/ffx/src/ffx/show_controller.py), [`src/ffx/pattern_controller.py`](/home/osgw/.local/src/codex/ffx/src/ffx/pattern_controller.py), and [`src/ffx/database.py`](/home/osgw/.local/src/codex/ffx/src/ffx/database.py) often do `q.count()` and then `q.first()`. - Optimization: - Replace with `first()`, `one_or_none()`, or existence checks that do not issue two queries. - Standardize this across all controllers. - Expected value: - Lower SQLite query volume. - Simpler controller code. 3. Filename pattern matching scales linearly across all patterns - [`src/ffx/pattern_controller.py`](/home/osgw/.local/src/codex/ffx/src/ffx/pattern_controller.py) loads every pattern and runs `re.search` against each filename on every lookup. - Optimization: - Cache compiled regexes in process memory. - Stop after the first intentional match instead of silently returning the last match. - Consider explicit pattern priority if overlapping rules are valid. - Expected value: - Faster per-file setup when many patterns exist. - More predictable matching behavior. 4. Media probing does two separate `ffprobe` subprocesses per file - [`src/ffx/file_properties.py`](/home/osgw/.local/src/codex/ffx/src/ffx/file_properties.py) calls `ffprobe` once for format data and once for stream data. - Optimization: - Use one probe call that requests both format and streams. - Cache that result inside `FileProperties`. - Expected value: - Less subprocess overhead. - Faster inspect and convert flows. 5. Crop detection is always a full extra ffmpeg scan - [`src/ffx/file_properties.py`](/home/osgw/.local/src/codex/ffx/src/ffx/file_properties.py) runs a dedicated `ffmpeg -vf cropdetect` pass for each file when crop detection is requested. - Optimization: - Cache crop results for repeated runs on the same source. - Consider exposing shorter sampling windows or probe presets for large files. - Expected value: - Lower latency on repeated experimentation. 6. Process wrapper lacks stronger execution controls - [`src/ffx/process.py`](/home/osgw/.local/src/codex/ffx/src/ffx/process.py) uses `Popen(...).communicate()` without timeout handling, structured error mapping, or direct missing-command handling. - Optimization: - Add timeout support and clearer `FileNotFoundError` handling. - Consider `subprocess.run(..., check=False, text=True)` where streaming is not required. - Centralize return/error formatting. - Expected value: - Better failure diagnosis. - Cleaner process management semantics. 7. Logger handlers can be added repeatedly - [`src/ffx/ffx.py`](/home/osgw/.local/src/codex/ffx/src/ffx/ffx.py) adds file and console handlers each invocation. - Several helper classes install `NullHandler` instances ad hoc, for example [`src/ffx/process.py`](/home/osgw/.local/src/codex/ffx/src/ffx/process.py), [`src/ffx/tmdb_controller.py`](/home/osgw/.local/src/codex/ffx/src/ffx/tmdb_controller.py), [`src/ffx/media_descriptor.py`](/home/osgw/.local/src/codex/ffx/src/ffx/media_descriptor.py), and [`src/ffx/helper.py`](/home/osgw/.local/src/codex/ffx/src/ffx/helper.py). - Optimization: - Guard handler installation so each logger is configured once. - Prefer module-level logger setup patterns over per-instance handler mutation. - Expected value: - Less duplicate logging. - Lower confusion in long-running or repeatedly invoked contexts. 8. Repo-local hygiene for generated Python artifacts - The repo currently contains nested compiled artifacts under `src/ffx/__pycache__/...`. - `.gitignore` only ignores `__pycache__` at the repo root, not recursive `__pycache__/`. - Optimization: - Ignore `__pycache__/` recursively and clean tracked generated files. - Consider ignoring local virtualenv or other generated tool directories if they may appear in-repo later. - Expected value: - Cleaner diffs and scans. - Lower repo noise. 9. Tooling overlap and naming drift - There are now multiple prep-related scripts: [`tools/prepare.sh`](/home/osgw/.local/src/codex/ffx/tools/prepare.sh), [`tools/setup.sh`](/home/osgw/.local/src/codex/ffx/tools/setup.sh), and the legacy-like [`tools/ffx_update.sh`](/home/osgw/.local/src/codex/ffx/tools/ffx_update.sh). - Optimization: - Decide which scripts remain canonical. - Replace or remove legacy wrappers once equivalent CLI commands exist. - Keep CLI maintenance commands and shell wrappers aligned. - Expected value: - Less operator confusion. - Fewer duplicated procedures to maintain. 10. Placeholder UI surfaces should either ship or disappear - [`src/ffx/help_screen.py`](/home/osgw/.local/src/codex/ffx/src/ffx/help_screen.py) and [`src/ffx/settings_screen.py`](/home/osgw/.local/src/codex/ffx/src/ffx/settings_screen.py) are placeholders. - Optimization: - Either remove them from the active UI surface or complete them. - Avoid paying ongoing maintenance cost for unfinished navigation targets. - Expected value: - Leaner interface. - Lower UX ambiguity. 11. Large Textual screens repeat configuration and controller loading - Screens such as [`src/ffx/media_details_screen.py`](/home/osgw/.local/src/codex/ffx/src/ffx/media_details_screen.py), [`src/ffx/pattern_details_screen.py`](/home/osgw/.local/src/codex/ffx/src/ffx/pattern_details_screen.py), and [`src/ffx/show_details_screen.py`](/home/osgw/.local/src/codex/ffx/src/ffx/show_details_screen.py) repeat setup patterns and local metadata filtering extraction. - Optimization: - Extract a shared screen base or helper for common config/controller/bootstrap logic. - Reduce repeated table refresh and repeated DB fetch code where possible. - Expected value: - Lower maintenance overhead. - Easier UI iteration. 12. Several helper functions are unfinished or dead-weight - [`src/ffx/helper.py`](/home/osgw/.local/src/codex/ffx/src/ffx/helper.py) contains `permutateList(...): pass`. - There are many combinator and conversion placeholders across tests and migrations. - Optimization: - Remove dead code, finish it, or isolate it behind a clearly dormant area. - Avoid carrying stubbed utility surface that looks reusable but is not. - Expected value: - Smaller mental model. - Less time spent re-evaluating inactive paths. 13. Test suite shape is expensive to understand and likely expensive to run - The project has a large matrix of combinator files under [`src/ffx/test`](/home/osgw/.local/src/codex/ffx/src/ffx/test), several placeholder `pass` implementations, and at least one suspicious filename with an embedded space: [`src/ffx/test/disposition_combinator_2_3 .py`](/home/osgw/.local/src/codex/ffx/src/ffx/test/disposition_combinator_2_3 .py). - Optimization: - Consolidate combinator families. - Add a lighter smoke-test path. - Normalize file naming and test discovery conventions. - Expected value: - Faster contributor onboarding. - Easier CI adoption later. 14. Process resource limiting semantics could be clearer - [`src/ffx/process.py`](/home/osgw/.local/src/codex/ffx/src/ffx/process.py) prepends `nice` and `cpulimit` directly when values are set. - Optimization: - Validate and document effective behavior for combined `nice` + `cpulimit`. - Consider explicit no-limit vs configured-limit states in the CLI and requirements. - Expected value: - Fewer surprises in production-like runs. - Easier support for user-reported performance behavior. 15. Import-time dependency coupling makes maintenance commands brittle - Even after recent CLI maintenance additions, the top-level CLI module still imports most application modules before Click dispatch. - Optimization: - Push imports for ORM, Textual, TMDB, ffmpeg helpers, and descriptors behind the commands that actually need them. - Expected value: - Maintenance commands such as setup and upgrade stay usable when optional runtime dependencies are broken. - Better separation between media runtime code and maintenance tooling. 16. Regex and string utility cleanup - [`src/ffx/helper.py`](/home/osgw/.local/src/codex/ffx/src/ffx/helper.py) still emits a `SyntaxWarning` for `RICH_COLOR_PATTERN`. - Optimization: - Convert regex literals to raw strings where appropriate. - Review filename and TMDB substitution helpers for repeated string churn. - Expected value: - Cleaner runtime output. - Less warning noise during dry-run maintenance commands. 17. Database startup always runs schema creation and version checks - [`src/ffx/database.py`](/home/osgw/.local/src/codex/ffx/src/ffx/database.py) runs `Base.metadata.create_all(...)` and version checks every time a DB-backed context is created. - Optimization: - Measure startup cost and consider separating bootstrapping from ordinary command execution. - Keep schema migration/version enforcement explicit. - Expected value: - Faster command startup. - Clearer operational boundaries. ## Open - Should optimization work focus first on operator-perceived latency, internal maintainability, or correctness-risk cleanup that also has performance upside? - Is the long-term supported model still “local Linux workstation plus Textual UI,” or should optimization decisions bias toward a more scriptable/headless CLI? ## Gaps Right Now - No explicit prioritization owner or milestone for the optimization backlog. - No benchmark or timing harness exists for startup, probe, DB, or conversion orchestration overhead. - Repo hygiene is still mixed with generated artifacts and some clearly unfinished files. ## Next 1. Triage the list into quick wins, medium refactors, and long-horizon cleanup. 2. Tackle the cheapest high-impact items first: - recursive `__pycache__/` ignore and cleanup, - regex raw-string warning cleanup, - `count()` plus `first()` query cleanup, - single-call `ffprobe` refactor. 3. Decide whether maintenance/tooling command imports should be split from media-runtime imports before adding more CLI maintenance surface. ## Delete When - Delete this scratchpad once the optimization backlog is either converted into issues/work items or distilled into durable project guidance.