This commit is contained in:
Javanaut
2026-04-09 01:13:06 +02:00
parent 5871ae30ad
commit f9c8b8ac5e

View File

@@ -1,62 +1,209 @@
<!--
# Scratchpad
Temporary information holder for the next iteration. Developers may create or
delete this file at any time. Anything durable should move into code, tests, or
canonical docs, then this file should disappear.
## Goal
Use this section for the current slice of work. It should explain what the
scratchpad is helping us move forward right now.
- Capture a compact, project-wide list of optimization candidates after a broad scan of the current FFX codebase, tooling, and requirements.
## Settled
Use this for decisions that are stable enough to guide the next steps, but are
still temporary enough to live in the scratchpad for now.
- The biggest near-term wins are in startup cost, repeated subprocess work, repeated database query patterns, and general repo hygiene.
- This list is intentionally optimization-oriented rather than bug-oriented. Some items below also improve correctness or maintainability, but they were selected because they can reduce runtime cost, operator friction, or iteration overhead.
## Focused Snapshot
Use an extra section like this only when one slice needs its own compact
summary. This is useful when a specific API, boundary, or model was recently
recreated and should be captured clearly.
- Highest-leverage application optimizations:
- Lazy-load CLI command dependencies so lightweight commands do not import most of the app.
- Collapse repeated `ffprobe` calls into a single probe result per source file.
- Replace `query.count()` plus `first()` patterns with single-query ORM accessors.
- Cache or precompile filename pattern regexes instead of scanning every pattern for every file.
- Guard logger handler installation to avoid duplicated handlers and noisy repeated setup.
- Highest-leverage repo and workflow optimizations:
- Stop tracking nested `__pycache__` output and other generated artifacts.
- Consolidate setup and upgrade tooling to reduce overlapping shell-script responsibilities.
- Trim or reorganize the oversized test/combinator surface so it is easier to run, debug, and extend.
## Optimization Candidates
1. CLI startup and import cost
- [`src/ffx/ffx.py`](/home/osgw/.local/src/codex/ffx/src/ffx/ffx.py) imports a large portion of the application at module import time, even for cheap commands such as `version`, `help`, `setup_dependencies`, and `upgrade`.
- Optimization:
- Move heavy imports into the commands that actually need them.
- Keep the CLI root importable with only core stdlib and Click dependencies.
- Expected value:
- Faster startup for scripting and tooling commands.
- Less coupling between maintenance commands and the runtime stack.
2. Repeated database queries via `count()` plus `first()`
- Controllers such as [`src/ffx/show_controller.py`](/home/osgw/.local/src/codex/ffx/src/ffx/show_controller.py), [`src/ffx/pattern_controller.py`](/home/osgw/.local/src/codex/ffx/src/ffx/pattern_controller.py), and [`src/ffx/database.py`](/home/osgw/.local/src/codex/ffx/src/ffx/database.py) often do `q.count()` and then `q.first()`.
- Optimization:
- Replace with `first()`, `one_or_none()`, or existence checks that do not issue two queries.
- Standardize this across all controllers.
- Expected value:
- Lower SQLite query volume.
- Simpler controller code.
3. Filename pattern matching scales linearly across all patterns
- [`src/ffx/pattern_controller.py`](/home/osgw/.local/src/codex/ffx/src/ffx/pattern_controller.py) loads every pattern and runs `re.search` against each filename on every lookup.
- Optimization:
- Cache compiled regexes in process memory.
- Stop after the first intentional match instead of silently returning the last match.
- Consider explicit pattern priority if overlapping rules are valid.
- Expected value:
- Faster per-file setup when many patterns exist.
- More predictable matching behavior.
4. Media probing does two separate `ffprobe` subprocesses per file
- [`src/ffx/file_properties.py`](/home/osgw/.local/src/codex/ffx/src/ffx/file_properties.py) calls `ffprobe` once for format data and once for stream data.
- Optimization:
- Use one probe call that requests both format and streams.
- Cache that result inside `FileProperties`.
- Expected value:
- Less subprocess overhead.
- Faster inspect and convert flows.
5. Crop detection is always a full extra ffmpeg scan
- [`src/ffx/file_properties.py`](/home/osgw/.local/src/codex/ffx/src/ffx/file_properties.py) runs a dedicated `ffmpeg -vf cropdetect` pass for each file when crop detection is requested.
- Optimization:
- Cache crop results for repeated runs on the same source.
- Consider exposing shorter sampling windows or probe presets for large files.
- Expected value:
- Lower latency on repeated experimentation.
6. Process wrapper lacks stronger execution controls
- [`src/ffx/process.py`](/home/osgw/.local/src/codex/ffx/src/ffx/process.py) uses `Popen(...).communicate()` without timeout handling, structured error mapping, or direct missing-command handling.
- Optimization:
- Add timeout support and clearer `FileNotFoundError` handling.
- Consider `subprocess.run(..., check=False, text=True)` where streaming is not required.
- Centralize return/error formatting.
- Expected value:
- Better failure diagnosis.
- Cleaner process management semantics.
7. Logger handlers can be added repeatedly
- [`src/ffx/ffx.py`](/home/osgw/.local/src/codex/ffx/src/ffx/ffx.py) adds file and console handlers each invocation.
- Several helper classes install `NullHandler` instances ad hoc, for example [`src/ffx/process.py`](/home/osgw/.local/src/codex/ffx/src/ffx/process.py), [`src/ffx/tmdb_controller.py`](/home/osgw/.local/src/codex/ffx/src/ffx/tmdb_controller.py), [`src/ffx/media_descriptor.py`](/home/osgw/.local/src/codex/ffx/src/ffx/media_descriptor.py), and [`src/ffx/helper.py`](/home/osgw/.local/src/codex/ffx/src/ffx/helper.py).
- Optimization:
- Guard handler installation so each logger is configured once.
- Prefer module-level logger setup patterns over per-instance handler mutation.
- Expected value:
- Less duplicate logging.
- Lower confusion in long-running or repeatedly invoked contexts.
8. Repo-local hygiene for generated Python artifacts
- The repo currently contains nested compiled artifacts under `src/ffx/__pycache__/...`.
- `.gitignore` only ignores `__pycache__` at the repo root, not recursive `__pycache__/`.
- Optimization:
- Ignore `__pycache__/` recursively and clean tracked generated files.
- Consider ignoring local virtualenv or other generated tool directories if they may appear in-repo later.
- Expected value:
- Cleaner diffs and scans.
- Lower repo noise.
9. Tooling overlap and naming drift
- There are now multiple prep-related scripts: [`tools/prepare.sh`](/home/osgw/.local/src/codex/ffx/tools/prepare.sh), [`tools/setup.sh`](/home/osgw/.local/src/codex/ffx/tools/setup.sh), and the legacy-like [`tools/ffx_update.sh`](/home/osgw/.local/src/codex/ffx/tools/ffx_update.sh).
- Optimization:
- Decide which scripts remain canonical.
- Replace or remove legacy wrappers once equivalent CLI commands exist.
- Keep CLI maintenance commands and shell wrappers aligned.
- Expected value:
- Less operator confusion.
- Fewer duplicated procedures to maintain.
10. Placeholder UI surfaces should either ship or disappear
- [`src/ffx/help_screen.py`](/home/osgw/.local/src/codex/ffx/src/ffx/help_screen.py) and [`src/ffx/settings_screen.py`](/home/osgw/.local/src/codex/ffx/src/ffx/settings_screen.py) are placeholders.
- Optimization:
- Either remove them from the active UI surface or complete them.
- Avoid paying ongoing maintenance cost for unfinished navigation targets.
- Expected value:
- Leaner interface.
- Lower UX ambiguity.
11. Large Textual screens repeat configuration and controller loading
- Screens such as [`src/ffx/media_details_screen.py`](/home/osgw/.local/src/codex/ffx/src/ffx/media_details_screen.py), [`src/ffx/pattern_details_screen.py`](/home/osgw/.local/src/codex/ffx/src/ffx/pattern_details_screen.py), and [`src/ffx/show_details_screen.py`](/home/osgw/.local/src/codex/ffx/src/ffx/show_details_screen.py) repeat setup patterns and local metadata filtering extraction.
- Optimization:
- Extract a shared screen base or helper for common config/controller/bootstrap logic.
- Reduce repeated table refresh and repeated DB fetch code where possible.
- Expected value:
- Lower maintenance overhead.
- Easier UI iteration.
12. Several helper functions are unfinished or dead-weight
- [`src/ffx/helper.py`](/home/osgw/.local/src/codex/ffx/src/ffx/helper.py) contains `permutateList(...): pass`.
- There are many combinator and conversion placeholders across tests and migrations.
- Optimization:
- Remove dead code, finish it, or isolate it behind a clearly dormant area.
- Avoid carrying stubbed utility surface that looks reusable but is not.
- Expected value:
- Smaller mental model.
- Less time spent re-evaluating inactive paths.
13. Test suite shape is expensive to understand and likely expensive to run
- The project has a large matrix of combinator files under [`src/ffx/test`](/home/osgw/.local/src/codex/ffx/src/ffx/test), several placeholder `pass` implementations, and at least one suspicious filename with an embedded space: [`src/ffx/test/disposition_combinator_2_3 .py`](/home/osgw/.local/src/codex/ffx/src/ffx/test/disposition_combinator_2_3 .py).
- Optimization:
- Consolidate combinator families.
- Add a lighter smoke-test path.
- Normalize file naming and test discovery conventions.
- Expected value:
- Faster contributor onboarding.
- Easier CI adoption later.
14. Process resource limiting semantics could be clearer
- [`src/ffx/process.py`](/home/osgw/.local/src/codex/ffx/src/ffx/process.py) prepends `nice` and `cpulimit` directly when values are set.
- Optimization:
- Validate and document effective behavior for combined `nice` + `cpulimit`.
- Consider explicit no-limit vs configured-limit states in the CLI and requirements.
- Expected value:
- Fewer surprises in production-like runs.
- Easier support for user-reported performance behavior.
15. Import-time dependency coupling makes maintenance commands brittle
- Even after recent CLI maintenance additions, the top-level CLI module still imports most application modules before Click dispatch.
- Optimization:
- Push imports for ORM, Textual, TMDB, ffmpeg helpers, and descriptors behind the commands that actually need them.
- Expected value:
- Maintenance commands such as setup and upgrade stay usable when optional runtime dependencies are broken.
- Better separation between media runtime code and maintenance tooling.
16. Regex and string utility cleanup
- [`src/ffx/helper.py`](/home/osgw/.local/src/codex/ffx/src/ffx/helper.py) still emits a `SyntaxWarning` for `RICH_COLOR_PATTERN`.
- Optimization:
- Convert regex literals to raw strings where appropriate.
- Review filename and TMDB substitution helpers for repeated string churn.
- Expected value:
- Cleaner runtime output.
- Less warning noise during dry-run maintenance commands.
17. Database startup always runs schema creation and version checks
- [`src/ffx/database.py`](/home/osgw/.local/src/codex/ffx/src/ffx/database.py) runs `Base.metadata.create_all(...)` and version checks every time a DB-backed context is created.
- Optimization:
- Measure startup cost and consider separating bootstrapping from ordinary command execution.
- Keep schema migration/version enforcement explicit.
- Expected value:
- Faster command startup.
- Clearer operational boundaries.
## Open
Use this for unresolved questions, design choices, and risks that still need a
decision.
## Sketches
Use this for rough candidate structures, names, or shapes. Keep it explicit
that these are sketches, not committed architecture.
- Should optimization work focus first on operator-perceived latency, internal maintainability, or correctness-risk cleanup that also has performance upside?
- Is the long-term supported model still “local Linux workstation plus Textual UI,” or should optimization decisions bias toward a more scriptable/headless CLI?
## Gaps Right Now
Use this for concrete missing pieces in the current repo state. This section
should describe what is absent or incomplete, not broad future ambitions.
- No explicit prioritization owner or milestone for the optimization backlog.
- No benchmark or timing harness exists for startup, probe, DB, or conversion orchestration overhead.
- Repo hygiene is still mixed with generated artifacts and some clearly unfinished files.
## Next
Use this for the immediate sequence of work. It should be short, ordered, and
biased toward the next deliverable rather than a long roadmap.
1. Triage the list into quick wins, medium refactors, and long-horizon cleanup.
2. Tackle the cheapest high-impact items first:
- recursive `__pycache__/` ignore and cleanup,
- regex raw-string warning cleanup,
- `count()` plus `first()` query cleanup,
- single-call `ffprobe` refactor.
3. Decide whether maintenance/tooling command imports should be split from media-runtime imports before adding more CLI maintenance surface.
## Delete When
Use this to define when the scratchpad should disappear. That keeps it clearly
temporary and helps prevent it from turning into shadow documentation.
## Suggested Style
- Prefer short bullets over long prose.
- Keep facts, questions, and rough sketches in separate sections.
- Add custom sections only when they help the next iteration move faster.
- Move durable outcomes out of the scratchpad once they stop being temporary.
-->
- Delete this scratchpad once the optimization backlog is either converted into issues/work items or distilled into durable project guidance.