Javanaut/ffx

Fork 0

Files

Javanaut 60ae58500a Tidy up logging and rework tests from scratch

2026-04-09 12:46:24 +02:00

11 KiB

Raw Blame History

Scratchpad

Goal

Capture a compact, project-wide list of optimization candidates after a broad scan of the current FFX codebase, tooling, and requirements.

Settled

The biggest near-term wins are in startup cost, repeated subprocess work, repeated database query patterns, and general repo hygiene.
This list is intentionally optimization-oriented rather than bug-oriented. Some items below also improve correctness or maintainability, but they were selected because they can reduce runtime cost, operator friction, or iteration overhead.
A first modern integration slice now exists under tests/integration/subtrack_mapping. Remaining test-suite cleanup is now mostly about migrating and shrinking the legacy harness surface under tests/legacy.
FFX logger setup now reuses named handlers, and fallback logger access no longer mutates handlers in ordinary constructors and helpers.

Focused Snapshot

Highest-leverage application optimizations:
- Lazy-load CLI command dependencies so lightweight commands do not import most of the app.
- Collapse repeated ffprobe calls into a single probe result per source file.
- Replace query.count() plus first() patterns with single-query ORM accessors.
- Cache or precompile filename pattern regexes instead of scanning every pattern for every file.
Highest-leverage repo and workflow optimizations:
- Consolidate setup and upgrade tooling to reduce overlapping shell-script responsibilities.
- Continue migrating the oversized legacy test/combinator surface into focused modern tests so it is easier to run, debug, and extend.

Optimization Candidates

CLI startup and import cost

src/ffx/cli.py imports a large portion of the application at module import time, even for cheap commands such as version, help, setup_dependencies, and upgrade.
Optimization:
- Move heavy imports into the commands that actually need them.
- Keep the CLI root importable with only core stdlib and Click dependencies.
Expected value:
- Faster startup for scripting and tooling commands.
- Less coupling between maintenance commands and the runtime stack.

Repeated database queries via count() plus first()

Controllers such as src/ffx/show_controller.py, src/ffx/pattern_controller.py, and src/ffx/database.py often do q.count() and then q.first().
Optimization:
- Replace with first(), one_or_none(), or existence checks that do not issue two queries.
- Standardize this across all controllers.
Expected value:
- Lower SQLite query volume.
- Simpler controller code.

Filename pattern matching scales linearly across all patterns

src/ffx/pattern_controller.py loads every pattern and runs re.search against each filename on every lookup.
Optimization:
- Cache compiled regexes in process memory.
- Stop after the first intentional match instead of silently returning the last match.
- Consider explicit pattern priority if overlapping rules are valid.
Expected value:
- Faster per-file setup when many patterns exist.
- More predictable matching behavior.

Media probing does two separate ffprobe subprocesses per file

src/ffx/file_properties.py calls ffprobe once for format data and once for stream data.
Optimization:
- Use one probe call that requests both format and streams.
- Cache that result inside FileProperties.
Expected value:
- Less subprocess overhead.
- Faster inspect and convert flows.

Crop detection is always a full extra ffmpeg scan

src/ffx/file_properties.py runs a dedicated ffmpeg -vf cropdetect pass for each file when crop detection is requested.
Optimization:
- Cache crop results for repeated runs on the same source.
- Consider exposing shorter sampling windows or probe presets for large files.
Expected value:
- Lower latency on repeated experimentation.

Process wrapper lacks stronger execution controls

src/ffx/process.py uses Popen(...).communicate() without timeout handling, structured error mapping, or direct missing-command handling.
Optimization:
- Add timeout support and clearer FileNotFoundError handling.
- Consider subprocess.run(..., check=False, text=True) where streaming is not required.
- Centralize return/error formatting.
Expected value:
- Better failure diagnosis.
- Cleaner process management semantics.

Tooling overlap and naming drift

There are still overlapping prep and setup entrypoints across tools/prepare.sh, tools/setup.sh, and newer CLI maintenance commands.
Optimization:
- Decide which scripts remain canonical.
- Replace or remove legacy wrappers once equivalent CLI commands exist.
- Keep CLI maintenance commands and shell wrappers aligned.
Expected value:
- Less operator confusion.
- Fewer duplicated procedures to maintain.

Placeholder UI surfaces should either ship or disappear

src/ffx/help_screen.py and src/ffx/settings_screen.py are placeholders.
Optimization:
- Either remove them from the active UI surface or complete them.
- Avoid paying ongoing maintenance cost for unfinished navigation targets.
Expected value:
- Leaner interface.
- Lower UX ambiguity.

Large Textual screens repeat configuration and controller loading

Screens such as src/ffx/media_details_screen.py, src/ffx/pattern_details_screen.py, and src/ffx/show_details_screen.py repeat setup patterns and local metadata filtering extraction.
Optimization:
- Extract a shared screen base or helper for common config/controller/bootstrap logic.
- Reduce repeated table refresh and repeated DB fetch code where possible.
Expected value:
- Lower maintenance overhead.
- Easier UI iteration.

Several helper functions are unfinished or dead-weight

src/ffx/helper.py contains permutateList(...): pass.
There are many combinator and conversion placeholders across tests and migrations.
Optimization:
- Remove dead code, finish it, or isolate it behind a clearly dormant area.
- Avoid carrying stubbed utility surface that looks reusable but is not.
Expected value:
- Smaller mental model.
- Less time spent re-evaluating inactive paths.

Test suite shape is expensive to understand and likely expensive to run

The project still carries a large legacy matrix of combinator files under tests/legacy, several placeholder pass implementations, and at least one suspicious filename with an embedded space: [tests/legacy/disposition_combinator_2_3 .py](/home/osgw/.local/src/codex/ffx/tests/legacy/disposition_combinator_2_3 .py).
A first focused replacement slice now exists in tests/integration/subtrack_mapping/test_cli_bundle.py, so the remaining work is migration and consolidation rather than creating the modern test shape from scratch.
Optimization:
- Continue replacing broad combinator matrices with focused parametrized integration and unit tests.
- Retire the bespoke legacy discovery and runner path once equivalent coverage exists.
- Normalize file naming and test discovery conventions.
Expected value:
- Faster contributor onboarding.
- Easier CI adoption later.

Process resource limiting semantics could be clearer

src/ffx/process.py prepends nice and cpulimit directly when values are set.
Optimization:
- Validate and document effective behavior for combined nice + cpulimit.
- Consider explicit no-limit vs configured-limit states in the CLI and requirements.
Expected value:
- Fewer surprises in production-like runs.
- Easier support for user-reported performance behavior.

Import-time dependency coupling makes maintenance commands brittle

Even after recent CLI maintenance additions, the top-level CLI module still imports most application modules before Click dispatch.
Optimization:
- Push imports for ORM, Textual, TMDB, ffmpeg helpers, and descriptors behind the commands that actually need them.
Expected value:
- Maintenance commands such as setup and upgrade stay usable when optional runtime dependencies are broken.
- Better separation between media runtime code and maintenance tooling.

Regex and string utility cleanup

src/ffx/helper.py still emits a SyntaxWarning for RICH_COLOR_PATTERN.
Optimization:
- Convert regex literals to raw strings where appropriate.
- Review filename and TMDB substitution helpers for repeated string churn.
Expected value:
- Cleaner runtime output.
- Less warning noise during dry-run maintenance commands.

Database startup always runs schema creation and version checks

src/ffx/database.py runs Base.metadata.create_all(...) and version checks every time a DB-backed context is created.
Optimization:
- Measure startup cost and consider separating bootstrapping from ordinary command execution.
- Keep schema migration/version enforcement explicit.
Expected value:
- Faster command startup.
- Clearer operational boundaries.

Open

Should optimization work focus first on operator-perceived latency, internal maintainability, or correctness-risk cleanup that also has performance upside?
Is the long-term supported model still “local Linux workstation plus Textual UI,” or should optimization decisions bias toward a more scriptable/headless CLI?

Gaps Right Now

No explicit prioritization owner or milestone for the optimization backlog.
No benchmark or timing harness exists for startup, probe, DB, or conversion orchestration overhead.
Repo hygiene is still mixed with generated artifacts and some clearly unfinished files.

Triage the list into quick wins, medium refactors, and long-horizon cleanup.
Tackle the cheapest high-impact items first:
- regex raw-string warning cleanup,
- count() plus first() query cleanup,
- single-call ffprobe refactor.
Decide whether maintenance/tooling command imports should be split from media-runtime imports before adding more CLI maintenance surface.

Delete When

Delete this scratchpad once the optimization backlog is either converted into issues/work items or distilled into durable project guidance.

11 KiB Raw Blame History

Scratchpad

Goal

Settled

Focused Snapshot

Optimization Candidates

Open

Gaps Right Now

Next

Delete When

11 KiB

Raw Blame History