12 KiB
12 KiB
Scratchpad
Goal
- Capture a compact, project-wide list of optimization candidates after a broad scan of the current FFX codebase, tooling, and requirements.
Settled
- The biggest near-term wins are in startup cost, repeated subprocess work, repeated database query patterns, and general repo hygiene.
- This list is intentionally optimization-oriented rather than bug-oriented. Some items below also improve correctness or maintainability, but they were selected because they can reduce runtime cost, operator friction, or iteration overhead.
Focused Snapshot
-
Highest-leverage application optimizations:
- Lazy-load CLI command dependencies so lightweight commands do not import most of the app.
- Collapse repeated
ffprobecalls into a single probe result per source file. - Replace
query.count()plusfirst()patterns with single-query ORM accessors. - Cache or precompile filename pattern regexes instead of scanning every pattern for every file.
- Guard logger handler installation to avoid duplicated handlers and noisy repeated setup.
-
Highest-leverage repo and workflow optimizations:
- Stop tracking nested
__pycache__output and other generated artifacts. - Consolidate setup and upgrade tooling to reduce overlapping shell-script responsibilities.
- Trim or reorganize the oversized test/combinator surface so it is easier to run, debug, and extend.
- Stop tracking nested
Optimization Candidates
- CLI startup and import cost
src/ffx/ffx.pyimports a large portion of the application at module import time, even for cheap commands such asversion,help,setup_dependencies, andupgrade.- Optimization:
- Move heavy imports into the commands that actually need them.
- Keep the CLI root importable with only core stdlib and Click dependencies.
- Expected value:
- Faster startup for scripting and tooling commands.
- Less coupling between maintenance commands and the runtime stack.
- Repeated database queries via
count()plusfirst()
- Controllers such as
src/ffx/show_controller.py,src/ffx/pattern_controller.py, andsrc/ffx/database.pyoften doq.count()and thenq.first(). - Optimization:
- Replace with
first(),one_or_none(), or existence checks that do not issue two queries. - Standardize this across all controllers.
- Replace with
- Expected value:
- Lower SQLite query volume.
- Simpler controller code.
- Filename pattern matching scales linearly across all patterns
src/ffx/pattern_controller.pyloads every pattern and runsre.searchagainst each filename on every lookup.- Optimization:
- Cache compiled regexes in process memory.
- Stop after the first intentional match instead of silently returning the last match.
- Consider explicit pattern priority if overlapping rules are valid.
- Expected value:
- Faster per-file setup when many patterns exist.
- More predictable matching behavior.
- Media probing does two separate
ffprobesubprocesses per file
src/ffx/file_properties.pycallsffprobeonce for format data and once for stream data.- Optimization:
- Use one probe call that requests both format and streams.
- Cache that result inside
FileProperties.
- Expected value:
- Less subprocess overhead.
- Faster inspect and convert flows.
- Crop detection is always a full extra ffmpeg scan
src/ffx/file_properties.pyruns a dedicatedffmpeg -vf cropdetectpass for each file when crop detection is requested.- Optimization:
- Cache crop results for repeated runs on the same source.
- Consider exposing shorter sampling windows or probe presets for large files.
- Expected value:
- Lower latency on repeated experimentation.
- Process wrapper lacks stronger execution controls
src/ffx/process.pyusesPopen(...).communicate()without timeout handling, structured error mapping, or direct missing-command handling.- Optimization:
- Add timeout support and clearer
FileNotFoundErrorhandling. - Consider
subprocess.run(..., check=False, text=True)where streaming is not required. - Centralize return/error formatting.
- Add timeout support and clearer
- Expected value:
- Better failure diagnosis.
- Cleaner process management semantics.
- Logger handlers can be added repeatedly
src/ffx/ffx.pyadds file and console handlers each invocation.- Several helper classes install
NullHandlerinstances ad hoc, for examplesrc/ffx/process.py,src/ffx/tmdb_controller.py,src/ffx/media_descriptor.py, andsrc/ffx/helper.py. - Optimization:
- Guard handler installation so each logger is configured once.
- Prefer module-level logger setup patterns over per-instance handler mutation.
- Expected value:
- Less duplicate logging.
- Lower confusion in long-running or repeatedly invoked contexts.
- Repo-local hygiene for generated Python artifacts
- The repo currently contains nested compiled artifacts under
src/ffx/__pycache__/.... .gitignoreonly ignores__pycache__at the repo root, not recursive__pycache__/.- Optimization:
- Ignore
__pycache__/recursively and clean tracked generated files. - Consider ignoring local virtualenv or other generated tool directories if they may appear in-repo later.
- Ignore
- Expected value:
- Cleaner diffs and scans.
- Lower repo noise.
- Tooling overlap and naming drift
- There are now multiple prep-related scripts:
tools/prepare.sh,tools/setup.sh, and the legacy-liketools/ffx_update.sh. - Optimization:
- Decide which scripts remain canonical.
- Replace or remove legacy wrappers once equivalent CLI commands exist.
- Keep CLI maintenance commands and shell wrappers aligned.
- Expected value:
- Less operator confusion.
- Fewer duplicated procedures to maintain.
- Placeholder UI surfaces should either ship or disappear
src/ffx/help_screen.pyandsrc/ffx/settings_screen.pyare placeholders.- Optimization:
- Either remove them from the active UI surface or complete them.
- Avoid paying ongoing maintenance cost for unfinished navigation targets.
- Expected value:
- Leaner interface.
- Lower UX ambiguity.
- Large Textual screens repeat configuration and controller loading
- Screens such as
src/ffx/media_details_screen.py,src/ffx/pattern_details_screen.py, andsrc/ffx/show_details_screen.pyrepeat setup patterns and local metadata filtering extraction. - Optimization:
- Extract a shared screen base or helper for common config/controller/bootstrap logic.
- Reduce repeated table refresh and repeated DB fetch code where possible.
- Expected value:
- Lower maintenance overhead.
- Easier UI iteration.
- Several helper functions are unfinished or dead-weight
src/ffx/helper.pycontainspermutateList(...): pass.- There are many combinator and conversion placeholders across tests and migrations.
- Optimization:
- Remove dead code, finish it, or isolate it behind a clearly dormant area.
- Avoid carrying stubbed utility surface that looks reusable but is not.
- Expected value:
- Smaller mental model.
- Less time spent re-evaluating inactive paths.
- Test suite shape is expensive to understand and likely expensive to run
- The project has a large matrix of combinator files under
src/ffx/test, several placeholderpassimplementations, and at least one suspicious filename with an embedded space: [src/ffx/test/disposition_combinator_2_3 .py](/home/osgw/.local/src/codex/ffx/src/ffx/test/disposition_combinator_2_3 .py). - Optimization:
- Consolidate combinator families.
- Add a lighter smoke-test path.
- Normalize file naming and test discovery conventions.
- Expected value:
- Faster contributor onboarding.
- Easier CI adoption later.
- Process resource limiting semantics could be clearer
src/ffx/process.pyprependsniceandcpulimitdirectly when values are set.- Optimization:
- Validate and document effective behavior for combined
nice+cpulimit. - Consider explicit no-limit vs configured-limit states in the CLI and requirements.
- Validate and document effective behavior for combined
- Expected value:
- Fewer surprises in production-like runs.
- Easier support for user-reported performance behavior.
- Import-time dependency coupling makes maintenance commands brittle
- Even after recent CLI maintenance additions, the top-level CLI module still imports most application modules before Click dispatch.
- Optimization:
- Push imports for ORM, Textual, TMDB, ffmpeg helpers, and descriptors behind the commands that actually need them.
- Expected value:
- Maintenance commands such as setup and upgrade stay usable when optional runtime dependencies are broken.
- Better separation between media runtime code and maintenance tooling.
- Regex and string utility cleanup
src/ffx/helper.pystill emits aSyntaxWarningforRICH_COLOR_PATTERN.- Optimization:
- Convert regex literals to raw strings where appropriate.
- Review filename and TMDB substitution helpers for repeated string churn.
- Expected value:
- Cleaner runtime output.
- Less warning noise during dry-run maintenance commands.
- Database startup always runs schema creation and version checks
src/ffx/database.pyrunsBase.metadata.create_all(...)and version checks every time a DB-backed context is created.- Optimization:
- Measure startup cost and consider separating bootstrapping from ordinary command execution.
- Keep schema migration/version enforcement explicit.
- Expected value:
- Faster command startup.
- Clearer operational boundaries.
Open
- Should optimization work focus first on operator-perceived latency, internal maintainability, or correctness-risk cleanup that also has performance upside?
- Is the long-term supported model still “local Linux workstation plus Textual UI,” or should optimization decisions bias toward a more scriptable/headless CLI?
Gaps Right Now
- No explicit prioritization owner or milestone for the optimization backlog.
- No benchmark or timing harness exists for startup, probe, DB, or conversion orchestration overhead.
- Repo hygiene is still mixed with generated artifacts and some clearly unfinished files.
Next
- Triage the list into quick wins, medium refactors, and long-horizon cleanup.
- Tackle the cheapest high-impact items first:
- recursive
__pycache__/ignore and cleanup, - regex raw-string warning cleanup,
count()plusfirst()query cleanup,- single-call
ffproberefactor.
- recursive
- Decide whether maintenance/tooling command imports should be split from media-runtime imports before adding more CLI maintenance surface.
Delete When
- Delete this scratchpad once the optimization backlog is either converted into issues/work items or distilled into durable project guidance.