ffn2
This commit is contained in:
225
SCRATCHPAD.md
225
SCRATCHPAD.md
@@ -1,62 +1,209 @@
|
||||
<!--
|
||||
|
||||
# Scratchpad
|
||||
|
||||
Temporary information holder for the next iteration. Developers may create or
|
||||
delete this file at any time. Anything durable should move into code, tests, or
|
||||
canonical docs, then this file should disappear.
|
||||
|
||||
|
||||
## Goal
|
||||
|
||||
Use this section for the current slice of work. It should explain what the
|
||||
scratchpad is helping us move forward right now.
|
||||
- Capture a compact, project-wide list of optimization candidates after a broad scan of the current FFX codebase, tooling, and requirements.
|
||||
|
||||
## Settled
|
||||
|
||||
Use this for decisions that are stable enough to guide the next steps, but are
|
||||
still temporary enough to live in the scratchpad for now.
|
||||
- The biggest near-term wins are in startup cost, repeated subprocess work, repeated database query patterns, and general repo hygiene.
|
||||
- This list is intentionally optimization-oriented rather than bug-oriented. Some items below also improve correctness or maintainability, but they were selected because they can reduce runtime cost, operator friction, or iteration overhead.
|
||||
|
||||
## Focused Snapshot
|
||||
|
||||
Use an extra section like this only when one slice needs its own compact
|
||||
summary. This is useful when a specific API, boundary, or model was recently
|
||||
recreated and should be captured clearly.
|
||||
- Highest-leverage application optimizations:
|
||||
- Lazy-load CLI command dependencies so lightweight commands do not import most of the app.
|
||||
- Collapse repeated `ffprobe` calls into a single probe result per source file.
|
||||
- Replace `query.count()` plus `first()` patterns with single-query ORM accessors.
|
||||
- Cache or precompile filename pattern regexes instead of scanning every pattern for every file.
|
||||
- Guard logger handler installation to avoid duplicated handlers and noisy repeated setup.
|
||||
|
||||
- Highest-leverage repo and workflow optimizations:
|
||||
- Stop tracking nested `__pycache__` output and other generated artifacts.
|
||||
- Consolidate setup and upgrade tooling to reduce overlapping shell-script responsibilities.
|
||||
- Trim or reorganize the oversized test/combinator surface so it is easier to run, debug, and extend.
|
||||
|
||||
## Optimization Candidates
|
||||
|
||||
1. CLI startup and import cost
|
||||
- [`src/ffx/ffx.py`](/home/osgw/.local/src/codex/ffx/src/ffx/ffx.py) imports a large portion of the application at module import time, even for cheap commands such as `version`, `help`, `setup_dependencies`, and `upgrade`.
|
||||
- Optimization:
|
||||
- Move heavy imports into the commands that actually need them.
|
||||
- Keep the CLI root importable with only core stdlib and Click dependencies.
|
||||
- Expected value:
|
||||
- Faster startup for scripting and tooling commands.
|
||||
- Less coupling between maintenance commands and the runtime stack.
|
||||
|
||||
2. Repeated database queries via `count()` plus `first()`
|
||||
- Controllers such as [`src/ffx/show_controller.py`](/home/osgw/.local/src/codex/ffx/src/ffx/show_controller.py), [`src/ffx/pattern_controller.py`](/home/osgw/.local/src/codex/ffx/src/ffx/pattern_controller.py), and [`src/ffx/database.py`](/home/osgw/.local/src/codex/ffx/src/ffx/database.py) often do `q.count()` and then `q.first()`.
|
||||
- Optimization:
|
||||
- Replace with `first()`, `one_or_none()`, or existence checks that do not issue two queries.
|
||||
- Standardize this across all controllers.
|
||||
- Expected value:
|
||||
- Lower SQLite query volume.
|
||||
- Simpler controller code.
|
||||
|
||||
3. Filename pattern matching scales linearly across all patterns
|
||||
- [`src/ffx/pattern_controller.py`](/home/osgw/.local/src/codex/ffx/src/ffx/pattern_controller.py) loads every pattern and runs `re.search` against each filename on every lookup.
|
||||
- Optimization:
|
||||
- Cache compiled regexes in process memory.
|
||||
- Stop after the first intentional match instead of silently returning the last match.
|
||||
- Consider explicit pattern priority if overlapping rules are valid.
|
||||
- Expected value:
|
||||
- Faster per-file setup when many patterns exist.
|
||||
- More predictable matching behavior.
|
||||
|
||||
4. Media probing does two separate `ffprobe` subprocesses per file
|
||||
- [`src/ffx/file_properties.py`](/home/osgw/.local/src/codex/ffx/src/ffx/file_properties.py) calls `ffprobe` once for format data and once for stream data.
|
||||
- Optimization:
|
||||
- Use one probe call that requests both format and streams.
|
||||
- Cache that result inside `FileProperties`.
|
||||
- Expected value:
|
||||
- Less subprocess overhead.
|
||||
- Faster inspect and convert flows.
|
||||
|
||||
5. Crop detection is always a full extra ffmpeg scan
|
||||
- [`src/ffx/file_properties.py`](/home/osgw/.local/src/codex/ffx/src/ffx/file_properties.py) runs a dedicated `ffmpeg -vf cropdetect` pass for each file when crop detection is requested.
|
||||
- Optimization:
|
||||
- Cache crop results for repeated runs on the same source.
|
||||
- Consider exposing shorter sampling windows or probe presets for large files.
|
||||
- Expected value:
|
||||
- Lower latency on repeated experimentation.
|
||||
|
||||
6. Process wrapper lacks stronger execution controls
|
||||
- [`src/ffx/process.py`](/home/osgw/.local/src/codex/ffx/src/ffx/process.py) uses `Popen(...).communicate()` without timeout handling, structured error mapping, or direct missing-command handling.
|
||||
- Optimization:
|
||||
- Add timeout support and clearer `FileNotFoundError` handling.
|
||||
- Consider `subprocess.run(..., check=False, text=True)` where streaming is not required.
|
||||
- Centralize return/error formatting.
|
||||
- Expected value:
|
||||
- Better failure diagnosis.
|
||||
- Cleaner process management semantics.
|
||||
|
||||
7. Logger handlers can be added repeatedly
|
||||
- [`src/ffx/ffx.py`](/home/osgw/.local/src/codex/ffx/src/ffx/ffx.py) adds file and console handlers each invocation.
|
||||
- Several helper classes install `NullHandler` instances ad hoc, for example [`src/ffx/process.py`](/home/osgw/.local/src/codex/ffx/src/ffx/process.py), [`src/ffx/tmdb_controller.py`](/home/osgw/.local/src/codex/ffx/src/ffx/tmdb_controller.py), [`src/ffx/media_descriptor.py`](/home/osgw/.local/src/codex/ffx/src/ffx/media_descriptor.py), and [`src/ffx/helper.py`](/home/osgw/.local/src/codex/ffx/src/ffx/helper.py).
|
||||
- Optimization:
|
||||
- Guard handler installation so each logger is configured once.
|
||||
- Prefer module-level logger setup patterns over per-instance handler mutation.
|
||||
- Expected value:
|
||||
- Less duplicate logging.
|
||||
- Lower confusion in long-running or repeatedly invoked contexts.
|
||||
|
||||
8. Repo-local hygiene for generated Python artifacts
|
||||
- The repo currently contains nested compiled artifacts under `src/ffx/__pycache__/...`.
|
||||
- `.gitignore` only ignores `__pycache__` at the repo root, not recursive `__pycache__/`.
|
||||
- Optimization:
|
||||
- Ignore `__pycache__/` recursively and clean tracked generated files.
|
||||
- Consider ignoring local virtualenv or other generated tool directories if they may appear in-repo later.
|
||||
- Expected value:
|
||||
- Cleaner diffs and scans.
|
||||
- Lower repo noise.
|
||||
|
||||
9. Tooling overlap and naming drift
|
||||
- There are now multiple prep-related scripts: [`tools/prepare.sh`](/home/osgw/.local/src/codex/ffx/tools/prepare.sh), [`tools/setup.sh`](/home/osgw/.local/src/codex/ffx/tools/setup.sh), and the legacy-like [`tools/ffx_update.sh`](/home/osgw/.local/src/codex/ffx/tools/ffx_update.sh).
|
||||
- Optimization:
|
||||
- Decide which scripts remain canonical.
|
||||
- Replace or remove legacy wrappers once equivalent CLI commands exist.
|
||||
- Keep CLI maintenance commands and shell wrappers aligned.
|
||||
- Expected value:
|
||||
- Less operator confusion.
|
||||
- Fewer duplicated procedures to maintain.
|
||||
|
||||
10. Placeholder UI surfaces should either ship or disappear
|
||||
- [`src/ffx/help_screen.py`](/home/osgw/.local/src/codex/ffx/src/ffx/help_screen.py) and [`src/ffx/settings_screen.py`](/home/osgw/.local/src/codex/ffx/src/ffx/settings_screen.py) are placeholders.
|
||||
- Optimization:
|
||||
- Either remove them from the active UI surface or complete them.
|
||||
- Avoid paying ongoing maintenance cost for unfinished navigation targets.
|
||||
- Expected value:
|
||||
- Leaner interface.
|
||||
- Lower UX ambiguity.
|
||||
|
||||
11. Large Textual screens repeat configuration and controller loading
|
||||
- Screens such as [`src/ffx/media_details_screen.py`](/home/osgw/.local/src/codex/ffx/src/ffx/media_details_screen.py), [`src/ffx/pattern_details_screen.py`](/home/osgw/.local/src/codex/ffx/src/ffx/pattern_details_screen.py), and [`src/ffx/show_details_screen.py`](/home/osgw/.local/src/codex/ffx/src/ffx/show_details_screen.py) repeat setup patterns and local metadata filtering extraction.
|
||||
- Optimization:
|
||||
- Extract a shared screen base or helper for common config/controller/bootstrap logic.
|
||||
- Reduce repeated table refresh and repeated DB fetch code where possible.
|
||||
- Expected value:
|
||||
- Lower maintenance overhead.
|
||||
- Easier UI iteration.
|
||||
|
||||
12. Several helper functions are unfinished or dead-weight
|
||||
- [`src/ffx/helper.py`](/home/osgw/.local/src/codex/ffx/src/ffx/helper.py) contains `permutateList(...): pass`.
|
||||
- There are many combinator and conversion placeholders across tests and migrations.
|
||||
- Optimization:
|
||||
- Remove dead code, finish it, or isolate it behind a clearly dormant area.
|
||||
- Avoid carrying stubbed utility surface that looks reusable but is not.
|
||||
- Expected value:
|
||||
- Smaller mental model.
|
||||
- Less time spent re-evaluating inactive paths.
|
||||
|
||||
13. Test suite shape is expensive to understand and likely expensive to run
|
||||
- The project has a large matrix of combinator files under [`src/ffx/test`](/home/osgw/.local/src/codex/ffx/src/ffx/test), several placeholder `pass` implementations, and at least one suspicious filename with an embedded space: [`src/ffx/test/disposition_combinator_2_3 .py`](/home/osgw/.local/src/codex/ffx/src/ffx/test/disposition_combinator_2_3 .py).
|
||||
- Optimization:
|
||||
- Consolidate combinator families.
|
||||
- Add a lighter smoke-test path.
|
||||
- Normalize file naming and test discovery conventions.
|
||||
- Expected value:
|
||||
- Faster contributor onboarding.
|
||||
- Easier CI adoption later.
|
||||
|
||||
14. Process resource limiting semantics could be clearer
|
||||
- [`src/ffx/process.py`](/home/osgw/.local/src/codex/ffx/src/ffx/process.py) prepends `nice` and `cpulimit` directly when values are set.
|
||||
- Optimization:
|
||||
- Validate and document effective behavior for combined `nice` + `cpulimit`.
|
||||
- Consider explicit no-limit vs configured-limit states in the CLI and requirements.
|
||||
- Expected value:
|
||||
- Fewer surprises in production-like runs.
|
||||
- Easier support for user-reported performance behavior.
|
||||
|
||||
15. Import-time dependency coupling makes maintenance commands brittle
|
||||
- Even after recent CLI maintenance additions, the top-level CLI module still imports most application modules before Click dispatch.
|
||||
- Optimization:
|
||||
- Push imports for ORM, Textual, TMDB, ffmpeg helpers, and descriptors behind the commands that actually need them.
|
||||
- Expected value:
|
||||
- Maintenance commands such as setup and upgrade stay usable when optional runtime dependencies are broken.
|
||||
- Better separation between media runtime code and maintenance tooling.
|
||||
|
||||
16. Regex and string utility cleanup
|
||||
- [`src/ffx/helper.py`](/home/osgw/.local/src/codex/ffx/src/ffx/helper.py) still emits a `SyntaxWarning` for `RICH_COLOR_PATTERN`.
|
||||
- Optimization:
|
||||
- Convert regex literals to raw strings where appropriate.
|
||||
- Review filename and TMDB substitution helpers for repeated string churn.
|
||||
- Expected value:
|
||||
- Cleaner runtime output.
|
||||
- Less warning noise during dry-run maintenance commands.
|
||||
|
||||
17. Database startup always runs schema creation and version checks
|
||||
- [`src/ffx/database.py`](/home/osgw/.local/src/codex/ffx/src/ffx/database.py) runs `Base.metadata.create_all(...)` and version checks every time a DB-backed context is created.
|
||||
- Optimization:
|
||||
- Measure startup cost and consider separating bootstrapping from ordinary command execution.
|
||||
- Keep schema migration/version enforcement explicit.
|
||||
- Expected value:
|
||||
- Faster command startup.
|
||||
- Clearer operational boundaries.
|
||||
|
||||
## Open
|
||||
|
||||
Use this for unresolved questions, design choices, and risks that still need a
|
||||
decision.
|
||||
|
||||
## Sketches
|
||||
|
||||
Use this for rough candidate structures, names, or shapes. Keep it explicit
|
||||
that these are sketches, not committed architecture.
|
||||
|
||||
- Should optimization work focus first on operator-perceived latency, internal maintainability, or correctness-risk cleanup that also has performance upside?
|
||||
- Is the long-term supported model still “local Linux workstation plus Textual UI,” or should optimization decisions bias toward a more scriptable/headless CLI?
|
||||
|
||||
## Gaps Right Now
|
||||
|
||||
Use this for concrete missing pieces in the current repo state. This section
|
||||
should describe what is absent or incomplete, not broad future ambitions.
|
||||
- No explicit prioritization owner or milestone for the optimization backlog.
|
||||
- No benchmark or timing harness exists for startup, probe, DB, or conversion orchestration overhead.
|
||||
- Repo hygiene is still mixed with generated artifacts and some clearly unfinished files.
|
||||
|
||||
## Next
|
||||
|
||||
Use this for the immediate sequence of work. It should be short, ordered, and
|
||||
biased toward the next deliverable rather than a long roadmap.
|
||||
1. Triage the list into quick wins, medium refactors, and long-horizon cleanup.
|
||||
2. Tackle the cheapest high-impact items first:
|
||||
- recursive `__pycache__/` ignore and cleanup,
|
||||
- regex raw-string warning cleanup,
|
||||
- `count()` plus `first()` query cleanup,
|
||||
- single-call `ffprobe` refactor.
|
||||
3. Decide whether maintenance/tooling command imports should be split from media-runtime imports before adding more CLI maintenance surface.
|
||||
|
||||
## Delete When
|
||||
|
||||
Use this to define when the scratchpad should disappear. That keeps it clearly
|
||||
temporary and helps prevent it from turning into shadow documentation.
|
||||
|
||||
|
||||
## Suggested Style
|
||||
|
||||
- Prefer short bullets over long prose.
|
||||
- Keep facts, questions, and rough sketches in separate sections.
|
||||
- Add custom sections only when they help the next iteration move faster.
|
||||
- Move durable outcomes out of the scratchpad once they stop being temporary.
|
||||
|
||||
|
||||
-->
|
||||
- Delete this scratchpad once the optimization backlog is either converted into issues/work items or distilled into durable project guidance.
|
||||
|
||||
Reference in New Issue
Block a user