Tidy up logging and rework tests from scratch
This commit is contained in:
@@ -32,7 +32,7 @@
|
||||
## High-Level Building Blocks
|
||||
|
||||
- Frontend, CLI, API, or worker:
|
||||
- A Click-based CLI in [`src/ffx/ffx.py`](/home/osgw/.local/src/codex/ffx/src/ffx/ffx.py).
|
||||
- A Click-based CLI in [`src/ffx/cli.py`](/home/osgw/.local/src/codex/ffx/src/ffx/cli.py), exposed as the `ffx` command and via `python -m ffx`.
|
||||
- A Textual terminal UI rooted in [`src/ffx/ffx_app.py`](/home/osgw/.local/src/codex/ffx/src/ffx/ffx_app.py) with screens for shows, patterns, file inspection, tracks, tags, and shifted seasons.
|
||||
- Core business logic:
|
||||
- Descriptor objects model media files, shows, and tracks.
|
||||
@@ -50,7 +50,7 @@
|
||||
- Key entities or records:
|
||||
- `Show`: canonical TV show metadata plus digit-formatting rules for generated filenames.
|
||||
- `Pattern`: regex rule tying filenames to one show and one target media schema.
|
||||
- `Track` and `TrackTag`: persisted target stream layout, codec, dispositions, audio layout, and stream-level tags.
|
||||
- `Track` and `TrackTag`: persisted target stream records, codec, dispositions, audio layout, and stream-level tags. Detailed source-to-target mapping rules live in `requirements/subtrack_mapping.md`.
|
||||
- `MediaTag`: persisted container-level metadata for a pattern.
|
||||
- `ShiftedSeason`: mapping from source numbering ranges to adjusted season and episode numbers.
|
||||
- `Property`: internal key-value storage currently used for database versioning.
|
||||
@@ -63,7 +63,6 @@
|
||||
- Only supported media-file extensions are accepted for conversion.
|
||||
- Stored database version must match the runtime-required version.
|
||||
- A normalized descriptor may have at most one default and one forced stream per relevant track type.
|
||||
- Stored target tracks must refer to valid source tracks of matching types.
|
||||
- Shifted-season ranges are intended not to overlap for the same show and season.
|
||||
- TMDB lookups require a show ID and season and episode numbers.
|
||||
- Error-handling approach:
|
||||
|
||||
74
requirements/subtrack_mapping.md
Normal file
74
requirements/subtrack_mapping.md
Normal file
@@ -0,0 +1,74 @@
|
||||
# Subtrack Mapping
|
||||
|
||||
This file defines the behavioral contract for mapping input subtracks to output
|
||||
subtracks during conversion.
|
||||
|
||||
Primary source: actual tool code in `src/ffx/`.
|
||||
Secondary source: `tests/legacy/`, used only to clarify intent and reveal gaps.
|
||||
|
||||
## Scope
|
||||
|
||||
- Ensuring each target subtrack is created from the corresponding source-subtrack information, including stream-level metadata.
|
||||
- Mapping input streams to output streams during conversion.
|
||||
- Using persisted pattern-track definitions from the database as the target schema.
|
||||
- Allowing omission and reordering of retained tracks.
|
||||
- Keeping stream-level metadata attached to the correct source-derived logical track after remapping.
|
||||
- Normalizing target output into ordered track groups: video, audio, subtitle, then special types such as fonts or images.
|
||||
|
||||
## Terms
|
||||
|
||||
- `source_index`: identity of the originating input stream from ffprobe or an imported source descriptor.
|
||||
- `index`: final output-track order across all retained tracks.
|
||||
- `sub_index`: per-type position within the retained tracks of one type, for example audio stream `0` or subtitle stream `1`.
|
||||
- `target schema`: stored or constructed output-track definition that decides which tracks are kept, omitted, reordered, and rewritten.
|
||||
- `separate source file`: additional file bound to one target track slot whose media payload replaces the regular source payload for that slot.
|
||||
|
||||
## Rules
|
||||
|
||||
- `SUBTRACK_MAPPING-0001`: The system shall represent source-stream identity separately from output order. `source_index`, `index`, and `sub_index` are distinct concepts and shall not be collapsed into one field.
|
||||
- `SUBTRACK_MAPPING-0002`: The system shall derive `source_index` for probed tracks from the original ffprobe stream index and preserve that identity through conversion planning.
|
||||
- `SUBTRACK_MAPPING-0003`: Pattern-backed track definitions stored in the database shall persist both target output order and originating source-stream identity.
|
||||
- `SUBTRACK_MAPPING-0004`: When a filename matches a pattern, the pattern target schema shall be the source of truth for which source tracks are retained, which are omitted, and in what order retained tracks appear in the output.
|
||||
- `SUBTRACK_MAPPING-0005`: A target track may refer only to an existing source track of the same type. Conversion shall fail fast when a target track refers to a nonexistent source stream or a source stream of a different type.
|
||||
- `SUBTRACK_MAPPING-0006`: The ffmpeg mapping phase shall be generated from target output order while resolving each retained output track back to its originating source stream via `source_index`.
|
||||
- `SUBTRACK_MAPPING-0007`: Reordering and omission shall preserve logical track identity. Stream-level metadata, titles, languages, and disposition decisions shall stay attached to the correct source-derived logical track after mapping.
|
||||
- `SUBTRACK_MAPPING-0008`: The system shall support one-off CLI stream-order overrides without requiring prior database edits.
|
||||
- `SUBTRACK_MAPPING-0009`: Operator-facing inspection and editing surfaces shall expose enough source-versus-target information to let a user reason about subtrack mapping decisions.
|
||||
- `SUBTRACK_MAPPING-0010`: Test coverage for subtrack mapping shall assert source-derived identity, omission, and output order explicitly. Final track counts or final type sequences alone are insufficient proof of correct mapping.
|
||||
- `SUBTRACK_MAPPING-0011`: Retained target tracks shall appear in ordered groups: video track or tracks first, then audio tracks, then subtitle tracks, then special types such as fonts or images. Within each group, the target schema shall define the order.
|
||||
- `SUBTRACK_MAPPING-0012`: Track omission is valid when required by output compatibility, when needed to normalize source tracks into the required target group order and schema, or when explicitly requested by database rules or CLI options.
|
||||
- `SUBTRACK_MAPPING-0013`: If source tracks do not already comply with the required target group order, conversion shall reorder retained tracks to match the target ordering contract without losing source-track identity or stream-level metadata lineage.
|
||||
|
||||
## Separate Additional Source Files
|
||||
|
||||
- `SUBTRACK_MAPPING-0014`: A separate source file may substitute the media payload of one target subtrack without changing that target track's intended output position.
|
||||
- `SUBTRACK_MAPPING-0015`: When a separate source file is used, the target track shall remain bound to the corresponding logical source track for mapping, validation, and metadata lineage.
|
||||
- `SUBTRACK_MAPPING-0016`: Metadata for a substituted target track shall be merged from the regular source track and the separate source file when available.
|
||||
- `SUBTRACK_MAPPING-0017`: If the separate source file provides a metadata field that is also present on the regular source track, the separate source file value shall win in the target output.
|
||||
- `SUBTRACK_MAPPING-0018`: If a metadata field is absent from the separate source file, the system shall fall back to the corresponding metadata from the regular source track or target schema rewrite rules.
|
||||
|
||||
## Acceptance
|
||||
|
||||
- Given a source media descriptor and a pattern-backed target schema, the planned output tracks can be listed in final output order and each retained track can still be traced to one originating source stream.
|
||||
- Planned output order follows grouped target order: video, audio, subtitle, then special types.
|
||||
- Tracks not referenced by the target schema are omitted from output mapping.
|
||||
- Tracks may also be omitted when they are incompatible with the chosen output format or explicitly excluded by database or CLI rules.
|
||||
- Two retained target tracks never originate from the same source stream unless duplication is implemented explicitly as a separate feature.
|
||||
- If target-track metadata is rewritten after reordering, it is written onto the correct source-derived logical track rather than the track that merely occupies the same final output position.
|
||||
- Invalid target-to-source references fail deterministically before the conversion job is launched.
|
||||
- If a separate source file substitutes one target track, that track keeps its target slot and ordering while metadata is merged with separate-file values taking precedence when both sides provide the same field.
|
||||
- A test proving subtrack mapping must assert at least one of: exact `source_index` to output-order mapping, omission of named source tracks, or preservation of per-track metadata after reorder.
|
||||
|
||||
## Test Notes
|
||||
|
||||
- `tests/legacy/scenario.py` names pattern behavior as `Filter/Reorder Tracks`.
|
||||
- `tests/legacy/scenario_4.py` is the strongest end-to-end signal because it runs DB-backed conversion and reapplies source indices before assertion.
|
||||
- `tests/legacy/track_tag_combinator_2_0.py` and `tests/legacy/track_tag_combinator_3_4.py` sort result tracks by `source_index` before checking tags, which matches the intended identity model.
|
||||
- Legacy permutation combinators define permutations but their assertion functions are stubs.
|
||||
- Some legacy scenarios produce `AP` and `SP` selectors but do not execute them.
|
||||
|
||||
## Risks
|
||||
|
||||
- `src/ffx/media_descriptor.py` contains an explicit `rearrangeTrackDescriptors()` path whose current implementation appears defective and under-tested.
|
||||
- Separate-source-file metadata precedence is only partly expressed in current implementation paths and should be covered directly in the rewritten test suite.
|
||||
- Production code expresses the mapping contract more clearly than the legacy harness, so a rewrite should add direct logic-level tests for mapping and reorder planning.
|
||||
130
requirements/tests.md
Normal file
130
requirements/tests.md
Normal file
@@ -0,0 +1,130 @@
|
||||
# Test Rewrite
|
||||
|
||||
This file captures the structure executed by `tests/legacy_runner.py` today and
|
||||
defines the target shape for a complete rewrite.
|
||||
|
||||
Detailed product rules for source-to-target subtrack mapping live in
|
||||
`requirements/subtrack_mapping.md`. This file describes only how tests cover
|
||||
that area.
|
||||
|
||||
## Current Harness
|
||||
|
||||
- Entrypoint: `python tests/legacy_runner.py run`
|
||||
- Runner style: custom Click CLI, not `pytest` or `unittest`
|
||||
- Commands:
|
||||
- `run`: discover scenario files, instantiate each scenario, run yielded jobs
|
||||
- `dupe`: helper command that creates duplicate media fixtures; not part of the test run
|
||||
- Filters: `--scenario`, `--variant`, `--limit`
|
||||
- Shared context:
|
||||
- builds one mutable dict for the whole run
|
||||
- installs loggers and writes `ffx_test_report.log`
|
||||
- creates `ConfigurationController` eagerly
|
||||
- tracks only passed and failed counters
|
||||
- Discovery:
|
||||
- scenario files: `tests/legacy/scenario_*.py`
|
||||
- combinators: `glob + importlib + inspect` by filename convention
|
||||
- ordering: implicit glob order, no explicit sorting
|
||||
- Skip behavior:
|
||||
- Scenario 4 is skipped when `TMDB_API_KEY` is missing
|
||||
- only `TMDB_API_KEY_NOT_PRESENT_EXCEPTION` is caught at scenario construction time
|
||||
|
||||
## Current Scenarios
|
||||
|
||||
- `1`: `tests/legacy/scenario_1.py`
|
||||
- focus: basename generation without pattern lookup or TMDB
|
||||
- inputs per job: `1`
|
||||
- jobs: `140`
|
||||
- expected failures: `0`
|
||||
- execution: build one synthetic source file, run `python -m ffx convert`, assert filename selectors only
|
||||
- selectors executed: `B`, `L`, `I`
|
||||
- selectors defined but not executed: `S`, `R`
|
||||
- `2`: `tests/legacy/scenario_2.py`
|
||||
- focus: conversion matrix over media layouts, dispositions, tags, and permutations
|
||||
- inputs per job: `1`
|
||||
- jobs: `8193`
|
||||
- expected failures: `3267`
|
||||
- execution: build one synthetic source file, run `python -m ffx convert`, probe result with `FileProperties`, assert track layout and selected audio and subtitle metadata
|
||||
- selectors executed: `M`, `AD`, `AT`, `SD`, `ST`
|
||||
- selectors defined but not executed: `MT`, `AP`, `SP`, `J`
|
||||
- `4`: `tests/legacy/scenario_4.py`
|
||||
- focus: pattern-driven batch conversion with SQLite state and live TMDB naming
|
||||
- inputs per job: `6`
|
||||
- jobs: `768`
|
||||
- expected failures: `336`
|
||||
- execution: build six synthetic preset files, recreate temp SQLite DB, insert show and pattern, run one batch convert command, query TMDB during assertions
|
||||
- selectors executed: `M`, `AD`, `AT`, `SD`, `ST`
|
||||
- selectors defined but not executed: `MT`, `AP`, `SP`, `J`
|
||||
- notes:
|
||||
- uses `MediaCombinator6` only
|
||||
- issues live HTTP requests through `TmdbController` with no request cache
|
||||
|
||||
## Current Combinator Families
|
||||
|
||||
- scenario files discovered: `3`
|
||||
- basename combinators discovered: `2`
|
||||
- media combinators discovered: `8`
|
||||
- media tag combinators discovered: `3`
|
||||
- disposition combinator 2 variants: `4`
|
||||
- disposition combinator 3 variants: `5`
|
||||
- track tag combinator 2 variants: `4`
|
||||
- track tag combinator 3 variants: `5`
|
||||
- indicator variants: `7`
|
||||
- label variants: `2`
|
||||
- show variants: `3`
|
||||
- release variants: `3`
|
||||
- permutation 2 variants: `2`
|
||||
- permutation 3 variants: `3`
|
||||
|
||||
## Current Totals
|
||||
|
||||
- full run without TMDB: `8333`
|
||||
- full run with TMDB: `9101`
|
||||
- Scenario 4 generated source files: `4608`
|
||||
- Scenario 4 live TMDB episode queries: `4608`
|
||||
|
||||
## Current Behavior Areas
|
||||
|
||||
- output basename rules for label, season and episode indicator, show name, and release suffix combinations
|
||||
- track layout normalization across the eight media combinator shapes from `VA` through `VAASSS`
|
||||
- two-track and three-track disposition edge cases, including intentional failure cases
|
||||
- two-track and three-track track-tag preservation checks, including checks that sort results by source identity
|
||||
- container-level media tag handling
|
||||
- pattern-backed conversion against a temporary SQLite database
|
||||
- TMDB-assisted episode naming for batch conversion
|
||||
|
||||
## Structural Findings
|
||||
|
||||
- The suite is process-heavy: most jobs run `ffmpeg` to generate a fixture and then spawn the FFX CLI as a subprocess.
|
||||
- The suite is integration-first and has almost no isolated unit-level coverage for pure logic.
|
||||
- The base `Combinator` class is a placeholder and is not the real abstraction boundary used by the suite.
|
||||
- Many combinator methods are placeholders: there are `25` `pass` statements across the current test modules.
|
||||
- Several assertion families are never executed because scenario selector dispatch is incomplete.
|
||||
- Scenario comments mention a Scenario 3, but no `scenario_3.py` exists.
|
||||
- `tests/legacy/_basename_combinator_1.py` is effectively orphaned because discovery only matches `basename_combinator_*.py`.
|
||||
- `tests/legacy/disposition_combinator_2_3 .py` contains an embedded space in the filename and is still part of discovery.
|
||||
- Expected failures are validated only as subprocess return-code matches, not as specific error types or messages.
|
||||
- The current suite depends on `ffmpeg`, `ffprobe`, SQLite, the local Python environment, and for Scenario 4 a live TMDB API key plus network access.
|
||||
|
||||
## Rewrite Target
|
||||
|
||||
- Replace the custom Click harness with a standard test runner, preferably `pytest`.
|
||||
- Split the suite into explicit layers: unit, integration, and optional external-system tests.
|
||||
- Keep unit tests as the default path and make them runnable without `ffmpeg`, `ffprobe`, TMDB, or a user config directory.
|
||||
- Model discovery explicitly in code instead of relying on glob-plus-reflection naming conventions.
|
||||
- Convert the current Cartesian-product combinators into readable parametrized cases grouped by behavior area.
|
||||
- Preserve the current behavior areas, but represent them with targeted cases instead of thousands of opaque variant IDs.
|
||||
- Make every assertion family explicit and executable; there must be no selector that is produced but never consumed.
|
||||
- Replace live TMDB access with fixtures or mocks in normal runs; any live-contract test must be opt-in.
|
||||
- Replace ad hoc subprocess return-code checks with assertions on typed exceptions, stderr content, or structured outputs.
|
||||
- Provide small reusable media fixtures or fixture builders so only a narrow integration slice needs `ffmpeg`-generated media.
|
||||
- Make database tests self-contained and fast through temporary databases and direct controller-level assertions.
|
||||
- Make ordering, naming, and selection deterministic so a contributor can predict exactly what will run.
|
||||
- Expose a small smoke suite for quick local runs and CI, plus a separately marked slower integration suite.
|
||||
- Prefer domain-oriented test modules over combinator-family modules: basename, pattern matching, metadata rewrite, track ordering, TMDB naming, CLI smoke, and failure handling.
|
||||
|
||||
## Rewrite Acceptance
|
||||
|
||||
- A default local test run finishes quickly and without network access.
|
||||
- A contributor can identify which behavior a failing test covers without decoding variant strings like `VAASSS-A:D10-S:T001`.
|
||||
- All current intended failure behaviors remain covered, but each one is asserted directly and readably.
|
||||
- The rewritten suite can be adopted by CI without requiring live TMDB credentials.
|
||||
Reference in New Issue
Block a user