Opt pattern matching
This commit is contained in:
68
requirements/pattern_management.md
Normal file
68
requirements/pattern_management.md
Normal file
@@ -0,0 +1,68 @@
|
||||
# Pattern Management
|
||||
|
||||
This file defines the behavioral contract for managing shows, patterns, and
|
||||
pattern-backed filename matching.
|
||||
|
||||
Primary source: actual tool code in `src/ffx/`.
|
||||
Secondary source: operator intent captured in task discussion.
|
||||
|
||||
## Scope
|
||||
|
||||
- The show, pattern, and track hierarchy stored in SQLite.
|
||||
- The role of a pattern as a reusable normalization definition for related media files.
|
||||
- Filename-driven assignment of a scanned media file to one show through one matching pattern.
|
||||
- Duplicate-match handling when more than one pattern matches the same filename.
|
||||
|
||||
## Terms
|
||||
|
||||
- `show`: logical series identity such as one TV show entry in the database.
|
||||
- `pattern`: regex-backed normalization definition attached to one show.
|
||||
- `track`: one persisted target-track definition attached to one pattern.
|
||||
- `scanned media file`: one source file currently being inspected or converted.
|
||||
- `duplicate pattern match`: a filename state where more than one stored pattern matches the same scanned media file.
|
||||
- `pattern-backed target schema`: the combination of one pattern's stored media tags and stored track definitions.
|
||||
|
||||
## Rules
|
||||
|
||||
- `PATTERN_MANAGEMENT-0001`: The domain model shall treat a show as the parent entity for patterns that describe distinct release families or normalization schemas for that show. A show may temporarily exist without patterns during editing or initial TUI creation.
|
||||
- `PATTERN_MANAGEMENT-0002`: Each persisted pattern shall belong to exactly one show.
|
||||
- `PATTERN_MANAGEMENT-0003`: The domain model shall treat a pattern as the reusable normalization definition for a series of media files expected to share the same internal track layout and materially similar stream and container metadata.
|
||||
- `PATTERN_MANAGEMENT-0004`: Each persisted track definition shall belong to exactly one pattern.
|
||||
- `PATTERN_MANAGEMENT-0005`: A pattern may also carry pattern-level media tags. The pattern's media tags plus its track definitions together form the pattern-backed target schema.
|
||||
- `PATTERN_MANAGEMENT-0006`: A scanned media file shall resolve to at most one pattern and therefore at most one show.
|
||||
- `PATTERN_MANAGEMENT-0007`: If no pattern matches a filename, the file shall remain unmatched rather than being assigned implicitly.
|
||||
- `PATTERN_MANAGEMENT-0008`: If more than one pattern matches the same filename, the system shall raise a duplicate pattern match error instead of silently selecting one.
|
||||
- `PATTERN_MANAGEMENT-0009`: Duplicate-match detection shall apply regardless of whether the competing patterns belong to the same show or to different shows.
|
||||
- `PATTERN_MANAGEMENT-0010`: Exact duplicate pattern definitions for the same show should not create multiple persisted pattern rows.
|
||||
- `PATTERN_MANAGEMENT-0011`: A persisted pattern shall define one or more tracks. Creating or retaining a zero-track pattern in the database is invalid managed state and shall be prohibited.
|
||||
- `PATTERN_MANAGEMENT-0012`: A show may exist without patterns as an intermediate editing state, for example when a user creates the show first in the TUI and adds patterns later.
|
||||
- `PATTERN_MANAGEMENT-0013`: Operator-facing pattern management should expose the owning show, regex pattern, stored track set, and stored media-tag set so a user can reason about matching and normalization behavior.
|
||||
- `PATTERN_MANAGEMENT-0014`: Matching semantics shall be deterministic and documented. Implicit "last matching pattern wins" behavior is not acceptable released behavior.
|
||||
|
||||
## Acceptance
|
||||
|
||||
- A filename that matches exactly one pattern yields one matched pattern and one show identity.
|
||||
- A filename that matches no pattern yields no matched pattern and an unmatched state.
|
||||
- A filename that matches more than one pattern yields an explicit duplicate-match error.
|
||||
- A pattern-backed target schema can be reconstructed from one pattern's stored media tags and stored track definitions.
|
||||
- A show may be stored before any patterns are attached to it.
|
||||
- A pattern cannot be stored or retained as a valid managed pattern unless at least one track is defined for it.
|
||||
- Pattern-backed conversion never proceeds with two competing matching patterns for the same input filename.
|
||||
|
||||
## Current Code Fit
|
||||
|
||||
- `src/ffx/model/show.py` implements a one-to-many `Show -> Pattern` relationship.
|
||||
- `src/ffx/model/pattern.py` implements `Pattern.show_id`, a one-to-many `Pattern -> Track` relationship, a one-to-many `Pattern -> MediaTag` relationship, and a unique `(show_id, pattern)` constraint for freshly created databases.
|
||||
- `src/ffx/model/track.py` implements `Track.pattern_id`, so each persisted track belongs to one pattern.
|
||||
- `src/ffx/model/pattern.py` reconstructs a pattern-backed target schema through `Pattern.getMediaDescriptor(...)`, combining stored media tags and stored tracks.
|
||||
- `src/ffx/file_properties.py` assumes a scanned file resolves to at most one pattern, because it stores only one `self.__pattern` and derives one `show_id` from it.
|
||||
- `src/ffx/pattern_controller.py` prevents exact duplicate `(show_id, pattern)` definitions during create and update flows, and it refreshes cached compiled regexes when stored pattern expressions change.
|
||||
- `src/ffx/pattern_controller.py` now complies with duplicate-match safety. `matchFilename(...)` scans deterministically, returns exactly one match, returns `{}` for no match, and raises an explicit duplicate-pattern-match error when more than one pattern matches the same filename.
|
||||
- The current persistence layer already aligns with the intended empty-show workflow because a show can exist without patterns.
|
||||
- New pattern creation and schema replacement flows now require at least one track, and `TrackController.deleteTrack(...)` prevents deleting the last persisted track from a pattern.
|
||||
- Trackless legacy rows can still exist in preexisting databases, but matching now rejects them explicitly instead of letting them participate silently.
|
||||
|
||||
## Risks
|
||||
|
||||
- The intended "release family" meaning of a pattern is a domain assumption, not something the code verifies automatically across all files matching that pattern.
|
||||
- Preexisting databases created before the newer validation rules may still contain invalid rows, so upgrade and cleanup paths should continue to treat explicit validation failures as recoverable operator signals.
|
||||
@@ -47,6 +47,7 @@
|
||||
- per-pattern stream definitions,
|
||||
- shifted-season mappings,
|
||||
- internal database version properties.
|
||||
- Detailed show, pattern, and duplicate-match management rules live in `requirements/pattern_management.md`.
|
||||
- The system shall inspect source media using `ffprobe` and derive a structured description of container metadata and streams.
|
||||
- The system shall optionally open a Textual UI to browse shows, inspect files, and create, edit, or delete shows, patterns, stream definitions, tags, and shifted-season rules.
|
||||
- The system shall match filenames against stored regex patterns to decide whether an input file should inherit a target stream and metadata schema.
|
||||
|
||||
Reference in New Issue
Block a user