Alpha stage commit
This commit is contained in:
220
MVP_ROADMAP.md
Normal file
220
MVP_ROADMAP.md
Normal file
@@ -0,0 +1,220 @@
|
||||
# MVP roadmap
|
||||
|
||||
Last updated: 2026-07-01
|
||||
|
||||
See also `docs/backlog.md` for the prioritized engineering backlog, caveats, and open optimization list.
|
||||
|
||||
## Objective
|
||||
|
||||
Build an internal management workbench that turns public mobility data into a normalized, auditable, coverage-scored dataset for a future traveller-facing web/native app.
|
||||
|
||||
The workbench stays distinct from the public app. Its users are data engineers, analysts, and operations staff who need to ingest, inspect, link, correct, route against, and publish mobility data.
|
||||
|
||||
## Current prototype: implemented
|
||||
|
||||
The repository has moved beyond the original SQLite/Berlin prototype. The current development path is Germany-scale and PostGIS-first, while SQLite remains useful as a legacy/test fallback.
|
||||
|
||||
Implemented:
|
||||
|
||||
```text
|
||||
source registry and source catalog
|
||||
local source cache
|
||||
job queue with job events and worker process
|
||||
PostgreSQL/PostGIS runtime support with SQLite fallback
|
||||
GTFS static importer for large national feeds
|
||||
OSM PBF import path for Germany-scale extracts
|
||||
OSM address index and address-aware journey endpoints
|
||||
canonical stop/station linking from GTFS and OSM
|
||||
automatic GTFS <-> OSM route matching
|
||||
manual route and canonical-stop rule persistence
|
||||
visual route-layer builder from OSM routes and GTFS shapes
|
||||
walk/drive routing layer from OSM-derived routing graph
|
||||
progressive journey-search API and UI polling
|
||||
map right-click "from here" / "to here"
|
||||
management UI with map, sources, stats, jobs, matches, search, and journeys
|
||||
separate GTFS Harmonization and Mapping Data source modules in the UI
|
||||
generic job-details overlay with phase timeline, event log, and queue snapshot
|
||||
QA dashboard skeleton for source/import/link/route/publication health
|
||||
GTFS harmonization concept and service-boundary decision
|
||||
CLI commands
|
||||
tests and syntax checks for changed modules
|
||||
```
|
||||
|
||||
Recent fixes:
|
||||
|
||||
```text
|
||||
PostgreSQL startup avoids unnecessary DDL when PostGIS columns/indexes already exist.
|
||||
Queue route-layer rebuild can be claimed by a real worker instead of staying queued behind a stale worker pid.
|
||||
Timetable routing no longer requires visual route-pattern trip links.
|
||||
Walk-leg route geometry has a short-lived in-process cache.
|
||||
Address search is bbox-aware without being bbox-limited.
|
||||
Job rows expose a details overlay that polls job events only while open.
|
||||
Journey routing consumes the active harmonized GTFS snapshot instead of a raw feed picker.
|
||||
```
|
||||
|
||||
## Current prototype: known limits
|
||||
|
||||
The app can import and inspect Germany-scale OSM and GTFS, but the routing and route-layer rebuild paths are still prototype-grade.
|
||||
|
||||
Important limits:
|
||||
|
||||
```text
|
||||
journey search is not yet RAPTOR/CSA or connection-scan based
|
||||
address endpoints can multiply transit searches through several nearby access/egress stops
|
||||
progressive transfer stages still recompute too much
|
||||
route-layer rebuild is coarse-grained and rewrites derived link tables
|
||||
visual route-pattern links are not yet incrementally updated
|
||||
canonical stop extraction is CPU/memory heavy on national feeds
|
||||
route geometry cannot yet classify temporary GTFS detours as separate variants
|
||||
local-transport-only routing is not a first-class query mode
|
||||
route-search caches are process-local and not persisted
|
||||
Alembic migrations are still missing
|
||||
```
|
||||
|
||||
## MVP 1: stable Germany data workbench
|
||||
|
||||
### Backend
|
||||
|
||||
- Add proper Alembic migrations for PostgreSQL and keep SQLite test support.
|
||||
- Add source-run history and dataset-version comparison.
|
||||
- Make route-layer rebuild incremental: update only affected matches/patterns/stops.
|
||||
- Keep old route-layer tables readable while a rebuild prepares replacement rows.
|
||||
- Add source health checks: download success, hash change, feed freshness, calendar validity.
|
||||
- Expand the QA dashboard into drill-down review queues for source health, GTFS validation, canonical stop conflicts, route conflicts, and publication blockers.
|
||||
- Add GTFS validation summary reports: service dates, route direction coverage, stop coordinate outliers, bad stop_times, missing shapes.
|
||||
- Add database maintenance jobs: analyze, vacuum, stale job recovery, orphan cleanup.
|
||||
- Add durable cache tables for journey stages, nearest stops, address access candidates, and common station-to-station searches.
|
||||
|
||||
### Routing
|
||||
|
||||
- Replace the demo round-expansion router with a GTFS-appropriate algorithm such as RAPTOR or CSA.
|
||||
- Precompute transfer graph edges: station-internal transfers, nearby walking transfers, and access/egress stop candidates.
|
||||
- Add routing profiles:
|
||||
|
||||
```text
|
||||
fastest public transport
|
||||
fewest transfers
|
||||
local transport only / Deutschlandticket-like
|
||||
walk only
|
||||
drive
|
||||
car comparison
|
||||
```
|
||||
|
||||
- Treat access/egress walking as access legs, not as public-transport transfers.
|
||||
- Add bounded hub-aware long-distance routing for city-to-city requests: local access to likely hubs, long-distance/regional trunk, local egress.
|
||||
- Add arrive-by search and better stop conditions for "good enough" results.
|
||||
- Add route diagnostics that explain why a route was found or pruned.
|
||||
|
||||
### Frontend
|
||||
|
||||
- Add source detail page.
|
||||
- Add dataset detail page.
|
||||
- Add match-review queue with filters by mode, operator, country, confidence, and source scope.
|
||||
- Add route detail inspection: GTFS geometry, OSM geometry, candidate matches, stops, evidence, and route-pattern provenance.
|
||||
- Add canonical stop/station detail overlay.
|
||||
- Add persistent rule editor.
|
||||
- Add routing controls for profile, transfer buffer, avoid/prefer modes, arrive-by, via, and local-only.
|
||||
- Show partial/progressive route results with clear stage labels.
|
||||
|
||||
### Data outputs
|
||||
|
||||
- GeoJSON exports for small regions.
|
||||
- GeoParquet exports for analysis.
|
||||
- PMTiles/vector-tile export for map display.
|
||||
- Coverage CSV/API for downstream services.
|
||||
|
||||
## MVP 2: Europe-scale coverage map
|
||||
|
||||
- Use Geofabrik country/Europe extracts and reproducible OSM PBF jobs.
|
||||
- Store OSM transport features, addresses, and routing graph in PostGIS.
|
||||
- Generate ranked/generalized transport route layers by zoom level.
|
||||
- Serve tiles with Martin or export PMTiles.
|
||||
- Add coverage statuses:
|
||||
|
||||
```text
|
||||
existing_in_osm
|
||||
static_timetable_covered
|
||||
live_data_covered
|
||||
fare_data_covered
|
||||
booking_covered
|
||||
missing_static
|
||||
stale_feed
|
||||
restricted_license
|
||||
low_confidence_match
|
||||
detour_or_temporary_variant
|
||||
```
|
||||
|
||||
- Add coverage metrics:
|
||||
|
||||
```text
|
||||
operator coverage
|
||||
route coverage
|
||||
route-km coverage
|
||||
stop coverage
|
||||
live-data coverage
|
||||
feed freshness
|
||||
license confidence
|
||||
booking coverage
|
||||
route-layer provenance coverage
|
||||
```
|
||||
|
||||
## MVP 3: more source formats
|
||||
|
||||
Add importers:
|
||||
|
||||
```text
|
||||
NeTEx
|
||||
TransXChange
|
||||
SIRI discovery/live endpoints
|
||||
GTFS-Realtime
|
||||
GBFS for shared mobility, optional
|
||||
operator CSV/API adapters
|
||||
```
|
||||
|
||||
Target data model:
|
||||
|
||||
```text
|
||||
canonical operators
|
||||
canonical stops/stations/terminals
|
||||
canonical routes
|
||||
route variants
|
||||
trip patterns
|
||||
calendar/service validity
|
||||
transfers
|
||||
access/egress legs
|
||||
coverage observations
|
||||
source evidence
|
||||
manual rules
|
||||
```
|
||||
|
||||
## MVP 4: production journey-planning dataset
|
||||
|
||||
- Build a canonical stop/station graph with transfer rules and transfer-time profiles.
|
||||
- Generate timetable-routing input for RAPTOR/CSA.
|
||||
- Add first/last-mile routing from OSM walk/drive graph.
|
||||
- Add emissions factors per mode/operator/country.
|
||||
- Add fare/ticket placeholders and booking/deep-link metadata.
|
||||
- Add confidence and provenance to every derived route/journey.
|
||||
|
||||
## MVP 5: booking-readiness layer
|
||||
|
||||
- Track booking availability separately from timetable coverage.
|
||||
- Add deep-link metadata per operator/route.
|
||||
- Add partner API adapters later.
|
||||
- Distinguish clearly:
|
||||
|
||||
```text
|
||||
travel-plausible itinerary
|
||||
bookable itinerary
|
||||
single-interface multi-booking
|
||||
protected through-ticket
|
||||
```
|
||||
|
||||
## Recommended next implementation sprint
|
||||
|
||||
1. Finish route-layer rebuild resilience: incremental updates, shadow tables, and detour/provenance classification.
|
||||
2. Replace or heavily optimize journey routing: precomputed transfers, hub-aware long-distance routing, local-only profile, and bounded search.
|
||||
3. Add durable PostgreSQL-backed journey caches for address access, stop pairs, and repeated stage searches.
|
||||
4. Add Alembic migrations and remove runtime DDL from normal request/worker startup.
|
||||
5. Add route/journey diagnostics so slow or failed requests explain what was searched and pruned.
|
||||
6. Add vector-tile output for route layers and large map rendering.
|
||||
Reference in New Issue
Block a user