# MVP roadmap Last updated: 2026-07-01 See also `docs/backlog.md` for the prioritized engineering backlog, caveats, and open optimization list. ## Objective Build an internal management workbench that turns public mobility data into a normalized, auditable, coverage-scored dataset for a future traveller-facing web/native app. The workbench stays distinct from the public app. Its users are data engineers, analysts, and operations staff who need to ingest, inspect, link, correct, route against, and publish mobility data. ## Current prototype: implemented The repository has moved beyond the original SQLite/Berlin prototype. The current development path is Germany-scale and PostGIS-first, while SQLite remains useful as a legacy/test fallback. Implemented: ```text source registry and source catalog local source cache job queue with job events and worker process PostgreSQL/PostGIS runtime support with SQLite fallback GTFS static importer for large national feeds OSM PBF import path for Germany-scale extracts OSM address index and address-aware journey endpoints canonical stop/station linking from GTFS and OSM automatic GTFS <-> OSM route matching manual route and canonical-stop rule persistence visual route-layer builder from OSM routes and GTFS shapes walk/drive routing layer from OSM-derived routing graph progressive journey-search API and UI polling map right-click "from here" / "to here" management UI with map, sources, stats, jobs, matches, search, and journeys separate GTFS Harmonization and Mapping Data source modules in the UI generic job-details overlay with phase timeline, event log, and queue snapshot QA dashboard skeleton for source/import/link/route/publication health GTFS harmonization concept and service-boundary decision CLI commands tests and syntax checks for changed modules ``` Recent fixes: ```text PostgreSQL startup avoids unnecessary DDL when PostGIS columns/indexes already exist. Queue route-layer rebuild can be claimed by a real worker instead of staying queued behind a stale worker pid. Timetable routing no longer requires visual route-pattern trip links. Walk-leg route geometry has a short-lived in-process cache. Address search is bbox-aware without being bbox-limited. Job rows expose a details overlay that polls job events only while open. Journey routing consumes the active harmonized GTFS snapshot instead of a raw feed picker. ``` ## Current prototype: known limits The app can import and inspect Germany-scale OSM and GTFS, but the routing and route-layer rebuild paths are still prototype-grade. Important limits: ```text journey search is not yet RAPTOR/CSA or connection-scan based address endpoints can multiply transit searches through several nearby access/egress stops progressive transfer stages still recompute too much route-layer rebuild is coarse-grained and rewrites derived link tables visual route-pattern links are not yet incrementally updated canonical stop extraction is CPU/memory heavy on national feeds route geometry cannot yet classify temporary GTFS detours as separate variants local-transport-only routing is not a first-class query mode route-search caches are process-local and not persisted Alembic migrations are still missing ``` ## MVP 1: stable Germany data workbench ### Backend - Add proper Alembic migrations for PostgreSQL and keep SQLite test support. - Add source-run history and dataset-version comparison. - Make route-layer rebuild incremental: update only affected matches/patterns/stops. - Keep old route-layer tables readable while a rebuild prepares replacement rows. - Add source health checks: download success, hash change, feed freshness, calendar validity. - Expand the QA dashboard into drill-down review queues for source health, GTFS validation, canonical stop conflicts, route conflicts, and publication blockers. - Add GTFS validation summary reports: service dates, route direction coverage, stop coordinate outliers, bad stop_times, missing shapes. - Add database maintenance jobs: analyze, vacuum, stale job recovery, orphan cleanup. - Add durable cache tables for journey stages, nearest stops, address access candidates, and common station-to-station searches. ### Routing - Replace the demo round-expansion router with a GTFS-appropriate algorithm such as RAPTOR or CSA. - Precompute transfer graph edges: station-internal transfers, nearby walking transfers, and access/egress stop candidates. - Add routing profiles: ```text fastest public transport fewest transfers local transport only / Deutschlandticket-like walk only drive car comparison ``` - Treat access/egress walking as access legs, not as public-transport transfers. - Add bounded hub-aware long-distance routing for city-to-city requests: local access to likely hubs, long-distance/regional trunk, local egress. - Add arrive-by search and better stop conditions for "good enough" results. - Add route diagnostics that explain why a route was found or pruned. ### Frontend - Add source detail page. - Add dataset detail page. - Add match-review queue with filters by mode, operator, country, confidence, and source scope. - Add route detail inspection: GTFS geometry, OSM geometry, candidate matches, stops, evidence, and route-pattern provenance. - Add canonical stop/station detail overlay. - Add persistent rule editor. - Add routing controls for profile, transfer buffer, avoid/prefer modes, arrive-by, via, and local-only. - Show partial/progressive route results with clear stage labels. ### Data outputs - GeoJSON exports for small regions. - GeoParquet exports for analysis. - PMTiles/vector-tile export for map display. - Coverage CSV/API for downstream services. ## MVP 2: Europe-scale coverage map - Use Geofabrik country/Europe extracts and reproducible OSM PBF jobs. - Store OSM transport features, addresses, and routing graph in PostGIS. - Generate ranked/generalized transport route layers by zoom level. - Serve tiles with Martin or export PMTiles. - Add coverage statuses: ```text existing_in_osm static_timetable_covered live_data_covered fare_data_covered booking_covered missing_static stale_feed restricted_license low_confidence_match detour_or_temporary_variant ``` - Add coverage metrics: ```text operator coverage route coverage route-km coverage stop coverage live-data coverage feed freshness license confidence booking coverage route-layer provenance coverage ``` ## MVP 3: more source formats Add importers: ```text NeTEx TransXChange SIRI discovery/live endpoints GTFS-Realtime GBFS for shared mobility, optional operator CSV/API adapters ``` Target data model: ```text canonical operators canonical stops/stations/terminals canonical routes route variants trip patterns calendar/service validity transfers access/egress legs coverage observations source evidence manual rules ``` ## MVP 4: production journey-planning dataset - Build a canonical stop/station graph with transfer rules and transfer-time profiles. - Generate timetable-routing input for RAPTOR/CSA. - Add first/last-mile routing from OSM walk/drive graph. - Add emissions factors per mode/operator/country. - Add fare/ticket placeholders and booking/deep-link metadata. - Add confidence and provenance to every derived route/journey. ## MVP 5: booking-readiness layer - Track booking availability separately from timetable coverage. - Add deep-link metadata per operator/route. - Add partner API adapters later. - Distinguish clearly: ```text travel-plausible itinerary bookable itinerary single-interface multi-booking protected through-ticket ``` ## Recommended next implementation sprint 1. Finish route-layer rebuild resilience: incremental updates, shadow tables, and detour/provenance classification. 2. Replace or heavily optimize journey routing: precomputed transfers, hub-aware long-distance routing, local-only profile, and bounded search. 3. Add durable PostgreSQL-backed journey caches for address access, stop pairs, and repeated stage searches. 4. Add Alembic migrations and remove runtime DDL from normal request/worker startup. 5. Add route/journey diagnostics so slow or failed requests explain what was searched and pruned. 6. Add vector-tile output for route layers and large map rendering.