221 lines
8.1 KiB
Markdown
221 lines
8.1 KiB
Markdown
# MVP roadmap
|
|
|
|
Last updated: 2026-07-01
|
|
|
|
See also `docs/backlog.md` for the prioritized engineering backlog, caveats, and open optimization list.
|
|
|
|
## Objective
|
|
|
|
Build an internal management workbench that turns public mobility data into a normalized, auditable, coverage-scored dataset for a future traveller-facing web/native app.
|
|
|
|
The workbench stays distinct from the public app. Its users are data engineers, analysts, and operations staff who need to ingest, inspect, link, correct, route against, and publish mobility data.
|
|
|
|
## Current prototype: implemented
|
|
|
|
The repository has moved beyond the original SQLite/Berlin prototype. The current development path is Germany-scale and PostGIS-first, while SQLite remains useful as a legacy/test fallback.
|
|
|
|
Implemented:
|
|
|
|
```text
|
|
source registry and source catalog
|
|
local source cache
|
|
job queue with job events and worker process
|
|
PostgreSQL/PostGIS runtime support with SQLite fallback
|
|
GTFS static importer for large national feeds
|
|
OSM PBF import path for Germany-scale extracts
|
|
OSM address index and address-aware journey endpoints
|
|
canonical stop/station linking from GTFS and OSM
|
|
automatic GTFS <-> OSM route matching
|
|
manual route and canonical-stop rule persistence
|
|
visual route-layer builder from OSM routes and GTFS shapes
|
|
walk/drive routing layer from OSM-derived routing graph
|
|
progressive journey-search API and UI polling
|
|
map right-click "from here" / "to here"
|
|
management UI with map, sources, stats, jobs, matches, search, and journeys
|
|
separate GTFS Harmonization and Mapping Data source modules in the UI
|
|
generic job-details overlay with phase timeline, event log, and queue snapshot
|
|
QA dashboard skeleton for source/import/link/route/publication health
|
|
GTFS harmonization concept and service-boundary decision
|
|
CLI commands
|
|
tests and syntax checks for changed modules
|
|
```
|
|
|
|
Recent fixes:
|
|
|
|
```text
|
|
PostgreSQL startup avoids unnecessary DDL when PostGIS columns/indexes already exist.
|
|
Queue route-layer rebuild can be claimed by a real worker instead of staying queued behind a stale worker pid.
|
|
Timetable routing no longer requires visual route-pattern trip links.
|
|
Walk-leg route geometry has a short-lived in-process cache.
|
|
Address search is bbox-aware without being bbox-limited.
|
|
Job rows expose a details overlay that polls job events only while open.
|
|
Journey routing consumes the active harmonized GTFS snapshot instead of a raw feed picker.
|
|
```
|
|
|
|
## Current prototype: known limits
|
|
|
|
The app can import and inspect Germany-scale OSM and GTFS, but the routing and route-layer rebuild paths are still prototype-grade.
|
|
|
|
Important limits:
|
|
|
|
```text
|
|
journey search is not yet RAPTOR/CSA or connection-scan based
|
|
address endpoints can multiply transit searches through several nearby access/egress stops
|
|
progressive transfer stages still recompute too much
|
|
route-layer rebuild is coarse-grained and rewrites derived link tables
|
|
visual route-pattern links are not yet incrementally updated
|
|
canonical stop extraction is CPU/memory heavy on national feeds
|
|
route geometry cannot yet classify temporary GTFS detours as separate variants
|
|
local-transport-only routing is not a first-class query mode
|
|
route-search caches are process-local and not persisted
|
|
Alembic migrations are still missing
|
|
```
|
|
|
|
## MVP 1: stable Germany data workbench
|
|
|
|
### Backend
|
|
|
|
- Add proper Alembic migrations for PostgreSQL and keep SQLite test support.
|
|
- Add source-run history and dataset-version comparison.
|
|
- Make route-layer rebuild incremental: update only affected matches/patterns/stops.
|
|
- Keep old route-layer tables readable while a rebuild prepares replacement rows.
|
|
- Add source health checks: download success, hash change, feed freshness, calendar validity.
|
|
- Expand the QA dashboard into drill-down review queues for source health, GTFS validation, canonical stop conflicts, route conflicts, and publication blockers.
|
|
- Add GTFS validation summary reports: service dates, route direction coverage, stop coordinate outliers, bad stop_times, missing shapes.
|
|
- Add database maintenance jobs: analyze, vacuum, stale job recovery, orphan cleanup.
|
|
- Add durable cache tables for journey stages, nearest stops, address access candidates, and common station-to-station searches.
|
|
|
|
### Routing
|
|
|
|
- Replace the demo round-expansion router with a GTFS-appropriate algorithm such as RAPTOR or CSA.
|
|
- Precompute transfer graph edges: station-internal transfers, nearby walking transfers, and access/egress stop candidates.
|
|
- Add routing profiles:
|
|
|
|
```text
|
|
fastest public transport
|
|
fewest transfers
|
|
local transport only / Deutschlandticket-like
|
|
walk only
|
|
drive
|
|
car comparison
|
|
```
|
|
|
|
- Treat access/egress walking as access legs, not as public-transport transfers.
|
|
- Add bounded hub-aware long-distance routing for city-to-city requests: local access to likely hubs, long-distance/regional trunk, local egress.
|
|
- Add arrive-by search and better stop conditions for "good enough" results.
|
|
- Add route diagnostics that explain why a route was found or pruned.
|
|
|
|
### Frontend
|
|
|
|
- Add source detail page.
|
|
- Add dataset detail page.
|
|
- Add match-review queue with filters by mode, operator, country, confidence, and source scope.
|
|
- Add route detail inspection: GTFS geometry, OSM geometry, candidate matches, stops, evidence, and route-pattern provenance.
|
|
- Add canonical stop/station detail overlay.
|
|
- Add persistent rule editor.
|
|
- Add routing controls for profile, transfer buffer, avoid/prefer modes, arrive-by, via, and local-only.
|
|
- Show partial/progressive route results with clear stage labels.
|
|
|
|
### Data outputs
|
|
|
|
- GeoJSON exports for small regions.
|
|
- GeoParquet exports for analysis.
|
|
- PMTiles/vector-tile export for map display.
|
|
- Coverage CSV/API for downstream services.
|
|
|
|
## MVP 2: Europe-scale coverage map
|
|
|
|
- Use Geofabrik country/Europe extracts and reproducible OSM PBF jobs.
|
|
- Store OSM transport features, addresses, and routing graph in PostGIS.
|
|
- Generate ranked/generalized transport route layers by zoom level.
|
|
- Serve tiles with Martin or export PMTiles.
|
|
- Add coverage statuses:
|
|
|
|
```text
|
|
existing_in_osm
|
|
static_timetable_covered
|
|
live_data_covered
|
|
fare_data_covered
|
|
booking_covered
|
|
missing_static
|
|
stale_feed
|
|
restricted_license
|
|
low_confidence_match
|
|
detour_or_temporary_variant
|
|
```
|
|
|
|
- Add coverage metrics:
|
|
|
|
```text
|
|
operator coverage
|
|
route coverage
|
|
route-km coverage
|
|
stop coverage
|
|
live-data coverage
|
|
feed freshness
|
|
license confidence
|
|
booking coverage
|
|
route-layer provenance coverage
|
|
```
|
|
|
|
## MVP 3: more source formats
|
|
|
|
Add importers:
|
|
|
|
```text
|
|
NeTEx
|
|
TransXChange
|
|
SIRI discovery/live endpoints
|
|
GTFS-Realtime
|
|
GBFS for shared mobility, optional
|
|
operator CSV/API adapters
|
|
```
|
|
|
|
Target data model:
|
|
|
|
```text
|
|
canonical operators
|
|
canonical stops/stations/terminals
|
|
canonical routes
|
|
route variants
|
|
trip patterns
|
|
calendar/service validity
|
|
transfers
|
|
access/egress legs
|
|
coverage observations
|
|
source evidence
|
|
manual rules
|
|
```
|
|
|
|
## MVP 4: production journey-planning dataset
|
|
|
|
- Build a canonical stop/station graph with transfer rules and transfer-time profiles.
|
|
- Generate timetable-routing input for RAPTOR/CSA.
|
|
- Add first/last-mile routing from OSM walk/drive graph.
|
|
- Add emissions factors per mode/operator/country.
|
|
- Add fare/ticket placeholders and booking/deep-link metadata.
|
|
- Add confidence and provenance to every derived route/journey.
|
|
|
|
## MVP 5: booking-readiness layer
|
|
|
|
- Track booking availability separately from timetable coverage.
|
|
- Add deep-link metadata per operator/route.
|
|
- Add partner API adapters later.
|
|
- Distinguish clearly:
|
|
|
|
```text
|
|
travel-plausible itinerary
|
|
bookable itinerary
|
|
single-interface multi-booking
|
|
protected through-ticket
|
|
```
|
|
|
|
## Recommended next implementation sprint
|
|
|
|
1. Finish route-layer rebuild resilience: incremental updates, shadow tables, and detour/provenance classification.
|
|
2. Replace or heavily optimize journey routing: precomputed transfers, hub-aware long-distance routing, local-only profile, and bounded search.
|
|
3. Add durable PostgreSQL-backed journey caches for address access, stop pairs, and repeated stage searches.
|
|
4. Add Alembic migrations and remove runtime DDL from normal request/worker startup.
|
|
5. Add route/journey diagnostics so slow or failed requests explain what was searched and pruned.
|
|
6. Add vector-tile output for route layers and large map rendering.
|