fix(api)!: prevent OSM node ID truncation in match/route annotations#7518
Draft
timpara wants to merge 1 commit intoProject-OSRM:masterfrom
Draft
fix(api)!: prevent OSM node ID truncation in match/route annotations#7518timpara wants to merge 1 commit intoProject-OSRM:masterfrom
timpara wants to merge 1 commit intoProject-OSRM:masterfrom
Conversation
OSM node IDs above 2^34 were silently truncated when written into the PackedOSMIDs storage (PackedVector<OSMNodeID, 34, ...>) used for the .osrm.nbg_nodes file. The truncated values then flowed unchanged through GetOSMNodeIDOfNode into the annotations.nodes field of /match and /route responses, as well as vector tiles. Additionally, the FlatBuffers schema for Annotation.nodes used [uint], truncating any ID above 2^32 a second time on the FB wire path; the JSON path was already 64-bit clean once storage stops truncating. Changes: - Widen PackedOSMIDs from 34 to 40 bits (max ~1.1e12, decades of headroom). - Add checkPackedOSMNodeIdFits() guard at every osm_node_ids.push_back site so future overflow throws util::exception during osrm-extract rather than silently masking values. - Change Annotation.nodes FlatBuffers field from [uint] to [ulong] and fix the corresponding std::vector<uint32_t> in route_api.hpp to std::vector<uint64_t>; regenerate fbresult_generated.js accordingly. - Add unit tests for round-trip of IDs above 2^34 and for the overflow guard. - Document the 64-bit semantics of annotations.nodes in docs/http.md. This is a breaking change for two reasons: 1. The on-disk packed-vector layout for osm_node_ids changes; existing .osrm datasets must be re-extracted with the new osrm-extract. 2. The FlatBuffers Annotation.nodes wire type changes from uint32 to uint64; FB clients reading annotations.nodes must update accordingly. Fixes Project-OSRM#7069
Collaborator
|
Thanks for the contribution. Please consider adding a test that covers the changed behavior. |
Collaborator
|
This needs a formatting fix to run CI checks: |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes incorrect OSM node IDs in /route and /match annotations when IDs exceed previous packing/serialization limits, by widening internal packed storage and the FlatBuffers representation to preserve 64-bit IDs end-to-end.
Changes:
- Increase
PackedOSMIDsstorage from 34 to 40 bits and add a hard overflow guard (checkPackedOSMNodeIdFits) at write sites to fail during extraction rather than truncating. - Change FlatBuffers
Annotation.nodesfrom[uint]to[ulong]and update the RouteAPI FlatBuffers emitter and the generated JS bindings accordingly. - Add unit tests for round-tripping IDs above
2^34and for the overflow guard; update HTTP docs to reflect 64-bit node IDs (partially).
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
unit_tests/util/packed_vector.cpp |
Adds regression + guard tests for PackedOSMIDs behavior with large node IDs. |
src/extractor/extractor.cpp |
Adds overflow guard call before pushing node IDs into packed storage during extraction. |
include/extractor/packed_osm_ids.hpp |
Widens packed bit-width and introduces MAX_PACKED_OSM_NODE_ID + checkPackedOSMNodeIdFits(). |
include/extractor/files.hpp |
Adds overflow guard when reading raw NBG nodes and pushing node IDs into packed storage. |
include/engine/api/route_api.hpp |
Switches FlatBuffers annotation node vector to uint64_t to avoid truncation. |
include/engine/api/flatbuffers/route.fbs |
Changes FlatBuffers schema for Annotation.nodes to [ulong]. |
features/support/fbresult_generated.js |
Regenerates JS FlatBuffers accessors/builders for 64-bit nodes (BigInt/BigUint64Array). |
docs/http.md |
Updates docs to mention 64-bit node IDs (but currently only in one place). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // 40 bits supports IDs up to ~1.1 * 10^12, providing decades of headroom. | ||
| // See https://github.com/Project-OSRM/osrm-backend/issues/7069 | ||
| inline constexpr std::size_t PACKED_OSM_ID_BITS = 40; | ||
|
|
| - `code` if the request was successful `Ok` otherwise see the service dependent and general status codes. | ||
| - `waypoints` array of `Waypoint` objects sorted by distance to the input coordinate. Each object has at least the following additional properties: | ||
| - `nodes`: Array of OpenStreetMap node ids. | ||
| - `nodes`: Array of OpenStreetMap node ids. Each id is a 64-bit unsigned integer (encoded as a JSON number for the `json` format, and as `ulong` for the `flatbuffers` format). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue
Fixes #7069. OSM node IDs above 2^34 come back wrong in /match and /route annotations.
Root cause
Two truncation bugs on the node-ID path:
PackedOSMIDspacks OSM IDs into 34 bits, so any ID at or above 2^34 is silently masked. TheBOOST_ASSERTguard is compiled out in release.Annotation.nodeswas[uint]androute_api.hppusedstd::vector<uint32_t>, truncating IDs above 2^32 again on the FB path. JSON output was already 64-bit.Changes
PackedOSMIDsfrom 34 to 40 bits (max ~1.1 * 10^12).checkPackedOSMNodeIdFits()and call it at every push site so overflow throws during osrm-extract instead of masking.Annotation.nodesto[ulong], usestd::vector<uint64_t>inroute_api.hpp, regeneratefbresult_generated.js.docs/http.md.Breaking changes
.osrm.nbg_nodeslayout changes; existing datasets must be re-extracted. Recommend bumpingpackage.jsonminor before release so the fingerprint rejects stale data.Annotation.nodeschanged from uint32 to uint64. FB clients must update accessors.Notes
./scripts/format.shif needed.OSMNodeID::value_type(uint64_t) and benefits transparently.