Skip to content

fix(api)!: prevent OSM node ID truncation in match/route annotations#7518

Draft
timpara wants to merge 1 commit intoProject-OSRM:masterfrom
timpara:fix/osm-node-id-overflow-7069
Draft

fix(api)!: prevent OSM node ID truncation in match/route annotations#7518
timpara wants to merge 1 commit intoProject-OSRM:masterfrom
timpara:fix/osm-node-id-overflow-7069

Conversation

@timpara
Copy link
Copy Markdown

@timpara timpara commented May 1, 2026

Issue

Fixes #7069. OSM node IDs above 2^34 come back wrong in /match and /route annotations.

Root cause

Two truncation bugs on the node-ID path:

  1. Storage: PackedOSMIDs packs OSM IDs into 34 bits, so any ID at or above 2^34 is silently masked. The BOOST_ASSERT guard is compiled out in release.
  2. FlatBuffers: Annotation.nodes was [uint] and route_api.hpp used std::vector<uint32_t>, truncating IDs above 2^32 again on the FB path. JSON output was already 64-bit.

Changes

  • Widen PackedOSMIDs from 34 to 40 bits (max ~1.1 * 10^12).
  • Add checkPackedOSMNodeIdFits() and call it at every push site so overflow throws during osrm-extract instead of masking.
  • Change FB Annotation.nodes to [ulong], use std::vector<uint64_t> in route_api.hpp, regenerate fbresult_generated.js.
  • Add unit tests for round-trip above 2^34 and the overflow guard.
  • Update docs/http.md.

Breaking changes

  1. On-disk: .osrm.nbg_nodes layout changes; existing datasets must be re-extracted. Recommend bumping package.json minor before release so the fingerprint rejects stale data.
  2. FB wire: Annotation.nodes changed from uint32 to uint64. FB clients must update accessors.

Notes

  • clang-format was not available in my environment; please run ./scripts/format.sh if needed.
  • No cucumber feature added: the framework auto-assigns small IDs and reworking it for huge IDs is out of scope. Unit tests cover the storage regression; the FB change is exercised by the existing FB cucumber matrix.
  • Vector tile output already uses OSMNodeID::value_type (uint64_t) and benefits transparently.

OSM node IDs above 2^34 were silently truncated when written into the
PackedOSMIDs storage (PackedVector<OSMNodeID, 34, ...>) used for the
.osrm.nbg_nodes file. The truncated values then flowed unchanged through
GetOSMNodeIDOfNode into the annotations.nodes field of /match and /route
responses, as well as vector tiles.

Additionally, the FlatBuffers schema for Annotation.nodes used [uint],
truncating any ID above 2^32 a second time on the FB wire path; the JSON
path was already 64-bit clean once storage stops truncating.

Changes:
- Widen PackedOSMIDs from 34 to 40 bits (max ~1.1e12, decades of headroom).
- Add checkPackedOSMNodeIdFits() guard at every osm_node_ids.push_back
  site so future overflow throws util::exception during osrm-extract
  rather than silently masking values.
- Change Annotation.nodes FlatBuffers field from [uint] to [ulong] and
  fix the corresponding std::vector<uint32_t> in route_api.hpp to
  std::vector<uint64_t>; regenerate fbresult_generated.js accordingly.
- Add unit tests for round-trip of IDs above 2^34 and for the overflow
  guard.
- Document the 64-bit semantics of annotations.nodes in docs/http.md.

This is a breaking change for two reasons:
1. The on-disk packed-vector layout for osm_node_ids changes; existing
   .osrm datasets must be re-extracted with the new osrm-extract.
2. The FlatBuffers Annotation.nodes wire type changes from uint32 to
   uint64; FB clients reading annotations.nodes must update accordingly.

Fixes Project-OSRM#7069
@timpara timpara marked this pull request as draft May 1, 2026 09:59
@DennisOSRM
Copy link
Copy Markdown
Collaborator

Thanks for the contribution. Please consider adding a test that covers the changed behavior.

@DennisOSRM
Copy link
Copy Markdown
Collaborator

This needs a formatting fix to run CI checks:

diff --git a/unit_tests/util/packed_vector.cpp b/unit_tests/util/packed_vector.cpp
index 0762c9b..99425bc 100644
--- a/unit_tests/util/packed_vector.cpp
+++ b/unit_tests/util/packed_vector.cpp
@@ -215,13 +215,13 @@ BOOST_AUTO_TEST_CASE(packed_osm_ids_above_2pow34_roundtrip)
     const std::vector<std::uint64_t> inputs = {
         0ull,
         1ull,
-        (1ull << 33),                                          // 2^33
-        (1ull << 34),                                          // 2^34, the old truncation boundary
-        (1ull << 34) + 1,                                      // just above 2^34
-        20'000'000'000ull,                                     // realistic projected OSM node ID
-        (1ull << 35),                                          // 2^35
-        osrm::extractor::MAX_PACKED_OSM_NODE_ID - 1,           // just below the new limit
-        osrm::extractor::MAX_PACKED_OSM_NODE_ID,               // exactly at the new limit
+        (1ull << 33),                                // 2^33
+        (1ull << 34),                                // 2^34, the old truncation boundary
+        (1ull << 34) + 1,                            // just above 2^34
+        20'000'000'000ull,                           // realistic projected OSM node ID
+        (1ull << 35),                                // 2^35
+        osrm::extractor::MAX_PACKED_OSM_NODE_ID - 1, // just below the new limit
+        osrm::extractor::MAX_PACKED_OSM_NODE_ID,     // exactly at the new limit
     };
 
     for (auto raw : inputs)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes incorrect OSM node IDs in /route and /match annotations when IDs exceed previous packing/serialization limits, by widening internal packed storage and the FlatBuffers representation to preserve 64-bit IDs end-to-end.

Changes:

  • Increase PackedOSMIDs storage from 34 to 40 bits and add a hard overflow guard (checkPackedOSMNodeIdFits) at write sites to fail during extraction rather than truncating.
  • Change FlatBuffers Annotation.nodes from [uint] to [ulong] and update the RouteAPI FlatBuffers emitter and the generated JS bindings accordingly.
  • Add unit tests for round-tripping IDs above 2^34 and for the overflow guard; update HTTP docs to reflect 64-bit node IDs (partially).

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
unit_tests/util/packed_vector.cpp Adds regression + guard tests for PackedOSMIDs behavior with large node IDs.
src/extractor/extractor.cpp Adds overflow guard call before pushing node IDs into packed storage during extraction.
include/extractor/packed_osm_ids.hpp Widens packed bit-width and introduces MAX_PACKED_OSM_NODE_ID + checkPackedOSMNodeIdFits().
include/extractor/files.hpp Adds overflow guard when reading raw NBG nodes and pushing node IDs into packed storage.
include/engine/api/route_api.hpp Switches FlatBuffers annotation node vector to uint64_t to avoid truncation.
include/engine/api/flatbuffers/route.fbs Changes FlatBuffers schema for Annotation.nodes to [ulong].
features/support/fbresult_generated.js Regenerates JS FlatBuffers accessors/builders for 64-bit nodes (BigInt/BigUint64Array).
docs/http.md Updates docs to mention 64-bit node IDs (but currently only in one place).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// 40 bits supports IDs up to ~1.1 * 10^12, providing decades of headroom.
// See https://github.com/Project-OSRM/osrm-backend/issues/7069
inline constexpr std::size_t PACKED_OSM_ID_BITS = 40;

Comment thread docs/http.md
- `code` if the request was successful `Ok` otherwise see the service dependent and general status codes.
- `waypoints` array of `Waypoint` objects sorted by distance to the input coordinate. Each object has at least the following additional properties:
- `nodes`: Array of OpenStreetMap node ids.
- `nodes`: Array of OpenStreetMap node ids. Each id is a 64-bit unsigned integer (encoded as a JSON number for the `json` format, and as `ulong` for the `flatbuffers` format).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Node IDs > 2^34 are reported by route matching incorrect.

3 participants