Fix #375: chunked vector loading uses globally-unique primary keys by idevasena · Pull Request #387 · mlcommons/storage

idevasena · 2026-05-26T03:30:12Z

Fix #375: chunked vector loading uses globally-unique primary keys

Closes #375.

Problem

In a 1M-vector dry-run on a single Gen5 NVMe, vdb_benchmark reported mean recall@10 = 0.0090. The mlps_1m_1shards_1536dim_uniform_flat_gt ground-truth collection held only 10,000 vectors — 1% of the source collection — so almost every PK returned by the ANN search was missing from the GT set, and set_intersection / k collapsed.

Root cause

load_vdb.insert_data() built each batch's primary keys as

ids = list(range(batch_start, batch_end))

where batch_start / batch_end were the chunk-local indices. When num_vectors > chunk_size, the caller in main() invokes insert_data once per generated chunk and passes only that chunk's vectors. With chunk_size = 10_000, every chunk therefore inserted IDs 0..9_999, i.e. all 100 chunks collided on the same 10 000 primary keys.

The main collection's num_entities still reports 1 000 000 because Milvus counts physical rows, not distinct PKs — masking the bug during loading.
enhanced_bench.create_flat_collection() copies the source via query_iterator(), which deduplicates by PK, so the FLAT collection only ever sees the 10 000 unique IDs.
A second, smaller bug in enhanced_bench.py hardcoded the final copy-progress line to (100.0%), hiding the discrepancy in the logs (Copied 10000/1000000 vectors (100.0%) in the original report).

Fix

File	Change
`vdb_benchmark/vdbbench/load_vdb.py`	`insert_data()` takes a new `start_id` (default `0`, preserves legacy single-chunk behavior). IDs are now `range(start_id + batch_start, start_id + batch_end)`.
`vdb_benchmark/vdbbench/load_vdb.py`	`main()` threads a running `global_id_offset` through the chunked-generation loop and passes it as `start_id` on every `insert_data` call. The `else` (single-chunk) branch passes `start_id=0` explicitly for clarity.
`vdb_benchmark/vdbbench/enhanced_bench.py`	Replace hardcoded `(100.0%)` with the real percentage in `create_flat_collection()`.
`vdb_benchmark/vdbbench/enhanced_bench.py`	New coverage guard: if the FLAT collection holds <99% of `source_coll.num_entities`, abort with a clear pointer to issue #375 instead of silently producing meaningless recall numbers.
`vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py`	New regression suite (10 tests) covering the `start_id` offset, the three-chunk scenario from the bug report, uneven final chunks, batch sizes larger than the chunk, and the coverage-threshold parametrization.

Testing

smrc@dskbd029:~/Storage_Repo_Tests/storage_vdb375$ uv sync --extra vectordb --extra test
Resolved 98 packages in 1ms
Checked 98 packages in 1ms
smrc@dskbd029:~/Storage_Repo_Tests/storage_vdb375$ uv run pytest vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py \
              vdb_benchmark/tests/tests/test_load_vdb.py -v
================================================================ test session starts =================================================================
platform linux -- Python 3.12.3, pytest-9.0.2, pluggy-1.6.0 -- /home/smrc/Storage_Repo_Tests/storage_vdb375/.venv/bin/python
cachedir: .pytest_cache
rootdir: /home/smrc/Storage_Repo_Tests/storage_vdb375
configfile: pyproject.toml
plugins: hydra-core-1.3.2, mock-3.15.1, cov-7.1.0
collected 25 items                                                                                                                                   

vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py::TestInsertDataIdOffset::test_default_start_id_preserves_legacy_behavior 
        SETUP    F reset_milvus_connections
        vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py::TestInsertDataIdOffset::test_default_start_id_preserves_legacy_behavior (fixtures used: reset_milvus_connections)PASSED
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py::TestInsertDataIdOffset::test_start_id_offsets_all_batches 
        SETUP    F reset_milvus_connections
        vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py::TestInsertDataIdOffset::test_start_id_offsets_all_batches (fixtures used: reset_milvus_connections)PASSED
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py::TestInsertDataIdOffset::test_three_chunks_produce_globally_unique_ids 
        SETUP    F reset_milvus_connections
        vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py::TestInsertDataIdOffset::test_three_chunks_produce_globally_unique_ids (fixtures used: reset_milvus_connections)PASSED
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py::TestInsertDataIdOffset::test_uneven_final_chunk 
        SETUP    F reset_milvus_connections
        vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py::TestInsertDataIdOffset::test_uneven_final_chunk (fixtures used: reset_milvus_connections)PASSED
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py::TestInsertDataIdOffset::test_batch_size_larger_than_chunk 
        SETUP    F reset_milvus_connections
        vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py::TestInsertDataIdOffset::test_batch_size_larger_than_chunk (fixtures used: reset_milvus_connections)PASSED
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py::TestFlatGtCoverageGuard::test_coverage_threshold[1000000-1000000-True] 
        SETUP    F reset_milvus_connections
        SETUP    F flat_count[1000000]
        SETUP    F source_count[1000000]
        SETUP    F should_pass[True]
        vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py::TestFlatGtCoverageGuard::test_coverage_threshold[1000000-1000000-True] (fixtures used: flat_count, reset_milvus_connections, should_pass, source_count)PASSED
        TEARDOWN F should_pass[True]
        TEARDOWN F source_count[1000000]
        TEARDOWN F flat_count[1000000]
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py::TestFlatGtCoverageGuard::test_coverage_threshold[995000-1000000-True] 
        SETUP    F reset_milvus_connections
        SETUP    F flat_count[995000]
        SETUP    F source_count[1000000]
        SETUP    F should_pass[True]
        vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py::TestFlatGtCoverageGuard::test_coverage_threshold[995000-1000000-True] (fixtures used: flat_count, reset_milvus_connections, should_pass, source_count)PASSED
        TEARDOWN F should_pass[True]
        TEARDOWN F source_count[1000000]
        TEARDOWN F flat_count[995000]
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py::TestFlatGtCoverageGuard::test_coverage_threshold[10000-1000000-False] 
        SETUP    F reset_milvus_connections
        SETUP    F flat_count[10000]
        SETUP    F source_count[1000000]
        SETUP    F should_pass[False]
        vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py::TestFlatGtCoverageGuard::test_coverage_threshold[10000-1000000-False] (fixtures used: flat_count, reset_milvus_connections, should_pass, source_count)PASSED
        TEARDOWN F should_pass[False]
        TEARDOWN F source_count[1000000]
        TEARDOWN F flat_count[10000]
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py::TestFlatGtCoverageGuard::test_coverage_threshold[100000-1000000-False] 
        SETUP    F reset_milvus_connections
        SETUP    F flat_count[100000]
        SETUP    F source_count[1000000]
        SETUP    F should_pass[False]
        vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py::TestFlatGtCoverageGuard::test_coverage_threshold[100000-1000000-False] (fixtures used: flat_count, reset_milvus_connections, should_pass, source_count)PASSED
        TEARDOWN F should_pass[False]
        TEARDOWN F source_count[1000000]
        TEARDOWN F flat_count[100000]
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py::TestFlatGtCoverageGuard::test_coverage_threshold[0-1000000-False] 
        SETUP    F reset_milvus_connections
        SETUP    F flat_count[0]
        SETUP    F source_count[1000000]
        SETUP    F should_pass[False]
        vdb_benchmark/tests/tests/test_issue_375_chunked_insert_ids.py::TestFlatGtCoverageGuard::test_coverage_threshold[0-1000000-False] (fixtures used: flat_count, reset_milvus_connections, should_pass, source_count)PASSED
        TEARDOWN F should_pass[False]
        TEARDOWN F source_count[1000000]
        TEARDOWN F flat_count[0]
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorGeneration::test_uniform_vector_generation 
        SETUP    F reset_milvus_connections
        vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorGeneration::test_uniform_vector_generation (fixtures used: reset_milvus_connections)PASSED
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorGeneration::test_normal_vector_generation 
        SETUP    F reset_milvus_connections
        vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorGeneration::test_normal_vector_generation (fixtures used: reset_milvus_connections)PASSED
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorGeneration::test_normalized_vector_generation 
        SETUP    F reset_milvus_connections
        vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorGeneration::test_normalized_vector_generation (fixtures used: reset_milvus_connections)PASSED
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorGeneration::test_chunked_vector_generation 
        SETUP    F reset_milvus_connections
        vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorGeneration::test_chunked_vector_generation (fixtures used: reset_milvus_connections)PASSED
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorGeneration::test_vector_generation_with_ids 
        SETUP    F reset_milvus_connections
        vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorGeneration::test_vector_generation_with_ids (fixtures used: reset_milvus_connections)PASSED
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorGeneration::test_vector_generation_progress_tracking 
        SETUP    F reset_milvus_connections
        vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorGeneration::test_vector_generation_progress_tracking (fixtures used: reset_milvus_connections)PASSED
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorLoading::test_batch_insertion 
        SETUP    F reset_milvus_connections
        SETUP    F mock_collection
        vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorLoading::test_batch_insertion (fixtures used: mock_collection, reset_milvus_connections)PASSED
        TEARDOWN F mock_collection
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorLoading::test_insertion_with_error_handling 
        SETUP    F reset_milvus_connections
        SETUP    F mock_collection
        vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorLoading::test_insertion_with_error_handling (fixtures used: mock_collection, reset_milvus_connections)PASSED
        TEARDOWN F mock_collection
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorLoading::test_parallel_insertion 
        SETUP    F reset_milvus_connections
        SETUP    F mock_collection
        vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorLoading::test_parallel_insertion (fixtures used: mock_collection, reset_milvus_connections)PASSED
        TEARDOWN F mock_collection
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorLoading::test_insertion_with_metadata 
        SETUP    F reset_milvus_connections
        SETUP    F mock_collection
        vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorLoading::test_insertion_with_metadata (fixtures used: mock_collection, reset_milvus_connections)PASSED
        TEARDOWN F mock_collection
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorLoading::test_insertion_rate_monitoring 
        SETUP    F reset_milvus_connections
        SETUP    F mock_collection
        vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorLoading::test_insertion_rate_monitoring (fixtures used: mock_collection, reset_milvus_connections)PASSED
        TEARDOWN F mock_collection
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorLoading::test_load_checkpoint_resume 
SETUP    S test_data_dir
        SETUP    F reset_milvus_connections
        vdb_benchmark/tests/tests/test_load_vdb.py::TestVectorLoading::test_load_checkpoint_resume (fixtures used: reset_milvus_connections, test_data_dir)PASSED
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_load_vdb.py::TestLoadOptimization::test_dynamic_batch_sizing 
        SETUP    F reset_milvus_connections
        vdb_benchmark/tests/tests/test_load_vdb.py::TestLoadOptimization::test_dynamic_batch_sizing (fixtures used: reset_milvus_connections)PASSED
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_load_vdb.py::TestLoadOptimization::test_memory_aware_loading 
        SETUP    F reset_milvus_connections
        vdb_benchmark/tests/tests/test_load_vdb.py::TestLoadOptimization::test_memory_aware_loading (fixtures used: reset_milvus_connections)PASSED
        TEARDOWN F reset_milvus_connections
vdb_benchmark/tests/tests/test_load_vdb.py::TestLoadOptimization::test_flush_optimization 
        SETUP    F reset_milvus_connections
        SETUP    F mock_collection
        vdb_benchmark/tests/tests/test_load_vdb.py::TestLoadOptimization::test_flush_optimization (fixtures used: mock_collection, reset_milvus_connections)PASSED
        TEARDOWN F mock_collection
        TEARDOWN F reset_milvus_connections
TEARDOWN S test_data_dir

================================================================= 25 passed in 0.13s =================================================================
smrc@dskbd029:~/Storage_Repo_Tests/storage_vdb375$ htop
smrc@dskbd029:~/Storage_Repo_Tests/storage_vdb375$ uv run python vdb_benchmark/vdbbench/load_vdb.py \
    --host 127.0.0.1 --collection test375_smoke \
    --num-vectors 50000 --chunk-size 10000 --dimension 1536 \
    --batch-size 5000 --distribution uniform
2026-05-26 00:56:51,123 - INFO - Connected to Milvus server at 127.0.0.1:19530
2026-05-26 00:56:51,123 - WARNING - FLOAT16 data type not available in this version of pymilvus. Using FLOAT_VECTOR instead.
2026-05-26 00:56:51,142 - INFO - Created collection 'test375_smoke' with 1536 dimensions and 1 shards
2026-05-26 00:56:51,142 - INFO - Creating index with parameters: {'index_type': 'DISKANN', 'metric_type': 'COSINE', 'params': {'MaxDegree': 16, 'SearchListSize': 200}}
2026-05-26 00:56:51,653 - INFO - Index creation command completed in 0.51 seconds
2026-05-26 00:56:51,653 - INFO - Generating 50000 vectors with 1536 dimensions using uniform distribution
2026-05-26 00:56:51,653 - INFO - Large vector count detected. Generating in chunks of 10,000 vectors
2026-05-26 00:56:51,653 - INFO - Generating chunk 1: 10,000 vectors
2026-05-26 00:56:51,877 - INFO - Generated chunk 0 (10,000 vectors) in 0.22 seconds. Progress: 0/50,000 vectors (0.0%)
2026-05-26 00:56:51,877 - INFO - Inserting chunk 1 (10,000 vectors) into collection 'test375_smoke' starting at id=0
2026-05-26 00:56:53,038 - INFO - Inserted batch 1/2: 50.00% complete, rate: 4308.30 vectors/sec, id_range=[0, 4999]
2026-05-26 00:56:54,045 - INFO - Inserted batch 2/2: 100.00% complete, rate: 4614.54 vectors/sec, id_range=[5000, 9999]
2026-05-26 00:56:54,045 - INFO - Inserted 10000 vectors in 2.17 seconds
2026-05-26 00:56:54,045 - INFO - Generating chunk 2: 10,000 vectors
2026-05-26 00:56:54,251 - INFO - Generated chunk 1 (10,000 vectors) in 0.21 seconds. Progress: 10,000/50,000 vectors (20.0%)
2026-05-26 00:56:54,251 - INFO - Inserting chunk 2 (10,000 vectors) into collection 'test375_smoke' starting at id=10000
2026-05-26 00:56:55,305 - INFO - Inserted batch 1/2: 50.00% complete, rate: 4744.59 vectors/sec, id_range=[10000, 14999]
2026-05-26 00:56:56,378 - INFO - Inserted batch 2/2: 100.00% complete, rate: 4702.16 vectors/sec, id_range=[15000, 19999]
2026-05-26 00:56:56,378 - INFO - Inserted 10000 vectors in 2.13 seconds
2026-05-26 00:56:56,378 - INFO - Generating chunk 3: 10,000 vectors
2026-05-26 00:56:56,583 - INFO - Generated chunk 2 (10,000 vectors) in 0.20 seconds. Progress: 20,000/50,000 vectors (40.0%)
2026-05-26 00:56:56,583 - INFO - Inserting chunk 3 (10,000 vectors) into collection 'test375_smoke' starting at id=20000
2026-05-26 00:56:57,526 - INFO - Inserted batch 1/2: 50.00% complete, rate: 5304.51 vectors/sec, id_range=[20000, 24999]
2026-05-26 00:56:58,458 - INFO - Inserted batch 2/2: 100.00% complete, rate: 5334.07 vectors/sec, id_range=[25000, 29999]
2026-05-26 00:56:58,458 - INFO - Inserted 10000 vectors in 1.87 seconds
2026-05-26 00:56:58,458 - INFO - Generating chunk 4: 10,000 vectors
2026-05-26 00:56:58,658 - INFO - Generated chunk 3 (10,000 vectors) in 0.20 seconds. Progress: 30,000/50,000 vectors (60.0%)
2026-05-26 00:56:58,659 - INFO - Inserting chunk 4 (10,000 vectors) into collection 'test375_smoke' starting at id=30000
2026-05-26 00:56:59,579 - INFO - Inserted batch 1/2: 50.00% complete, rate: 5432.54 vectors/sec, id_range=[30000, 34999]
2026-05-26 00:57:00,597 - INFO - Inserted batch 2/2: 100.00% complete, rate: 5159.16 vectors/sec, id_range=[35000, 39999]
2026-05-26 00:57:00,597 - INFO - Inserted 10000 vectors in 1.94 seconds
2026-05-26 00:57:00,597 - INFO - Generating chunk 5: 10,000 vectors
2026-05-26 00:57:00,802 - INFO - Generated chunk 4 (10,000 vectors) in 0.20 seconds. Progress: 40,000/50,000 vectors (80.0%)
2026-05-26 00:57:00,802 - INFO - Inserting chunk 5 (10,000 vectors) into collection 'test375_smoke' starting at id=40000
2026-05-26 00:57:01,893 - INFO - Inserted batch 1/2: 50.00% complete, rate: 4585.04 vectors/sec, id_range=[40000, 44999]
2026-05-26 00:57:02,787 - INFO - Inserted batch 2/2: 100.00% complete, rate: 5036.72 vectors/sec, id_range=[45000, 49999]
2026-05-26 00:57:02,787 - INFO - Inserted 10000 vectors in 1.99 seconds
2026-05-26 00:57:02,787 - INFO - Generated all 50,000 vectors in 11.13 seconds
2026-05-26 00:57:03,311 - INFO - Flush completed in 0.52 seconds
2026-05-26 00:57:03,311 - INFO - Starting to monitor index building progress (checking every 5 seconds)
2026-05-26 00:57:03,314 - INFO - Starting to monitor progress for collection: test375_smoke
2026-05-26 00:57:03,315 - INFO - Initial state: 0 of 50,000 rows indexed
2026-05-26 00:57:03,315 - INFO - Initial pending rows: 50,000
2026-05-26 00:57:08,317 - INFO - Progress: 0.00% complete... (0/50,000 rows) | Pending rows: 50,000
2026-05-26 00:57:13,323 - INFO - Progress: 0.00% complete... (0/50,000 rows) | Pending rows: 50,000
2026-05-26 00:57:18,329 - INFO - Progress: 0.00% complete... (0/50,000 rows) | Pending rows: 50,000
2026-05-26 00:57:23,333 - INFO - No pending rows detected. Assuming indexing phase is complete.
2026-05-26 00:57:23,334 - INFO - No pending rows for 0.0 seconds (waiting for 10 seconds to confirm)
2026-05-26 00:57:28,338 - INFO - No pending rows for 5.0 seconds (waiting for 10 seconds to confirm)
2026-05-26 00:57:33,341 - INFO - No pending rows for 10.0 seconds (waiting for 10 seconds to confirm)
2026-05-26 00:57:33,341 - INFO - No pending rows detected for 0 minutes. Process is considered complete.
2026-05-26 00:57:33,341 - INFO - Process fully complete! Total time: 0:00:30
2026-05-26 00:57:33,341 - INFO - Benchmark completed successfully!
smrc@dskbd029:~/Storage_Repo_Tests/storage_vdb375$ uv run python - <<'PY'
from pymilvus import connections, Collection
connections.connect("default", host="127.0.0.1", port="19530")
c = Collection("test375_smoke"); c.flush(); c.load()

# (a) Total physical rows
print("num_entities:", c.num_entities)            # expect 50000

# (b) PK range is contiguous and reaches the top — the real test.
# Pre-fix this maxed out at chunk_size-1 (9999).
tail = c.query(expr="id >= 49990", output_fields=["id"], limit=20)
ids = sorted(r["id"] for r in tail)
print("max id seen:", max(ids))                   # expect 49999, NOT 9999
print("tail ids:", ids)

# (c) Spot-check no duplicate at a chunk boundary
boundary = c.query(expr="id in [9999, 10000, 19999, 20000]",
                   output_fields=["id"], limit=10)
print("boundary ids found:", sorted(r['id'] for r in boundary))  # expect all four
PY
num_entities: 50000
max id seen: 49999
tail ids: [49990, 49991, 49992, 49993, 49994, 49995, 49996, 49997, 49998, 49999]
boundary ids found: [9999, 10000, 19999, 20000]
smrc@dskbd029:~/Storage_Repo_Tests/storage_vdb375$ uv run python vdb_benchmark/vdbbench/enhanced_bench.py \
    --host 127.0.0.1 --collection test375_smoke \
    --auto-create-flat --runtime 10 --queries 1000 \
    --recall-k 10 --search-limit 10 --batch-size 10 --processes 2

============================================================
ENHANCED VDB BENCH — runtime/query-count mode
============================================================
Results will be saved to: vdbbench_results/20260526_010852

============================================================
Database Verification and Collection Loading
============================================================
Connecting to Milvus server at 127.0.0.1:19530...
Collection test375_smoke already loaded.

Collection: test375_smoke  vectors=50000  dim=1536  index=DISKANN  metric=COSINE
Detected source vector field: 'vector'

============================================================
RECALL SETUP (outside benchmark timing)
============================================================
Ground truth is pre-computed using a FLAT (brute-force) index.
Using metric type: COSINE

Generating 1000 query vectors (dim=1536, seed=42)...
Generated 1000 query vectors.

Setting up FLAT collection: test375_smoke_flat_gt
Creating FLAT collection 'test375_smoke_flat_gt' from source 'test375_smoke'...
Source schema: pk_field='id' (INT64), vec_field='vector', vectors=50000
Copying 50000 vectors to FLAT collection (batch_size=5000)...
  Copied 50000/50000 vectors (100.0%)
Building FLAT index...
FLAT collection 'test375_smoke_flat_gt' ready with 50000 vectors.
Pre-computing ground truth for 1000 queries using FLAT index (top_k=10)...
Ground truth pre-computation complete: 1000 queries in 2.07s
Ground truth ready: 1000 queries pre-computed.

Collecting initial disk statistics...

============================================================
Benchmark Execution
============================================================
Starting benchmark: 2 processes × 500 queries/process
Recall: 1000 pre-generated queries, recall@10
NOTE: batch_end timing is placed BEFORE recall capture — performance unaffected.
NOTE: recall hits written to per-worker recall_hits_p<N>.jsonl files.
Staggering process startup by 0.500s
Starting process 0...
Process 0 initialized
Process 0 - Loading collection
Process 0: Writing results to vdbbench_results/20260526_010852/milvus_benchmark_p0.csv
Process 0: Starting benchmark ...
Process 0: Completed 100 queries in 0.31 seconds.
Starting process 1...
Process 1 initialized
Process 1 - Loading collection
Process 1: Writing results to vdbbench_results/20260526_010852/milvus_benchmark_p1.csv
Process 1: Starting benchmark ...
Process 0: Completed 200 queries in 0.58 seconds.
Process 1: Completed 100 queries in 0.25 seconds.
Process 0: Completed 300 queries in 0.87 seconds.
Process 1: Completed 200 queries in 0.51 seconds.
Process 0: Completed 400 queries in 1.13 seconds.
Process 1: Completed 300 queries in 0.77 seconds.
Process 0: Completed 500 queries in 1.41 seconds.
Process 0: Finished. Executed 500 queries in 1.44 seconds
Process 1: Completed 400 queries in 1.03 seconds.
Process 1: Completed 500 queries in 1.28 seconds.
Process 1: Finished. Executed 500 queries in 1.31 seconds
Reading final disk statistics...

Calculating recall from per-worker JSONL files...
  Loaded ANN hits for 500 unique query indices from 2 worker(s).
Calculating benchmark statistics...

============================================================
BENCHMARK SUMMARY
============================================================
Total Queries: 1000
Total Batches: 100
Total Runtime: 1.83s

QUERY STATISTICS
------------------------------------------------------------
Mean Latency:      2.70 ms
Median Latency:    2.56 ms
P95 Latency:       3.78 ms
P99 Latency:       4.01 ms
P99.9 Latency:     4.15 ms
P99.99 Latency:    4.15 ms
Throughput:        547.92 queries/second

BATCH STATISTICS
------------------------------------------------------------
Mean Batch Time:   26.97 ms
Median Batch Time: 25.62 ms
P95 Batch Time:    37.83 ms
P99 Batch Time:    40.07 ms
P99.9 Batch Time:  41.35 ms
P99.99 Batch Time: 41.48 ms
Max Batch Time:    41.49 ms
Batch Throughput:  37.07 batches/second

RECALL STATISTICS (recall@10)
------------------------------------------------------------
Mean Recall:       0.7292
Median Recall:     0.7000
Min Recall:        0.2000
Max Recall:        1.0000
P95 Recall:        0.9000
P99 Recall:        1.0000
Queries Evaluated: 500

DISK I/O DURING BENCHMARK
------------------------------------------------------------
Total Read:        679.21 MB  (372.15 MB/s,  47601 IOPS)
Total Write:       776.00 KB  (0.42 MB/s,  5 IOPS)

Per-Device Breakdown:
  sda:
    Read:  188.00 KB  (0.10 MB/s, 2 IOPS)
    Write: 252.00 KB  (0.13 MB/s, 0 IOPS)
  sda3:
    Read:  188.00 KB  (0.10 MB/s, 2 IOPS)
    Write: 252.00 KB  (0.13 MB/s, 0 IOPS)
  dm-0:
    Read:  188.00 KB  (0.10 MB/s, 2 IOPS)
    Write: 252.00 KB  (0.13 MB/s, 1 IOPS)
  nvme3n1:
    Read:  678.66 MB  (371.85 MB/s, 47596 IOPS)
    Write: 20.00 KB  (0.01 MB/s, 3 IOPS)

Detailed results: vdbbench_results/20260526_010852
Recall details:   vdbbench_results/20260526_010852/recall_stats.json
============================================================
smrc@dskbd029:~/Storage_Repo_Tests/storage_vdb375$ uv run python vdb_benchmark/vdbbench/list_collections.py --host 127.0.0.1 \
  | grep -i flat_gt        # expect ~50000 entities, not ~10000
2026-05-26 01:11:00,324 - INFO - Connected to Milvus server at 127.0.0.1:19530
2026-05-26 01:11:00,326 - INFO - Found 2 collections
2026-05-26 01:11:00,326 - INFO - Getting information for collection: test375_smoke
2026-05-26 01:11:00,740 - INFO - Getting information for collection: test375_smoke_flat_gt
2026-05-26 01:11:01,166 - INFO - Disconnected from Milvus server
| test375_smoke_flat_gt |          50000 |        1536 | FLAT          | COSINE         |            1 |

…y keys

github-actions · 2026-05-26T03:30:20Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

Fix mlcommons#375: chunked vector loading uses globally-unique primar…

06a198d

…y keys

idevasena requested a review from a team May 26, 2026 03:30

FileSystemGuy approved these changes May 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #375: chunked vector loading uses globally-unique primary keys#387

Fix #375: chunked vector loading uses globally-unique primary keys#387
idevasena wants to merge 1 commit into
mlcommons:mainfrom
idevasena:fix/issue-375-duplicate-pks-chunked-insert

idevasena commented May 26, 2026

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

idevasena commented May 26, 2026

Fix #375: chunked vector loading uses globally-unique primary keys

Problem

Root cause

Fix

Testing

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants