Skip to content

perf: pooled dada under-utilizes cores (serial bud/shuffle/p_update) — profile & parallelize #33

@cjfields

Description

@cjfields

Observation

In the benchmark, the pooled dada step uses far fewer effective cores than the
per-sample modes, most starkly on MiSeq:

platform mode dada cores (of 24)
MiSeq pooled 9.9–12.9 (dada_rev / dada_fwd)
MiSeq pseudo 19.9–20.8
MiSeq nopool 19.0–19.9
PacBio pooled 18.8
PacBio pseudo / nopool 21.0–21.7

So pooled is the only mode that under-fills cores, and short reads make it worse.

Why (grounded in the code)

run_dada (src/dada.rs:510) is explicitly Amdahl-structured — its own comment:
"Only b_compare_parallel is multithreaded; shuffle/bud/p_update are serial."
Each loop iteration buds one cluster (serial b_bud), aligns all raws against
centers (parallel b_compare_parallel), then b_p_update + shuffle
(serial). Effective cores ≈ parallel_time / total_time × threads, so low
utilization = high serial fraction.

Hypothesis for the platform split: the serial phases scale with nraw × nclust,
while the parallel phase scales with alignment cost. Short MiSeq reads (240 bp)
make each alignment cheap, so the serial per-round bookkeeping dominates; long
PacBio reads (1500 bp) make the parallel compare expensive, so it dominates and
cores stay high. Pooling has no across-sample concurrency axis (unlike
pseudo/nopool with --sample-jobs), so it's fully exposed to this.

First step — it's already instrumented

run_dada keeps phase timers t_compare / t_shuffle / t_bud / t_pupdate.
Surface them (verbose or aux) for a MiSeq pooled run to quantify the serial
breakdown and identify which serial phase to attack.

Then

If b_p_update / b_shuffle / b_bud dominate, evaluate parallelizing them —
they're per-raw loops (p-value updates, max-pval search for budding) amenable to
a rayon reduction. A win here would also help the tail of pseudo/nopool, not
just pooled.

Priority

Lower than the memory tickets: pooled is the least-recommended mode (pseudo/nopool
are preferred and already utilize cores well). File for visibility; chase if the
profile shows a cheap, high-leverage serial phase.

Noted in docs/results.md (MiSeq per-step section).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions