Add concurrent normalization by martijnbastiaan · Pull Request #3196 · clash-lang/clash-compiler

martijnbastiaan · 2026-04-14T10:12:30Z

This adds concurrent normalization by wrapping all state (fields) in MVars. Earlier patches used STM, but this didn't yield a performance benefit due to excessive contention/rollbacks. Explicit locks have all the usual drawbacks of potential deadlocks, which is why TracedMVar has been added. This can be enabled by CLASH_DEBUG_MVAR. Earlier tests on the bittide project yielded a 300% performance increase.

The feature is disabled by default, but can be enabled using -fclash-concurrent-normalization. It is enabled in the clash-testsuite.

Still TODO:

Write a changelog entry (see changelog/README.md)
Check copyright notices are up to date in edited files
Rebase on master
Actually listen to -fclash-concurrent-normalization
Check on bittide-hardware
Investigate Hit specialization limit 20 on function `Clash.Explicit.Testbench.outputVerifierWith[454934]'.. This can be triggered semi-reliably by executing cabal run clash-testsuite -- -j8 -p clash --hide-successes -p XilinxDDR.

martijnbastiaan · 2026-04-14T11:55:29Z

@rowanG077 I let Claude have a look at the issue. The reasoning it came up with sounds reasonable to me, but I haven't checked it in detail so please be careful :). The test suite passes locally. I'll test it on bittide-hardware too

Problem

The commit "Add concurrent normalization" introduced a race condition in the specialization cache. The specialisationCache and specialisationHistory were converted from pure state (sequential access via lens (%=)) to MVar-based shared state, but the cache-check → history-increment → cache-insert sequence was not atomic.

With concurrent normalization, multiple threads normalizing different entities (e.g., testBenchGS, testBenchGA, testBenchUS, testBenchUA) would all try to specialize the same function (ignoreFor) simultaneously. Each thread would:

Check the cache → empty
Increment the specialization history counter
Do the specialization work
Insert into the cache

Since steps 1-4 weren't atomic, N threads would each increment the counter for the same logical specialization, causing the counter to reach N× the expected value and exceeding the specialization limit of 20.

Fix

Two changes in Specialize.hs

Changed the cache type from Map key Id to Map key (MVar Id) in Types.hs, using a "future" pattern (same as normalizeTopLvlBndr already uses for the normalized cache).
Atomic cache reservation: When checking the cache, we atomically either find an existing entry (and wait on it if it's still being computed) or insert an empty MVar as a placeholder. This ensures only one thread does each specialization, and the history counter is only incremented once per unique specialization.

rowanG077 · 2026-04-14T12:00:22Z

Ah that's close to what my hunch was. Interesting that it can do that. I will check in the code whether it's actually true. If true I'm surprised how good it is.

martijnbastiaan · 2026-04-14T12:03:37Z

Modern Claude with VSCode / terminal integration is very close to magic.

martijnbastiaan · 2026-05-01T07:15:48Z

On bittide this does speed up Clash, but it trades it for high memory usage (5.5 GB -> 21 GB). Reducing the thread count from -N8 to -N2 reduces memory usage to 17 GB.

rowanG077 · 2026-05-01T07:21:52Z

And this was not that much of an improvement right and requiring manual OPAQUE on large components?

martijnbastiaan · 2026-05-01T07:26:06Z

With OPAQUEs it is a 33% improvement. Nothing to sneeze at for 18m runs 🫠..

martijnbastiaan · 2026-05-01T07:36:32Z

It seems to be fundamental to how Clash normalization works: due to its lifting of binders and specialization caches Terms just stay around for a looong time. I don't think there is much incremental we can do, though I'm open to ideas.

rowanG077 · 2026-05-01T07:58:23Z

Oh that is MUCH more significant than I thought

rowanG077 · 2026-05-05T16:37:42Z

I fixed a deadlock that occurs when a thread is waiting on a specialization result of another thread that never completed.

I also added listening to -fclash-concurrent-normalization

The actual implementation looks fine to me now.

martijnbastiaan · 2026-05-05T17:34:03Z

Cool, I'll have a look soon.

In retrospect I should have made TracedMVar for qualified use, instead of trying to be consistent with the existing MVar module. What do you think?

rowanG077 · 2026-05-06T07:19:50Z

In retrospect I should have made TracedMVar for qualified use, instead of trying to be consistent with the existing MVar module. What do you think?

I don't have really strong feelings either way. I do like that the API matches the standard MVar which signals to developers that this is just an MVar under the hood. So if you ask me I would keep it like it is currently.

Co-authored-by: Vanessa McHale <vamchale@gmail.com> Co-authored-by: Alexander McKenna <alex@qbaylogic.com> Co-authored-by: Rowan Goemans <rowan@qbaylogic.com>

When the thread responsible for creating a specialization failed after installing the cache placeholder, waiting threads could block forever on the empty result MVar.

martijnbastiaan force-pushed the martijn/concurrent-normalization branch from 197045b to ccf882c Compare April 14, 2026 11:37

rowanG077 force-pushed the martijn/concurrent-normalization branch 2 times, most recently from 99c384a to 80adea0 Compare April 14, 2026 17:23

rowanG077 self-requested a review April 14, 2026 17:32

rowanG077 marked this pull request as ready for review April 14, 2026 18:06

rowanG077 force-pushed the martijn/concurrent-normalization branch from 6ce3ecd to 4ffba9b Compare May 5, 2026 16:19

martijnbastiaan and others added 5 commits May 6, 2026 09:44

Add concurrent normalization

ffa6bf1

Co-authored-by: Vanessa McHale <vamchale@gmail.com> Co-authored-by: Alexander McKenna <alex@qbaylogic.com> Co-authored-by: Rowan Goemans <rowan@qbaylogic.com>

Vibe coded fix for specialization issue

534d07a

Actually listen to -fclash-concurrent-normalization

f6e68ab

Fix concurrent specialization future deadlock

46ee384

When the thread responsible for creating a specialization failed after installing the cache placeholder, waiting threads could block forever on the empty result MVar.

Update changelog

90321b2

rowanG077 force-pushed the martijn/concurrent-normalization branch from b93c821 to 90321b2 Compare May 6, 2026 07:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add concurrent normalization#3196

Add concurrent normalization#3196
martijnbastiaan wants to merge 5 commits into
masterfrom
martijn/concurrent-normalization

martijnbastiaan commented Apr 14, 2026 •

edited

Loading

Uh oh!

martijnbastiaan commented Apr 14, 2026

Uh oh!

rowanG077 commented Apr 14, 2026 •

edited

Loading

Uh oh!

martijnbastiaan commented Apr 14, 2026

Uh oh!

martijnbastiaan commented May 1, 2026

Uh oh!

rowanG077 commented May 1, 2026

Uh oh!

martijnbastiaan commented May 1, 2026

Uh oh!

martijnbastiaan commented May 1, 2026

Uh oh!

rowanG077 commented May 1, 2026

Uh oh!

rowanG077 commented May 5, 2026 •

edited

Loading

Uh oh!

martijnbastiaan commented May 5, 2026

Uh oh!

rowanG077 commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

martijnbastiaan commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Still TODO:

Uh oh!

martijnbastiaan commented Apr 14, 2026

Problem

Fix

Uh oh!

rowanG077 commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

martijnbastiaan commented Apr 14, 2026

Uh oh!

martijnbastiaan commented May 1, 2026

Uh oh!

rowanG077 commented May 1, 2026

Uh oh!

martijnbastiaan commented May 1, 2026

Uh oh!

martijnbastiaan commented May 1, 2026

Uh oh!

rowanG077 commented May 1, 2026

Uh oh!

rowanG077 commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

martijnbastiaan commented May 5, 2026

Uh oh!

rowanG077 commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

martijnbastiaan commented Apr 14, 2026 •

edited

Loading

rowanG077 commented Apr 14, 2026 •

edited

Loading

rowanG077 commented May 5, 2026 •

edited

Loading