Skip to content

Adds HMAC#28

Merged
kleinreact merged 7 commits into
mainfrom
add-hmac
Jul 8, 2025
Merged

Adds HMAC#28
kleinreact merged 7 commits into
mainfrom
add-hmac

Conversation

@enathang
Copy link
Copy Markdown
Contributor

Adds HMAC to Clash.Crypto module.

There are two improvements I'd like to make that are not in this PR:

  1. Have hedgehog tests try different SHA versions automatically (similar to how
    SHA does it). I've tested it with sha256 and sha512, and the tests are currently
    set to sha256 (the one used for ECDSA).
  2. The circuit does not hash the key if the key is too large (rather, the circuit
    throws an error). It would be better if the circuit hashed the key if too large.
    It would make the circuit easier to work with. However, this is not a functionality
    blocker as the upstream circuit can always pre-emptively hash the key (as well as the
    key never needs to be hashed for ECDSA).

Due to time restrictions, I'm planning on addressing these improvements in a separate PR.

Comment thread src/Clash/Crypto/Hash/HMAC.hs Outdated
Copy link
Copy Markdown
Member

@kleinreact kleinreact left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank your for the addition.

I have a few change requests (see below).

Comment thread clash-crypto.cabal Outdated
Comment thread clash-crypto.cabal Outdated
Comment thread clash-crypto.cabal Outdated
Comment thread tests/sim/Main.hs Outdated
Comment thread tests/sim/Test/Clash/Crypto/Hash/HMAC.hs Outdated
Comment thread src/Clash/Crypto/Hash/HMAC.hs Outdated
Comment thread src/Clash/Crypto/Hash/HMAC.hs Outdated
Comment thread src/Clash/Crypto/Hash/HMAC.hs Outdated
Comment thread tests/sim/Test/Clash/Crypto/Hash/HMAC.hs Outdated
Comment thread tests/sim/Test/Clash/Crypto/Hash/HMAC.hs Outdated
@enathang enathang requested a review from kleinreact June 30, 2025 09:30
Copy link
Copy Markdown
Member

@kleinreact kleinreact left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some more feedback on several particularities of the code.

I also like to see HITL tests as part of the PR, as they are easy to add in this case and they give insights about he resource usage on the device.

Comment thread src/Clash/Crypto/MAC/HMAC.hs Outdated
-- If isKey goes from low->high, we know the msg is finished
isEndMsg = not wasKey && isKey

outputS = mealyB step (True, 0 :: Index (BlockSize alg + 1)) (isKeyS, inputS)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the initial state be (False, 0)?

My assumption is that the key indicator can be either low or high initially, or theoretically being in the middle of sending a message. The component simply ignores those inputs then, because the first input arrives as soon as the the indicator switches to high, which also can be immediately at startup.

It may be more elegant to pass isRising of the indicator to the Mealy machine instead, which allows for easier separation of concerns here.

Comment thread src/Clash/Crypto/MAC/HMAC.hs Outdated

newState
| -- If it's a new key, reset the padding state
isEndMsg = (isKey, 0)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check skips the first byte of the next key+msg pair, if the first byte is received together with the isEndMsg toggle.

Comment thread src/Clash/Crypto/MAC/HMAC.hs Outdated
step (wasKey, n) (isKey, input) = (newState, (isEndMsg, output))
where
output
| not isKey && n < keySize = Just padByte
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is dangerous to add the padding although the environment does not provide any further inputs. This means that the input rate may need to change while the padding is added. This not only may cause some extra burden to be handled by the environment, but also allows to deduce the length of the key via observing a potential change in the input rate.

We only should add padding bytes on inputs being received. This way, the environment of the component is in full control over the step rate utilized internally.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble understanding this comment. What do you mean by "environment" and "input rate"?

The padding is only triggered when isKey goes low. Given the initState is accepting a key (isKey=True), it accepts a key until the key is triggered "done"(by raising isKey=False). Once that happens, the paddingCircuit will add any remaining padding required. This includes if the key has been passed 0 bytes, because I consider that to be a valid use case.

But it'll be easier to respond once I better understand your comment :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or are you saying that one should not pad dynamically, but either 1) expect the user to pad ahead of time or 2) pad it all at once, once the end of the key is signalled?

Part of my confusion also stems from your previous comment "it's fine to make the upstream circuit wait 2*key length cycles" or something similar while we pad it, rather than exposing an external flag saying the key has finished padding. Is that comment unrelated to this comment?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In reactive system verification, the "environment" is considered to be in control of the inputs being passed to a component. The term just abstracts over all possible input providers, e.g., another component, a peripheral interface, a simulation stimuli or a sensor input. It is useful to stick with that terminology, as it avoids unnecessary ambiguities during communication.

The input rate is the frequency at which the input arrives. Full utilization would mean that everything is received as a consecutive block, e.g., Just _ :- Just _ :- Just :- _. However, inputs could also arrive at a slower rate, e.g., Just _ :- Nothing :- Nothing :- Just _ :- Nothing :- Nothing :- Just _ :- Nothing :- _.

Or are you saying that one should not pad dynamically, but either 1) expect the user to pad ahead of time or 2) pad it all at once, once the end of the key is signalled?

What I like to say is that the environment shall be in control of the points in time at which the padding is added. It can also add the padding ahead of time, but the actual data is irrelevant at this point. The circuit can overwrite any padded bytes with zeros, but it cannot control the rate of the arrival. There must be at least the required amount of cycles at the full utilization rate to fill in the padding, where the environment needs to wait between sending the key and starting the message (if we like to avoid buffering any data). However, there is no issue, if it takes longer than that.

Therefore, giving control to the environment on how long it takes to add the padding is the most flexible and secure solution.

Comment thread src/Clash/Crypto/MAC/HMAC.hs Outdated
deriving (Generic, Eq, NFDataX)

-- Stores the internal computed/assembled values for hmac state
data HmacInfo alg n m =
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The m parameter is not needed here. We can remove it and reduce the SuitableDivisorForHMAC constraint of the NFDataX instance to just (KnownNat n, 1 <= n).

Comment thread src/Clash/Crypto/MAC/HMAC.hs Outdated
Comment on lines +242 to +245
deriving instance ( KnownNat (MessageDigestSize alg)
, KnownNat (BlockSize alg)
, SuitableDivisorForHMAC n alg m
) => Eq (HmacInfo alg n m)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This instance is not needed. Also note that comparing large vectors, like here, can be quite expensive in terms of resource usage on the device.

Comment thread src/Clash/Crypto/MAC/HMAC.hs Outdated
OuterHash -> digestIn
_ -> Nothing

outputS = mealyB step initState (isEndS, inputS, digestS)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should try to avoid the generation of circular signals, which need to be handled explicitly as such in a Mealy or Moore machine. Debugging the behavior of these machines quickly gets quite tricky then.

You should take care of that in your initial architecture design considerations.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate on what you mean by circular signals in this specific case? I don't see one.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One example would be

[hmac hashOutput] --
[hmacController digestS] --
[hmacController step digestIn] ->
[hmacController step hashInput] --
[hmac hashInput] ->
[hmac hashInputBugFix] ->
[hmac hashOutput]

where -- indicates a wire and -> a transformation.

I don't claim that this loop can be completely avoided in general, but your initial design requires that particular signal to be followed through multiple components in order to understand the state machine of the controller. You will fail to formally verify such a design.

Instead, you need to be able to consider any component in isolation, where inputs have some well defined behavior resulting in some well defined outputs. Separation of concerns is very important for later verification and requires to avoid component behavior being intertwined.

In this particular case, I solved this via introducing a staging signal that splits up the temporal stages of the behavior such that those stages can be considered in isolation. I also tried to isolate concerns via distributing behavior to multiple components that having a very dedicated purpose each. This leads to a larger composition of the individual pieces, sure, but due to the staging you don't need to look at all of them at the same time.

Comment thread src/Clash/Crypto/MAC/HMAC.hs Outdated
Comment on lines +202 to +207
InnerHash
| n < keySize -> Just (hmacInfo.key !! n `xor` innerPadByte)
| otherwise -> inputIn
OuterHash
| n < keySize -> Just (hmacInfo.key !! n `xor` outerPadByte)
| n < keySize + msgSize -> Just (hmacInfo.msgHash !! (n-keySize))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dynamically indexing the vector with the internal counter will be quite inefficient. It is better to shift or rotate the vector instead.

Comment thread tests/sim/Test/Clash/Crypto/MAC/HMAC.hs Outdated
hmacTestInput :: [(Maybe (BitVector 8), Bool)]
hmacTestInput =
-- Skip over reset
List.replicate 3 (Nothing, True)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also test for isKey being low initially.

Comment thread tests/sim/Test/Clash/Crypto/MAC/HMAC.hs Outdated
Comment on lines +69 to +74
<> keyInput
-- Note: The circuit needs at most `2*(BlockSize alg / n)+c` cycles
-- to pad the key, where `n` in this case is 8 and `c` is a small constant
-- potentially introduced by registers etc. `c=3` should be sufficient.
<> List.replicate 300 (Nothing, False)
<> msgInput
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also test for non-contiguous inputs at this point.

Comment thread tests/sim/Test/Clash/Crypto/MAC/HMAC.hs Outdated
$ maybe 0 fst
$ List.uncons
$ catMaybes
$ sampleN @System numTestCycles
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should not use a constant for the number of test cycles, but calculate the required numbers of cycles to be simulated from the inputs instead (you can over-approximate it). Otherwise, the test will suddenly fail, when changing the range parameters of the generators. A property based test always should work, no matter on which ranges it is executed.

Copy link
Copy Markdown
Member

@kleinreact kleinreact left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The requested changes and #13 have already been addressed by the last two commits. I simply forgot to publish the review before.

@enathang the feedback still is worth to be considered for future PRs.

@kleinreact kleinreact merged commit 5e95fb5 into main Jul 8, 2025
@kleinreact kleinreact deleted the add-hmac branch July 8, 2025 12:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants