Adds HMAC#28
Conversation
kleinreact
left a comment
There was a problem hiding this comment.
Thank your for the addition.
I have a few change requests (see below).
kleinreact
left a comment
There was a problem hiding this comment.
I have some more feedback on several particularities of the code.
I also like to see HITL tests as part of the PR, as they are easy to add in this case and they give insights about he resource usage on the device.
| -- If isKey goes from low->high, we know the msg is finished | ||
| isEndMsg = not wasKey && isKey | ||
|
|
||
| outputS = mealyB step (True, 0 :: Index (BlockSize alg + 1)) (isKeyS, inputS) |
There was a problem hiding this comment.
Shouldn't the initial state be (False, 0)?
My assumption is that the key indicator can be either low or high initially, or theoretically being in the middle of sending a message. The component simply ignores those inputs then, because the first input arrives as soon as the the indicator switches to high, which also can be immediately at startup.
It may be more elegant to pass isRising of the indicator to the Mealy machine instead, which allows for easier separation of concerns here.
|
|
||
| newState | ||
| | -- If it's a new key, reset the padding state | ||
| isEndMsg = (isKey, 0) |
There was a problem hiding this comment.
This check skips the first byte of the next key+msg pair, if the first byte is received together with the isEndMsg toggle.
| step (wasKey, n) (isKey, input) = (newState, (isEndMsg, output)) | ||
| where | ||
| output | ||
| | not isKey && n < keySize = Just padByte |
There was a problem hiding this comment.
It is dangerous to add the padding although the environment does not provide any further inputs. This means that the input rate may need to change while the padding is added. This not only may cause some extra burden to be handled by the environment, but also allows to deduce the length of the key via observing a potential change in the input rate.
We only should add padding bytes on inputs being received. This way, the environment of the component is in full control over the step rate utilized internally.
There was a problem hiding this comment.
I'm having trouble understanding this comment. What do you mean by "environment" and "input rate"?
The padding is only triggered when isKey goes low. Given the initState is accepting a key (isKey=True), it accepts a key until the key is triggered "done"(by raising isKey=False). Once that happens, the paddingCircuit will add any remaining padding required. This includes if the key has been passed 0 bytes, because I consider that to be a valid use case.
But it'll be easier to respond once I better understand your comment :)
There was a problem hiding this comment.
Or are you saying that one should not pad dynamically, but either 1) expect the user to pad ahead of time or 2) pad it all at once, once the end of the key is signalled?
Part of my confusion also stems from your previous comment "it's fine to make the upstream circuit wait 2*key length cycles" or something similar while we pad it, rather than exposing an external flag saying the key has finished padding. Is that comment unrelated to this comment?
There was a problem hiding this comment.
In reactive system verification, the "environment" is considered to be in control of the inputs being passed to a component. The term just abstracts over all possible input providers, e.g., another component, a peripheral interface, a simulation stimuli or a sensor input. It is useful to stick with that terminology, as it avoids unnecessary ambiguities during communication.
The input rate is the frequency at which the input arrives. Full utilization would mean that everything is received as a consecutive block, e.g., Just _ :- Just _ :- Just :- _. However, inputs could also arrive at a slower rate, e.g., Just _ :- Nothing :- Nothing :- Just _ :- Nothing :- Nothing :- Just _ :- Nothing :- _.
Or are you saying that one should not pad dynamically, but either 1) expect the user to pad ahead of time or 2) pad it all at once, once the end of the key is signalled?
What I like to say is that the environment shall be in control of the points in time at which the padding is added. It can also add the padding ahead of time, but the actual data is irrelevant at this point. The circuit can overwrite any padded bytes with zeros, but it cannot control the rate of the arrival. There must be at least the required amount of cycles at the full utilization rate to fill in the padding, where the environment needs to wait between sending the key and starting the message (if we like to avoid buffering any data). However, there is no issue, if it takes longer than that.
Therefore, giving control to the environment on how long it takes to add the padding is the most flexible and secure solution.
| deriving (Generic, Eq, NFDataX) | ||
|
|
||
| -- Stores the internal computed/assembled values for hmac state | ||
| data HmacInfo alg n m = |
There was a problem hiding this comment.
The m parameter is not needed here. We can remove it and reduce the SuitableDivisorForHMAC constraint of the NFDataX instance to just (KnownNat n, 1 <= n).
| deriving instance ( KnownNat (MessageDigestSize alg) | ||
| , KnownNat (BlockSize alg) | ||
| , SuitableDivisorForHMAC n alg m | ||
| ) => Eq (HmacInfo alg n m) |
There was a problem hiding this comment.
This instance is not needed. Also note that comparing large vectors, like here, can be quite expensive in terms of resource usage on the device.
| OuterHash -> digestIn | ||
| _ -> Nothing | ||
|
|
||
| outputS = mealyB step initState (isEndS, inputS, digestS) |
There was a problem hiding this comment.
You should try to avoid the generation of circular signals, which need to be handled explicitly as such in a Mealy or Moore machine. Debugging the behavior of these machines quickly gets quite tricky then.
You should take care of that in your initial architecture design considerations.
There was a problem hiding this comment.
Could you elaborate on what you mean by circular signals in this specific case? I don't see one.
There was a problem hiding this comment.
One example would be
[hmac hashOutput] --
[hmacController digestS] --
[hmacController step digestIn] ->
[hmacController step hashInput] --
[hmac hashInput] ->
[hmac hashInputBugFix] ->
[hmac hashOutput]
where -- indicates a wire and -> a transformation.
I don't claim that this loop can be completely avoided in general, but your initial design requires that particular signal to be followed through multiple components in order to understand the state machine of the controller. You will fail to formally verify such a design.
Instead, you need to be able to consider any component in isolation, where inputs have some well defined behavior resulting in some well defined outputs. Separation of concerns is very important for later verification and requires to avoid component behavior being intertwined.
In this particular case, I solved this via introducing a staging signal that splits up the temporal stages of the behavior such that those stages can be considered in isolation. I also tried to isolate concerns via distributing behavior to multiple components that having a very dedicated purpose each. This leads to a larger composition of the individual pieces, sure, but due to the staging you don't need to look at all of them at the same time.
| InnerHash | ||
| | n < keySize -> Just (hmacInfo.key !! n `xor` innerPadByte) | ||
| | otherwise -> inputIn | ||
| OuterHash | ||
| | n < keySize -> Just (hmacInfo.key !! n `xor` outerPadByte) | ||
| | n < keySize + msgSize -> Just (hmacInfo.msgHash !! (n-keySize)) |
There was a problem hiding this comment.
Dynamically indexing the vector with the internal counter will be quite inefficient. It is better to shift or rotate the vector instead.
| hmacTestInput :: [(Maybe (BitVector 8), Bool)] | ||
| hmacTestInput = | ||
| -- Skip over reset | ||
| List.replicate 3 (Nothing, True) |
There was a problem hiding this comment.
Please also test for isKey being low initially.
| <> keyInput | ||
| -- Note: The circuit needs at most `2*(BlockSize alg / n)+c` cycles | ||
| -- to pad the key, where `n` in this case is 8 and `c` is a small constant | ||
| -- potentially introduced by registers etc. `c=3` should be sufficient. | ||
| <> List.replicate 300 (Nothing, False) | ||
| <> msgInput |
There was a problem hiding this comment.
We should also test for non-contiguous inputs at this point.
| $ maybe 0 fst | ||
| $ List.uncons | ||
| $ catMaybes | ||
| $ sampleN @System numTestCycles |
There was a problem hiding this comment.
You should not use a constant for the number of test cycles, but calculate the required numbers of cycles to be simulated from the inputs instead (you can over-approximate it). Otherwise, the test will suddenly fail, when changing the range parameters of the generators. A property based test always should work, no matter on which ranges it is executed.
Adds HMAC to Clash.Crypto module.
There are two improvements I'd like to make that are not in this PR:
SHA does it). I've tested it with sha256 and sha512, and the tests are currently
set to sha256 (the one used for ECDSA).
throws an error). It would be better if the circuit hashed the key if too large.
It would make the circuit easier to work with. However, this is not a functionality
blocker as the upstream circuit can always pre-emptively hash the key (as well as the
key never needs to be hashed for ECDSA).
Due to time restrictions, I'm planning on addressing these improvements in a separate PR.