Bump crucible and propolis revs to latest#10447
Conversation
Update Crucible from `7103cd3a` to `bd9a0e2a`, picking up the following PRs: - Use an explicit rev for oxidecomputer git deps (oxidecomputer/crucible#1936) - Add Clone and Deserialize to VolumeInfo et al (oxidecomputer/crucible#1935) - Update omicron/oximeter (oxidecomputer/crucible#1933) - [meta] update to drift 0.1.4 (oxidecomputer/crucible#1932) - Don't log if there is nothing to log (oxidecomputer/crucible#1930) - Add VolumeInfo (oxidecomputer/crucible#1928) - Remove bonus Volume layer (oxidecomputer/crucible#1927) - Add session and client id to panic messages (oxidecomputer/crucible#1926) - [crucible-agent-types] migrate to RFD 619 pattern (oxidecomputer/crucible#1899) - Background read-only region creation (oxidecomputer/crucible#1919) - [crucible-downstairs-repair] switch to RFD 619 pattern (oxidecomputer/crucible#1901) - [crucible-pantry] switch to RFD 619 pattern (oxidecomputer/crucible#1900) - Use separate in-memory types (oxidecomputer/crucible#1913) - Remove old field from dtrace action script (oxidecomputer/crucible#1917) - Retry data writes that return an IO error (oxidecomputer/crucible#1915) - Bump dropshot to 0.17.0 (oxidecomputer/crucible#1909) - Reject snapshot requests when read-only (oxidecomputer/crucible#1914) - update ringbuf method, fix clippy lint (oxidecomputer/crucible#1904) - bump vergen-v9 version too (oxidecomputer/crucible#1903) - update dropshot to 0.16.7, dropshot-api-manager to 0.5.2 (oxidecomputer/crucible#1851) - perf-vol.d updates (oxidecomputer/crucible#1898) - upgrade progenitor to 0.13, reqwest to 0.13 (oxidecomputer/crucible#1854) - Remove cargo nextest from github workflow, out of space (oxidecomputer/crucible#1846) - Add a test for VCR serialize/deserialize (oxidecomputer/crucible#1843) Update Propolis from `bc489ddf` to `58ab73bd`, picking up the following PRs: - Bump crucible to latest, update Omicron, use explicit revs (oxidecomputer/propolis#1141) - Add project and silo ids to VM attestation (oxidecomputer/propolis#1114) - Update escargot (oxidecomputer/propolis#1139) - Prefix shebang and mark D scripts as executable (oxidecomputer/propolis#1140) - Fix error in propolis-server README (oxidecomputer/propolis#1138) - [meta] update to drift 0.1.4 (oxidecomputer/propolis#1137) - Fix Intel CPUID leaf 4 cache topology for SMT (oxidecomputer/propolis#1002) - support NVMe Deallocate (oxidecomputer/propolis#1105) - viona: do not lose used/avail indices (oxidecomputer/propolis#1135) - viona: multiqueue device should stay multiqueue across migration (oxidecomputer/propolis#1121) - Bump crucible rev to latest (oxidecomputer/propolis#1132) - expand zerocopy IntoBytes/FromByes use in guest memory accesses (oxidecomputer/propolis#1130) - dropshot-api-manager 0.7.1 (oxidecomputer/propolis#1129) - improve slog component setting (oxidecomputer/propolis#1124) - wait for viona Poller to run before declaring device running (oxidecomputer/propolis#1118) - virtio: tolerate importing queues with adjusted size (oxidecomputer/propolis#1117) - Run viona unit tests in CI (oxidecomputer/propolis#1120) - feature gate Crucible-specific boot digest code (oxidecomputer/propolis#1119) Also: - ran `cargo update -p vergen` - removed the `reqwest012` dependency - removed `reqwest012_client` from Nexus - ran `cargo hakari generate` and `cargo hakari manage-deps` - replace use of `ProgenitorOperationRetry` with `retry_operation_while_indefinitely` - during the region replacement drive saga, consume the new `VolumeInfo` from Propolis and use that to determine when to consider a replacement done
|
I've tagged a few folks for review based on the PRs picked up, namely the propolis ones (related to viona and attestation). |
iximeow
left a comment
There was a problem hiding this comment.
for the propolis side of things: the viona stuff is all around migration ... though propolis#1118 probably fixes an actual bug if you manage to get an instance started and then stopped in a ~100ms window. good to get plumbed along, but really should have no effect on shipped software atm.
propolis#1105 will change how NVMe devices appear to guests (they'll support dataset management! yay!) so that's expected and part of why I'd wanted to get Propolis bumped in Omicron too.
propolis#1114 changes the guest/hypervisor API for attestations. this will break existing attestation users until they bump the rev of https://github.com/oxidecomputer/vm-attest to fit. @flihp @jordanhendricks my understanding is anyone using that (including us :D) expects no particular release-to-release stability yet and we're free to break things. so R20 is gonna do that!
on the whole: shipit (modulo clippy)
|
|
||
| result.into_inner().active | ||
| let health = | ||
| propolis_client_volume_health(&result.into_inner().volume_info); |
There was a problem hiding this comment.
I'm not sure what level we are at right here, but I wonder if we handle a multiple sub-volume volume where each sub-volume can have it's own health state that is independent from the other parts of that same higher level volume.
I'm thinking about the case where we have to sub-volumes, each with a bad downstairs. If we fix one of those sub-volumes, it will be healthy, but the overall sub-volume will not be.
Not sure if that plays into any of this here, so ignore if it is not applicable.
|
|
||
| /// Not all three downstairs are present for one or more region sets. | ||
| ReducedRedundancy, | ||
|
|
There was a problem hiding this comment.
The ReducedRedundancy is when we can't talk to one of the downstairs.
The DownstairsDegraded is when we are talking to the downstairs, but it's not yet made it through LiveRepair and needs to?
If we are not talking with a downstairs, and we have decided too much IO has happened and we have marked it faulted, is that Reduced or Degraded?
| // from their import of the `crucible-client-types` crate, meaning two versions | ||
| // could exist that Nexus could read. Do the simplest thing: write two versions | ||
| // of the function that reads each type returns a `VolumeHealth`. These | ||
| // functions currently are the same, but in the future may temporarily look |
There was a problem hiding this comment.
Could these differ if we change VolumeInfo, and we have propolis at a newer version than the pantry, say during a live update?
I'm not excited that both of these live in nexus though, so maybe I'm not clear on when we would have these two go out of sync.
rust-cache#341
Update Crucible from
7103cd3atobd9a0e2a, picking up the following PRs:Update Propolis from
bc489ddfto58ab73bd, picking up the following PRs:Also:
ran
cargo update -p vergenremoved the
reqwest012dependencyremoved
reqwest012_clientfrom Nexusran
cargo hakari generateandcargo hakari manage-depsreplace use of
ProgenitorOperationRetrywithretry_operation_while_indefinitelyduring the region replacement drive saga, consume the new
VolumeInfofrom Propolis and use that to determine when to consider a replacement done