Skip to content

feat(wuji): Genesis-native deploy + sim2real toolchain#235

Open
KraHsu wants to merge 11 commits into
mainfrom
feat/wuji-deploy-genesis
Open

feat(wuji): Genesis-native deploy + sim2real toolchain#235
KraHsu wants to merge 11 commits into
mainfrom
feat/wuji-deploy-genesis

Conversation

@KraHsu

@KraHsu KraHsu commented Jun 16, 2026

Copy link
Copy Markdown
Owner

What

Brings the WUJI dexterous-hand in-hand reorientation policy from sim to the real hand: a Genesis-native deploy stack plus the full sim2real toolchain, mirroring wuji-mjlab's pixi run -e deploy … commands one-for-one.

Two commits:

  1. 7cce6b7 — base Genesis-native deploy port (real2sim + ONNX hand control; numpy core, ZMQ-decoupled).
  2. 06f566f — sim2real toolchain + assets in the central zoo (this round).

Sim2real chain (all faithful ports, GeneLab-native)

Step mjlab GeneLab
bridge check / home hand_utils.py check/home deploy/scripts/hand_utils.py — read-only sanity + 3s ease-in-out ramp (baked into WujiHandDriver.home(duration_s))
vision cube_world_observer.py --preview full Hikvision MVS port: multi-face ArUco board, SO3 Kalman + position LP + corner EMA, world auto-sampling, fast ROI, --preview
calibration tools/calib_check.py deploy/scripts/calib_check.py — Genesis digital-twin viewer (live encoders + observed cube)
control play_real.py deploy/scripts/play_real.py — goal modes (external/fixed/random) + success monitor + resample-on-success

Notable changes

  • Core: InteractiveScene.refresh_visualizer() — FK-only viewer refresh (no physics integration; the mj_forward analogue) so kinematic deploy viewers don't get gravity-pulled.
  • Asset zoo: wuji_hand, wuji_hand_reorient, wuji_cube specs hoisted to module level in genelab.asset_zoo.wuji_hand → discoverable via genelab asset list/download; examples import them as the single source of truth.
  • Deps: deploy-hand extra pins wujihandpy==1.5.1; vision needs the system Hikvision MVS SDK (see deploy/README.md).
  • Quaternion/obs conventions, joint order, and the 6D goal-error encoding all match the GeneLab training side (obs_dim=207, 3-step history).

Testing

  • 144 deploy/asset/scene unit tests pass (frame math, obs builder, action processor, ONNX wrapper, ramp, asset discovery).
  • End-to-end verified: retrained a 207-dim reorient PPO (reward ~1050, success-threshold 0.2, success≈release), exported to ONNX, and ran play_real against the real ONNX — obs/action dims match, closed loop clean.
  • Hardware glue (MVS capture, real hand) is not exercised in CI.

🤖 Generated with Claude Code

KraHsu and others added 11 commits June 13, 2026 11:53
Port the wuji-mjlab/deploy/reorient pipeline onto the GeneLab (Genesis)
stack under examples/wuji/deploy. Two deliverables: reproduce the real
cube's pose inside the Genesis sim (real2sim), and run an exported ONNX
policy to control the (real or mock) Wuji hand.

Pure-numpy core is simulator- and hardware-agnostic, so all frame /
obs / action / policy logic runs and is unit-tested headlessly (31 tests):
- frame_transform, real2sim: camera->wrist-tag lift + tag->sim-world lift
- zmq_bridge: cube/goal pub-sub, scipy-xyzw<->mujoco-wxyz, last-valid cache
- obs: DeployObsBuilder (207-dim policy obs + 3-step history)
- action: ActionProcessor (offset + clamp + EMA + warmup)
- onnx_policy: ONNXPolicy (GeneLab exporter metadata format)
- hand_driver: HandDriverBase / MockHandDriver / WujiHandDriver (lazy)
- controller: DeployController (closed-loop step)

Two correctness facts pinned by tests:
- the deploy obs needs no forward kinematics (observer reports the cube
  already in the tag frame); FK is only for the viewer
- the deploy 6D goal-error matches the GeneLab training encoding
  (matrix_to_rotation_6d = first two rows), verified numerically against
  the actual training math — differs from the wuji-mjlab deploy convention

Genesis/hardware glue (scripts/play_real, toreal_viewer,
cube_world_observer) is wired over the tested core but not run in CI.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Completes the Genesis-native wuji-hand deploy stack into a full sim2real
pipeline (check → home → vision → calib_check → play_real), all faithful
ports of the wuji-mjlab `pixi run -e deploy ...` commands.

- hand_utils.py: `check` (read-only bridge sanity) + `home` (3s ease-in-out
  ramp, now baked into `WujiHandDriver.home(duration_s)`); `deploy-hand`
  extra pins `wujihandpy==1.5.1`.
- cube_world_observer.py: full Hikvision MVS port (multi-face ArUco board,
  SO3 Kalman + position low-pass + corner EMA, world auto-sampling, fast ROI,
  --preview) replacing the simplified stub; new camera_config.py / cube_geom.py
  / config/observer.yaml. Publishes the identical ZMQ schema via CubePublisher.
- calib_check.py: Genesis digital-twin calibration viewer (live hand encoders
  + observed cube). Adds core `InteractiveScene.refresh_visualizer()` (FK-only
  viewer refresh, no physics — the mj_forward analogue) and `_env.set_hand_joints`
  / single-env override so the cube isn't pulled by gravity.
- play_real.py: goal modes (external/fixed/random) + success monitor
  (geodesic < threshold, held) + resample-on-success; action params verified
  to match the training action term.
- asset zoo: all three wuji AssetSpecs (wuji_hand, wuji_hand_reorient,
  wuji_cube) hoisted to module level in genelab.asset_zoo.wuji_hand so
  `genelab asset list`/`download` discover them; examples import them as the
  single source of truth.

Tested: 144 deploy/asset/scene tests pass; play_real verified end-to-end with
a real 207-dim ONNX exported from a freshly retrained reorient PPO (reward
~1050, success-threshold 0.2).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
play_real now opens a Genesis digital-twin viewer by default (`--viewer`,
`--no-viewer` for headless), mirroring the live hand (encoders) + observed cube
+ goal while the policy drives the real/mock hand — matching wuji-mjlab's
play_real MuJoCo mirror.

- `_SimMirror` reuses the calib_check pieces (build_reorient_env, set_hand_joints,
  set_cube_pose, refresh_visualizer) with deferred imports so `--no-viewer` stays
  numpy-only / headless.
- New `_env.set_goal_marker` poses the play-mode goal_marker at the current goal.
- `DeployController.step()` now returns `joint_pos` so the mirror reuses the read
  (no extra hardware poll).
- README: document --viewer/--no-viewer; fix `genelab export --out` flag.

Tested: controller + deploy tests pass; headless `--no-viewer` smoke clean with
the real 207-dim ONNX.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… bug)

The real hand twitched without manipulating the cube (0% success) while the same
policy scored 1.0 in `genelab eval`. Root cause: a joint-ordering mismatch.

Genesis orders the hand articulation JOINT-major (finger1..5_joint1, then
_joint2, ...), so the trained policy's obs and action are joint-major. But the
encoder / wujihandpy order (JOINT_NAMES_20) is finger-major. Deploy assumed they
were identical ("no remap"), so DeployObsBuilder fed joint_pos/joint_vel in the
wrong order and the action was written to the wrong joints → scrambled.

Fix:
- config: add POLICY_JOINT_NAMES (joint-major) + ENC_TO_POLICY permutation +
  default_joint_pos_policy(); correct the (wrong) "no remap" docstring.
- DeployController: optional enc_to_policy remap — encoder->policy on read
  (joint_pos/joint_vel), policy->encoder on write (target). Runs the obs/action
  in policy order with a policy-order default. Identity when unset (tests/back-compat).
- play_real: pass default_joint_pos_policy() + ENC_TO_POLICY.

Verified: a term-by-term parity harness (DeployObsBuilder vs the env's actual
policy obs) now matches to 5e-7 across all five terms (was joint_pos Δ=1.40);
new tests/test_examples_wuji_deploy_joint_order.py pins POLICY_JOINT_NAMES
against the built env so it can't drift. Also ruled out goal_rot_err_6d frame
(tag_w is identity) — only joint order was wrong. 37 deploy tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
sim2sim diagnosis: the GeneLab reorient policy scores 1.0 in Genesis but only
0.61 in mjlab's MuJoCo (vs mjlab's own policy = 1.0), failing by TIMEOUT (38%),
not drop — it holds the grasp but reorients too slowly under different contact
dynamics. The hand MJCF declares ZERO joint frictionloss, so the policy never
learns to overcome the real hand's static friction and under-drives on hardware.

Add `genelab.mdp.dr.dof_frictionloss`: sets a joint dry-friction baseline via
Genesis `set_dofs_frictionloss`. NOTE Genesis frictionloss is a non-batched
(global, shared-across-envs) dof property — unlike kp/kv it can't be per-env — so
this is a fixed baseline, not per-env DR. Wired into the reorient training DR
(`friction=0.01`, startup, training-only). Experiment to test the stiction
hypothesis; validate by retrain + sim2sim_mjlab.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…0.52)

The joint-frictionloss experiment failed: adding a fixed global stiction baseline
dropped mjlab sim2sim from 0.61 to 0.52 (timeouts 38%->46%). Genesis frictionloss
is a global (non-per-env) dof property, so a fixed value is not real DR — it just
shifts the Genesis overfit point further from MuJoCo. The transfer gap is the
Genesis<->MuJoCo contact dynamics, which Genesis cannot per-env randomize.

Removed the event from the reorient recipe (back to the 0.61 recipe); kept the
mdp.dr.dof_frictionloss primitive + its docstring documenting the global limitation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rient sim2real

Correction to the earlier "Genesis can't do per-env dof DR" conclusion: it CAN —
it's gated by RigidOptions.batch_dofs_info / batch_links_info (default False), which
GeneLab never exposed or enabled. With them off, Genesis stores dof model params
(kp/kv/frictionloss/damping/armature) shared across the batch, so per-env writes
silently no-op — which meant `randomize_joint_stiffness_damping` (pd_gains DR) was
DEAD for the implicit-PD reorient hand the whole time.

- configs.SimulationCfg: expose `batch_dofs_info` / `batch_links_info` -> RigidOptions.
- reorient training: enable both (training-only; eval/play uses nominal params).
- mdp.dr.dof_frictionloss: rewrite to real PER-ENV sampling (was a global baseline
  that hurt sim2sim; per-env is the proper DR). friction_range=(0.0, 0.02).

Verified on the built env: frictionloss AND kp now vary per env (pd_gains DR active).
This is a much stronger DR recipe than the 0.61 run (dead pd_gains + no stiction).
Retrain + sim2sim_mjlab to measure.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…orient sim2real)

Building on the per-env DR win (sim2sim 0.61->0.77 after enabling batch_dofs_info):
- mdp.dr.dof_armature: per-env multiplicative armature DR (scale 0.75-1.3, mjlab
  parity), now possible with batch_dofs_info. Reads nominal via get_dofs_armature.
- reorient DR: add dof_armature; widen dof_frictionloss range 0.02 -> 0.03.

Note: per-env CONTACT-compliance DR (solref/solimp) is confirmed impossible in this
Genesis (geom sol_params are global / batched=False, no batch_geoms_info) — so contact
is the structural ceiling; these dof/link-side per-env DR additions are the achievable
lever. Retrain + sim2sim to measure (expect a modest bump over 0.77).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…andomization

Found via the real-hand 10-deg down-tilt symptom: mjlab randomizes the hand mount
PITCH every episode (reset_root_state pose_range pitch (-0.4, 0.1) rad ~ -23..+6 deg),
so its policy is robust to a tilted mount; GeneLab had NO root DR (fixed-base, pitch
locked at 0), so the real 10-deg tilt is out-of-distribution.

Genesis refuses per-env orientation on a fixed-base link ("Impossible to set
env-specific quat for fixed links with at least one geometry"), so we can't tilt the
hand per env. Instead tilt GRAVITY per env — same gravity-in-palm physics, hand stays
fixed-base, and the wrist-tag world frame (tag_w == identity) is preserved so the deploy
obs pipeline is unchanged.

- mdp.dr.gravity_tilt: per-env gravity-direction DR (random polar angle 0..max_tilt in a
  random azimuth, via rigid_solver.set_gravity(..., envs_idx=...) — already a per-env
  solver field, no batch flag needed).
- reorient training: reset-mode gravity_tilt, max_tilt_rad=0.4 (covers the real ~10 deg
  any-azimuth). Verified per-env tilt varies 0..~21 deg.

Note: sim2sim_mjlab evaluates a LEVEL hand, so it may not move much (or dip slightly from
broader DR) — the real payoff is robustness on the tilted hardware. Retrain + real test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… robustness

sim2sim_mjlab evaluates a level hand, so it can't show the payoff of the gravity-tilt
DR (which targets a tilted mount). Add --gravity-tilt <deg>: tilt the eval scene's
gravity about +X (same gravity-in-palm effect as pitching the hand), persists across
resets via model.opt.gravity.

Baseline measured: the 0.89 level-trained policy (DRv2) drops to 0.79 at 10deg tilt
(timeout 11%->19%) — reproduces the real-hand tilt symptom in sim. The gravity-tilt-DR
policy (DRv3) should recover most of that.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ning stalled)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant