feat(e2e): complete E2E v2 suite — 66 specs, orchestrator, bug fixes#2353
Draft
YellowSnnowmann wants to merge 24 commits into
Draft
feat(e2e): complete E2E v2 suite — 66 specs, orchestrator, bug fixes#2353YellowSnnowmann wants to merge 24 commits into
YellowSnnowmann wants to merge 24 commits into
Conversation
Add three new E2E specs covering the complete tool-call pipeline: - chat-tool-call-flow: single web_fetch round, timeline entry, IN_FLIGHT drain - chat-multi-tool-round: sequential file_read + grep, 3-turn LLM loop - chat-tool-error-recovery: mid-stream error surfacing, composer re-enable, recovery send
…versation history Add three new E2E specs covering real user workflows: - user-journey-full-task: login → chat → web_fetch tool call → result → navigate away + back - user-journey-settings-round-trip: every major settings panel loads without blank screens - chat-conversation-history: multi-turn memory verified via message context inspection and disk persistence
Add two new E2E specs covering navigation quality: - navigation-smoothness: 8-route cycle run twice (normal + rapid), blank-screen char-count guard - navigation-settings-panels: all 8 settings sub-panels visited individually (N2.1-N2.9)
Wire all 8 new specs into the sequential flow runner under three sections: - Chat & agent harness: chat-tool-call, chat-multi-tool, chat-error-recovery - User journeys: journey-full-task, journey-settings, chat-history - Navigation & core UI: navigation-smoothness, navigation-settings
… case `test_reset` now sets `onboarding_completed=false` (in addition to `chat_onboarding_completed=false`) to faithfully mirror a fresh install. Also fixes `ConversationStore::get_messages` returning an I/O error for threads whose JSONL file hasn't been written yet — returns `[]` instead. Adds a regression test for the empty-thread case.
…ck all specs
test_reset (fixed above) now clears onboarding_completed=false.
App.tsx's onboarding gate reads this flag: when false it redirects
every session to /onboarding, causing every spec that depends on /home
to fail. Call config_set_onboarding_completed({value:true}) immediately
after a successful wipe so the gate routes to /home as expected.
Adds retry logic for auth bypass if home page isn't reached first time.
…adSlice promise AddAccountModal: add data-testid on the modal root and each provider button so accounts-provider-modal.spec.ts can target them precisely. Accounts page: add data-testid on page root and add-button rail icon. threadSlice: fire-and-forget generateThreadTitleIfNeeded via .catch() rather than try/catch to avoid an uncaught rejection on async dispatch.
…t API shape shared-flows: add openAddAccountModal, waitForAccountsPage, clickAddAccountProvider, waitForAddAccountModalClosed, navigateToSkills, and waitForHomePage for the new accounts-provider-modal and journey specs. mock-api socket/core + websocket: update socket handler namespace to match current RPC event shape (openhuman.* prefix, correct field names). mock-api state: seed composio/webhook state keys for provider specs. root package.json: add test:e2e and test:e2e:flows convenience aliases.
…il/--skip-preflight flags Replaces the old per-spec runner with a single master orchestrator that: - Groups all 66 specs into 11 suites (auth, navigation, chat, skills, notifications, webhooks, providers, payments, settings, system, journeys) - --suite=<name> to run a single suite; --bail stops on first suite failure - --skip-preflight to bypass environment checks - Removes OPENHUMAN_SERVICE_MOCK=1 from service-connectivity invocation — the old sidecar service model was removed in PR tinyhumansai#1061; the spec now auto-skips via its own guard rather than running against a dead mock - Captures per-spec exit codes and prints a summary table at the end
navigation-settings-panels + user-journey-settings-round-trip: /settings/account → /settings, /settings/channels → /settings/connections, /settings/data → /settings/memory-data, /settings/ai-skills → /settings/intelligence, /settings/advanced → /settings/developer-options, /settings/dev → /settings/appearance, /settings/features → /settings/tools (all corrected to match Settings.tsx routes). insights-dashboard: IntelligenceMemoryTab was removed; replace assertions on #actionable-search / #actionable-source with [data-testid="memory-workspace"] and [data-testid="memory-actions"] from the current MemoryWorkspace component. screen-intelligence: panel title renamed from 'Screen Intelligence' to 'Screen Awareness' (i18n key settings.features.screenAwareness). onboarding-modes: resetApp now restores onboarding_completed=true; spec must explicitly set it back to false to test the onboarding flow.
… patterns
Replace hardcoded browser.pause() calls with waitUntil() in
auth-access-control. Add explicit auth setup and mock server lifecycle
to logout-relogin-onboarding, notifications, slack-flow, whatsapp-flow.
composio-triggers-flow: tighten RPC result unwrapping to handle both
{result:{result:...}} and {result:...} response shapes.
tool-filesystem-flow: resolve relative paths inside tmp workspace;
guard path-sensitive assertions against sandbox restrictions.
rewards specs: correct progress assertion thresholds after points model
update. navigation + tauri-commands: add missing mock server lifecycle
hooks. settings-account-preferences: fix selector after label rename.
… React state The previous approach (native HTMLTextAreaElement prototype setter + synthetic input/change events via browser.execute) does not update React's controlled inputValue state in the CEF renderer — the events fire but React's synthetic onChange handler never sees a value change, leaving the composer empty and the send button permanently disabled. Fix: focus the textarea via JS (avoids coordinate-based click that gets intercepted by AppUpdatePrompt at z-[9998]), select-all existing content, then send the text as real OS-level keyboard events via browser.keys(). These go through CDP Input.dispatchKeyEvent → Chromium input pipeline → React's onChange → inputValue state update → send button enabled.
…+ add auth
The synthetic KeyboardEvent dispatched via browser.execute() does not
reliably reach window capture-phase listeners in the Appium Chromium
(CDP) driver. Replace dispatchKey with browser.action('key') which maps
to CDP Input.dispatchKeyEvent — a real key event in Chromium's input
pipeline that hotkeyManager's capture listener sees correctly.
Falls back to synthetic dispatch if the Actions API throws.
Also adds startMockServer + resetApp to before/after hooks: CommandProvider
(which mounts the mod+K listener) lives inside the auth-gated provider
chain and does not mount without a valid session token.
…rkflow Upload WDIO spec result artifacts on failure so CI logs are accessible without re-running. Add a job summary step that surfaces pass/fail counts directly in the GitHub Actions job summary view.
accounts-provider-modal.spec.ts: asserts all 6 exposed account provider tiles appear in the picker, hidden providers (google-meet, zoom) are absent, and each provider can be registered via picker interaction. rpc-preflight.ts: validates RPC methods against the live core before the suite runs to catch ghost RPC calls (like removed skills runtime methods) early rather than mid-suite. e2e-preflight.sh: environment sanity checks (bundle, Appium, ports). docs/e2e-status.md + e2e-audit-2026-05.md: living tracking docs for the 66-spec suite status and root-cause audit findings.
…allback
composerSendDecision.ts blocks every send with 'socket_disconnected' when
the Socket.IO connection to the in-process Rust core is not yet up. In
practice this produces the visible error toast
"Realtime socket is not connected — responses cannot be delivered
without a client ID."
and causes ALL chat-harness specs to fail.
Changes:
- chat-harness.ts: add waitForSocketConnected(timeoutMs=30_000) that polls
window.__OPENHUMAN_STORE__ until socket.byUser[*].status === 'connected'.
- chat-harness.ts: fix clickSend() fallback — extend primary clear-wait
from 1 s to 5 s (addMessageLocal does a Rust RPC before setInputValue('')
so the composer can take 100–500 ms to clear) and replace the coordinate-
based composer.click() fallback with a JS el.focus() call to avoid the
AppUpdatePrompt overlay (z-[9998]) intercepting the click.
- All 10 chat + user-journey specs: import waitForSocketConnected and call
it with a warn-if-false guard before the first clickSend().
socketService.getSocketUserId() was changed (3aa8477) to use auth.userId from the core state snapshot, but selectSocketUserId still parsed the JWT token. The two derivations produced different keys (e.g. "user-123" vs the JWT sub claim), so selectSocketStatus returned "disconnected" even when the socket was connected — blocking all chat sends with "socket_disconnected". Use the same auth.userId source in both paths. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ltiple specs - Consolidated import statements in reset-app.ts and rpc-preflight.ts for better readability. - Enhanced formatting of timeout configurations in auth-access-control.spec.ts for consistency. - Streamlined object definitions in various specs to improve clarity and maintainability. - Updated console log statements to ensure consistent formatting across navigation and chat specs. - Minor adjustments to ensure better alignment with coding standards and improve overall code quality.
…osio-triggers-flow specs
Contributor
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
Comment |
…fix threadSlice async assertion Socket selector tests were still keying state by JWT-parsed tgUserId, but selectSocketUserId now reads auth.userId directly. Thread title assertion raced against a fire-and-forget dispatch — use vi.waitFor(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…, and lint/format fixes Align all E2E specs with updated helper APIs (shared-flows, app-helpers), fix unused variable lint errors in settings-data-management and settings-feature-preferences, and apply Prettier formatting across remaining spec files. Update e2e-run-all-flows and e2e-run-session scripts for the revised spec set.
# Conflicts: # app/test/e2e/helpers/shared-flows.ts # src/openhuman/test_support/rpc.rs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Rewrites and massively expands the E2E test suite from a handful of specs to a full 66-spec suite organized into 11 suites, with a new orchestrator, hardened test helpers, production bug fixes discovered during E2E work, and CI improvements.
New specs (11 new spec files)
chat-tool-call-flowchat-multi-tool-roundchat-tool-error-recoveryuser-journey-full-taskuser-journey-settings-round-tripchat-conversation-historynavigation-smoothnessnavigation-settings-panelsaccounts-provider-modalscreen-intelligenceOrchestrator rewrite (
e2e-run-all-flows.sh)--suite=<name>to run a single suite;--bailstops on first failure--skip-preflightto bypass environment checksProduction bug fixes
selectSocketUserId/ socket status mismatch —selectSocketUserIdparsed the JWT whilesocketServiceusedauth.userId, producing different keys →selectSocketStatusalways returned"disconnected"→ all chat sends blocked. Aligned both to useauth.userId.ConversationStore::get_messagesI/O error on empty threads — returned an I/O error for threads whose JSONL file hadn't been written yet. Now returns[]. Regression test added.test_resetmissingonboarding_completedflag — didn't clearonboarding_completed, causing post-reset sessions to skip onboarding gate. Fixed + restore flag after wipe.threadSliceuncaught rejection —generateThreadTitleIfNeededasync dispatch wrapped in try/catch instead of.catch(), causing uncaught promise rejection.Test infrastructure improvements
waitForSocketConnected()polls store until socket is connected before any send;typeIntoComposeruses native OS keyboard events via WebDriver Actions API (CDPInput.dispatchKeyEvent) instead of synthetic React events;clickSend()extended clear-wait + JS focus fallback to avoidAppUpdatePromptoverlay interceptionopenhuman.*RPC event shape; seeded composio/webhook state keysopenAddAccountModal,waitForAccountsPage,clickAddAccountProvider,navigateToSkills,waitForHomePagerpc-preflight.ts): validates RPC methods against live core before suite runse2e-preflight.sh): bundle, Appium, port sanity checksdata-testidselectors added toAddAccountModalandAccountspageExisting spec fixes
/settings/account→/settings,/settings/channels→/settings/connections, etc.)insights-dashboard: replaced removedIntelligenceMemoryTabselectors with currentMemoryWorkspacedata-testidsscreen-intelligence: panel title updated to 'Screen Awareness'waitUntilinstead ofbrowser.pause)CI
test:e2eandtest:e2e:flowsconvenience scripts in rootpackage.jsonDocs
docs/e2e-status.md— living tracker for 66-spec suite statusdocs/e2e-audit-2026-05.md— root-cause audit findings from May 2026Test plan
./app/scripts/e2e-run-all-flows.sh--suite=chat,--suite=navigation,--suite=auth