Skip to content

feat(OBSERV-02): KPI analytics dashboard — active users, conversations & messages#1722

Draft
florian-muller wants to merge 13 commits into
swiftfrom
add-kpi-dashboard
Draft

feat(OBSERV-02): KPI analytics dashboard — active users, conversations & messages#1722
florian-muller wants to merge 13 commits into
swiftfrom
add-kpi-dashboard

Conversation

@florian-muller

@florian-muller florian-muller commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

  • KPI middleware (api.request_latency_ms): already shipped — drives active users metrics
  • Session instrumentation: emit session.created_total at session creation with dims team_id, scope_type (personal/team), agent_instance_id, and user attribution via actor
  • Agent turn attribution: fix agent.turn_completed and agent.turn_error_total actor from system to human so turns carry user_id for analytics
  • New preset endpoints:
    • GET /kpi/presets/sessions_over_time — new conversations per time bucket
    • GET /kpi/presets/messages_over_time — agent turns per time bucket
  • Analytics dashboard (/admin/analytics):
    • Users section: unique users stat card (from dedicated preset) + active users over time chart
    • Conversations & Messages section: two rows with alternating card/chart layout
      • Conversations: line chart → total stat card (summed client-side)
      • Messages: total stat card → line chart
    • KpiSection + KpiRow reusable layout molecules (compactFirst / compactLast props)
    • Time range selector and refresh button apply to all charts simultaneously

Test plan

  • Start control-plane and hit GET /control-plane/v1/kpi/presets/sessions_over_time — returns {"rows": [...], "interval": "1d", ...}
  • Hit GET /control-plane/v1/kpi/presets/messages_over_time — same shape
  • Create a session via the frontend — verify a session.created_total KPI event appears in OpenSearch with dims.user_id set
  • Run an agent turn — verify agent.turn_completed event has dims.actor_type = "human" and dims.user_id set (was "system" before)
  • Load /admin/analytics — three rows render: unique users, conversations, messages
  • Change the time range — all four charts and both totals update
  • Stat cards are fixed-width (200 px); charts take remaining space

Replace per-backend ad-hoc KPI wiring with a shared KpiObservabilityConfig
in fred-core, covering log, Prometheus, and OpenSearch sinks independently.
Add build_kpi_writer() factory so all three backends use identical sink
stacking logic. Prod defaults enable Prometheus + OpenSearch; local dev
enables log only.
@florian-muller florian-muller self-assigned this Jun 11, 2026
…ntator

Add a shared FastAPI/Starlette middleware in fred-core that emits
api.request_latency_ms for every request. Mounted in control-plane,
knowledge-flow, and agent pods. Supports eager and lazy (lifespan-safe)
KPIWriter resolution; skips health/readiness probes; tags unauthenticated
calls as actor_subtype=anonymous. Removes prometheus_fastapi_instrumentator
from knowledge-flow-backend.

Also includes the KPI-ANALYTICS-RFC describing the full analytics architecture.
@florian-muller florian-muller changed the title feat(observability): unify KPI sink config across all backends feat(observability): add KPIMiddleware and unify KPI sink config across all backends Jun 11, 2026
- Fix ruff import-order in fred-core kpi/__init__.py and fred-runtime agent_app.py
- Fix ruff format in knowledge-flow tests/core/test_main_worker.py
- Add get_kpi_writer() and start_metrics_exporter() stubs to _FakeContainer
  in control-plane test_main.py to match the new main.py composition root
- Update fred-runtime test configs to use the new KpiObservabilityConfig schema
  (observability.kpi.{log,prometheus,opensearch}) instead of the removed
  observability.metrics string and app.metrics_port fields; port 9900 is now
  passed directly under prometheus config
- Disable opensearch KPI sink in all offline test fixtures (conftest.py) to
  prevent OpenSearchKPIStore from attempting TCP connections under --disable-socket
- Add ObservabilityConfig with opensearch KPI disabled to knowledge-flow test
  conftest so ingestion service tests no longer trigger socket blocks
…in test configs

- Fix ruff import-order in control-plane app/context.py and knowledge-flow tests/conftest.py
- Add offline KPI config (opensearch disabled) to test_history.py and
  test_openai_compat_router.py local config builders — these were missing the
  observability.kpi block, causing OpenSearchKPIStore to attempt TCP connections
  under --disable-socket
- Add /kpi/presets router to control-plane-backend with a preset system
- Add active_users_over_time preset: cardinality of user_id per time bucket,
  filtered to human actors on api.request_latency_ms
- Auto-select bucket interval (1s/1m/1h/1d) from since/until span,
  mirroring getPrecisionForRange() thresholds from frontend timeAxis.ts
- Use extended_bounds to fill empty buckets across the full range
- since/until are Python datetime params (ISO 8601); defaults to last 30 days
- Add kpi/utils.py with resolve_interval() shared helper
- Add AnalyticsPage with active-users-over-time chart wired to the
  /kpi/presets/active_users_over_time endpoint
- Add TimeRangeSelector molecule (Grafana-style): preset list + custom
  since/until panel side-by-side in a dropdown
- Add DateTimeInput atom with 24h locale forcing and styled segments
- Add i18n keys (en/fr) under rework.analytics.*
- Extend MaterialIconType with refresh, schedule, edit_calendar,
  expand_less, expand_more
- Update OpenAPI types and RTK hook alias to active_users_over_time
- Sync document.documentElement.lang with i18n language for native
  datetime-local 24h formatting in Chrome
…eChart molecule

Introduce a common TimeSeriesResponse/TimeSeriesPoint contract for time-bucketed
KPI presets. Replace the hand-rolled BarChart with a reusable TimeSeriesLineChart
molecule (Recharts) that resolves design-system tokens at runtime. Move the refresh
button to the page header next to the time range selector. Add a KPI README for
future preset authors.
@florian-muller florian-muller changed the title feat(observability): add KPIMiddleware and unify KPI sink config across all backends feat(OBSERV-02): Recharts TimeSeriesLineChart molecule + common KPI preset contract Jun 12, 2026
Add ScalarResponse common type and unique_users_total preset (cardinality
aggregation over the full time range). Display the result in a new
KpiStatCard molecule — label top-left, number centred — placed to the
left of the active users line chart in a responsive flex row that wraps
on narrow viewports.
… dashboard

- Emit session.created_total at session creation (dims: team_id, scope_type, agent_instance_id, user_id)
- Fix agent.turn_completed actor from system to human so turns carry user attribution
- Add sessions_over_time and messages_over_time preset endpoints
- Add KpiSection/KpiRow layout molecule with compactFirst/compactLast support
- Wire two new charts and totals (summed client-side) into the analytics page
@florian-muller florian-muller changed the title feat(OBSERV-02): Recharts TimeSeriesLineChart molecule + common KPI preset contract feat(OBSERV-02): KPI analytics dashboard — active users, conversations & messages Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant