Date: 2026-02-24 Author: Claude Code
Allow admin and IR users to upload institutional data files directly from the dashboard without needing direct database or server access. Two upload paths: course enrollment CSVs (end-to-end to Postgres) and PDP cohort/AR files (to Supabase Storage + GitHub Actions ML pipeline trigger).
In scope:
- Course enrollment CSV →
course_enrollmentsPostgres table (upsert) - PDP Cohort CSV / PDP AR (.xlsx) → Supabase Storage + GitHub Actions
repository_dispatch - Preview step (first 10 rows + column validation) before commit
- Role guard: admin and ir only
Out of scope:
- Upload history log (future issue)
- Column remapping UI (columns must match known schema)
- ML experiment tracking / MLflow (future issue)
- Auto-triggering ML pipeline without a server (GitHub Actions is the trigger mechanism)
New page: codebenders-dashboard/app/admin/upload/page.tsx
Role guard: Add to lib/roles.ts ROUTE_PERMISSIONS:
{ prefix: "/admin", roles: ["admin", "ir"] },
{ prefix: "/api/admin", roles: ["admin", "ir"] },Middleware already enforces this pattern via x-user-role header — no other auth code needed.
Nav link: Add "Upload Data" to nav-header.tsx, visible only to admin/ir roles.
New API routes:
POST /api/admin/upload/preview— parse first 10 rows, return sample + validation summaryPOST /api/admin/upload/commit— full ingest (course → Postgres; PDP/AR → Storage + Actions)
- Dropdown: file type (
Course Enrollment CSV|PDP Cohort CSV|PDP AR File (.xlsx)) - Drag-and-drop zone (click to pick;
.csvfor course/cohort,.csv+.xlsxfor AR) - "Preview" button → calls
/api/admin/upload/preview
- Shows: detected file type, estimated row count, first 10 rows in a table
- Validation banner: lists missing required columns or warnings
- "Confirm & Upload" → calls
/api/admin/upload/commit - "Back" link to return to State 1
- Course enrollments:
{ inserted, skipped, errors[] }summary card - PDP/AR: "File accepted — ML pipeline queued in GitHub Actions" + link to Actions run
- "Upload another file" resets to State 1
Input: multipart/form-data with file and fileType fields
Logic:
- Parse first 50 rows with
csv-parse(CSV) orxlsx(Excel) - Validate required columns exist for the given
fileType - Return
{ columns, sampleRows (first 10), rowCount (estimated), warnings[] }
Input: Same multipart form
Course enrollment path:
- Stream-parse full CSV with
csv-parseasync iterator - Batch-upsert 500 rows at a time into
course_enrollmentsviapg - Conflict target:
(student_guid, course_prefix, course_number, academic_term) - Return
{ inserted, skipped, errors[] }
PDP/AR path:
- Upload file to Supabase Storage bucket
pdp-uploadsvia@supabase/supabase-js - Call GitHub API
POST /repos/{owner}/{repo}/dispatcheswith:{ "event_type": "ml-pipeline", "client_payload": { "file_path": "<storage-path>" } } - Return
{ status: "processing", actionsUrl: "https://github.com/{owner}/{repo}/actions" }
Role enforcement: Read x-user-role header (set by middleware); return 403 if not admin/ir.
File: .github/workflows/ml-pipeline.yml
Trigger: repository_dispatch with event_type: ml-pipeline
Steps:
- Checkout repo
- Set up Python with
venv - Install dependencies (
pip install -r requirements.txt) - Download uploaded file from Supabase Storage using
SUPABASE_SERVICE_KEYsecret - Run
venv/bin/python ai_model/complete_ml_pipeline.py --input <downloaded-file-path> - Upload
ML_PIPELINE_REPORT.txtas a GitHub Actions artifact (retained 90 days)
Required secrets: SUPABASE_URL, SUPABASE_SERVICE_KEY, GITHUB_TOKEN (auto-provided)
Must include: student_guid, course_prefix, course_number, academic_year, academic_term
Optional (all other course_enrollments columns): filled as NULL if absent
Must include: Institution_ID, Cohort, Student_GUID, Cohort_Term
Must include: Institution_ID, Cohort, Student_GUID (first sheet parsed)
| Package | Purpose |
|---|---|
csv-parse |
Streaming CSV parsing (async iterator mode) |
xlsx |
Excel (.xlsx) parsing |
| File | Purpose |
|---|---|
codebenders-dashboard/app/admin/upload/page.tsx |
Upload UI page |
codebenders-dashboard/app/api/admin/upload/preview/route.ts |
Preview API route |
codebenders-dashboard/app/api/admin/upload/commit/route.ts |
Commit API route |
.github/workflows/ml-pipeline.yml |
GitHub Actions ML pipeline trigger |
Storage bucket: Create pdp-uploads bucket (private, authenticated access only).
No new database migrations required — course_enrollments table already exists.
Bucket policy: Only service role key can read/write. Signed URLs used for pipeline download.
- ML pipeline trigger via GitHub Actions means a ~30-60s delay before the pipeline starts
- Vercel free tier has a 4.5 MB request body limit — large files should use Supabase Storage direct upload in a future iteration
- No upload history log in this version (deferred)
- Column remapping is out of scope — files must match the known schema