Skip to content

Commit 1472a63

Browse files
authored
Merge pull request #69 from devcolor/rebranding/bishop-state
epic: Bishop State rebranding — full platform delivery (v0.9–v0.10)
2 parents 7087437 + 47253f6 commit 1472a63

74 files changed

Lines changed: 116820 additions & 570670 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
# Architectural Patterns
2+
3+
This document describes architectural patterns used consistently across this codebase.
4+
5+
## API Design Patterns
6+
7+
### Route Structure
8+
All Next.js API routes export explicit HTTP method handlers. See:
9+
- `codebenders-dashboard/app/api/analyze/route.ts:82-87`
10+
- `codebenders-dashboard/app/api/dashboard/kpis/route.ts:22`
11+
- `codebenders-dashboard/app/api/execute-sql/route.ts:33`
12+
13+
### Error Response Standardization
14+
Consistent error structure: `{ error: string, details?: string }` with appropriate HTTP status codes (400 for bad requests, 404 for not found, 500 for server errors). See:
15+
- `codebenders-dashboard/app/api/dashboard/kpis/route.ts:54-59`
16+
- `codebenders-dashboard/app/api/execute-sql/route.ts:72-78`
17+
- `codebenders-dashboard/app/api/analyze/route.ts:212-218`
18+
19+
### Console Logging with Prefixes
20+
Debug logs use module prefixes for traceability (e.g., `[analyze]`, `[v0]`). See:
21+
- `codebenders-dashboard/app/api/analyze/route.ts:83-211`
22+
- `codebenders-dashboard/lib/query-executor.ts:11-72`
23+
24+
## Database Access Patterns
25+
26+
### Connection Pooling (TypeScript)
27+
Lazy-initialized singleton pg Pool prevents connection exhaustion:
28+
- `codebenders-dashboard/lib/db.ts` - `getPool()` singleton
29+
30+
Key config: `max: 10`, pool error handler registered on init
31+
32+
### Connection Pooling (Python)
33+
psycopg2 with connection pooling via SQLAlchemy:
34+
- `operations/db_utils.py` - `get_connection()`, `get_sqlalchemy_engine()`
35+
- `operations/db_config.py` - Centralized DB_CONFIG
36+
37+
### Parameterized Queries
38+
All dynamic queries use `$1`, `$2` placeholders (Postgres style) with params arrays to prevent SQL injection:
39+
- `codebenders-dashboard/app/api/dashboard/readiness/route.ts:47-96`
40+
41+
### Bulk Data Insertion
42+
Chunked DataFrame insertion (1000 records/batch) with progress tracking:
43+
- `operations/db_utils.py:49-117` - `save_dataframe_to_db()`
44+
45+
## React/Next.js Patterns
46+
47+
### Independent State Variables
48+
Multiple `useState` hooks for different data domains instead of single state object:
49+
- `codebenders-dashboard/app/page.tsx:51-58`
50+
- `codebenders-dashboard/app/query/page.tsx:27-32`
51+
52+
### Parallel Data Fetching
53+
`Promise.all()` for concurrent API calls:
54+
- `codebenders-dashboard/app/page.tsx:67-81`
55+
56+
### Component Loading States
57+
Three-state rendering pattern: loading skeleton → error message → content:
58+
- `codebenders-dashboard/components/kpi-card.tsx:19-59`
59+
- `codebenders-dashboard/components/risk-alert-chart.tsx:47-84`
60+
61+
## ML Pipeline Patterns
62+
63+
### Feature Engineering Pipeline
64+
Sequential stages: data loading → feature engineering → preprocessing → training → evaluation → storage:
65+
- `ai_model/complete_ml_pipeline.py:91-179` - Target variable calculation
66+
- `ai_model/complete_ml_pipeline.py:189-223` - Feature set definitions
67+
68+
### Preprocessing with Label Encoding
69+
Centralized preprocessing: median imputation for numeric, "Unknown" for categorical, LabelEncoder for object types:
70+
- `ai_model/complete_ml_pipeline.py:232-256` - `preprocess_features()`
71+
72+
### Model Performance Tracking
73+
Metrics saved to `ml_model_performance` table after each training run:
74+
- `operations/db_utils.py:159-210` - `save_model_performance()`
75+
76+
## Component Patterns
77+
78+
### TypeScript Props Interfaces
79+
All components define explicit prop interfaces with optional loading/error fields:
80+
- `codebenders-dashboard/components/kpi-card.tsx:5-16`
81+
- `codebenders-dashboard/components/risk-alert-chart.tsx:13-17`
82+
83+
### Chart Color Mapping
84+
Centralized color dictionaries mapping semantic values to hex colors:
85+
- `codebenders-dashboard/components/risk-alert-chart.tsx:19-24`
86+
- `codebenders-dashboard/components/retention-risk-chart.tsx:19-24`
87+
88+
Colors: LOW=#22c55e (green), MODERATE=#eab308 (yellow), HIGH=#f97316 (orange), URGENT=#ef4444 (red)
89+
90+
### Multi-Format Export
91+
Single component handles CSV, JSON, Markdown exports via `downloadFile()` utility:
92+
- `codebenders-dashboard/components/export-button.tsx:25-209`
93+
94+
## Configuration Patterns
95+
96+
### Environment Variable Hierarchy
97+
ENV vars with fallback defaults for development:
98+
- `codebenders-dashboard/app/api/dashboard/readiness/route.ts:4-10`
99+
100+
### Schema Configuration Constants
101+
Database schema metadata as constants for multi-institution support:
102+
- `codebenders-dashboard/app/api/analyze/route.ts:22-79` - SCHEMA_INFO
103+
- `codebenders-dashboard/lib/prompt-analyzer.ts:4-28` - SCHEMA_CONFIG
104+
105+
## Error Handling Patterns
106+
107+
### Try-Catch with Typed Errors
108+
All API routes wrap operations in try-catch, return structured errors with stack traces logged:
109+
- `codebenders-dashboard/app/api/analyze/route.ts:209-220`
110+
- `codebenders-dashboard/app/api/execute-sql/route.ts:65-79`
111+
112+
### Python Status Reporting
113+
Visual status indicators with `print()` statements:
114+
- `operations/db_utils.py` - Uses `` for success, `` for failure
115+
- Section headers with `"=" * 80`
116+
117+
## Data Transformation Patterns
118+
119+
### JSON Field Parsing
120+
Parse JSON columns from DB, aggregate values, handle parse errors gracefully:
121+
- `codebenders-dashboard/app/api/dashboard/readiness/route.ts:119-157`
122+
123+
### Query Plan to SQL Conversion
124+
Semantic query plans translated to SQL using schema-aware column mapping:
125+
- `codebenders-dashboard/lib/prompt-analyzer.ts:30-174`

.docker.env.example

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# PostgreSQL Configuration for Docker Compose
2+
# Copy this file to .docker.env and update with your values
3+
4+
# Database user
5+
POSTGRES_USER=postgres
6+
7+
# Database password
8+
POSTGRES_PASSWORD=devcolor2025
9+
10+
# Database name
11+
POSTGRES_DB=bishop_state
12+
13+
# Port mapping (host:container)
14+
POSTGRES_PORT=5432
15+
16+
# pgAdmin configuration (optional)
17+
PGADMIN_EMAIL=admin@bishopstate.edu
18+
PGADMIN_PASSWORD=devcolor2025
19+
PGADMIN_PORT=8080

.gitignore

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,11 @@ lerna-debug.log*
5353
package-lock.json
5454
yarn.lock
5555
pnpm-lock.yaml
56+
57+
# Exceptions: dashboard source files that conflict with Python ignores above
58+
!codebenders-dashboard/lib/
59+
!codebenders-dashboard/lib/**
60+
5661
dist/
5762
dist-ssr/
5863
*.local
@@ -150,15 +155,24 @@ supabase/.env
150155
# Docker (if used for local dev)
151156
docker-compose.override.yml
152157

158+
# Git worktrees
159+
.worktrees/
160+
153161
# Misc
154162
.cache/
155163
*.seed
156164
*.pid
157165
*.pid.lock
158166
.terraform/
159167

168+
# Exceptions: dashboard source files that conflict with Python ignores above
169+
!codebenders-dashboard/lib/
170+
!codebenders-dashboard/lib/**
171+
160172
# Operations scripts (local utilities)
161173
operations/fix_institution_id.py
162174
operations/list_tables.py
163175
operations/convert_institution_id_to_string.py
164176
operations/verify_institution_id.py
177+
.vercel
178+
.env.deploy

CHANGELOG.md

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
# Changelog
2+
3+
All notable changes to the Bishop State Student Success Dashboard are documented here.
4+
5+
Each version corresponds to a git tag. To compare any two versions:
6+
```
7+
git log v0.1.0-kctcs..v0.2.0-bishop-state-data --oneline
8+
```
9+
10+
---
11+
12+
## [v0.6.0-readiness-engine] — 2026-02-20
13+
14+
Student readiness scoring, PDP-aligned methodology documentation, and optional LiteLLM narrative enrichment.
15+
16+
### Added
17+
- Rule-based readiness score engine (`ai_model/generate_readiness_scores.py`) — deterministic, FERPA-safe, fully auditable
18+
- Supabase migration creating `llm_recommendations` and `readiness_generation_runs` tables
19+
- `docs/READINESS_METHODOLOGY.md` — full scoring formula, PDP alignment table, FERPA compliance notes, and 5 research citations (CCRC, CAPR, Bird et al. 2021)
20+
- Model 9 (Readiness Score) section in `ML_MODELS_GUIDE.md`
21+
- PDP credit momentum (12-credit Year 1 milestone) and math placement components to rule engine
22+
- Optional `--enrich-with-llm` flag using LiteLLM for provider-agnostic narrative enrichment (OpenAI, Ollama, Anthropic, Azure)
23+
- `/methodology` page in Next.js dashboard with scoring breakdown tables, PDP alignment, and references
24+
- Methodology nav button in dashboard header
25+
26+
### Fixed
27+
- Stale chart subtitles showing concatenated strings instead of student counts (PostgreSQL `COUNT(*)` bigint → string coercion)
28+
- Data source footnote referencing wrong table name and student count
29+
30+
---
31+
32+
## [v0.5.0-docs-cleanup] — 2026-02-19
33+
34+
Final documentation sweep and removal of all remaining KCTCS references.
35+
36+
### Added
37+
- Docker setup migrated from MariaDB to Postgres
38+
39+
### Changed
40+
- All documentation rebranded from KCTCS/MariaDB to Bishop State/Postgres
41+
42+
### Removed
43+
- Old KCTCS data files and merge script
44+
45+
### Fixed
46+
- Final sweep removing remaining KCTCS references across codebase
47+
48+
---
49+
50+
## [v0.4.0-frontend-rebrand] — 2026-02-18
51+
52+
Frontend UI, dashboard API routes, and query API fully migrated to Bishop State Community College and Postgres.
53+
54+
### Changed
55+
- Frontend UI rebranded from KCTCS to Bishop State Community College (institution name, colors, copy)
56+
- Dashboard API routes migrated from `mysql2`/KCTCS to `pg`/Bishop State
57+
- Query API routes migrated from `mysql2`/KCTCS to `pg`/Bishop State
58+
- Shared Postgres connection pool module added for Next.js
59+
60+
### Fixed
61+
- Cohort filter format in non-LLM prompt-analyzer fallback path
62+
63+
---
64+
65+
## [v0.3.0-postgres-migration] — 2026-02-17
66+
67+
Python ML pipeline database layer fully migrated from MariaDB to Postgres.
68+
69+
### Changed
70+
- Python DB layer migrated from `pymysql`/MariaDB to `psycopg2`/Postgres
71+
- ML pipeline updated for Bishop State data and Supabase Postgres
72+
73+
---
74+
75+
## [v0.2.0-bishop-state-data] — 2026-02-17
76+
77+
Bishop State data introduced and local Supabase development environment initialized.
78+
79+
### Added
80+
- Local Supabase setup for Postgres development
81+
- Synthetic Bishop State student data generation script
82+
- Bishop State data merge script and merged dataset
83+
- Updated ML models and documentation for Bishop State
84+
85+
---
86+
87+
## [v0.1.0-kctcs] — 2025-10-29
88+
89+
Baseline snapshot of the KCTCS/MariaDB codebase immediately before the Bishop State rebranding effort began.
90+
91+
### Context
92+
This tag marks the final state of the project when it was built for Kentucky Community and Technical College System (KCTCS) using MariaDB. All subsequent versions represent the migration to Bishop State Community College and Postgres.
93+
94+
---
95+
96+
*Generated from git tags. See [GitHub Releases](https://github.com/devcolor/codebenders-datathon/releases) or use `git log <tag1>..<tag2>` for full commit details between versions.*

CLAUDE.md

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Project Overview
6+
7+
Bishop State Student Success Prediction - Full-stack ML + web application predicting student outcomes for Bishop State Community College. Uses 5 ML models to generate retention predictions, early warnings, time-to-credential estimates, credential type forecasts, and GPA predictions for ~4K students.
8+
9+
## Tech Stack
10+
11+
| Layer | Technologies |
12+
|-------|-------------|
13+
| ML Pipeline | Python 3.8+, XGBoost, scikit-learn, pandas |
14+
| Frontend | Next.js 16, React 19, TypeScript, Tailwind CSS |
15+
| Charts | Recharts |
16+
| UI Components | shadcn/ui (Radix UI) |
17+
| Database | Postgres (Supabase), pg driver |
18+
| AI Features | OpenAI (natural language query analysis) |
19+
| Infrastructure | Docker Compose, Vercel |
20+
21+
## Key Directories
22+
23+
| Directory | Purpose |
24+
|-----------|---------|
25+
| `ai_model/` | Python ML pipeline - 5 models (XGBoost + Random Forest) |
26+
| `codebenders-dashboard/` | Next.js web application |
27+
| `codebenders-dashboard/app/` | App Router pages and API routes |
28+
| `codebenders-dashboard/components/` | React components (shadcn/ui based) |
29+
| `codebenders-dashboard/lib/` | Utilities: prompt-analyzer.ts, query-executor.ts |
30+
| `operations/` | Database utilities and configuration |
31+
| `data/` | CSV data files (~20K students, ~500K courses) |
32+
33+
## Essential Commands
34+
35+
### ML Pipeline
36+
```bash
37+
pip install -r requirements.txt # Install Python dependencies
38+
cd ai_model && python complete_ml_pipeline.py # Run full pipeline
39+
python -m operations.test_db_connection # Test DB connection
40+
```
41+
42+
### Dashboard
43+
```bash
44+
cd codebenders-dashboard
45+
npm install # Install dependencies
46+
npm run dev # Dev server (localhost:3000)
47+
npm run build # Production build
48+
npm run lint # Lint check
49+
```
50+
51+
### Docker
52+
```bash
53+
docker-compose up -d # Start Postgres + pgAdmin
54+
docker-compose down -v # Stop and remove volumes
55+
```
56+
57+
## Database Schema
58+
59+
Three main tables in the `bishop_state` Postgres database:
60+
- `student_predictions` - Student-level predictions (~4K records)
61+
- `course_predictions` - Course-level predictions (~100K records)
62+
- `ml_model_performance` - Model metrics and training history
63+
64+
## Key Entry Points
65+
66+
| File | Purpose |
67+
|------|---------|
68+
| `ai_model/complete_ml_pipeline.py:1` | Main ML entry point |
69+
| `codebenders-dashboard/app/page.tsx:1` | Dashboard home page |
70+
| `codebenders-dashboard/app/query/page.tsx:1` | Query interface page |
71+
| `codebenders-dashboard/lib/prompt-analyzer.ts:30` | LLM-powered SQL generation |
72+
| `operations/db_config.py:8` | Database configuration |
73+
74+
## Python Environment
75+
76+
- **Always** use the project virtualenv at `venv/` when running Python commands.
77+
- Activate with `source venv/bin/activate` or use `venv/bin/python` directly.
78+
- Install dependencies into the venv, not globally.
79+
80+
## Git Commit Rules
81+
82+
- **Never** add `Co-Authored-By` lines to commit messages.
83+
84+
## Additional Documentation
85+
86+
Check these files for detailed information on specific topics:
87+
88+
| Topic | File |
89+
|-------|------|
90+
| Architectural patterns | `.claude/docs/architectural_patterns.md` |
91+
| Project overview | `README.md` |
92+
| Quick start guide | `QUICKSTART.md` |
93+
| Data field descriptions | `DATA_DICTIONARY.md` |
94+
| ML model details | `ML_MODELS_GUIDE.md` |
95+
| Dashboard features | `codebenders-dashboard/DASHBOARD_README.md` |
96+
| Database utilities | `operations/README.md` |
97+
| Docker setup | `DOCKER_SETUP.md` |

DASHBOARD_VISUALIZATIONS.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Dashboard Visualizations Guide
2-
## KCTCS Student Success Analytics & Predictive Models
2+
## Bishop State Community College Student Success Analytics & Predictive Models
33

4-
**Dataset**: `kctcs_student_level_with_predictions.csv`
4+
**Dataset**: `bishop_state_student_level_with_predictions.csv`
55
**Students**: 32,800
66
**Date**: October 28, 2025
77
**Purpose**: Comprehensive visualization guide for retention, graduation, and student success metrics
@@ -680,7 +680,7 @@ Average(risk_score) by Program_of_Study_Year_1
680680

681681
- **Model Performance Details**: See `ML_MODELS_GUIDE.md`
682682
- **Data Dictionary**: See `DATA_DICTIONARY.md`
683-
- **Raw Data**: `kctcs_student_level_with_predictions.csv`
683+
- **Raw Data**: `bishop_state_student_level_with_predictions.csv`
684684

685685
---
686686

0 commit comments

Comments
 (0)