Skip to content

Commit 390dfa5

Browse files
committed
chore: update ML pipeline report from hosted Supabase run
1 parent 5853c76 commit 390dfa5

1 file changed

Lines changed: 61 additions & 32 deletions

File tree

ML_PIPELINE_REPORT.txt

Lines changed: 61 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,68 +1,86 @@
11

2-
KCTCS ML PIPELINE - SUMMARY REPORT
2+
BISHOP STATE ML PIPELINE - SUMMARY REPORT
33
================================================================================
4-
Generated: 2025-10-28 17:29:21
4+
Generated: 2026-02-21 12:59:23
55

66
DATASET OVERVIEW
77
--------------------------------------------------------------------------------
8-
Total Students: 32,800
9-
Total Course Records: 145,918
8+
Total Students: 4,000
9+
Total Course Records: 4,000
1010

1111
MODEL PERFORMANCE SUMMARY
1212
--------------------------------------------------------------------------------
1313

1414
1. RETENTION PREDICTION MODEL
1515
Algorithm: XGBoost Classifier
16-
Features Used: 31
16+
Features Used: 23
1717
Test Set Performance:
18-
- Accuracy: 0.5224
19-
- AUC-ROC: 0.5355
18+
- Accuracy: 0.7238
19+
- AUC-ROC: 0.6134
2020

2121
Risk Distribution:
22-
Critical Risk 242 ( 0.7%)
23-
High Risk 15,755 ( 48.0%)
24-
Moderate Risk 15,202 ( 46.3%)
25-
Low Risk 1,601 ( 4.9%)
22+
Critical Risk 0 ( 0.0%)
23+
High Risk 82 ( 2.1%)
24+
Moderate Risk 2,195 ( 54.9%)
25+
Low Risk 1,723 ( 43.1%)
2626

2727
2. EARLY WARNING SYSTEM
2828
Algorithm: Composite Risk Score (Retention + Performance Metrics)
2929
Approach: Aligned with retention predictions to eliminate contradictions
3030
Alert Distribution:
31-
URGENT 487 ( 1.5%)
32-
HIGH 8,344 ( 25.4%)
33-
MODERATE 19,823 ( 60.4%)
34-
LOW 4,146 ( 12.6%)
31+
URGENT 0 ( 0.0%)
32+
HIGH 21 ( 0.5%)
33+
MODERATE 2,210 ( 55.2%)
34+
LOW 1,769 ( 44.2%)
3535

3636
3. TIME TO CREDENTIAL PREDICTION
3737
Algorithm: XGBoost Regressor
38-
Mean Predicted Time: 4.29 years
39-
Median Predicted Time: 4.39 years
38+
Mean Predicted Time: 2.97 years
39+
Median Predicted Time: 2.96 years
4040

4141
4. CREDENTIAL TYPE PREDICTION
4242
Algorithm: Random Forest Classifier
4343
Predicted Distribution:
44-
No Credential 32,735 ( 99.8%)
45-
Associate 59 ( 0.2%)
46-
Bachelor 6 ( 0.0%)
44+
No Credential 4,000 (100.0%)
4745

48-
5. COURSE SUCCESS (GPA) PREDICTION
49-
Algorithm: Random Forest Regressor
50-
Mean Predicted GPA: 2.06
46+
5. GATEWAY MATH SUCCESS PREDICTION (NEW!)
47+
Algorithm: XGBoost Classifier
48+
Students with Gateway Math Data: 4,000
49+
Average Pass Probability: 0.0%
5150

52-
Performance vs. Expected:
53-
As Expected 32,800 (100.0%)
51+
Gateway Math Risk Distribution:
52+
High Risk 4,000 (100.0%)
53+
54+
6. GATEWAY ENGLISH SUCCESS PREDICTION (NEW!)
55+
Algorithm: XGBoost Classifier
56+
Students with Gateway English Data: 4,000
57+
Average Pass Probability: 0.0%
58+
59+
Gateway English Risk Distribution:
60+
High Risk 4,000 (100.0%)
61+
62+
7. FIRST-SEMESTER LOW GPA (<2.0) PREDICTION (NEW!)
63+
Algorithm: XGBoost Classifier
64+
Average Low GPA Probability: 13.1%
65+
Students Predicted Low GPA: 231
66+
67+
Academic Risk Level Distribution:
68+
Low Risk 3,078 ( 77.0%)
69+
Moderate Risk 597 ( 14.9%)
70+
High Risk 258 ( 6.5%)
71+
Critical Risk 67 ( 1.7%)
5472

5573
OUTPUT: DATABASE TABLES
5674
--------------------------------------------------------------------------------
5775
1. student_predictions (Table)
5876
- Student-level data with all predictions
59-
- 32,800 students
60-
- 156 columns
77+
- 4,000 students
78+
- 164 columns
6179

6280
2. course_predictions (Table)
6381
- Course-level data with predictions
64-
- 145,918 records
65-
- 151 columns
82+
- 4,000 records
83+
- 159 columns
6684

6785
3. ml_model_performance (Table)
6886
- Model performance metrics
@@ -90,9 +108,20 @@ Credential Type:
90108
- predicted_credential_label (text label)
91109
- prob_no_credential, prob_certificate, prob_associate, prob_bachelor
92110

93-
Course Success:
94-
- predicted_gpa (0-4 scale)
95-
- gpa_performance (Above/Below/As Expected)
111+
Gateway Math Success:
112+
- gateway_math_probability (0-1 scale)
113+
- gateway_math_prediction (0=Won't Pass, 1=Will Pass)
114+
- gateway_math_risk (High Risk/Moderate Risk/Likely Pass/Very Likely Pass)
115+
116+
Gateway English Success:
117+
- gateway_english_probability (0-1 scale)
118+
- gateway_english_prediction (0=Won't Pass, 1=Will Pass)
119+
- gateway_english_risk (High Risk/Moderate Risk/Likely Pass/Very Likely Pass)
120+
121+
First-Semester GPA < 2.0 Risk:
122+
- low_gpa_probability (0-1 scale)
123+
- low_gpa_prediction (0=Adequate GPA, 1=Low GPA)
124+
- academic_risk_level (Low Risk/Moderate Risk/High Risk/Critical Risk)
96125

97126
================================================================================
98127
PIPELINE COMPLETE!

0 commit comments

Comments
 (0)