Our Methodology
How we generate college football predictions
Overview
Our prediction system uses machine learning models trained on comprehensive college football data spanning multiple seasons. We employ a consensus approach that combines multiple models to produce more accurate and reliable predictions.
Data Sources
Team Performance Metrics
- Team Efficiency Ratings - Advanced offensive and defensive metrics
- Power Rankings - Statistical team strength indices
- Historical Performance - Season and multi-year trend analysis
- Talent Metrics - Roster composition and quality indicators
Game-Level Features
- Home field advantage adjustments
- Rest days and scheduling factors
- Historical matchup data
- Conference strength metrics
Model Architecture
Consensus Approach
We use a consensus of multiple machine learning models, each trained on different feature sets:
- Current Model - Uses recent season data with emphasis on current year performance
- Historical Model - Trained on multi-season data for robust long-term patterns
- Ensemble Model - Combines offensive and defensive predictions separately
- Full Feature Model - 700+ features including advanced metrics and recruiting data
Training Process
- Ridge regression with optimal regularization (α = 0.1)
- Trained on 12,000+ historical games
- Cross-validated to prevent overfitting
- Calibrated to match actual scoring distributions
Calibration & Quality Control
Score Calibration
We apply calibration factors to ensure our predictions match real-world scoring distributions:
- Training data average: 55.2 total points per game
- Predictions are scaled to match this distribution
- Prevents systematic OVER/UNDER bias
Team Name Mapping
Critical quality control step to ensure accurate data alignment:
- Canonical team names mapped across all data sources
- Verified mappings for all teams before generating predictions
- Prevents feature misalignment that could cause incorrect predictions
Performance Metrics
Model Accuracy
- R² Score: 0.607 (explains 60.7% of variance in game scores)
- Mean Absolute Error: ~10 points per score prediction
- ATS Target: 55%+ win rate (breakeven is 52.4%)
Edge Detection
We calculate the "edge" as the difference between our prediction and Vegas lines:
- Small Edge: < 3 points - Lower confidence
- Medium Edge: 3-7 points - Moderate disagreement with Vegas
- Large Edge: 7+ points - Significant value opportunity
Limitations & Disclaimers
What We Can't Predict
- Injuries to key players (unless reflected in pre-game lines)
- Weather conditions (extreme weather can significantly impact scoring)
- Motivational factors (rivalry games, playoff implications)
- Coaching changes or in-season staff turnover