diff --git a/README.md b/README.md index 3953d2d..fdb3c2e 100644 --- a/README.md +++ b/README.md @@ -1,17 +1,19 @@ -# Boba | Discover 2024 MiLB Prospects with 97% Accuracy +# Boba | Discover 2025 MiLB Prospects with 97% Accuracy -Boba is a Next.js application that helps users discover Minor League Baseball (MiLB) prospects using a machine learning model with 97% predictive accuracy (R²). It provides a user-friendly interface to explore the most recent 2024 MiLB prospect pool, offering detailed prospect profiles and rankings. The app leverages Next.js, Firebase, and Tailwind CSS for a modern and efficient experience. +Boba is a Next.js application that helps users discover Minor League Baseball (MiLB) prospects using a machine learning model with 97% predictive accuracy (R²). It provides a user-friendly interface to explore the most recent 2025 MiLB prospect pool, offering detailed prospect profiles and rankings. The app leverages Next.js, Firebase, and Tailwind CSS for a modern and efficient experience. + +![bWAR Histogram](ml/bwar_histogram.jpg) ## Contributors -| Developer | Affiliation | Contact -| ------------------------------------------------ | ------------------------------ | ------------------------------ | -| [Ebod Shojaei](https://github.com/ebodshojaei/) | BSc. University of British Columbia | ebod.shojaei@alumni.ubc.ca | -| [Rebecca Jeon](https://github.com/rebecca-jeon/) | BSc. University of Victoria | beccajeon12@gmail.com | +| Developer | Affiliation | Contact | +| :----------------------------------------------- | :---------------------------------- | :--------------------------- | +| [Ebod Shojaei](https://github.com/ebodshojaei/) | BSc. University of British Columbia | | +| [Rebecca Jeon](https://github.com/rebecca-jeon/) | BSc. University of Victoria | | ## Features -* View a comprehensive list of all 2024 MiLB prospects. +* View a comprehensive list of all 2025 MiLB prospects. * Explore individual MiLB prospect profiles. * Sort MiLB prospects by ranking. @@ -22,13 +24,13 @@ Boba is a Next.js application that helps users discover Minor League Baseball (M ## Routes -* `/`: Home page showcasing a list of 2024 MiLB prospects. +* `/`: Home page showcasing a list of 2025 MiLB prospects. * `/about`: About page with details on the MLB testing data and the "bWAR" metric. * `/contact`: Contact page for inquiries, feedback, or bug reports. ## Data and Methodology -Data is sourced from the [MLB Stats API](https://statsapi.mlb.com/). Our proprietary machine learning model (H2O.ai Stacked Ensemble) was trained on over 6,000 players and rigorously tested on over 1,500 players from the 2015 to 2024 seasons. This model predicts WAR (Wins Above Replacement), which we name "bWAR" (Boba Wins Above Replacement), for over 600 available 2024 prospects. +Data is sourced from the [MLB Stats API](https://statsapi.mlb.com/). Our proprietary machine learning model (H2O.ai Stacked Ensemble) was trained on over 6,000 players and rigorously tested on over 1,500 players from the 2015 to 2024 seasons. This model predicts WAR (Wins Above Replacement), which we name "bWAR" (Boba Wins Above Replacement), for over 600 available 2025 prospects. Our "WAR Machine" achieves 97% accuracy (R²) in predicting MiLB prospect WAR based on testing. Error rates were calculated using a modified Symmetric Mean Absolute Percentage Error (sMAPE) for two values, indicating both magnitude and direction within a range of -100% to 100% (0% sMAPE is perfect accuracy). Change in bWAR was calculated for players with available 2023 data to indicate growth or decline. @@ -47,47 +49,46 @@ The model training process included: A subset of the following pitching statistics from the MLB Stats API were used as features for model training: -| Field | Description | -|---------------------------|-------------------------------------------------| -| `stat.gamesPlayed` | Number of games pitched. | -| `stat.gamesStarted` | Number of games started. | -| `stat.gamesFinished` | Number of games finished. | -| `stat.completeGames` | Number of complete games. | -| `stat.shutouts` | Number of shutouts. | -| `stat.wins` | Number of wins. | -| `stat.losses` | Number of losses. | -| `stat.saveOpportunities` | Number of save opportunities. | -| `stat.saves` | Number of saves. | -| `stat.blownSaves` | Number of blown saves. | -| `stat.holds` | Number of holds. | -| `stat.inningsPitched` | Innings pitched (can be a formatted string). | -| `stat.runs` | Runs allowed. | -| `stat.earnedRuns` | Earned runs allowed. | -| `stat.battersFaced` | Number of batters faced. | -| `stat.atBats` | At-bats against the pitcher. | -| `stat.hits` | Hits allowed. | -| `stat.doubles` | Doubles allowed. | -| `stat.triples` | Triples allowed. | -| `stat.homeRuns` | Home runs allowed. | -| `stat.baseOnBalls` | Walks issued. | -| `stat.intentionalWalks` | Intentional walks issued. | -| `stat.strikeOuts` | Strikeouts. | -| `stat.hitByPitch` | Batters hit by pitch. | -| `stat.balks` | Balks committed. | -| `stat.wildPitches` | Wild pitches. | -| `stat.groundOuts` | Groundouts induced. | -| `stat.airOuts` | Flyouts induced. | -| `stat.stolenBases` | Stolen bases allowed. | -| `stat.caughtStealing` | Runners caught stealing. | -| `stat.sacBunts` | Sacrifice bunts allowed. | -| `stat.sacFlies` | Sacrifice flies allowed. | -| `stat.catchersInterference`| Catcher's interference while pitching. | -| `stat.pickoffs` | Pickoffs. | -| `stat.inheritedRunners` | Inherited runners. | -| `stat.inheritedRunnersScored`| Inherited runners scored. | -| `stat.numberOfPitches` | Pitches thrown. | -| `stat.strikes` | Strikes thrown. | - +| Field | Description | +| :---------------------------- | :------------------------------------------- | +| `stat.gamesPlayed` | Number of games pitched. | +| `stat.gamesStarted` | Number of games started. | +| `stat.gamesFinished` | Number of games finished. | +| `stat.completeGames` | Number of complete games. | +| `stat.shutouts` | Number of shutouts. | +| `stat.wins` | Number of wins. | +| `stat.losses` | Number of losses. | +| `stat.saveOpportunities` | Number of save opportunities. | +| `stat.saves` | Number of saves. | +| `stat.blownSaves` | Number of blown saves. | +| `stat.holds` | Number of holds. | +| `stat.inningsPitched` | Innings pitched (can be a formatted string). | +| `stat.runs` | Runs allowed. | +| `stat.earnedRuns` | Earned runs allowed. | +| `stat.battersFaced` | Number of batters faced. | +| `stat.atBats` | At-bats against the pitcher. | +| `stat.hits` | Hits allowed. | +| `stat.doubles` | Doubles allowed. | +| `stat.triples` | Triples allowed. | +| `stat.homeRuns` | Home runs allowed. | +| `stat.baseOnBalls` | Walks issued. | +| `stat.intentionalWalks` | Intentional walks issued. | +| `stat.strikeOuts` | Strikeouts. | +| `stat.hitByPitch` | Batters hit by pitch. | +| `stat.balks` | Balks committed. | +| `stat.wildPitches` | Wild pitches. | +| `stat.groundOuts` | Groundouts induced. | +| `stat.airOuts` | Flyouts induced. | +| `stat.stolenBases` | Stolen bases allowed. | +| `stat.caughtStealing` | Runners caught stealing. | +| `stat.sacBunts` | Sacrifice bunts allowed. | +| `stat.sacFlies` | Sacrifice flies allowed. | +| `stat.catchersInterference` | Catcher's interference while pitching. | +| `stat.pickoffs` | Pickoffs. | +| `stat.inheritedRunners` | Inherited runners. | +| `stat.inheritedRunnersScored` | Inherited runners scored. | +| `stat.numberOfPitches` | Pitches thrown. | +| `stat.strikes` | Strikes thrown. | ## Acknowledgements diff --git a/client/README.md b/client/README.md index d74c2f5..8acea7a 100644 --- a/client/README.md +++ b/client/README.md @@ -1,6 +1,6 @@ # Boba | Client App -Boba is a Next.js application designed to help users discover Minor League Baseball (MiLB) prospects with 97% accuracy (R²). It provides a user-friendly interface to explore the 2024 MiLB prospect pool. The app leverages Next.js, Firebase, and Tailwind CSS for a modern and efficient experience. +Boba is a Next.js application designed to help users discover Minor League Baseball (MiLB) prospects with 97% accuracy (R²). It provides a user-friendly interface to explore the 2025 MiLB prospect pool. The app leverages Next.js, Firebase, and Tailwind CSS for a modern and efficient experience. ## Contributors @@ -11,7 +11,7 @@ Boba is a Next.js application designed to help users discover Minor League Baseb ## Features -* View a comprehensive list of all 2024 MiLB prospects. +* View a comprehensive list of all 2025 MiLB prospects. * Explore individual MiLB prospect profiles by clicking on their names. * Sort MiLB prospects based on their ranking. @@ -19,13 +19,13 @@ Data is persistently stored in a Firebase database. Images are efficiently handl ## Routes -* `/`: Home page showcasing a list of all 2024 MiLB prospects. +* `/`: Home page showcasing a list of all 2025 MiLB prospects. * `/about`: About page providing detailed information about the MLB testing data and the "bWAR" metric. * `/contact`: Contact page featuring a form for users to submit inquiries, feedback, or bug reports. ## Data -All data was sourced from the [MLB Stats API](https://statsapi.mlb.com/). Our proprietary machine learning model (H2O.ai Stacked Ensemble) was trained on over 6,000 players and rigorously tested on over 1,500 players spanning the 2015 to 2024 seasons. This model is used to predict all available prospects (over 600) for 2024. Trained on comprehensive player statistics, the model predicts WAR (Wins Above Replacement) for each player, which we've aptly named "bWAR" (Boba Wins Above Replacement). +All data was sourced from the [MLB Stats API](https://statsapi.mlb.com/). Our proprietary machine learning model (H2O.ai Stacked Ensemble) was trained on over 6,000 players and rigorously tested on over 1,500 players spanning the 2015 to 2024 seasons. This model is used to predict all available prospects (over 600) for 2025. Trained on comprehensive player statistics, the model predicts WAR (Wins Above Replacement) for each player, which we've aptly named "bWAR" (Boba Wins Above Replacement). Our WAR Machine demonstrates an impressive 97% accuracy (R²) in predicting MiLB prospects, based on our testing results. Error rates were meticulously calculated using a modified version of Symmetric Mean Absolute Percentage Error (sMAPE) for two values, indicating both magnitude and direction within a range of -100% to 100% (0% sMAPE is perfect accuracy). If 2023 data was available for an MiLB player, we calculated the change in bWAR to indicate the player's growth or decline. diff --git a/ml/README.md b/ml/README.md index 865eea4..4bf32e3 100644 --- a/ml/README.md +++ b/ml/README.md @@ -127,7 +127,7 @@ Model performance is evaluated on the held-out test set using the following metr ## References -[1] Baumer BS, Matthews GJ. A statistician reads the sports page: There is no avoiding WAR. Chance. 2014 Jul 3;27(3):41-4. [https://doi.org/10.1080/09332480.2014.965630](https://doi.org/10.1080/09332480.2014.965630) -[2] Todd Rob. MLB-StatsAPI – GitHub Repository. [https://github.com/toddrob99/MLB-StatsAPI](https://github.com/toddrob99/MLB-StatsAPI) -[3] Baseball-Reference. Baseball-Reference.com WAR Explained. [https://www.baseball-reference.com/about/war_explained.shtml](https://www.baseball-reference.com/about/war_explained.shtml) -[4] Piper Slowinski. Calculating WAR for Position Players. FanGraphs. 2012 Apr 2. [https://library.fangraphs.com/war/war-position-players/](https://library.fangraphs.com/war/war-position-players/) +1. Baumer BS, Matthews GJ. A statistician reads the sports page: There is no avoiding WAR. Chance. 2014 Jul 3;27(3):41-4. [https://doi.org/10.1080/09332480.2014.965630](https://doi.org/10.1080/09332480.2014.965630) +2. Todd Rob. MLB-StatsAPI – GitHub Repository. [https://github.com/toddrob99/MLB-StatsAPI](https://github.com/toddrob99/MLB-StatsAPI) +3. Baseball-Reference. Baseball-Reference.com WAR Explained. [https://www.baseball-reference.com/about/war_explained.shtml](https://www.baseball-reference.com/about/war_explained.shtml) +4. Piper Slowinski. Calculating WAR for Position Players. FanGraphs. 2012 Apr 2. [https://library.fangraphs.com/war/war-position-players/](https://library.fangraphs.com/war/war-position-players/)