Pitching Analytics for MLB Betting

Table of Contents
- One pitcher carries the game in a way no other sport allows
- Why the starter dominates the matchup more than the franchise
- ERA is the headline number, and the headline number is misleading
- WHIP and what it actually tells you about a starter
- FIP and xERA: the metrics that strip luck from the result
- K/9 and why strikeout rate predicts more than wins
- BB/9 and HR/9: the cheap mistakes that decide tight games
- Times through the order: the penalty everybody underestimates
- Building a matchup model that beats the bookmaker’s defaults
- Where a UK bettor finds the data without paying a subscription
One pitcher carries the game in a way no other sport allows
The average MLB game in 2025 lasted 2 hours 38 minutes — the third consecutive season under 2:40 for the first time since 1983–85. That stat looks like trivia until you realise what it actually means for betting. Pitchers throw fewer pitches per game on average, starters reach the seventh inning less often, and the share of innings handled by the bullpen has crept up year on year. The pitcher’s role is changing under your feet, and the betting market is still pricing the starter as if it’s 2018.
This article is my attempt to walk through the metrics that actually predict pitching outcomes — not the ones broadcast graphics throw at you, the ones I rely on when modelling matchups for stakes. ERA gets ten seconds; FIP gets a section. WHIP gets unpacked properly. The strikeout rate gets treated as the single most important predictor because, on the data I’ve tracked, it is.
The framing I’d offer up front. In MLB, one pitcher throws roughly 60–70% of his team’s innings in any given game he starts. That share of in-game influence is higher than in any other major team sport. A quarterback in American football touches every offensive play; a starter in baseball touches every defensive play of his innings. The difference is that the quarterback has ten teammates also executing on each play, while the pitcher is the single player on the mound when every pitch is thrown. Pitching is the most concentrated source of in-game variance in any major sport.
The takeaway from that concentration is uncomfortable for retail punters: you can know nothing about either team and still profit on a single game if you’ve correctly assessed the starting pitcher matchup. The franchise barely matters. The bullpen matters in late innings. The starter matters across the first 18 to 21 outs.
What follows is the seven-metric set I use. ERA, WHIP, FIP, xERA, K/9, BB/9, and HR/9, plus the times-through-the-order penalty. Each section explains the metric, the threshold value that signals a real edge, and how the bookmaker tends to misprice it. The data sources section at the end is where to find these numbers without paying a subscription — important if your monthly betting budget under the UK affordability framework is in the £50–£200 range and a £400 stats subscription isn’t going to fit.
Why the starter dominates the matchup more than the franchise
Rob Manfred, the MLB Commissioner, put the international growth case plainly: “We do continue to attract the best players in the world to play in Major League Baseball. We had unbelievable audiences in Korea and Japan last year.” That’s the headline. The under-the-headline truth is that the “best players in the world” includes a cohort of pitchers whose individual performance can drag a mediocre team to a series win or hand a great team an embarrassing loss. The talent is uneven across the league — and within a team, often uneven across the rotation.
Concretely. A team with a four-man rotation will give roughly 80% of its innings to four starters. If three of them are quality arms and the fourth is replacement-level, that team’s fourth-rotation games carry an entirely different price than its first-rotation games. The bookmaker prices this — but inconsistently. The bookmaker’s model usually treats all four rotation slots the same way for moneyline purposes, with adjustments only for the day’s specific starter.
The variance source is uneven inning impact. The first inning is heavily influenced by the starter facing the top of the lineup with no game-state context. The middle innings are still mostly the starter, but he’s now facing hitters for the second and third time and the order’s adjustments start to bite. The late innings shift to the bullpen, where the variance is structurally different — short outings, high-leverage situations, fatigue effects.
The 2025 pitch clock acceleration accelerated this shift. With games shortened to 2:38 average length and pitchers throwing fewer pitches per outing on average, starters have been reaching the seventh inning less often. The bullpen has expanded its share of innings. For betting purposes, this means the starter’s grip on the F5 markets has tightened (he’s almost always still pitching through the fifth), while his grip on the full-game outcome has loosened slightly (he’s less likely to finish the seventh).
One pattern I’ve tracked across the last three seasons: starters’ impact on the moneyline is roughly twice their impact on the totals market when measured against bookmaker mispricing. The market under-prices great starters as moneyline favourites; it more accurately prices great starters as Under tickets on totals. Why? Because Under tickets correlate with multiple metrics (starter + opponent’s offensive depth + bullpen depth + ballpark factors), while moneyline tickets concentrate the bet on the single biggest variable. The bookmaker’s model handles distributed variance better than concentrated variance.
ERA is the headline number, and the headline number is misleading
ERA — earned run average — is the metric every broadcast splashes across the screen, and the one I weight least heavily in any modelling work. It measures the average number of earned runs a pitcher allows per nine innings of work. A 3.50 ERA means the starter is averaging 3.5 earned runs across nine innings; a 4.50 ERA means 4.5 earned runs.
The math is straightforward. The problem is what’s baked in. ERA includes the defence behind the pitcher (a great shortstop turns ground balls into outs that an average shortstop misses; the average shortstop’s misses become hits, runs, and earned-run charges against the pitcher). ERA includes sequencing luck (a pitcher who gives up three singles in an inning gives up runs; the same three singles spread across three innings might give up zero runs). ERA includes bullpen luck (a starter who exits with runners on base relies on the reliever to strand them; if the reliever fails, the inherited runners become earned runs on the starter’s line).
What you actually get from ERA is a lagging indicator. It tells you what happened, weighted by stuff that didn’t have anything to do with the pitcher’s actual performance. Across a full season of 30+ starts, ERA does roughly track quality — but with noise of ±0.40 runs per nine innings, which is enormous for betting purposes. Across a 10-start sample (the slice the bookmaker’s model often weights heavily), ERA has noise of ±0.80 or more.
The number where ERA becomes useful is at the extremes. A starter running an ERA below 2.50 across 100+ innings is almost certainly genuinely good; the defence and sequencing variance can’t account for that gap. A starter running an ERA above 5.00 across 100+ innings is almost certainly genuinely poor. The middle band — 3.00 to 4.50 — is where ERA is essentially uninformative without context.
What I use ERA for in practice is anchoring. When the broadcast announcer says “the starter has a 3.30 ERA”, I take that as roughly true that he’s a quality starter. I don’t bet on the basis of it. The metrics in the next sections — WHIP, FIP, K/9 — are where I actually form a view on the price.
WHIP and what it actually tells you about a starter
WHIP — walks plus hits per innings pitched — is the metric I would teach a new MLB punter before any other. It measures how many baserunners the starter is allowing per inning of work. Walks and hits are summed; the total is divided by innings pitched. A WHIP of 1.00 means the pitcher is averaging exactly one baserunner per inning; 1.30 means 1.3 baserunners.
What makes WHIP more useful than ERA is what it excludes. WHIP doesn’t care about defence. It doesn’t care about sequencing. It doesn’t care about the bullpen. It measures only the baserunners the pitcher himself is allowing on his own pitches and his own walks. That makes it a cleaner read on the pitcher’s actual ability than ERA can be.
The benchmark thresholds I work from: a WHIP below 1.00 is elite (very few starters sustain this across a full season). A WHIP of 1.00 to 1.15 is excellent — top-tier starter quality. 1.15 to 1.30 is solid — quality back-of-rotation. 1.30 to 1.45 is league-average. Above 1.45 is below-average, and above 1.60 is start-replacing-this-pitcher territory.
The reason WHIP predicts more than ERA on a bet-by-bet basis is the variance smoothing. ERA’s outcome (runs) is sparse — a starter might give up 0, 1, 2, or 6 earned runs in a given game, with the distribution wildly skewed. Baserunners (the input to WHIP) are denser — even on a great outing, a starter typically allows 4 or 5 baserunners. The denser the input metric, the smaller the noise.
Where WHIP gets interesting for betting is the gap between WHIP and ERA on a specific starter. A pitcher with a 1.05 WHIP but a 4.20 ERA has been unlucky — his baserunner control is excellent, but those baserunners have been scoring at an above-average rate due to sequencing or defence. Bookmaker models that anchor on ERA will price this pitcher too long on the moneyline. Conversely, a pitcher with a 1.45 WHIP but a 3.10 ERA has been lucky — his baserunners haven’t been scoring much, and the run prevention won’t sustain.
I trade this WHIP-ERA divergence regularly. When the divergence is larger than 0.8 (in either direction), the bookmaker has materially mispriced the next start. The signal isn’t perfect — pitchers can sustain low strand rates for partial seasons, and great pitchers really do strand runners better than the league average — but on a hundred-bet sample, the divergence has predicted my closing line value better than any single other input.
FIP and xERA: the metrics that strip luck from the result
FIP and xERA exist for the same reason: ERA is too noisy to be useful at the individual-start level, and someone smart wanted a number that captured pitcher skill without the noise. They go about it differently.
FIP — fielding-independent pitching — strips out everything except the things only the pitcher controls: strikeouts, walks, hit-by-pitches, and home runs allowed. The formula weights each of these against league averages and produces a number scaled to look like ERA. A 3.10 FIP can be read as “this is what the pitcher’s ERA would be if his defence and sequencing luck were exactly league-average”. A 4.20 FIP says the same in the opposite direction.
The intuition is that strikeouts are 100% the pitcher’s doing (no fielder is involved). Walks and HBPs are 100% the pitcher’s doing (he chose where to throw the ball). Home runs are mostly the pitcher’s doing (the hitter had to do something with the pitch, but the ball left the park because the pitch was hittable). Everything else — ground balls, fly balls, line drives — depends on defence and on the random distribution of where the ball goes. FIP ignores the random parts.
xERA — expected ERA — does the same thing but with more sophistication. It uses Statcast data on exit velocity, launch angle, and barrel rate to estimate what runs the pitcher “should” have allowed based on the quality of contact he gave up. A pitcher who gives up lots of weak contact has a low xERA even if his ERA is inflated by bloop hits and bad defence. A pitcher who gives up lots of hard contact has a high xERA even if the line drives have been caught.
For betting purposes, FIP is the workhorse and xERA is the validator. I use FIP for matchup modelling because it’s faster to compute, easier to find in free data sources, and more stable across sample sizes under 100 innings. I cross-check with xERA when the FIP-ERA gap looks suspicious, because xERA’s Statcast inputs can confirm or contradict the FIP read.
The signal I treat as gold is “FIP and xERA agree and disagree with ERA”. When both fielding-independent metrics say the starter is better than his ERA suggests, the bookmaker’s price (which usually anchors on ERA) is too long. When both say he’s worse, the bookmaker’s price is too short. When the two metrics disagree with each other, I treat the matchup as cloudy and don’t bet.
One practical note: FIP is not predictive of next-season FIP at the level retail betting needs. It’s a descriptive metric of what happened. Next season’s projection uses different tools (Steamer, ZiPS, Marcel) that combine multiple seasons of data. For single-game betting, FIP-to-date is the right input. For futures markets (covered in the futures cluster), you want a projection system, not a descriptive metric.
K/9 and why strikeout rate predicts more than wins
If I had to pick one metric to model an MLB matchup with — one number, no others — it would be K/9. Strikeouts per nine innings. Strikeout rate is the single most stable, most predictive, most repeatable pitching metric in the entire data ecosystem.
Why? Strikeouts are pure pitcher-versus-batter. No fielders. No park effects worth speaking of. No sequencing, no luck. The ball doesn’t get put in play; the at-bat resolves on the pitcher’s stuff and the batter’s swing decisions. Year to year, K/9 is the most correlated pitching metric — meaning a pitcher’s K/9 in season X is the best single predictor of his K/9 in season X+1. ERA, WHIP, FIP all correlate less.
The benchmark thresholds: a K/9 above 11.0 is elite. 9.0 to 11.0 is excellent. 7.5 to 9.0 is solid. 6.0 to 7.5 is below average for a modern starter. Below 6.0 is concerning unless the pitcher is a ground-ball specialist with strong control. The league-average K/9 for starters is around 8.5 in 2026, up from 7.0 a decade ago — strikeout rates have risen steadily as bullpens get better and starters lean harder on max-effort outings.
What K/9 predicts for betting purposes goes beyond pitcher quality. It predicts strikeout props directly — for which there’s a deeper treatment available in the applying K/9 to strikeout props cluster. It predicts F5 totals (high-K starters suppress run scoring in their innings). It correlates with moneyline expected value when paired with opponent contact rates. And it correlates inversely with the variance of the start — high-K pitchers are more reliable inning-to-inning because their out-getting doesn’t depend on fielders.
One nuance worth carrying: K/9 against specific opponent lineups varies significantly. A pitcher with a season K/9 of 10.0 facing a strikeout-prone lineup (team K-rate above 25%) might project to 12.0 in that matchup. The same pitcher facing a contact-heavy lineup (team K-rate below 18%) might project to 8.0. The bookmaker prices the prop line accordingly, but the model often lags actual recent opponent K-rates by several weeks.
The single most consistent retail edge I’ve found in MLB betting is taking pitcher strikeout Over lines when the pitcher’s K/9 is elite and the opponent’s K-rate is above league average. The bookmaker prices the prop on the pitcher’s season K/9 multiplied by expected innings, with a soft adjustment for opponent. The adjustment is consistently too small. The Over line clears more often than the implied probability says.
BB/9 and HR/9: the cheap mistakes that decide tight games
BB/9 is walks per nine innings; HR/9 is home runs allowed per nine innings. Both are simple metrics that the bookmaker’s model handles competently for moneyline pricing but mishandles in specific contexts that retail edge survives in.
BB/9 thresholds: below 2.0 is elite (Greg Maddux territory). 2.0 to 2.5 is excellent. 2.5 to 3.0 is solid. 3.0 to 3.5 is league-average. Above 3.5 is concerning. Above 4.5 the pitcher is hurting his own team. The metric is highly stable year-to-year; pitchers who walk batters tend to keep walking batters because control is a learned skill that doesn’t fluctuate much in season.
Why BB/9 matters for betting is that walks correlate strongly with big innings. A starter who walks two batters per inning is one bad swing away from a four-run inning every time. The home runs that ruin starts disproportionately come with one or two baserunners on. BB/9 above 3.5 amplifies the variance of every other metric.
HR/9 thresholds: below 0.8 is excellent. 0.8 to 1.2 is solid. 1.2 to 1.5 is concerning. Above 1.5 is structurally problematic. The metric is more park-dependent than the others — Coors Field starters run HR/9 figures that look terrible but reflect the venue, not the pitcher. HR/9 normalised for park (HR/9- in some data sources) is the cleaner read.
The interaction between BB/9 and HR/9 is the part that bookmakers’ models handle worst. A starter with a 3.8 BB/9 and a 1.3 HR/9 isn’t twice as bad as a starter with a 3.8 BB/9 and a 0.6 HR/9. The combination is much worse than the sum because walked batters become solo-walking-on-bases-runs only sometimes, but they become two-or-three-run shots regularly. The convex relationship is what models tend to flatten.
For UK betting purposes, the practical advice is to flag starters whose BB/9 and HR/9 are both above league average. The bookmaker’s moneyline will price them roughly fairly. The Over on the totals line is where the underpricing tends to show up. The hidden risk of multi-run innings is hard for a model to fully express in the single number that is the totals line.
Times through the order: the penalty everybody underestimates
Here’s the metric that the broadcast graphic never shows and the bookmaker’s model handles badly: the times-through-the-order penalty.
The penalty refers to the documented fact that starting pitchers perform progressively worse each time they cycle through the opposing lineup. First time through (batters 1–9), a typical starter has an OPS-against around .680. Second time through (batters 10–18), that climbs to .720. Third time through (batters 19–27), it jumps to .770 or higher. Fourth time through (rare in 2026 with the pitch clock and modern usage patterns), the OPS-against spikes above .800.
What drives the penalty is information. The hitters have seen the pitcher’s stuff. They’ve calibrated to his timing. They know what’s coming with two strikes versus what’s coming early in the count. The pitcher’s fastball doesn’t get slower the third time around — but the batters’ bat speed effectively gets faster because they know when to start their swing.
For betting purposes, the third-time-through threshold is the trigger. A starter heading into his third pass through the order — typically in the fifth and sixth innings — is materially weaker than the same starter in his first pass. Most managers in 2026 manage to this aggressively: starters get pulled in the fifth or sixth if a high-leverage situation arises against the top of the order for the third time.
The bookmaker’s model knows the penalty exists. What it doesn’t price well is the interaction between the penalty and the specific lineup. A starter heading into his third pass against a deep, patient lineup faces real third-time penalty risk. The same starter against a thin lineup with high strikeout rates faces less risk because the lineup hasn’t learned as much from the first two passes.
The application for UK punters is the live totals market. If a starter enters the fifth inning having limited the opponent to one or two hits, the bookmaker’s live total has dropped (because the under is now favoured). But the third-time penalty is about to bite, the bullpen will need to enter soon, and the variance is about to spike. The live Over price in this spot is structurally generous. I’ve consistently profited on live Over tickets in exactly this window, and the principle is just timing the penalty correctly.
Building a matchup model that beats the bookmaker’s defaults
SportsLine, a prediction service I’ve followed casually for several years, ran their MLB picks engine to a 35–29 record on moneyline picks across the 2025 season, with each World Series scenario simulated 10,000 times. That sample is too small to claim systematic edge — 64 picks at random would land 50/50 plus a few percentage points of noise — but the methodology is the right one. Simulate the matchup ten thousand times. Look at the distribution of outcomes. Bet when your distribution materially differs from the bookmaker’s implied distribution.
You don’t need a simulation engine to do this manually. The framework, which I run on paper for two or three matchups per night:
Step one. Estimate each starter’s expected innings pitched. A typical 2026 starter goes 5.5 to 6 innings. A high-K starter might reach 6.5. A struggling starter might be pulled at 4. Match this to your matchup model — a starter facing a high-OBP lineup will likely come out sooner.
Step two. Estimate each starter’s expected runs allowed across those innings. Use FIP scaled to expected innings, with a 10–15% adjustment for opponent quality. A 3.20 FIP starter going 6 innings projects to roughly 2.1 expected runs; bump that to 2.4 against a top-tier lineup.
Step three. Estimate the bullpen’s contribution. Default to the team’s bullpen ERA scaled to expected bullpen innings. A team with a 4.10 bullpen ERA covering 3 innings projects to roughly 1.4 expected runs.
Step four. Sum the team-against-team expected runs. Team A starter expected runs allowed + Team A bullpen expected runs allowed = Team B’s projected runs scored. Same in reverse. Now you have two numbers: each team’s expected runs in the game.
Step five. Convert expected runs to win probability. The Pythagorean theorem of baseball (Bill James’s formulation) says win percentage approximately equals runs scored squared divided by (runs scored squared plus runs allowed squared). If you project Team A scoring 4.2 and allowing 3.6, their projected win rate is 4.2²/(4.2²+3.6²) = 17.64/30.6 = ~57.6%.
Step six. Compare your win probability to the bookmaker’s implied probability. The bookmaker’s moneyline of 1.75 implies 57.1%. Your model says 57.6%. The gap is too small to bet. The bookmaker’s moneyline of 2.10 implies 47.6%. Your model says 57.6%. The gap is 10 percentage points — a strong positive EV bet.
The framework above is rough and crude. A professional model uses much more sophisticated inputs and a much more careful adjustment process. But the rough framework is also tractable for a UK punter doing their analysis at 8pm on a Tuesday evening, and it produces directional signals that beat the median retail bettor consistently. The polished work happens at the margins.
Where a UK bettor finds the data without paying a subscription
Most British MLB punters I talk to have either zero data sources or one expensive one. Both are wrong. The data ecosystem for MLB pitching analytics is generous if you know where to look, and the price is mostly free if you’re willing to do the lookup work yourself.
MLB.com itself is the starting point and the most underappreciated source. The Stats section covers ERA, WHIP, K/9, BB/9, HR/9 for every active pitcher, splittable by season and recent games. The site’s traffic underscores how active the data ecosystem is — MLB.TV grew viewership by 27% year-on-year in 2025, with 7.5 billion minutes of streaming consumed by early June 2025. The audience is large, the data is publicly accessible, and the depth is good enough for most retail betting analysis.
Baseball Reference is the historical standard. Career splits, year-by-year breakdowns, pitch-type usage. Free for browsing; paid tier for advanced filtering. The free tier is sufficient for any UK punter doing single-game analysis. The paid tier is worth it only if you’re running deep historical research.
FanGraphs covers the advanced metrics — FIP, xFIP, xERA, SIERA, BB/9, K/9, swing rates, and the times-through-the-order splits. Free for browsing; paid Membership tier unlocks projections, custom queries, and deeper splits. For UK retail punters, the free tier covers everything needed. The Membership is for serious modellers.
Baseball Savant is the MLB-operated Statcast hub. Exit velocity, launch angle, barrel rate, all the Statcast inputs that drive xERA. Free. The interface is dense but rewards an hour of exploration.
Rotowire is the working-pitcher news source. Late lineup changes, injury news, starter scratches. The free RSS feeds cover most of what you need.
The pricing for these is, honestly, friendlier than most UK punters realise. The combined cost of “browse free across MLB.com, Baseball Reference, FanGraphs, and Baseball Savant” is zero. The marginal upgrade to paid tiers across those four sources is roughly £20 per month combined — perfectly affordable for any UK punter staking £150 per month under the affordability framework.
The hardest data to find for free is in-season opponent K-rates against specific pitch types, which is where Statcast paid analytics tiers become tempting. If you’re refining a model that already shows positive CLV, the paid tier is justified. If you’re still building one, free sources will get you 90% of the way.
Which single pitching metric is most predictive of MLB betting outcomes?
K/9 — strikeouts per nine innings — is the most stable and predictive pitching metric on a single-game basis. It depends entirely on the pitcher’s stuff and the batter’s swing decisions, with no defence or sequencing luck in the way. Across multiple seasons, K/9 is the metric most correlated year-to-year with itself, meaning a starter’s prior K/9 is the best leading indicator of his next K/9. For bookmaker pricing, the K/9 against opponent K-rate matchup is the angle that retail edge most reliably lives in.
Where can UK bettors find reliable pitching analytics?
MLB.com, Baseball Reference, FanGraphs, and Baseball Savant cover essentially all the metrics needed for retail-level MLB betting analysis, and the free tiers across all four are usually sufficient. The combined cost is zero if you do the manual lookup, or roughly £20 per month if you upgrade to the paid Membership tiers. Rotowire covers in-the-hour news on lineups and pitcher scratches.
How does the pitch clock affect pitching analytics for betting?
The pitch clock has shortened average game times to 2 hours 38 minutes — three consecutive seasons under 2:40 for the first time since 1983–85. The downstream effect on pitching analytics is that starters reach the seventh inning less frequently, bullpens cover slightly more innings, and the third-time-through-the-order penalty matters more because managers pull starters more aggressively. K/9 has crept up because pitchers throw fewer pitches per outing on average, which lets them throw harder. Bookmaker models have been slow to fully adjust totals lines for these shifts.
Created by the ”mlb Online Betting” editorial team.
