Predicting Teams’ Points against Opponents (Part 2)

This post is a sequel to Predicting Teams’ Points against Opponents.

To summarize, in the previous post, we found that taking an average of teams’ offensive strength and opponents’ defensive strength performed best at predicting points scored against opponents.

For example, if Team A (who scored 110 points on average against its opponents) faced Team B (who allowed 90 points on average to its opponents), then we’d predict that Team A would score 100 points against Team B.

We also found out from the analysis that both cumulative averages and 10-game rolling averages outperformed 5-game rolling averages. Most likely, having more data points to average over increased the accuracy of the predictions.

In this post, we will tweak our methods in two different ways to better predict teams’ points scored against their opponents.


1. Using Home/Away-Specific Averages

First, in the previous post, we used general rolling and cumulative averages, not site-specific averages. I figured using site-specific averages might give us more accurate predictions.

Specifically, if Team A was playing at home against Team B (away team), I’d look at Team A’s average points scored during its previous home games and compare the figure to Team B’s average points allowed during its away games. I would then calculate the mid-point between Team A’s average points scored at home games and Team B’s average points allowed at away games.


2. Using Home/Away-Adjusted General Averages

Another way we can tweak our original prediction methods is by manually adjusting our home/away projections.

From my earlier analysis, I know that teams that play at home tend to score an extra 1 point (on average) against its opponents, even after adjusted for game outcomes. (Yes, it seems very marginal.) By the same logic, this means that teams that play at away games tend to allow an extra 1 point (on average) to its opponents.

Knowing this, we can manually adjust our predictions. Let’s suppose again Team A (who scored 110 points on average) is to play against Team B (who allowed 90 points on average). If Team A is playing at home, we’d calculate that Team A would score 111 points (110 + extra 1 point scored for playing at home). On the other had, we’d calculate that Team B would allow 91 points (90 + extra 1 point allowed for playing at away). We would then find the midpoint of these two numbers to predict what Team A would score against Team B. In this case, that comes out to be (111 + 91) / 2, which is 101.

We have manually adjusted our general average projections by accounting for home/away differences.


Performance Chart for Different Projection Methods


Metric Type Metric Correlation Coefficient Mean Absolute Error (MAE)
General \begin{aligned} \frac{\text{rqP\_rollmean5\_gen} + \text{o\_rqPA\_rollmean5\_gen}}{2} \end{aligned} 0.424  8.65
\begin{aligned} \frac{\text{rqP\_rollmean10\_gen} + \text{o\_rqPA\_rollmean10\_gen}}{2} \end{aligned}  0.461 8.487
\begin{aligned} \frac{\text{rqP\_cummean\_gen} + \text{o\_rqPA\_cummean\_gen}}{2} \end{aligned} 0.474  8.496
Site-Specific \begin{aligned} \frac{\text{rqP\_rollmean5\_site} + \text{o\_rqPA\_rollmean5\_site}}{2} \end{aligned} 0.438  8.591
\begin{aligned} \frac{\text{rqP\_rollmean10\_site} + \text{o\_rqPA\_rollmean10\_site}}{2} \end{aligned}  0.47  8.471
\begin{aligned} \frac{\text{rqP\_cummean\_site} + \text{o\_rqPA\_cummean\_site}}{2} \end{aligned}  0.481 8.469
General (Site-Adjusted) \begin{aligned} \frac{\text{rqP\_rollmean5\_gen\_siteadj} + \text{o\_rqPA\_rollmean5\_gen\_siteadj}}{2} \end{aligned}  0.442  8.567
\begin{aligned} \frac{\text{rqP\_rollmean10\_gen\_siteadj} + \text{o\_rqPA\_rollmean10\_gen\_siteadj}}{2} \end{aligned}  0.479 8.404
\begin{aligned} \frac{\text{rqP\_cummean\_gen\_siteadj} + \text{o\_rqPA\_cummean\_gen\_siteadj}}{2} \end{aligned}  0.492  8.415

We can see from the performance chart that our general projection methods, in which we adjusted for home/away difference, performed better than the simple general projection methods without any adjustments.

We can also see from the performance chart that site-specific projection methods performed better than simple general projection methods, as can be seen by lower mean average errors and higher correlation coefficients.

In the next post, I will utilize multivariate regression modeling techniques to create statistical models to achieve even better predictions.

You May Also Like

About the Author: Howard Song

I’m a data practitioner by day, a web developer by night, a semi-competent swimmer, an active basketball player, a collector of cool ideas, an aspiring entrepreneur, a college dropout but a lifelong learner, and a self-professed nice guy. I love all things basketball, data, programming, and entrepreneurship.

Leave a Reply

Your email address will not be published. Required fields are marked *