Bayesian probabilistic forecasts using categorical information | Part 1

Bayesian probabilistic forecasts using categorical information | Part 1

In this blog, I will make Bayesian forecasts of Ozone concentrations.

My previous blog on Bayesian analysis: Bayesian Feminist.


I have data of a single air pollution monitoring station for the year 2019. It looks something like this:

Article content

MDA — Daily maximum average (8-hr) of Ozone concentrations. We have one data point per day. It’s units are μg/m3

season — Season is a categorical variable calculated from the date itself. Dec/Jan/Feb for Winter; Mar/Apr/May for Summer etc.

Objective: To forecast Ozone concentrations, using Bayes’ theorem.

Note: We can do the same using a simple linear regression as well. This blog is to develop intuition on Bayesian approach.


POSTERIOR = LIKELIHOOD * PRIOR / EVIDENCE

This is the Bayes theorem. Let’s solve it one term after another.

What is Prior?

Prior is our prior opinion on how Ozone concentrations are distributed. We can go for an uninformative prior aka uniform distribution. But since we have the data of a year, we’ll use that distribution as our prior opinion.

Article content
This is a Kernel Density Plot. MDA<0 is not meaningful. But, we'll go along with it now.

Based on this prior, if I ask you to forecast Ozone concentration for someday, what value would you forecast?

I heard you saying the expected value. It is around 68 μg/m3

Article content

This is what you should guess, given that you have no information of the day you are forecasting the Ozone concentration for.


What is Likelihood?

Now, I provide you that information.

I ask you to forecast Ozone concentration for someday, which is in winter.

That “winter” is the information. The likelihood function helps you update your opinion by taking in information.

Generating this likelihood function, which does this noble job, is not so simple. So, focus a bit here.

Likelihood = P(Winter|Ozone data)

I do computational statisitics here. So, I won’t go into equations to generate the likelihood function. But I’ll generate it with computation (iteratively calculating likelihood for each chunk of data).

To put it simply, let me ask this question: What is the probability of season being winter, if ozone concentration is between 50–51?

Article content

There are two days in the year when Ozone value is between 50 and 51. One day is in winter, other is in monsoons. So the probability of season being winter, if ozone concentration is between 50–51 is 50% !

I will now iterate this. For every step of 1 i.e. 0–1; 1–2; 2–3 etc., I’ll calculate the likelihood i.e. the probability of it being a winter season.

For every step, I’ll also calculate the prior probability. It is nothing but the area under the prior curve for that step

Article content
Pink band = Area under 50-51 step

What is Evidence?

This is simple. I have data for 350 days. In which 89 days are winter.

Evidence is nothing but the probability of the information you have. Probability of Winter (since winter season is the only bit of information you have).

P(winter) = 89/350


POSTERIOR = LIKELIHOOD * PRIOR / EVIDENCE

Now we have everything to calculate the Posterior. The updated opinion based on the information (winter) you have.

For every step 0–1; 1–2 etc., we’ll calculate the posterior.

Article content

You can see that my posterior opinion shifted left. Once I have the information that I’m guessing for a winter date, I am no longer using my Prior opinion curve. I’m guessing lesser values. The expected value of the posterior opinion curve is 59.

We can further calculate posteriors given other seasons.

Article content
Article content
Article content

Point forecasts for the year 2020 based on this model will be the follwoing way — basically one value for the entire season.

Article content

It’s bad, obviously. We only took one information — season — to make this forecast. But this is only part 1.


In this part, we saw how one information point changed our prior opinion to left. This is what information is! I read a technical definition of ‘information’ recently and I loved it.

Article content

But the game does not stop there. What if I give you more information? That you are forecasting for a weekend? Then the above posterior will become prior, and you’ll again calculate the posterior using the new information in hand.

We’ll add more information in the next part of this blog.

Opinion updating is an unending business.


Finally, the following should further help you appreciate how Bayesian theorem works.

When asked to forecast on a winter day, you could just see the previous year distribution on winter, right? And forecast using that instead of getting into the Bayesian business.

Well yes. But Bayesian analysis is basically doing the same work for you. The Posterior curve you calculated is basically the distribution of winter data. You can thus approach this problem either as a Frequentist or as a Bayesian. This is to appreciate the fact that, if done well, both approaches will help you solve the problem.

Article content


To view or add a comment, sign in

More articles by Sai Krishna Dammalapati

  • Datafication of Indian court judgments | Part-2

    I worked on the datafication of Indian court judgments two years ago. I detailed that work here: Exploring the…

  • LogProbs

    LogProbs is one of the basic skills for a prompt engineer to have. Some background before implementing it: An LLM model…

    1 Comment
  • When to brush your teeth? A good ANOVA study!

    I found this paper which did a simple ANOVA study to find out when should one brush their teeth! TL;DR Brush twice a…

  • Statistical issues in this paper studying relation between air quality and LULC

    A paper got published in Environmental Monitoring and Assessment. It studied relation between land-use classes (Urban…

  • 100% Mediation in Action

    I wrote about Mediators in the previous article. This is a follow-up to it.

  • Mediators

    I one of my previous blogs, we saw Omitted Variable Bias. In this blog, we’ll do mediation analysis using the same…

  • Visualize Collider Bias with me

    It’s 2020. You are a doctor.

  • A Statistician counts well

    I’ve come across an article Counting as Statistics in Saket Choudhary's blog. The blog has a story on how statisticians…

  • Omitted Variable Bias (OVB)

    You performed a regression between house prices and area and obtained a coefficient (β) for area. You’d interpret it…

  • Clarifications into Regression Discontinuity Design (RDD)

    I came across one RDD study last week where observational data was used to find the causal link between air pollution…

Insights from the community

Others also viewed

Explore topics