Learn SAP from the Experts | The SAP PRESS Blog

Handling Time Series Models with Missing Data in SAP Analytics Cloud

Written by SAP PRESS | Jul 8, 2024 1:00:00 PM

In SAP Analytics Cloud, you can work with time series models that are missing data. This post will discuss.

 

Let’s start with a few definitions:

  • Zero: The period exists, and there is a value equal to zero that is present for this period (e.g., March 2023 exists in the data, and the corresponding value is 0).
  • Blank: This is a period for which no data point is available (e.g., March 2023 doesn’t exist in the data, alongside the corresponding value). This missing data point could also correspond to a missing zero. There are varied reasons why these data points could be blank, as we’ll detail in the table below.

  • Discontinued time series: This time series has values entered for a continuous period and then contains only blank periods (e.g., filled periods for January 2015 to December 2020, “blank” periods since January 2021). See the figure below for an example of a discontinued time series. An example of a discontinued time series is the evolution of product sales when a product is no longer for sale at a given point in time.

  • Emerging time series: This time series has values filled for a given period, usually up to the present, while it had blank periods before (e.g., filled periods for January 2020 to February 2022, “blank” periods before January 2020). See the next figure for an example of an emerging time series. For instance, this could correspond to a new product that started selling recently on the market.

  • Intermittent time series: This time series has a substantial number of zeros, meaning a mix of time periods filled with values (> = 0) and time periods corresponding to zero values. This figure shows an example of an intermittent time series.

 

We’ll now make a few recommendations to handle time series with missing data.

 

The first thing you need to do is check the nature of the different time series. You should also measure the percentage of missing values you have across the data range you use for training the time series forecasting models. This should give you a good sign of what to expect with the time series and the remediation measures you can take. The lower the percentage of missing values in the time series, the better.

 

You have different ways to check the percentage of filled data:

  • If you’re using an entity-based time series forecasting model, you can check the Record Count per entity in the Overview tab of your time series forecasting model (see figure below). If you have a single time series forecasting model, you can see the Record Count in the Predictive Models

  • You can create ad hoc stories based on your training data to report on the number of filled data periods per different time granularities.

Here are a few remediation measures you can take based on your initial data exploration and analysis:

  • You need to gauge the percentage of zeros in an intermittent time series. An intermittent time series with too many zero values is typically not a great fit for predictive forecasting as the time series forecasting model will tend to average zeros, filled values, and flat, averaged predictive forecasts.
  • You should discard discontinued time series as they are most probably no longer relevant to your business case once you’ve confirmed the missing observations do correspond to this time series being effectively discontinued.
  • You should not replace blanks with zeros at the beginning of emerging time series, as it’s normal that such periods are missing.

When you’re replacing blanks with zeros, it’s important that you can evaluate the percentage of zeros in your time series. Having an important percentage of zeros in a time series isn’t the best avenue to success as time series forecasting models aren’t good at predicting based on a mix of zeroes and filled values.

 

Learn more about SAP Analytics Cloud in this overview.

 

Editor’s note: This post has been adapted from a section of the book SAP Analytics Cloud: Predictive Analytics by Antoine Chabert and David Serre.