Learn SAP from the Experts | The SAP PRESS Blog

76 SAP HANA PAL Algorithms to Utilize

Written by SAP PRESS | Feb 8, 2021 2:00:00 PM

SAP HANA Predictive Analysis Library (SAP HANA PAL) is part of the SAP HANA Application Function Library (AFL), which is a collection of complex application functions.

 

These functions are grouped into multiple libraries depending on their application area, like the Business Function Library (BFL), SAP HANA Predictive Analysis Library (SAP HANA PAL), SAP HANA Automated Predictive Library (SAP HANA APL), and SAP HANA Extended Machine Learning Library (SAP HANA EML).

 

These application functions are written in C++ to achieve high performance and run on the SAP HANA database layer instead of on the application server so that the huge computational power of the SAP HANA database can be utilized. AFL is not part of the SAP HANA appliance; these libraries must be installed separately. The figure below provides a high-level view of the architecture of SAP HANA PAL.

 

 

Machine Learning

Machine learning algorithms require complex computations on data to arrive at a specific model. These computations require several sophisticated numerical techniques like matrix manipulation, numerical optimization, and evaluation of complex functions.

 

Usually, most machine learning tools provide these algorithms as functions/procedures so that the data scientists or machine learning experts don’t have to write these algorithms by hand. Instead, the desired algorithms can be called from a program/editor using functions/procedures and configured with specific values for input data, output data, and model parameters.

 

Machine Learning with SAP HANA

In a similar way, SAP HANA also delivers machine learning algorithms for data scientists and machine learning experts as procedures within SAP HANA PAL. These procedures can be called from SAP HANA’s SQLScript procedures to perform specific machine learning-related tasks. Prior to SAP HANA 2.0 SPS 02, these algorithms were provided as SAP HANA PAL functions.

 

SAP HANA also provides the Application Function Modeler (AFM) tool, which is a graphical modeling tool in SAP HANA Studio. AFM supports accessing and executing SAP HANA PAL procedures in flowgraph models that can be configured with specific input tables, output tables, and parameter values, without writing any SQLScript code. AFM will take care of the code in the backend.

4

SAP HANA PAL includes procedures for different types of machine learning algorithms; in the fascinating world of machine learning, thousands of algorithms are available for managing a wide variety of machine learning scenarios. SAP HANA has adopted the most commonly used algorithms based on market surveys and included them in SAP HANA PAL. Here, we’ll provide a list of the algorithms delivered with SAP HANA PAL as of SAP HANA 2.0 SPS 04:

Clustering

  • Accelerated k-means
  • Affinity propagation
  • Agglomerate hierarchical clustering
  • DBSCAN, geometry DBSCAN
  • Gaussian mixture model (GMM)
  • K-means, k-medians, k-medoids
  • Latent dirichlet allocation
  • Self-organizing maps

Classification

  • Conditional random field
  • Decision tree (including C4.5, CART, CHAID)
  • Hybrid gradient boosting tree
  • KNN
  • Linear discriminant analysis
  • Logistic regression (with elastic net regularization)
  • Multi-class logistic regression
  • Multilayer perceptron
  • Naive Bayes
  • Random decision trees
  • Support vector machine

Regression

  • Bi-variate geometric regression
  • Bi-variate natural logarithmic regression
  • Cox proportional hazard model
  • Exponential regression
  • Generalised linear models
  • Multiple linear regression
  • Polynomial regression 

Social Network Analysis 

  • Link prediction
  • PageRank

Association

  • Apriori
  • FP-growth
  • K-optimal rule discovery (KORD)
  • Sequential pattern mining

Time Series

  • ARIMA, auto-ARIMA
  • Brown exponential smoothing
  • Change-point detection
  • Correlation function
  • Croston's method
  • Fast fourier transform
  • Forecast accuracy measures
  • Hierarchical forecast
  • Linear regression with damped trend and seasonal adjust
  • Exponential smoothing (single, double, triple, auto)
  • Seasonality test, trend test, white noise test

Pre-processing

  • Discretize
  • Inter-quartile range
  • Multidimensional scaling (MDS)
  • Missing value handling
  • Partition
  • Principal component analysis (PCA)
  • Random distribution sampling
  • Sampling
  • Scale, scale with model
  • Variance Test

Recommender System

  • Alternating least squares
  • Factorized polynomial regression models
  • Field-aware factorization machine

Statistics

  • Anova
  • Chi-square goodness-of-fit test
  • Chi-squared test of independence
  • Condition index
  • Cumulative distribution function
  • Distribution fitting
  • Distribution quantile
  • Equal variance test
  • Factor analysis
  • Grubbs' Test
  • Kaplan-Meier survival analysis
  • Kernel density
  • Multivariate analysis
  • One-sample median test
  • T-test
  • Univariate analysis
  • Wilcox signed rank test

Miscellaneous

  • ABC analysis
  • T-distributed stochastic neighbor embedding
  • Weighted score table

Editor’s note: This post has been adapted from a section of the book Machine Learning with SAP: Models and Applications by Laboni Bhowmik, Avijit Dhar, and Ranajay Mukherjee.