New Interpolation Methods for Data Synthetization and Prediction

Vincent Granville
2 min readJan 15, 2023

--

With Python code, application to temperature geospatial data and ocean tides dataset.

I describe little-known original interpolation methods with applications to real-life datasets. These simple techniques are easy to implement and can be used for regression or prediction. They offer an alternative to model-based statistical methods. Applications include interpolating ocean tides at Dublin, predicting temperatures in the Chicago area with geospatial data, and a problem in astronomy: planet alignments and frequency of these events. In one example, the 5-min data can be replaced by 80-min measurements, with the 5-min increments reconstructed via interpolation, without noticeable loss. Thus, my algorithm can be used for data compression.

Temperature in the Chicago area: real data (round dots) blended with synthetic data

The first technique has strong ties to Fourier methods. In addition to the above applications, I show how it can be used to efficiently interpolate complex mathematical functions such as Bessel and Riemann zeta. For those familiar with MATLAB or Mathematica, this is an opportunity to play with the MPmath library in Python and see how it compares with the traditional tools in this context. In the process, I also show how the methodology can be used to generate synthetic data, be it time series or geospatial data.

Depending on the parameters, in the geospatial context, the interpolation is either close to nearest-neighbor methods, kriging (also known as Gaussian process regression), or a truly original and hybrid mix of additive and multiplicative techniques. There is an option not to interpolate at locations far away from the training set, where regression or interpolation results may be meaningless, regardless of the technique used.

The second technique is based on ordinary least squares — the same method used to solve polynomial regression — but instead of highly unstable polynomials leading to overfitting, I focus on generic functions that avoid these pitfalls, using an iterative greedy algorithm to find the optimum. In particular, a solution based on orthogonal functions leads to a particularly simple implementation with a direct and elegant solution.

Download the full paper (15 pages) from here.

--

--

Vincent Granville
Vincent Granville

Written by Vincent Granville

Founder, MLtechniques.com. Machine learning scientist. Co-founder of Data Science Central (acquired by Tech Target).

No responses yet