Title: | Probabilistic Time Series Forecasting |
---|---|
Description: | Probabilistic time series forecasting via Natural Gradient Boosting for Probabilistic Prediction. |
Authors: | Resul Akay [aut, cre] |
Maintainer: | Resul Akay <[email protected]> |
License: | Apache License (>= 2) |
Version: | 0.1.1 |
Built: | 2025-02-20 03:29:41 UTC |
Source: | https://github.com/akai01/ngboostforecast |
NGBoost distributions
Dist( dist = c("Normal", "Bernoulli", "k_categorical", "StudentT", "Laplace", "Cauchy", "Exponential", "LogNormal", "MultivariateNormal", "Poisson"), k )
Dist( dist = c("Normal", "Bernoulli", "k_categorical", "StudentT", "Laplace", "Cauchy", "Exponential", "LogNormal", "MultivariateNormal", "Poisson"), k )
dist |
NGBoost distributions. One of the following:
|
k |
Used only with k_categorical and MultivariateNormal |
An NGBoost Distribution object
Only for internal usage.
is_exists_conda()
is_exists_conda()
Logical, TRUE if conda is installed.
Resul Akay
The main forecasting class.
An NGBforecast class
new()
Initialize an NGBforecast model.
NGBforecast$new( Dist = NULL, Score = NULL, Base = NULL, natural_gradient = TRUE, n_estimators = as.integer(500), learning_rate = 0.01, minibatch_frac = 1, col_sample = 1, verbose = TRUE, verbose_eval = as.integer(100), tol = 1e-04, random_state = NULL )
Dist
Assumed distributional form of Y|X=x
. An output of
Dist
function, e.g. Dist('Normal')
Score
Rule to compare probabilistic predictions to
the observed data. A score from Scores
function, e.g.
Scores(score = "LogScore")
.
Base
Base learner. An output of sklearner
function,
e.g. sklearner(module = "tree", class = "DecisionTreeRegressor", ...)
natural_gradient
Logical flag indicating whether the natural gradient should be used
n_estimators
The number of boosting iterations to fit
learning_rate
The learning rate
minibatch_frac
The percent subsample of rows to use in each boosting iteration
col_sample
The percent subsample of columns to use in each boosting iteration
verbose
Flag indicating whether output should be printed during fitting. If TRUE it will print logs.
verbose_eval
Increment (in boosting iterations) at which output should be printed
tol
Numerical tolerance to be used in optimization
random_state
Seed for reproducibility.
An NGBforecast object that can be fit.
fit()
Fit the initialized model.
NGBforecast$fit( y, max_lag = 5, xreg = NULL, test_size = NULL, seasonal = TRUE, K = frequency(y)/2 - 1, train_loss_monitor = NULL, val_loss_monitor = NULL, early_stopping_rounds = NULL )
y
A time series (ts) object
max_lag
Maximum number of lags
xreg
Optional. A numerical matrix of external regressors, which must have the same number of rows as y.
test_size
The length of validation set. If it is NULL, then, it is automatically specified.
seasonal
Boolean. If seasonal = TRUE
the fourier terms
will be used for modeling seasonality.
K
Maximum order(s) of Fourier terms, used only if
seasonal = TRUE
.
train_loss_monitor
A custom score or set of scores to track on the training set during training. Defaults to the score defined in the NGBoost constructor. Please do not modify unless you know what you are doing.
val_loss_monitor
A custom score or set of scores to track on the validation set during training. Defaults to the score defined in the NGBoost constructor. Please do not modify unless you know what you are doing.
early_stopping_rounds
The number of consecutive boosting iterations during which the loss has to increase before the algorithm stops early.
NULL
forecast()
Forecast the fitted model
NGBforecast$forecast(h = 6, xreg = NULL, level = c(80, 95), data_frame = FALSE)
h
Forecast horizon
xreg
A numerical vector or matrix of external regressors
level
Confidence level for prediction intervals
data_frame
Bool. If TRUE, forecast will be returned as a
data.frame object, if FALSE it will return a forecast class. If TRUE,
autoplot
will function.
feature_importances()
Return the feature importance for all parameters in the distribution (the higher, the more important the feature).
NGBforecast$feature_importances()
A data frame
plot_feature_importance()
Plot feature importance
NGBforecast$plot_feature_importance()
A ggplot object
get_params()
Get parameters for this estimator.
NGBforecast$get_params(deep = TRUE)
deep
bool, default = TRUE If True, will return the parameters for this estimator and contained subobjects that are estimators.
A named list of parameters.
clone()
The objects of this class are cloneable with this method.
NGBforecast$clone(deep = FALSE)
deep
Whether to make a deep clone.
Resul Akay
Duan, T et. al. (2019), NGBoost: Natural Gradient Boosting for Probabilistic Prediction.
## Not run: library(ngboostForecast) model <- NGBforecast$new(Dist = Dist("Normal"), Base = sklearner(module = "linear_model", class = "Ridge"), Score = Scores("LogScore"), natural_gradient = TRUE, n_estimators = 200, learning_rate = 0.1, minibatch_frac = 1, col_sample = 1, verbose = TRUE, verbose_eval = 100, tol = 1e-5) model$fit(y = AirPassengers, seasonal = TRUE, max_lag = 12, xreg = NULL, early_stopping_rounds = 10L) fc <- model$forecast(h = 12, level = c(90, 80), xreg = NULL) autoplot(fc) ## End(Not run)
## Not run: library(ngboostForecast) model <- NGBforecast$new(Dist = Dist("Normal"), Base = sklearner(module = "linear_model", class = "Ridge"), Score = Scores("LogScore"), natural_gradient = TRUE, n_estimators = 200, learning_rate = 0.1, minibatch_frac = 1, col_sample = 1, verbose = TRUE, verbose_eval = 100, tol = 1e-5) model$fit(y = AirPassengers, seasonal = TRUE, max_lag = 12, xreg = NULL, early_stopping_rounds = 10L) fc <- model$forecast(h = 12, level = c(90, 80), xreg = NULL) autoplot(fc) ## End(Not run)
It is a wrapper for the sklearn GridSearchCV with TimeSeriesSplit.
new()
Initialize an NGBforecastCV model.
NGBforecastCV$new( Dist = NULL, Score = NULL, Base = NULL, natural_gradient = TRUE, n_estimators = as.integer(500), learning_rate = 0.01, minibatch_frac = 1, col_sample = 1, verbose = TRUE, verbose_eval = as.integer(100), tol = 1e-04, random_state = NULL )
Dist
Assumed distributional form of Y|X=x
. An output of
Dist
function, e.g. Dist('Normal')
Score
Rule to compare probabilistic predictions to
the observed data. A score from Scores
function, e.g.
Scores(score = "LogScore")
.
Base
Base learner. An output of sklearner
function,
e.g. sklearner(module = "tree", class = "DecisionTreeRegressor", ...)
natural_gradient
Logical flag indicating whether the natural gradient should be used
n_estimators
The number of boosting iterations to fit
learning_rate
The learning rate
minibatch_frac
The percent subsample of rows to use in each boosting iteration
col_sample
The percent subsample of columns to use in each boosting iteration
verbose
Flag indicating whether output should be printed during fitting. If TRUE it will print logs.
verbose_eval
Increment (in boosting iterations) at which output should be printed
tol
Numerical tolerance to be used in optimization
random_state
Seed for reproducibility.
An NGBforecastCV object that can be fit.
tune()
Tune ngboosForecast.
NGBforecastCV$tune( y, max_lag = 5, xreg = NULL, seasonal = TRUE, K = frequency(y)/2 - 1, n_splits = NULL, train_loss_monitor = NULL, val_loss_monitor = NULL, early_stopping_rounds = NULL )
y
A time series (ts) object
max_lag
Maximum number of lags
xreg
Optional. A numerical matrix of external regressors, which must have the same number of rows as y.
seasonal
Boolean. If seasonal = TRUE
the fourier terms
will be used for modeling seasonality.
K
Maximum order(s) of Fourier terms, used only if
seasonal = TRUE
.
n_splits
Number of splits. Must be at least 2.
train_loss_monitor
A custom score or set of scores to track on the training set during training. Defaults to the score defined in the NGBoost constructor. Please do not modify unless you know what you are doing.
val_loss_monitor
A custom score or set of scores to track on the validation set during training. Defaults to the score defined in the NGBoost constructor. Please do not modify unless you know what you are doing.
early_stopping_rounds
The number of consecutive boosting iterations during which the loss has to increase before the algorithm stops early.
test_size
The length of validation set. If it is NULL, then, it is automatically specified.
A named list of best parameters.
clone()
The objects of this class are cloneable with this method.
NGBforecastCV$clone(deep = FALSE)
deep
Whether to make a deep clone.
Resul Akay
https://stanfordmlgroup.github.io/ngboost/2-tuning.html
## Not run: library(ngboostForecast) dists <- list(Dist("Normal")) base_learners <- list(sklearner(module = "tree", class = "DecisionTreeRegressor", max_depth = 1), sklearner(module = "tree", class = "DecisionTreeRegressor", max_depth = 2), sklearner(module = "tree", class = "DecisionTreeRegressor", max_depth = 3), sklearner(module = "tree", class = "DecisionTreeRegressor", max_depth = 4), sklearner(module = "tree", class = "DecisionTreeRegressor", max_depth = 5), sklearner(module = "tree", class = "DecisionTreeRegressor", max_depth = 6), sklearner(module = "tree", class = "DecisionTreeRegressor", max_depth = 7)) scores <- list(Scores("LogScore")) model <- NGBforecastCV$new(Dist = dists, Base = base_learners, Score = scores, natural_gradient = TRUE, n_estimators = list(10, 100), learning_rate = list(0.1, 0.2), minibatch_frac = list(0.1, 1), col_sample = list(0.3), verbose = FALSE, verbose_eval = 100, tol = 1e-5) params <- model$tune(y = AirPassengers, seasonal = TRUE, max_lag = 12, xreg = NULL, early_stopping_rounds = NULL, n_splits = 4L) params ## End(Not run)
## Not run: library(ngboostForecast) dists <- list(Dist("Normal")) base_learners <- list(sklearner(module = "tree", class = "DecisionTreeRegressor", max_depth = 1), sklearner(module = "tree", class = "DecisionTreeRegressor", max_depth = 2), sklearner(module = "tree", class = "DecisionTreeRegressor", max_depth = 3), sklearner(module = "tree", class = "DecisionTreeRegressor", max_depth = 4), sklearner(module = "tree", class = "DecisionTreeRegressor", max_depth = 5), sklearner(module = "tree", class = "DecisionTreeRegressor", max_depth = 6), sklearner(module = "tree", class = "DecisionTreeRegressor", max_depth = 7)) scores <- list(Scores("LogScore")) model <- NGBforecastCV$new(Dist = dists, Base = base_learners, Score = scores, natural_gradient = TRUE, n_estimators = list(10, 100), learning_rate = list(0.1, 0.2), minibatch_frac = list(0.1, 1), col_sample = list(0.3), verbose = FALSE, verbose_eval = 100, tol = 1e-5) params <- model$tune(y = AirPassengers, seasonal = TRUE, max_lag = 12, xreg = NULL, early_stopping_rounds = NULL, n_splits = 4L) params ## End(Not run)
Probabilistic time series forecasting via Natural Gradient Boosting for Probabilistic Prediction.
Duan, T et. al. (2019), NGBoost: Natural Gradient Boosting for Probabilistic Prediction.
## Not run: library(ngboostForecast) model <- NGBforecast$new(Dist = Dist("Normal"), Base = sklearner(module = "linear_model", class = "Ridge"), Score = Scores("LogScore"), natural_gradient = TRUE, n_estimators = 200, learning_rate = 0.1, minibatch_frac = 1, col_sample = 1, verbose = TRUE, verbose_eval = 100, tol = 1e-5) model$fit(y = AirPassengers, seasonal = TRUE, max_lag = 12, xreg = NULL, early_stopping_rounds = 10L) fc <- model$forecast(h = 12, level = c(90, 80), xreg = NULL) autoplot(fc) ## End(Not run)
## Not run: library(ngboostForecast) model <- NGBforecast$new(Dist = Dist("Normal"), Base = sklearner(module = "linear_model", class = "Ridge"), Score = Scores("LogScore"), natural_gradient = TRUE, n_estimators = 200, learning_rate = 0.1, minibatch_frac = 1, col_sample = 1, verbose = TRUE, verbose_eval = 100, tol = 1e-5) model$fit(y = AirPassengers, seasonal = TRUE, max_lag = 12, xreg = NULL, early_stopping_rounds = 10L) fc <- model$forecast(h = 12, level = c(90, 80), xreg = NULL) autoplot(fc) ## End(Not run)
Select a rule to compare probabilistic predictions to the observed data. A score from ngboost.scores, e.g. LogScore.
Scores(score = c("LogScore", "CRPS", "CRPScore", "MLE"))
Scores(score = c("LogScore", "CRPS", "CRPScore", "MLE"))
score |
A string. can be one of the following:
|
A score class from ngboost.scores
Resul Akay
The Seatbelts dataset from the datasets package.
seatbelts
seatbelts
An object of class mts
(inherits from ts
) with 192 rows and 8 columns.
Harvey, A.C. (1989). Forecasting, Structural Time Series Models and the Kalman Filter. Cambridge University Press, pp. 519–523.
Durbin, J. and Koopman, S. J. (2001). Time Series Analysis by State Space Methods. Oxford University Press.
https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/UKDriverDeaths.html
Harvey, A. C. and Durbin, J. (1986). The effects of seat belt legislation on British road casualties: A case study in structural time series modelling. Journal of the Royal Statistical Society series A, 149, 187–227.
Scikit-Learn interface
sklearner(module = "tree", class = "DecisionTreeRegressor", ...)
sklearner(module = "tree", class = "DecisionTreeRegressor", ...)
module |
scikit-learn module name, default is 'tree'. |
class |
scikit-learn's module class, default is 'DecisionTreeRegressor' |
... |
Other arguments passed to model class |
Resul Akay
## Not run: sklearner(module = "tree", class = "DecisionTreeRegressor", criterion="friedman_mse", min_samples_split=2) ## End(Not run)
## Not run: sklearner(module = "tree", class = "DecisionTreeRegressor", criterion="friedman_mse", min_samples_split=2) ## End(Not run)