Regressors

The gtime.regressors module contains regression models.

class gtime.regressors.ExplainableRegressor(estimator: RegressorMixin, explainer_type: str)

Wraps the most commons scikit-learn regressor to offer a nice to use interface to fit/predict models and at the same time to explain the predictions.

Since it follows the fit/predict interface of scikit-learn model it is compatible with scikit-learn pipelines, etc..

2 explainers are available: LIME and SHAP

You can get the explanation by accessing to regressor.explainer_.explanations_ after the predict function,

Parameters

estimator: RegressorMixin, required: the scikit-learn model
explainer_type: str, required: ‘lime’ or ‘shap’

Examples

>>> import numpy as np
>>> from gtime.regressors import ExplainableRegressor
>>> from sklearn.ensemble import RandomForestRegressor
>>> X = np.random.random((30, 5))
>>> y = np.random.random(30)
>>> X_train, y_train = X[:20], y[:20]
>>> X_test, y_test = X[20:], y[20:]
>>>
>>> random_forest = RandomForestRegressor()
>>> explainable_regressor = ExplainableRegressor(random_forest, 'shap')
>>>
>>> explainable_regressor.fit(X_train, y_train, feature_names=['a', 'b', 'c', 'd', 'e'])
>>> explainable_regressor.predict(X_test)
array([0.41323105, 0.40386639, 0.46462663, 0.3795568 , 0.57571486,
       0.37079003, 0.54756082, 0.35160197, 0.30881165, 0.48201442])
>>> explainable_regressor.explainer_.explanations_[0]
{'a': -0.019896434698603117, 'b': 0.029814649814215954, 'c': 0.02447547087613202, 'd': 0.021313815648682066, 'e': -0.10778800140251406}

fit(X: ndarray, y: ndarray, feature_names: Optional[List[str]] = None)

Fit function that calls the fit on the estimator and on the explainer.

Parameters

X: np.ndarray, required: train matrix
y: np.ndarray, required: train true values
feature_names: List[str], optional, (default=`None`): the name of the feature column of X

Returns

Fitted ExplainableRegressor

predict(X: ndarray)

Predict function that call the predict function of the explainer.

You can access to the explanation of the predictions via regressor.explainer_.explanations_ attribute

Parameters

X: np.ndarray, required: test matrix

Returns

predictions: np.ndarray

class gtime.regressors.LinearRegressor(loss=<function mean_squared_error>)

Implementation of a LinearRegressor that takes a custom loss function.

Parameters

lossCallable, optional, default: mean_squared_error: The loss function to use when fitting the model. The loss function must accept y_true, y_pred and return a single real number.

Examples

>>> from gtime.regressors.linear_regressor import LinearRegressor
>>> from gtime.metrics import max_error
>>> import numpy as np
>>> import pandas as pd
>>> X = np.random.random((100, 10))
>>> y = np.random.random(100)
>>> lr = LinearRegressor(loss=max_error)
>>> X_train, y_train = X[:90], y[:90]
>>> X_test, y_test = X[90:], y[90:]
>>> x0 = [0]*11
>>> lr.fit(X_train, y_train, x0=x0)
>>> lr.predict(X_test)
array([0.62987155, 0.46971378, 0.50421395, 0.5543149 , 0.50848151,
       0.54768797, 0.50968854, 0.50500384, 0.58069366, 0.54912972])

fit(X: DataFrame, y: DataFrame, **kwargs) → LinearRegressor

Fit the linear model on X and y on the given loss function.To do the minimization, the scipy.optimize.minimize function is used. To have more details and check which kind of options are available, please refer to the scipy documentation.

Parameters

Xpd.DataFrame, shape (n_samples, n_features), required: The X matrix used as features in the fitting procedure.
ypd.DataFrame, shape (n_samples, 1), required: The y matrix to use as target values in the fitting procedure.
kwargs: dict, optional.: Optional arguments to pass to the minimize function of scipy.

Returns

self: LinearRegressor: The fitted model.

predict(X: DataFrame) → DataFrame

Predict the y values associated to the features X.

Parameters

Xpd.DataFrame, shape (n_samples, n_features), required: The features used to predict.

Returns

predictionspd.DataFrame, shape (n_samples, 1): The predictions of the model

class gtime.regressors.MultiFeatureMultiOutputRegressor(estimator: RegressorMixin, target_to_features_dict: Optional[Dict[int, List[int]]] = None)

Multi target regression with option to choose the features for each target.

This strategy consists of fitting one regressor per target. It is built over sklearn.multioutput.MultiOutputRegressor. Compared to this, it allows to choose different features for each regressor.

Parameters

estimator: RegressorMixin, required: An estimator object implementing fit and predict.

Examples

>>> import numpy as np
>>> from gtime.regressors import MultiFeatureMultiOutputRegressor
>>> from sklearn.ensemble import RandomForestRegressor
>>> X = np.random.random((30, 5))
>>> y = np.random.random((30, 3))
>>> X_train, y_train = X[:20], y[:20]
>>> X_test, y_test = X[20:], y[20:]
>>>
>>> random_forest = RandomForestRegressor()
>>> regressor = MultiFeatureMultiOutputRegressor(estimator=random_forest)
>>>
>>> target_to_features_dict = {0: [0,1,2], 1: [0,1,3], 2: [0,1,4]}
>>> regressor.fit(X_train, y_train, target_to_features_dict=target_to_features_dict)
>>>
>>> predictions = regressor.predict(X_test)
>>> predictions.shape
(10, 3)

fit(X: ndarray, y: ndarray, **kwargs)

Fit the model.

Train the models, one for each target variable in y.

Parameters

Xnp.ndarray, shape (n_samples, n_features), required.: The data.
ynp.ndarray, shape (n_samples, horizon), required.: The matrix containing the target variables.

Returns

self : object

predict(X: ndarray) → ndarray

For each row in X, make a prediction for each fitted model

Parameters

Xnp.ndarray, shape (n_samples, n_features), required: The data.

Returns

predictionsnp.ndarray, shape (n_samples, horizon): The predictions