Regressors

The gtime.regressors module contains regression models.

class gtime.regressors.ExplainableRegressor(estimator: RegressorMixin, explainer_type: str)

Wraps the most commons scikit-learn regressor to offer a nice to use interface to fit/predict models and at the same time to explain the predictions.

Since it follows the fit/predict interface of scikit-learn model it is compatible with scikit-learn pipelines, etc..

2 explainers are available: LIME and SHAP

You can get the explanation by accessing to regressor.explainer_.explanations_ after the predict function,

Parameters

estimator: RegressorMixin, required

the scikit-learn model

explainer_type: str, required

‘lime’ or ‘shap’

Examples

>>> import numpy as np
>>> from gtime.regressors import ExplainableRegressor
>>> from sklearn.ensemble import RandomForestRegressor
>>> X = np.random.random((30, 5))
>>> y = np.random.random(30)
>>> X_train, y_train = X[:20], y[:20]
>>> X_test, y_test = X[20:], y[20:]
>>>
>>> random_forest = RandomForestRegressor()
>>> explainable_regressor = ExplainableRegressor(random_forest, 'shap')
>>>
>>> explainable_regressor.fit(X_train, y_train, feature_names=['a', 'b', 'c', 'd', 'e'])
>>> explainable_regressor.predict(X_test)
array([0.41323105, 0.40386639, 0.46462663, 0.3795568 , 0.57571486,
       0.37079003, 0.54756082, 0.35160197, 0.30881165, 0.48201442])
>>> explainable_regressor.explainer_.explanations_[0]
{'a': -0.019896434698603117, 'b': 0.029814649814215954, 'c': 0.02447547087613202, 'd': 0.021313815648682066, 'e': -0.10778800140251406}
fit(X: ndarray, y: ndarray, feature_names: Optional[List[str]] = None)

Fit function that calls the fit on the estimator and on the explainer.

Parameters

X: np.ndarray, required

train matrix

y: np.ndarray, required

train true values

feature_names: List[str], optional, (default=`None`)

the name of the feature column of X

Returns

Fitted ExplainableRegressor

predict(X: ndarray)

Predict function that call the predict function of the explainer.

You can access to the explanation of the predictions via regressor.explainer_.explanations_ attribute

Parameters

X: np.ndarray, required

test matrix

Returns

predictions: np.ndarray

class gtime.regressors.LinearRegressor(loss=<function mean_squared_error>)

Implementation of a LinearRegressor that takes a custom loss function.

Parameters

lossCallable, optional, default: mean_squared_error

The loss function to use when fitting the model. The loss function must accept y_true, y_pred and return a single real number.

Examples

>>> from gtime.regressors.linear_regressor import LinearRegressor
>>> from gtime.metrics import max_error
>>> import numpy as np
>>> import pandas as pd
>>> X = np.random.random((100, 10))
>>> y = np.random.random(100)
>>> lr = LinearRegressor(loss=max_error)
>>> X_train, y_train = X[:90], y[:90]
>>> X_test, y_test = X[90:], y[90:]
>>> x0 = [0]*11
>>> lr.fit(X_train, y_train, x0=x0)
>>> lr.predict(X_test)
array([0.62987155, 0.46971378, 0.50421395, 0.5543149 , 0.50848151,
       0.54768797, 0.50968854, 0.50500384, 0.58069366, 0.54912972])
fit(X: DataFrame, y: DataFrame, **kwargs) LinearRegressor

Fit the linear model on X and y on the given loss function.To do the minimization, the scipy.optimize.minimize function is used. To have more details and check which kind of options are available, please refer to the scipy documentation.

Parameters

Xpd.DataFrame, shape (n_samples, n_features), required

The X matrix used as features in the fitting procedure.

ypd.DataFrame, shape (n_samples, 1), required

The y matrix to use as target values in the fitting procedure.

kwargs: dict, optional.

Optional arguments to pass to the minimize function of scipy.

Returns

self: LinearRegressor

The fitted model.

predict(X: DataFrame) DataFrame

Predict the y values associated to the features X.

Parameters

Xpd.DataFrame, shape (n_samples, n_features), required

The features used to predict.

Returns

predictionspd.DataFrame, shape (n_samples, 1)

The predictions of the model

class gtime.regressors.MultiFeatureMultiOutputRegressor(estimator: RegressorMixin, target_to_features_dict: Optional[Dict[int, List[int]]] = None)

Multi target regression with option to choose the features for each target.

This strategy consists of fitting one regressor per target. It is built over sklearn.multioutput.MultiOutputRegressor. Compared to this, it allows to choose different features for each regressor.

Parameters

estimator: RegressorMixin, required

An estimator object implementing fit and predict.

Examples

>>> import numpy as np
>>> from gtime.regressors import MultiFeatureMultiOutputRegressor
>>> from sklearn.ensemble import RandomForestRegressor
>>> X = np.random.random((30, 5))
>>> y = np.random.random((30, 3))
>>> X_train, y_train = X[:20], y[:20]
>>> X_test, y_test = X[20:], y[20:]
>>>
>>> random_forest = RandomForestRegressor()
>>> regressor = MultiFeatureMultiOutputRegressor(estimator=random_forest)
>>>
>>> target_to_features_dict = {0: [0,1,2], 1: [0,1,3], 2: [0,1,4]}
>>> regressor.fit(X_train, y_train, target_to_features_dict=target_to_features_dict)
>>>
>>> predictions = regressor.predict(X_test)
>>> predictions.shape
(10, 3)
fit(X: ndarray, y: ndarray, **kwargs)

Fit the model.

Train the models, one for each target variable in y.

Parameters

Xnp.ndarray, shape (n_samples, n_features), required.

The data.

ynp.ndarray, shape (n_samples, horizon), required.

The matrix containing the target variables.

Returns

self : object

predict(X: ndarray) ndarray

For each row in X, make a prediction for each fitted model

Parameters

Xnp.ndarray, shape (n_samples, n_features), required

The data.

Returns

predictionsnp.ndarray, shape (n_samples, horizon)

The predictions