Regressors
The gtime.regressors
module contains regression models.
- class gtime.regressors.ExplainableRegressor(estimator: RegressorMixin, explainer_type: str)
Wraps the most commons scikit-learn regressor to offer a nice to use interface to fit/predict models and at the same time to explain the predictions.
Since it follows the fit/predict interface of scikit-learn model it is compatible with scikit-learn pipelines, etc..
2 explainers are available: LIME and SHAP
You can get the explanation by accessing to regressor.explainer_.explanations_ after the predict function,
Parameters
- estimator: RegressorMixin, required
the scikit-learn model
- explainer_type: str, required
‘lime’ or ‘shap’
Examples
>>> import numpy as np >>> from gtime.regressors import ExplainableRegressor >>> from sklearn.ensemble import RandomForestRegressor >>> X = np.random.random((30, 5)) >>> y = np.random.random(30) >>> X_train, y_train = X[:20], y[:20] >>> X_test, y_test = X[20:], y[20:] >>> >>> random_forest = RandomForestRegressor() >>> explainable_regressor = ExplainableRegressor(random_forest, 'shap') >>> >>> explainable_regressor.fit(X_train, y_train, feature_names=['a', 'b', 'c', 'd', 'e']) >>> explainable_regressor.predict(X_test) array([0.41323105, 0.40386639, 0.46462663, 0.3795568 , 0.57571486, 0.37079003, 0.54756082, 0.35160197, 0.30881165, 0.48201442]) >>> explainable_regressor.explainer_.explanations_[0] {'a': -0.019896434698603117, 'b': 0.029814649814215954, 'c': 0.02447547087613202, 'd': 0.021313815648682066, 'e': -0.10778800140251406}
- fit(X: ndarray, y: ndarray, feature_names: Optional[List[str]] = None)
Fit function that calls the fit on the estimator and on the explainer.
Parameters
- X: np.ndarray, required
train matrix
- y: np.ndarray, required
train true values
- feature_names: List[str], optional, (default=`None`)
the name of the feature column of X
Returns
Fitted ExplainableRegressor
- class gtime.regressors.LinearRegressor(loss=<function mean_squared_error>)
Implementation of a LinearRegressor that takes a custom loss function.
Parameters
- lossCallable, optional, default:
mean_squared_error
The loss function to use when fitting the model. The loss function must accept y_true, y_pred and return a single real number.
Examples
>>> from gtime.regressors.linear_regressor import LinearRegressor >>> from gtime.metrics import max_error >>> import numpy as np >>> import pandas as pd >>> X = np.random.random((100, 10)) >>> y = np.random.random(100) >>> lr = LinearRegressor(loss=max_error) >>> X_train, y_train = X[:90], y[:90] >>> X_test, y_test = X[90:], y[90:] >>> x0 = [0]*11 >>> lr.fit(X_train, y_train, x0=x0) >>> lr.predict(X_test) array([0.62987155, 0.46971378, 0.50421395, 0.5543149 , 0.50848151, 0.54768797, 0.50968854, 0.50500384, 0.58069366, 0.54912972])
- fit(X: DataFrame, y: DataFrame, **kwargs) LinearRegressor
Fit the linear model on
X
andy
on the given loss function.To do the minimization, thescipy.optimize.minimize
function is used. To have more details and check which kind of options are available, please refer to the scipy documentation.Parameters
- Xpd.DataFrame, shape (n_samples, n_features), required
The X matrix used as features in the fitting procedure.
- ypd.DataFrame, shape (n_samples, 1), required
The y matrix to use as target values in the fitting procedure.
- kwargs: dict, optional.
Optional arguments to pass to the
minimize
function of scipy.
Returns
- self: LinearRegressor
The fitted model.
- lossCallable, optional, default:
- class gtime.regressors.MultiFeatureMultiOutputRegressor(estimator: RegressorMixin, target_to_features_dict: Optional[Dict[int, List[int]]] = None)
Multi target regression with option to choose the features for each target.
This strategy consists of fitting one regressor per target. It is built over sklearn.multioutput.MultiOutputRegressor. Compared to this, it allows to choose different features for each regressor.
Parameters
- estimator: RegressorMixin, required
An estimator object implementing fit and predict.
Examples
>>> import numpy as np >>> from gtime.regressors import MultiFeatureMultiOutputRegressor >>> from sklearn.ensemble import RandomForestRegressor >>> X = np.random.random((30, 5)) >>> y = np.random.random((30, 3)) >>> X_train, y_train = X[:20], y[:20] >>> X_test, y_test = X[20:], y[20:] >>> >>> random_forest = RandomForestRegressor() >>> regressor = MultiFeatureMultiOutputRegressor(estimator=random_forest) >>> >>> target_to_features_dict = {0: [0,1,2], 1: [0,1,3], 2: [0,1,4]} >>> regressor.fit(X_train, y_train, target_to_features_dict=target_to_features_dict) >>> >>> predictions = regressor.predict(X_test) >>> predictions.shape (10, 3)