Preprocessing
The gtime.preprocessing
module deals with the preprocessing of time series
data.
- class gtime.preprocessing.TimeSeriesPreparation(start: Optional[datetime] = None, end: Optional[datetime] = None, freq: Optional[Timedelta] = None, resample_if_not_equispaced: bool = False, output_name: str = 'time_series')
Transforms an array-like sequence in a period-index DataFrame with a single column.
Here is what happens: - if a list or np.array is passed, the PeriodIndex is built using the parameters
start, end and freq
- if a pd.Series is passed, it checks if the index is a time index
(DatetimeIndex, TimedeltaIndex, PeriodIndex) or not. If not the index is built as if it were a list or `np.array. If yes the index is converted to PeriodIndex.
Parameters
- startdatetime, optional, default:
None
The date to use as start date.
- enddatetime, optional, default:
None
The date to use as end date.
- freqpd.Timedelta, optional, default:
None
The frequency of the output time series. Not mandatory for all time series conversion.
- resample_if_not_equispacedbool, optional, default:
False
Not supported yet, leave it as True
- output_namestr, optional, default:
'time_series'
The name of the output column
Raises
- ValueError
Of the three parameters: start, end, and periods, exactly two must be specified.
Examples
>>> import pandas as pd >>> from gtime.preprocessing import TimeSeriesPreparation >>> time_series = [1, 2, 3, 5, 5, 7] >>> period_index_time_series = pd.Series( ... index = pd.period_range(start='01-01-2010', freq='10D', periods=6), ... data=[1,2,3,5,5,7] ... ) >>> datetime_index_time_series = pd.Series( ... index = pd.date_range(start='01-01-2010', freq='10D', periods=6), ... data=[1,2,3,5,5,7] ... ) >>> timedelta_index_time_series = pd.Series( ... index = pd.timedelta_range(start=pd.Timedelta(days=1), freq='10D', periods=6), ... data=[1,2,3,5,5,7] ... ) >>> time_series_preparation = TimeSeriesPreparation() >>> time_series_preparation.transform(time_series) time_series 1970-01-01 1 1970-01-02 2 1970-01-03 3 1970-01-04 5 1970-01-05 5 1970-01-06 7 >>> time_series_preparation.transform(period_index_time_series) time_series 2010-01-01 1 2010-01-11 2 2010-01-21 3 2010-01-31 5 2010-02-10 5 2010-02-20 7 >>> time_series_preparation.transform(datetime_index_time_series) time_series 2010-01-01 1 2010-01-11 2 2010-01-21 3 2010-01-31 5 2010-02-10 5 2010-02-20 7 >>> time_series_preparation.transform(timedelta_index_time_series) time_series 1970-01-02 1 1970-01-12 2 1970-01-22 3 1970-02-01 5 1970-02-11 5 1970-02-21 7
- transform(time_series: Union[List, array, Series, DataFrame]) DataFrame
Transforms an array-like sequence in a period-index DataFrame with a single column.
Parameters
- time_seriesUnion[List, np.array, pd.Series, pd.DataFrame], required
The input time series.
Returns
- period_index_dataframepd.DataFrame
The output dataframe with a period index.