caikit.interfaces.ts.data_model._single_timeseries

The core data model object for a TimeSeries

Attributes

log

error

Classes

SingleTimeSeries

The TimeSeries object is the central data container for the library.

Module Contents

caikit.interfaces.ts.data_model._single_timeseries.log[source]
caikit.interfaces.ts.data_model._single_timeseries.error
class caikit.interfaces.ts.data_model._single_timeseries.SingleTimeSeries(*args, **kwargs)[source]

Bases: caikit.core.DataObjectBase

The TimeSeries object is the central data container for the library. At present it wraps either a pandas.DataFrame, or pyspark.sql.DataFrame to bind into the caikit data model.

class StringIDSequence[source]

Bases: caikit.core.DataObjectBase

Nested value sequence of strings

values: py_to_proto.dataclass_to_proto.Annotated[List[str], FieldNumber(1)]
class IntIDSequence[source]

Bases: caikit.core.DataObjectBase

Nested value sequence of ints

values: py_to_proto.dataclass_to_proto.Annotated[List[int], FieldNumber(1)]
time_sequence: py_to_proto.dataclass_to_proto.Annotated[caikit.interfaces.ts.data_model.time_types.PeriodicTimeSequence, OneofField('time_period'), FieldNumber(10)] | py_to_proto.dataclass_to_proto.Annotated[caikit.interfaces.ts.data_model.time_types.PointTimeSequence, OneofField('time_points'), FieldNumber(20)]
values: py_to_proto.dataclass_to_proto.Annotated[List[caikit.interfaces.ts.data_model.time_types.ValueSequence], FieldNumber(1)]
timestamp_label: py_to_proto.dataclass_to_proto.Annotated[str, FieldNumber(2)]
value_labels: py_to_proto.dataclass_to_proto.Annotated[List[str], FieldNumber(3)]
ids: py_to_proto.dataclass_to_proto.Annotated[SingleTimeSeries.IntIDSequence, OneofField('id_int'), FieldNumber(30)] | py_to_proto.dataclass_to_proto.Annotated[SingleTimeSeries.StringIDSequence, OneofField('id_str'), FieldNumber(40)]
_DEFAULT_TS_COL = 'timestamp'
_get_pd_df() Tuple[pandas.DataFrame, str, Iterable[str]][source]

Convert the data to a pandas DataFrame, efficiently if possible

__len__() int[source]

Return the length of the single time series object.

Returns:

int: Length

__eq__(other: SingleTimeSeries) bool[source]

Equivalence operator for SingleTimeSeries objects.

Performs ordering of data based on timestamp_label prior to checking for equivalence. Relies on underlying pandas equivalence testing function pd.testing.assert_frame_equal.

Args:

other (SingleTimeSeries): SingleTimeSeries to test against.

Returns:

bool: True if the SingleTimeSeries are equivalent.

_as_pandas_ops(adf, include_timestamps: None | bool = False)[source]

operate on pandas-like object instead of strictly pandas

as_pandas(include_timestamps: bool | None = None) pandas.DataFrame[source]

Get the view of this timeseries as a pandas DataFrame

Args:

include_timestamps (bool, optional): Control the addition or removal of timestamps. True will include timestamps, generating if needed, while False will remove timestamps. Use None to returned what is available, leaving unchanged. Defaults to None.

Returns:

pd.DataFrame: The view of the data as a pandas DataFrame

as_spark(include_timestamps: bool | None = None) caikit.interfaces.ts.data_model.toolkit.optional_dependencies.pyspark.sql.DataFrame[source]

Get the view of this timeseries as a spark DataFrame

Args:

include_timestamps (bool, optional): Control the addition or removal of timestamps. True will include timestamps, generating if needed, while False will remove timestamps. Use None to returned what is available, leaving unchanged. Defaults to None.

Returns:

pyspark.sql.DataFrame: The view of the data as a spark DataFrame