caikit.runtime.model_management.batcher

The Batcher transparently aggregates individual inference calls into unified batches to call the run_batch implementation of the wrapped model.

Attributes

log

Classes

Batcher

Module Contents

caikit.runtime.model_management.batcher.log[source]
class caikit.runtime.model_management.batcher.Batcher(model_name: str, model: caikit.core.ModuleBase, batch_size: int, batch_collect_delay_s: float | None = None)[source]
__doc__ = Multiline-String
Show Value
"""
The Batcher transparently aggregates individual inference calls into unified
batches to call the run_batch implementation of the wrapped model.
"""
_model_name
_model
_batch_size
_batch_collect_delay_s = None
_input_q
_finished_tasks
_req_num = 0
_id_lock
_ready_event
_stop_event
_batch_thread_start_lock
_batch_thread = None
_model_run_defaults
__del__()[source]

Shut down the internal thread

run(**kwargs) caikit.core.data_model.base.DataBase[source]

This run function gives a facade to the underlying model’s run function that is implemented by running batches of individual requests through the model’s run_batch method.

NOTE: Only kwargs accepted to simplify batching across inconsistent sets

of kwargs (and only kwargs are used in the predict servicer)

stop()[source]

Stop this batcher’s run thread (cannot be undone)

_ensure_batch_thread()[source]

The run thread will stop itself if there’s no work to do, so this function is called to ensure that it’s up and running

_next_req_id()[source]

Make a unique ID for this request

_batch_thread_run()[source]

This function runs in an independent thread and manages pulling requests from the input queue, running the batch, and returning the completed results into _finished_tasks.