caikit.runtime.protobufs.model_runtime_pb2_grpc =============================================== .. py:module:: caikit.runtime.protobufs.model_runtime_pb2_grpc .. autoapi-nested-parse:: Client and server classes corresponding to protobufs-defined services. Classes ------- .. autoapisummary:: caikit.runtime.protobufs.model_runtime_pb2_grpc.ModelRuntimeStub caikit.runtime.protobufs.model_runtime_pb2_grpc.ModelRuntimeServicer caikit.runtime.protobufs.model_runtime_pb2_grpc.ModelRuntime Functions --------- .. autoapisummary:: caikit.runtime.protobufs.model_runtime_pb2_grpc.add_ModelRuntimeServicer_to_server Module Contents --------------- .. py:class:: ModelRuntimeStub(channel) Bases: :py:obj:`object` this is the internal "sidecar" API for interfacing with a colocated model runtime container .. py:attribute:: loadModel .. py:attribute:: unloadModel .. py:attribute:: predictModelSize .. py:attribute:: modelSize .. py:attribute:: runtimeStatus .. py:class:: ModelRuntimeServicer Bases: :py:obj:`object` this is the internal "sidecar" API for interfacing with a colocated model runtime container .. py:method:: loadModel(request, context) Load a model, return when model is fully loaded. Include size of loaded model in response if no additional cost. A gRPC error code of PRECONDITION_FAILED or INVALID_ARGUMENT should be returned if no attempt to load the model was made (so can be sure that no space remains used). Note that the RPC may be cancelled by model-mesh prior to completion, after which an unloadModel call will immediately be sent for the same model. To avoid state inconsistency and "leaking" memory, implementors should ensure that this case is properly handled, i.e. that the model doesn't remain loaded after returning successfully from this unloadModel call. .. py:method:: unloadModel(request, context) Unload a previously loaded (or failed) model. Return when model is fully unloaded, or immediately if not found/loaded. .. py:method:: predictModelSize(request, context) Predict size of not-yet-loaded model - must return almost immediately. Should not perform expensive computation or remote lookups. Should be a conservative estimate. .. py:method:: modelSize(request, context) Calculate size (memory consumption) of currently-loaded model .. py:method:: runtimeStatus(request, context) Provide basic runtime status and parameters; called only during startup. Before returning a READY status, implementations should check for and purge any/all currently-loaded models. Since this is only called during startup, there should very rarely be any, but if there are it implies the model-mesh container restarted unexpectedly and such a purge must be done to ensure continued consistency of state and avoid over-committing resources. .. py:function:: add_ModelRuntimeServicer_to_server(servicer, server) .. py:class:: ModelRuntime Bases: :py:obj:`object` this is the internal "sidecar" API for interfacing with a colocated model runtime container .. py:method:: loadModel(request, target, options=(), channel_credentials=None, call_credentials=None, insecure=False, compression=None, wait_for_ready=None, timeout=None, metadata=None) :staticmethod: .. py:method:: unloadModel(request, target, options=(), channel_credentials=None, call_credentials=None, insecure=False, compression=None, wait_for_ready=None, timeout=None, metadata=None) :staticmethod: .. py:method:: predictModelSize(request, target, options=(), channel_credentials=None, call_credentials=None, insecure=False, compression=None, wait_for_ready=None, timeout=None, metadata=None) :staticmethod: .. py:method:: modelSize(request, target, options=(), channel_credentials=None, call_credentials=None, insecure=False, compression=None, wait_for_ready=None, timeout=None, metadata=None) :staticmethod: .. py:method:: runtimeStatus(request, target, options=(), channel_credentials=None, call_credentials=None, insecure=False, compression=None, wait_for_ready=None, timeout=None, metadata=None) :staticmethod: