caikit.core
===========

.. py:module:: caikit.core

.. autoapi-nested-parse::

   Caikit Core AI Framework library.  This is the base framework for core AI/ML libraries.


Submodules
----------

.. toctree::
   :maxdepth: 1

   /autoapi/caikit/core/augmentors/index
   /autoapi/caikit/core/data_model/index
   /autoapi/caikit/core/exceptions/index
   /autoapi/caikit/core/model_management/index
   /autoapi/caikit/core/model_manager/index
   /autoapi/caikit/core/module_backends/index
   /autoapi/caikit/core/modules/index
   /autoapi/caikit/core/registries/index
   /autoapi/caikit/core/signature_parsing/index
   /autoapi/caikit/core/task/index
   /autoapi/caikit/core/toolkit/index


Attributes
----------

.. autoapisummary::

   caikit.core.MODEL_MANAGER
   caikit.core.extract
   caikit.core.load
   caikit.core.resolve_and_load
   caikit.core.train
   caikit.core.start_prediction_job
   caikit.core.get_model_future
   caikit.core.get_prediction_future


Exceptions
----------

.. autoapisummary::

   caikit.core.DataValidationError


Classes
-------

.. autoapisummary::

   caikit.core.DataObjectBase
   caikit.core.ModelManager
   caikit.core.BackendBase
   caikit.core.LocalBackend
   caikit.core.ModuleBase
   caikit.core.ModuleConfig
   caikit.core.ModuleLoader
   caikit.core.ModuleSaver
   caikit.core.TaskBase
   caikit.core.EvalTypes
   caikit.core.F1Metrics
   caikit.core.F1MetricsContainer
   caikit.core.QualityEvaluator
   caikit.core.ObjectSerializer
   caikit.core.JSONSerializer
   caikit.core.TextSerializer
   caikit.core.YAMLSerializer
   caikit.core.CSVSerializer
   caikit.core.PickleSerializer


Functions
---------

.. autoapisummary::

   caikit.core.get_valid_module_ids
   caikit.core.module
   caikit.core.task
   caikit.core.load_txt
   caikit.core.load_txt_lines
   caikit.core.save_txt
   caikit.core.load_binary
   caikit.core.save_binary
   caikit.core.load_csv
   caikit.core.save_csv
   caikit.core.load_dict_csv
   caikit.core.save_dict_csv
   caikit.core.load_json
   caikit.core.save_json
   caikit.core.load_yaml
   caikit.core.save_yaml
   caikit.core.load_pickle
   caikit.core.save_pickle
   caikit.core.save_raw
   caikit.core.compress


Package Contents
----------------

.. py:class:: DataObjectBase

   Bases: :py:obj:`caikit.core.data_model.base.DataBase`


   A DataObject is a data model class that is backed by a @dataclass.

   Data model classes that use the @dataobject decorator must derive from this
   base class.


.. py:exception:: DataValidationError(reason: str, item_number: Optional[int] = None)

   Bases: :py:obj:`Exception`


   This error is used for data validation problems during training


   .. py:attribute:: _reason


   .. py:attribute:: _item_number
      :value: None


   .. py:property:: reason
      :type: str


      The reason given for this data validation error


   .. py:property:: item_number
      :type: Optional[int]


      The index of the training data item that failed validation. Probably zero indexed


.. py:function:: get_valid_module_ids()

   Get a dictionary mapping all module IDs to the string names of the
   implementing classes.


.. py:class:: ModelManager

   Manage the models or resources for library.


   .. py:attribute:: _singleton_module_cache


   .. py:attribute:: _trainers


   .. py:attribute:: _finders


   .. py:attribute:: _job_predictors


   .. py:attribute:: _initializers


   .. py:attribute:: __singleton_lock


   .. py:method:: initialize_components()

      Proactively initialize all configured trainer/finder/initializer
      component instances. This is a separate call to enable explicit config.


   .. py:method:: train(module: Union[Type[caikit.core.modules.base.ModuleBase], str], *args, trainer: Union[str, caikit.core.model_management.ModelTrainerBase] = 'default', save_path: Optional[Union[str, caikit.interfaces.common.data_model.stream_sources.S3Path]] = None, save_with_id: bool = False, model_name: Optional[str] = None, wait: bool = False, **kwargs) -> caikit.core.model_management.ModelTrainerFutureBase

      Train an instance of the given module with the given args and kwargs
      using the given trainer.

      Each module's train function encapsulates the code needed to perform the
      training locally. This top-level train function provides the wrapper
      functionality to delegate the execution of the module's train function
      to an alternate framework using a ModelTrainerBase. It also allows
      training to be launched asynchronously.

      Args:
          module (Union[Type[ModuleBase], str]): The module class or guid for
              the module to train
          *args: Additional positional args to pass through to the module's
              train function

      Kwargs:
          trainer (Union[str, ModelTrainerBase]): The trainer to use. If given
              as a string, this is a key in the global config at
              model_management.trainers.
          save_path (Optional[Union[str, S3Path]]): Base path where the model should be
              saved (may be relative to a remote trainer's filesystem, or link to S3
              storage)
          save_with_id (bool): Inject the training ID into the save path for
              the output model
          model_name (Optional[str]): Name of model that will be appended
              to the end of the save_path
          wait (bool): Wait for training to complete before returning
          **kwargs: Additional keyword arguments to pass through to the
              modules's train function

      Returns:
          model_future (ModelFutureBase): The future handle
              to the model which holds the status of the in-flight training.


   .. py:method:: start_prediction_job(model: caikit.core.modules.base.ModuleBase, prediction_func_name: str, *args, predictor: Union[str, caikit.core.model_management.JobPredictorBase] = 'default', wait: bool = False, **kwargs) -> caikit.core.model_management.JobPredictorFutureBase

      Start a prediction job using a job_predictor.

      Args:
          model (ModuleBase): Loaded model to run prediction on
          prediction_func_name (str): String reference to name of function to run
          predictor (Union[str, JobPredictorBase], optional): Which job_predictor to use.
            Defaults to "default".
          wait (bool, optional): Weather to wait for job to finish. Defaults to False.

      Returns:
          JobPredictorFutureBase: Future to track job result


   .. py:method:: get_model_future(training_id: str) -> caikit.core.model_management.ModelTrainerFutureBase

      Get the future handle to an in-progress training

      Args:
          training_id (str): The ID string from the original training
              submission's ModelFuture

      Returns:
          model_future (ModelTrainerFutureBase): The future handle
              to the model which holds the status of the in-flight training.


   .. py:method:: get_prediction_future(prediction_id: str) -> caikit.core.model_management.JobPredictorFutureBase

      Get the future handle to an in-progress prediction job

      Args:
          prediction_id (str): The ID string from the original prediction
              submission's ModelFuture

      Returns:
          prediction_future (JobPredictorFutureBase): The future handle
              to the job which holds the status of the in-flight prediction.


   .. py:method:: load(module_path: Union[str, io.BytesIO, bytes], *, load_singleton: bool = False, finder: Union[str, caikit.core.model_management.ModelFinderBase] = 'default', initializer: Union[str, caikit.core.model_management.ModelInitializerBase] = 'default', **kwargs)

      Load a model and return an instantiated object on which we can run
      inference.

      Args:
          module_path (str | BytesIO | bytes): A module path (identifier) to
              one of the following:
              1. A directory containing a yaml config file in the top level.
              2. A zip archive containing either a yaml config file in the
                  top level when extracted, or a directory containing a yaml
                  config file in the top level.
              3. A BytesIO object corresponding to a zip archive containing
                  either a yaml config file in the top level when extracted,
                  or a directory containing a yaml config file in the top
                  level.
              4. A bytes object corresponding to a zip archive containing
                  either a yaml config file in the top level when extracted,
                  or a directory containing a yaml config file in the top
                  level.
              5. A string that is understood by the configured
                  finder/initializer

      Kwargs:
          load_singleton (bool): Load this model as a singleton
          finder (Union[str, ModelFinderBase]): Finder to use when loading
              this model. If passed as a string, this names the finder in the
              global config model_management.finders section.
          initializer (Union[str, ModelInitializerBase]): Loader to use when
              initializint this model. If passed as a string, this is the name
              of the initializer in the global
              config model_management.initializers section.

      Returns:
          model (ModuleBase) Model object that is loaded, configured, and
              ready for prediction.


   .. py:method:: extract(zip_path: str, model_path: str, force_overwrite: bool = False) -> str

      Method to extract a downloaded archive to a specified directory.

      Args:
          zip_path (str): Location of .zip file to extract.
          model_path (str): Model directory where the archive should be
              unzipped unzipped.
          force_overwrite: bool (Defaults to false)
              Force an overwrite to model_path, even if the folder exists
      Returns:
          str: Output path where the model archive is unzipped.


   .. py:method:: resolve_and_load(path_or_name_or_model_reference: Union[str, caikit.core.modules.base.ModuleBase], **kwargs)

      Try our best to load a model, given a path or a name. Simply returns any loaded model
      passed in. This exists to ease the burden on workflow developers who need to accept
      individual modules in their API, where users may have references to custom models or may
      only have the ability to give the name of a stock model.

      Args:
          path_or_name_or_model_reference (str, ModuleBase): Either a
              - Path to a model on disk
              - Name of a model that the catalog knows about
              - Loaded module
          **kwargs: Any keyword arguments to pass along to ModelManager.load()
                      or ModelManager.download()
              e.g. parent_dir

      Returns:
          A loaded module
      Examples:
          >>> stock_syntax_model = manager.resolve_and_load('syntax_izumo_en_stock')
          >>> local_categories_model = manager.resolve_and_load('path/to/categories/model')
          >>> some_custom_model = manager.resolve_and_load(some_custom_model)


   .. py:method:: get_singleton_model_cache_info()

      Returns information about the singleton cache in {hash: module type} format

      Returns:
          Dict[str, type]: A dictionary of model hashes to model types


   .. py:method:: clear_singleton_cache()

      Clears the cache of singleton models. Useful to release references of models, as long as
      you know that they are no longer held elsewhere and you won't be loading them again.

      Returns:
          None


   .. py:method:: get_trainer(trainer: Union[str, caikit.core.model_management.ModelTrainerBase]) -> caikit.core.model_management.ModelTrainerBase

      Get the configured model trainer or the one passed by value


   .. py:method:: get_finder(finder: Union[str, caikit.core.model_management.ModelFinderBase]) -> caikit.core.model_management.ModelFinderBase

      Get the configured model finder or the one passed by value


   .. py:method:: get_initializer(initializer: Union[str, caikit.core.model_management.ModelInitializerBase]) -> caikit.core.model_management.ModelInitializerBase

      Get the configured model initializer or the one passed by value


   .. py:method:: get_predictor(inferencer: Union[str, caikit.core.model_management.JobPredictorBase]) -> caikit.core.model_management.JobPredictorBase

      Get the configured job predictor or the one passed by value


   .. py:method:: get_module_backends(initialize: bool = True) -> List[caikit.core.module_backends.base.BackendBase]

      Convenience method to get access to the configured module backends if
      any have been configured

      Args:
          initialize (bool): Initialize the components from config

      Returns:
          backends (List[BackendBase]): The list of backend instances that
              have been configured


   .. py:method:: _do_load(module_path, load_singleton, finder, initializer, **kwargs)

      Load a model from a directory.

      Args:
          module_path (str): Path to directory. At the top level of directory
              is `config.yml` which holds info about the model.
          load_singleton (bool): Load this model as a singleton
          finder (Union[str, ModelFinderBase]): Finder to use when loading
              this model. If passed as a string, this names the finder in the
              global config model_management.finders section.
          initializer (Union[str, ModelInitializerBase]): Loader to use when
              loading this model. If passed as a string, this is the name of
              the initializer in the global
              config model_management.initializers section.

      Returns:
          subclass of caikit.core.modules.ModuleBase: Model object that is
              loaded, configured, and ready for prediction.


   .. py:method:: _load_from_zipfile(module_path, load_singleton, finder, initializer, **kwargs)

      Load a model from a zip archive.

      Args:
          module_path (str): Path to directory. At the top level of directory
              is `config.yml` which holds info about the model.
          load_singleton (bool): Load this model as a singleton
          finder (Union[str, ModelFinderBase]): Finder to use when loading
              this model. If passed as a string, this names the finder in the
              global config model_management.finders section.
          initializer (Union[str, ModelInitializerBase]): Loader to use when
              loading this model. If passed as a string, this is the name of
              the initializer in the global
              config model_management.initializers section.

      Returns:
          subclass of caikit.core.modules.ModuleBase: Model object that is
              loaded, configured, and ready for prediction.


   .. py:method:: _singleton_lock(load_singleton: bool)

      Helper contextmanager that will only lock the singleton cache if this
      load is a singleton load


   .. py:method:: _get_component(component: Union[str, caikit.core.toolkit.factory.FactoryConstructible], component_dict: Dict[str, caikit.core.toolkit.factory.FactoryConstructible], component_factory: caikit.core.toolkit.factory.Factory, component_name: str, component_cfg: dict, component_type: type) -> caikit.core.toolkit.factory.FactoryConstructible
      :staticmethod:


      Common logic for resolving components from config

      NOTE: This is done lazily to avoid relying on import order and to allow
          for dynamic config changes


.. py:class:: BackendBase(config: Optional[aconfig.Config] = None)

   Bases: :py:obj:`abc.ABC`


   Interface for creating configuration setup for backends


   .. py:attribute:: config


   .. py:attribute:: _started
      :value: False


   .. py:attribute:: _start_lock


   .. py:property:: backend_type
      :classmethod:

      :abstractmethod:


      Property storing type of the backend


   .. py:property:: is_started


   .. py:method:: register_config(config)
      :abstractmethod:


      Function to allow dynamic merging of configs.
      This can be useful, if there are explicit configurations
      particular implementations (modules) need to register before the starting the backend.


   .. py:method:: start()
      :abstractmethod:


      Function to start a distributed backend. This function
      should set self._started variable to True


   .. py:method:: stop()
      :abstractmethod:


      Function to stop a distributed backend. This function
      should set self._started variable to False


   .. py:method:: start_lock()


   .. py:method:: handle_runtime_context(model_id: str, runtime_context: caikit.core.data_model.runtime_context.RuntimeServerContextType)

      Update backend state for the given model based on a runtime request.

      Some backends may need to handle runtime context information for the
      target model in order to correctly configure the backend before finding
      and loading the model. By default, this is a No-Op.

      Args:
          model_id (str): The unique ID of the model that is referenced by the
              runtime context
          runtime_context (RuntimeServerContextType): The context for the
              given runtime request


.. py:class:: LocalBackend(config: Optional[aconfig.Config] = None)

   Bases: :py:obj:`caikit.core.module_backends.base.BackendBase`


   Interface for creating configuration setup for backends


   .. py:attribute:: backend_type
      :value: 'LOCAL'


      Property storing type of the backend


   .. py:method:: register_config(config) -> None

      Function to merge configs with existing configurations


   .. py:method:: start()

      Start local backend. This is a no-op function


   .. py:method:: stop()

      Stop local backend. This is a no-op


.. py:class:: ModuleBase

   Abstract base class from which all modules should inherit.


   .. py:attribute:: _metadata


   .. py:attribute:: _load_backend
      :value: None


   .. py:property:: metadata
      :type: Dict[str, Any]


      This module's metadata.

      Returns:
          Dict[str, Any]: A dictionary of this module's metadata

          TODO: Can this be a `ModuleConfig` object instead? (or aconfig.Config)?


   .. py:property:: module_metadata
      :type: Dict[str, Any]


      Helper property to return metadata about a Module. This function
      is separate from `metadata` as this is specific for the class module. This
      function also requires a flat metadata structure without nested dictionaries.

      NOTE: This should be a @classmethod but using @property/@classmethod together has
      been deprecated

      Returns:
          Dict[str, str]: A dictionary of this ModuleBases's metadata


   .. py:property:: public_model_info
      :type: Dict[str, Any]


      Helper property to return public metadata about a specific Model. This
      function is separate from `metdata` as that contains the entire ModelConfig
      which might not want to be shared/exposed.

      Returns:
          Dict[str, str]: A dictionary of this models's public metadata


   .. py:method:: set_load_backend(load_backend)

      Method used by the model manager to indicate the load backend that
      was used to load this module


   .. py:method:: get_inference_signature(input_streaming: bool, output_streaming: bool, task: Type[caikit.core.TaskBase] = None) -> Optional[caikit.core.signature_parsing.CaikitMethodSignature]
      :classmethod:


      Returns the inference method signature that is capable of running the module's task
      for the given flavors of input and output streaming


   .. py:method:: get_inference_signatures(task: Type[caikit.core.TaskBase]) -> List[Tuple[bool, bool, caikit.core.signature_parsing.CaikitMethodSignature]]
      :classmethod:


      Returns inference method signatures for all supported flavors
      of input and output streaming for a given task


   .. py:property:: load_backend

      Get the backend instance used to load this module. This can be used
      in module implementations that require use of a specific backend at
      inference time.


   .. py:method:: bootstrap(*args, **kwargs)
      :classmethod:


      Bootstrap a module. This method can be used to initialize the module
      from artifacts created outside of a particular caikit library


   .. py:method:: load(model_path: Union[str, caikit.core.modules.config.ModuleConfig], *args, **kwargs) -> ModuleBase
      :classmethod:


      Load a new instance of workflow from a given model_path

      Args:
          model_path (Union[str, ModuleConfig]): Path to saved model or
              in-memory ModuleConfig
      Returns:
          model (ModuleBase): A new instance of this module class


   .. py:method:: _load(module_loader, *args, **kwargs)
      :classmethod:


      Load a model.


   .. py:method:: timed_load(*args, **kwargs)
      :classmethod:


      Time a model `load` call.

      Args:
          *args (list): Will be passed to `self.load`.
          **kwargs (dict): Will be passed to `self.load` -- the only way to
              pass arbitrary arguments to `self.load` from this function.

      Returns:
          int, caikit.core._ModuleBase: The first return value is the total
              time spent in the `self.load` call. The second return value is
              the loaded model.

      Notes:
          You can pass everything that should go to the run function normally using args/kwargs.
          Example: `model.timed_load("/model/path/dir")`


   .. py:method:: validate_loaded_model(*args)

      Validate a loaded model.


   .. py:method:: save(model_path: str, *args, **kwargs)

      Save a model.

      Args:
          model_path (str): Path on disk to export the model to.


   .. py:method:: as_file_like_object(*args, **kwargs) -> io.BytesIO

      Produces a file-like object corresponding to a zip archive affiliated with a given
      model. This method wraps is functionally similar to .save() - it saves a model into
      a temporary directory and produces a zip archive, then loads the result as a io.BytesIO
      object. The result of this function is also compatible with .load(), and cleanup is
      handled automatically.

      Args:
          *args, **kwargs (dict): Optional keyword arguments for saving.
      Returns:
          io.BytesIO: File like object holding an exported model in memory as
              a io.BytesIO object.


   .. py:method:: as_bytes(*args, **kwargs) -> bytes

      Produces a bytes object corresponding to a zip archive affiliated with a given
      model. This method wraps is functionally similar to .save() - it saves a model into
      a temporary directory and produces a zip archive, then loads the result as a bytes
      object. The result of this function is also compatible with .load(), and cleanup is
      handled automatically.

      Args:
          *args, **kwargs (dict): Optional keyword arguments for saving.
      Returns:
          bytes: bytes object holding an exported model in memory.


   .. py:method:: run(*args, **kwargs)

      Run a model - this typically makes a single prediction and returns an object from the
      data model.


   .. py:method:: run_batch(*args, **kwargs)

      Run a model in batch mode - this typically ingests an iterable of inputs that can be
      applied to run & returns a list of data model objects that run ordinarily returns. A module
      may override this method to provide faster evaluation capabilities, e.g., by leveraging
      vectorization during prediction.

      All provided args and kwargs that should be expanded with the batch should be provided as
      prebatched iterables. If a provided arg/kwarg is not provided as an iterable, it will be
      passed as is to all self contained run calls, which may be the case in some rare cases,
      such as runtime explanability enablement.

      This function is intentionally kept as simple as possible. In order to maintain its
      simplicity, all argument iterables must be the same length, where the length of every
      provided iterable is presumed to be the batch size. If an iterable must be passed as
      arg to each run call, batch run must be called by wrapping it in another iterable and
      duplicating the iterable arg to match the size, or (ideally) overridden in the subclass
      as necessary.

      Args:
          *args: Variable length argument list to be passed directly to run().
          **kwargs: Arbitrary keyword arguments to be passed directly to run().
      Returns:
          tuple: Iterable of prediction outputs, run as a batch.


   .. py:method:: timed_run(*args, num_seconds=None, num_iterations=None, **kwargs)

      Time a number of runs over set seconds or iterations.

      Args:
          *args (list): Will be passed to `self.run`.
          num_seconds (int): Minimum number of seconds to run timed_run over.
              Will most likely be more than this value due to its waiting for
              the each call to `self.run` to finish.
          num_iterations (int): Minimum number of iterations to run timed_run
              over. Will run exactly this many times.
          **kwargs (dict): Will be passed to `self.run`.

      Returns:
          int, int, caikit.core.data_model.DataBase: The first return value is
              the total time spent in the `self.run` loop. The second return
              value is the total number of calls to `self.run` were made. The
              return value is the output of the module's run method

      Notes:
          You can pass everything that should go to the run function normally using args/kwargs.
          Example: `model.timed_run("some example text", num_seconds=60)`

      By default it will run for greater than or equal to 120 seconds.


   .. py:method:: stream(data_stream, *args, **kwargs)

      Lazily evaluate a run() on a given model by constructing a new data stream generator
      from the results. Note that we do not allow datastreams in args/kwargs. In rare cases,
      this may mean that stream() is not available, e.g., for keywords extraction. In these
      cases, stream() should be overridden in the subclass (module implementation) to allow
      and expand along multiple data streams.

      Args:
          data_stream (caikit.core.data_model.DataStream): Datastream to be
              lazily sequentially processed by the module under consideration.
          *args: Variable length argument list to be passed directly to run().
          **kwargs: Arbitrary keyword arguments to be passed directly to run().
      Returns:
          protobufs: A DataBase object.


   .. py:method:: train(*args, **kwargs)
      :classmethod:


      Train a model.


   .. py:method:: validate_training_data(training_data: Union[str, caikit.core.data_model.DataStream], limit: int = -1) -> List[caikit.core.exceptions.validation_error.DataValidationError]
      :classmethod:


      Validate a set of training data, passed as a filename or as a data stream.
      Return up to `limit` number of DataValidationErrors


   .. py:attribute:: evaluation_type
      :value: None


   .. py:attribute:: evaluator
      :value: None


   .. py:method:: find_label_func(*_args, **_kwargs)
      :staticmethod:

      :abstractmethod:


      Function used to extract "label" from a prediction/result of a module's .run method.
      Define if you wish to have more specific evaluation metrics. Implemented in subclass.


   .. py:method:: find_label_data_func(*_args, **_kwargs)
      :staticmethod:

      :abstractmethod:


      Function used to extract data belonging to class "label" from a prediction/result
      of a module's .run method. Define if you wish to have more specific evaluation metrics.
      Implemented in subclass.


   .. py:method:: evaluate_quality(dataset_path, preprocess_func=None, detailed_metrics=False, labels=None, partial_match_metrics=False, max_hierarchy_levels=3, **kwargs)

      Run quality evaluation for instance of module

      Args:
          dataset_path (str): Path to where the input "gold set" dataset
              lives. Most often this is .json file.
          preprocess_func (method): Function used as proxy for any preliminary
              steps that need to be taken to run the model on the input text.
              This helper function ultimately leads to the input to this
              module and may involve executing other modules.
          detailed_metrics: boolean (Optional, defaults to False)
              Only for 'keywords'. Include partial scores and scores over every text in document.
          labels: list (Optional, defaults to None)
              Optional list of class labels to evaluate quality on. By default evaluation is done
              over all class labels. Using this, you can explicitly mention only a subset of
              labels to include in the quality evaluation.
          partial_match_metrics: boolean (Optional, defaults to False)
              Include partial match micro avg F1.
          max_hierarchy_levels (int): Used in hierarchical multilabel
              multiclass evaluation only. The number of levels in the
              hierarchy to run model evaluation on, in addition to complete
              matches.
          *args, **kwargs: Optional arguments which can be used by goldset/prediction
              set extraction.
              Nonekeyword arguments: `block_level`: str
                  For any module that has pre processing steps in the
                  middle of raw text and actual module input, use the input from gold standard
                  labels instead of a pre-process function. Useful for measuring quality for the
                  'block' alone (instead of the module + pre_process pipeline)
      Returns:
          dict: Dictionary of results provided by the `self.evaluator.run`
              function, depending on the associated `evaluation_type`. Reports
              things like precision, recall, and f1.


   .. py:method:: _is_expandable_iterable(arg)
      :staticmethod:


      Check to see if something is a list / tuple of data model objects or strings. If it is,
      we consider it "expandable", meaning that one element of the iterable to one run call. In
      contrast, if something is not expandable, it will be passed as is to each call.

      Args:
          arg (any): Argument to run_batch being considered.
      Returns:
          bool: True if the argument is a compatible iterable, False
              otherwise.


   .. py:method:: _validate_and_extract_batch_size(*args, **kwargs)

      Check to ensure that there's at least one iterable whose length is well defined,
      i.e., no generators, and that if multiple iterable arg/kwarg values are provided,
      they are all the same length.

      Args:
          *args: Variable length argument list to be passed directly to run().
          **kwargs: Arbitrary keyword arguments to be passed directly to run().
      Returns:
          int: Inferred batch size based on expandable iterables.


   .. py:method:: _validate_arg_and_verify_batch_size(val, current_batch_size)

      Check an arg value from args/kwargs. If we find that it's an expandable iterable, see
      if it conflicts with what we know about the inferred batch size so far.

      args:
          val (any): Argument / keyword argument value being inspected.
          current_batch_size (None | int): Current inferred batch size from
              previous args/kwargs, or None if no inferences have been made on
              other expandable iterables yet.
      Returns:
          None | inferred batch size.


   .. py:method:: _build_args_for_default_run_with_batch(fixed_args, expanded_args, idx)
      :staticmethod:


      Build the non keyword arguments for run_batch's default implementation by expanding
      iterable args where possible, and grouping them with repeated noniterable arguments. The
      index correspondes to the current document under consideration.

      Args:
          fixed_args (dict): Noniterable args - common across all documents.
          expanded_args (dict): Iterable args - we'll need to index into this
              to get our doc arg.
          idx (int): Index of the document being considered.
      Returns:
          list: Args to be run for document [idx].


   .. py:method:: _build_kwargs_for_default_run_with_batch(fixed_kwargs, expanded_kwargs, idx)
      :staticmethod:


      Similar to the previous function, but for kwargs. Note that we can just clone our fixed
      kwargs instead of cycling through them, because order doesn't matter here.

      Args:
          fixed_args (dict): Noniterable valued kwargs - common across all
              documents.
          expanded_args (dict): Iterable valued kwargs - we'll need to index
              into these to get our doc kwarg.
      Returns:
          dict: Kwargs to be run for document [idx].


   .. py:method:: _extract_gold_set(dataset)

      Method for extracting gold set from dataset. Implemented in subclass.

      Args:
          dataset (object): In-memory version of whatever is loaded from on-
              disk. May be json, txt, etc.

      Returns:
          list: List of labels in the format of the module_type that is being
              called.


   .. py:method:: _extract_pred_set(dataset, preprocess_func=None, **kwargs)

      Method for extracting pred set from dataset. Implemented in subclass.

      Args:
          dataset (object): In-memory version of whatever is loaded from on-
              disk. May be json, txt, etc.
          preprocess_func (method): Function used as proxy for any preliminary
              steps that need to be taken to run the model on the input text.
              This helper function ultimately leads to the input to this
              module and may involve executing other modules.
          **kwargs (dict): Optional keyword arguments for prediction set extraction.
      Returns:
          list: List of labels in the format of the module_type that is being
              called.


   .. py:method:: _load_evaluation_dataset(dataset_path)
      :staticmethod:


      Helper specifically for dataset loading.

      Args:
          dataset_path (str): Path to where the input 'gold set' dataset
              lives. Most often this is .json file.

      Returns:
          object: list, dict, or other python object, depending on the input
              dataset_path extension. Currently only supports `.json` and uses
              fileio from toolkit.


   .. py:method:: _extract_gold_annotations(gold_set)
      :staticmethod:


      Extract the core list of annotations that is needed for quality evaluation

      Args:
          gold_set (list)
      Returns:
          gold_annotations: list


   .. py:method:: _extract_pred_annotations(pred_set)
      :staticmethod:


      Extract the core list of predictions that is needed for quality evaluation

      Args:
          pred_set (list)
      Returns:
          pred_annotations: list


   .. py:method:: _generate_report(report, gold_set)
      :staticmethod:


      Generate the quality report output
      Args:
          report (dict)
          gold_set (list(dict))


.. py:class:: ModuleConfig(config_dict)

   Bases: :py:obj:`aconfig.Config`


   Config object used by all modules for config loading, saving, etc.


   .. py:attribute:: reserved_keys
      :value: ['model_path']


   .. py:method:: load(model_path: Union[str, ModuleConfig]) -> ModuleConfig
      :classmethod:


      Load a new module configuration from a directory on disk.

      Args:
          model_path (Union[str, ModuleConfig]): Path to model directory. At
              the top level of directory is `config.yml` which holds info
              about the model. Note that the model_path here is assumed to be
              operating system correct as a consequence of the way this method
              is invoked by the model manager.

      Returns:
          model_config (ModuleConfig): Instantiated ModuleConfig for model
              given model_path.


   .. py:method:: save(model_path)

      Save this module configuration to a top-level `config.yml` file in the specified
      model path.

      Args:  str
          Path to model directory.  The `config.yml` file will be written to this location.

      Notes:
          `model_path` must already exist!  This means you must create the directory outside of
          this routine.


.. py:class:: ModuleLoader(model_path: Union[str, caikit.core.modules.config.ModuleConfig])

   .. py:attribute:: MODULE_PATHS_KEY
      :value: 'module_paths'


   .. py:attribute:: config


   .. py:attribute:: model_path


   .. py:method:: load_arg(arg)

      Extract arg value from the loaded model's config


   .. py:method:: load_args(*args)

      Extract values from the loaded model's config


   .. py:method:: load_module(module_paths_key, load_singleton=False)

      Load a CaikitCore module from a module config.module_paths specification.

      Args:
          module_paths_key (str): key in `config.module_paths` looked at to
              load a module
          load_singleton (bool): singleton load flag to pass to individual
              module loads


   .. py:method:: load_module_list(module_paths_key)

      Load a list of CaikitCore module from a workflow config.module_paths specification.

      Args:
          module_paths_key (str): key in `config.module_paths` looked at to
              load a list of modules

      Returns:
          list: list of loaded modules


.. py:class:: ModuleSaver(module: caikit.core.modules.base.ModuleBase, model_path, exist_ok=True)

   A module saver that provides common functionality used for saving modules and also a context
   manager that cleans up in case an error is encountered during the save process for a model_path
   that did not already exist.


   .. py:attribute:: SAVED_KEY_NAME
      :value: 'saved'


   .. py:attribute:: CREATED_KEY_NAME
      :value: 'created'


   .. py:attribute:: TRACKING_KEY_NAME
      :value: 'tracking_id'


   .. py:attribute:: MODULE_VERSION_KEY_NAME
      :value: 'version'


   .. py:attribute:: MODULE_ID_KEY_NAME
      :value: 'module_id'


   .. py:attribute:: MODULE_CLASS_KEY_NAME
      :value: 'module_class'


   .. py:attribute:: model_path
      :value: b'.'


   .. py:attribute:: exist_ok
      :value: True


   .. py:attribute:: config


   .. py:method:: add_dir(relative_path, base_relative_path='')

      Create a directory inside the `model_path` for this saver.

      Args:
          relative_path (str): A path relative to this saver's `model_path`
              denoting the directory to create.
          base_relative_path (str): A path, relative to this saver's
              `model_path`, in which `relative_path` will be created.

      Returns:
          str, str: A tuple containing both the `relative_path` and
              `absolute_path` to the directory created.

      Examples:
          >>> with ModelSaver('/path/to/model') as saver:
          >>>     rel_path, abs_path = saver.add_dir('word_embeddings', 'model_data')
          >>> print(rel_path)
          model_data/word_embeddings
          >>> print(abs_path)
          /path/to/model/model_data/word_embeddings


   .. py:method:: copy_file(file_path, relative_path='')

      Copy an external file into a subdirectory of the `model_path` for this saver.

      Args:
          file_path (str): Absolute path to the external file to copy.
          relative_path (str): The relative path inside of `model_path` where
              the file will be copied to. If set to the empty string (default)
              then the file will be placed directly in the `model_path`
              directory.

      Returns:
          str, str: A tuple containing both the `relative_path` and
              `absolute_path` to the copied file.


   .. py:method:: save_object(obj, filename, serializer, relative_path='')

      Save a Python object using the provided ObjectSerializer.

      Args:
          obj (any): The Python object to save
          filename (str): The filename to use for the saved object
          serializer (ObjectSerializer): An ObjectSerializer instance (e.g.,
              YAMLSerializer) that should be used to serialize the object
          relative_path (str): The relative path inside of `model_path` where
              the object will be saved


   .. py:method:: update_config(additional_config)

      Add items to this saver's config dictionary.

      Args:
          additional_config (dict): A dictionary of config options to add the
              this saver's configuration.

      Notes:
          The behavior of this method matches `dict.update` and is equivalent to calling
          `saver.config.update`.  The `saver.config` dictionary may be accessed directly for
          more sophisticated manipulation of the configuration.


   .. py:method:: save_module(module, relative_path, **kwargs)

      Save a CaikitCore module within a workflow artifact and add a reference to the config.

      Args:
          module (caikit.core.ModuleBase): The CaikitCore module to save as
              part of this workflow
          relative_path (str): The relative path inside of `model_path` where
              the module will be saved
          **kwargs:  dict
              key-value pair of parameters to be passed to module.save


   .. py:method:: save_module_list(modules, config_key, **kwargs)

      Save a list of CaikitCore modules within a workflow artifact and add a reference to the
      config.

      Args:
          modules (dict{str -> caikit.core.ModuleBase}): A dict with module
              relative path as key and a CaikitCore module as value to save as
              part of this workflow
          config_key (str): The config key inside of `model_path` where the
              modules' relative path with be referenced
          **kwargs:  dict
              key-value pair of parameters to be passed to module.save


      Returns:
          list_of_rel_path: list(str)
              List of relative paths where the modules are saved
          list_of_abs_path: list(str)
              List of absolute paths where the modules are saved


   .. py:method:: __enter__()

      Enter the module saver context.  This creates the `model_path` directory.  If this
      context successfully exits, then the model configuration and all files it contains will
      be written and saved to disk inside the `model_path` directory.

      If `exist_ok` is False, an exception will be raised before touching existing `model_path`
      files.

      If any uncaught exceptions are thrown inside this context, and `exist_ok` is False,
      then this new `model_path` will be removed. If `exist_ok` is True, the files will be kept
      and may include incomplete updates.


   .. py:method:: __exit__(exc_type, exc_val, exc_tb)

      Exit the module saver context. If this context successfully exits, then the model
      configuration and all files it contains will be written and saved to disk inside the
      `model_path` directory.

      If any uncaught exceptions are thrown inside this context, and `exist_ok` is False,
      then this new `model_path` will be removed. If `exist_ok` is True, the files will be kept
      and may include incomplete updates.


.. py:function:: module(id=None, name=None, version=None, task: Type[caikit.core.task.TaskBase] = None, tasks: Optional[List[Type[caikit.core.task.TaskBase]]] = None, backend_type='LOCAL', base_module: Union[str, Type[caikit.core.modules.base.ModuleBase]] = None, backend_config_override: Optional[Dict] = None)

   Apply this decorator to any class that should be treated as a caikit module
    (i.e., extends`{caikit.core.ModuleBase}) and registered with caikit.core so that the library
    "knows" the class is a caikit module and is capable of loading instances of the module.

   Args:
       id:  str
           A UUID to use when registering this module with caikit.core
           Not required if based on another caikit module using `base_module`
       name:  str
           A human-readable name for the module
           Not required if based on another caikit module using `base_module`
       version:  str
           A SemVer for the module
           Not required if based on another caikit module using `base_module`
       task:  Type[TaskBase]
           An ML task class that this module is an implementation for
           Not required if based on another caikit module using `base_module`,
           or if multiple tasks are specified using `tasks`.
       tasks: Optional[List[Type[TaskBase]]
           List of ML task classes that this module implements.
       backend_type: backend_type
           Associated backend type for the module.
           Default: `LOCAL`
       base_module: str | ModuleBase
           If this module is based on a different caikit module, provide name
           of the base module.
           Default: None
       backend_config_override: Dict
           Dictionary containing configuration required for the specific backend.
           Default: None

   Returns:
       A decorated version of the class to which it was applied, after registering the
       class as a valid module with caikit.core


.. py:class:: TaskBase

   The TaskBase defines the interface for an abstract AI task

   An AI task is a logical function signature which, when implemented, performs
   a task in some AI domain. The key property of a task is that the set of
   required input argument types and the output value type are consistent
   across all implementations of the task.


   .. py:class:: InferenceMethodPtr

      Little container class that holds a method name and its flavor of streaming.
      i.e. the args to a `@TaskClass.taskmethod` decoration.


      .. py:attribute:: method_name
         :type:  str


      .. py:attribute:: input_streaming
         :type:  bool


      .. py:attribute:: output_streaming
         :type:  bool


      .. py:attribute:: context_arg
         :type:  Optional[str]


   .. py:attribute:: deferred_method_decorators
      :type:  Dict[Type[TaskBase], Dict[str, List[TaskBase]]]


   .. py:method:: taskmethod(input_streaming: bool = False, output_streaming: bool = False, context_arg: Optional[str] = None) -> Callable[[_InferenceMethodBaseT], _InferenceMethodBaseT]
      :classmethod:


      Decorates a module instancemethod and indicates whether the inputs and outputs should
      be handled as streams. This will trigger validation that the signature of this method
      is compatible with the task's definition of input and output types.

      The actual handling of validating the method and registering it is deferred until after
      the module class is created, which happens outside the context of this decoration.


   .. py:method:: deferred_method_decoration(module: Type)
      :classmethod:


      Runs the actual decoration logic that `taskmethod` would have run if the module class
      existed during its lifetime.

      Validates that all decorated methods match the task's API expectations, and stores the
      signatures on the module class for access later.


   .. py:method:: has_inference_method_decorators(module_class: Type) -> bool
      :classmethod:


      Utility that returns true iff a module has any `@TaskClass.taskmethod` decorations


   .. py:method:: validate_run_signature(signature: caikit.core.signature_parsing.CaikitMethodSignature, input_streaming: bool, output_streaming: bool) -> None
      :classmethod:


      Validates that the provided method signature meets the api constraints defined in this
      task, for the given streaming flavors.

      Raises:
          ValueError if no type annotations were provided on the method
          TypeError if the type annotations do not meet the task's api constraints


   .. py:method:: get_required_parameters(input_streaming: bool) -> Dict[str, Union[ValidInputTypes, Type[Iterable[ValidInputTypes]]]]
      :classmethod:


      Get the set of input types required by this task


   .. py:method:: get_output_type(output_streaming: bool) -> Type[caikit.core.data_model.base.DataBase]
      :classmethod:


      Get the output type for this task

      NOTE: This method is automatically configured by the @task decorator
          and should not be overwritten by child classes.


   .. py:method:: get_visibility() -> bool
      :classmethod:


      Get the visibility for this task.

      NOTE: defaults to True even if visibility wasn't provided


   .. py:method:: get_metadata() -> Dict[str, Any]
      :classmethod:


      Get any metadata defined for this task

      NOTE: defaults to an empty dict if one wasn't provided


   .. py:method:: _raise_on_wrong_output_type(output_type, module, output_streaming: bool)
      :classmethod:


   .. py:method:: _subclass_check(this_type, that_type)
      :staticmethod:


      Wrapper around issubclass that first checks if both args are classes.
      Returns True if the types are the same, or they are both classes and this_type
      is a subclass of that_type


   .. py:method:: _is_iterable_type(typ: Type) -> bool
      :staticmethod:


      Returns True if typ is an iterable type.
      Does not work for types like `list`, `tuple`, but we're interested here in `List[T]` etc.

      This is implemented this way to support older python versions where
      isinstance(typ, typing.Iterable) does not work


.. py:function:: task(unary_parameters: Dict[str, ValidInputTypes] = None, streaming_parameters: Dict[str, Type[Iterable[ValidInputTypes]]] = None, unary_output_type: Type[caikit.core.data_model.base.DataBase] = None, streaming_output_type: Type[Iterable[Type[caikit.core.data_model.base.DataBase]]] = None, visible: bool = True, metadata: Optional[Dict[str, Any]] = None, **kwargs) -> Callable[[Type[TaskBase]], Type[TaskBase]]

   The decorator for AI Task classes.

   This defines an output data model type for the task, and a minimal set of required inputs
   that all public models implementing this task must accept.

   As an example, the `caikit.interfaces.nlp.SentimentTask` might look like::

       @task(
           unary_parameters={
               "raw_document": caikit.interfaces.nlp.RawDocument
           },
           streaming_parameters={
               "raw_documents": Iterable[caikit.interfaces.nlp.RawDocument]
           }
           unary_output_type=caikit.interfaces.nlp.SentimentPrediction
           streaming_output_type=Iterable[caikit.interfaces.nlp.SentimentPrediction]
       )
       class SentimentTask(caikit.TaskBase):
           pass

   and a module that implements this task might have methods like::

       @module(id="b9d98408-84c2-488c-8385-9d698effe60b", task=SentimentTask)
       class MyModule(ModuleBase):

           @SentimentTask.taskmethod()
           def run(raw_document: caikit.interfaces.nlp.RawDocument,
                   inference_mode: str = "fast") ->
                       caikit.interfaces.nlp.SentimentPrediction:
               # impl

           @SentimentTask.taskmethod(input_streaming=True, output_streaming=True)
           def run_bidi_stream(raw_documents: DataStream[caikit.interfaces.nlp.RawDocument])
                   -> DataStream[caikit.interfaces.nlp.SentimentPrediction]:
               # impl

   Note the run function may include other arguments beyond the minimal required inputs for
   the task.

   Args:
       unary_parameters (Dict[str, ValidInputTypes]): The required parameters that all module's
           unary-input inference methods must contain. A dictionary of parameter name to parameter
           type, where the types can be in the set of:
               - Python primitives
               - Caikit data models
               - Iterable containers of the above
               - Caikit model references (maybe?)
       streaming_parameters: The same as unary_parameters, but for streaming-input inference
           methods. All types must be in the form `Iterable[T]`

       unary_output_type (Type[DataBase]): The unary output type of the task, which all modules'
           unary-output inference methods must return. This must be a caikit data model type.
       streaming_output_type (Type[Iterable[Type[DataBase]]]): The streaming output type of the
           task, which all modules' streaming-output inference methods must return. This must be
           in the form Iterable[T].

       visible (bool): If this task should be exposed to the end user in documentation or if
         it should only be used internally

       metadata (Optional[Dict[str, Any]]): Any additional metadata that should
         be included in the documentation for this task

   Returns:
       A decorator function for the task class, registering it with caikit's core registry of
           tasks.


.. py:class:: EvalTypes(*args, **kwds)

   Bases: :py:obj:`enum.Enum`


   Enum that contains set of all possible evaluation types.


   .. py:attribute:: SINGLELABEL_MULTICLASS
      :value: 1


   .. py:attribute:: MULTILABEL_MULTICLASS
      :value: 2


   .. py:attribute:: MULTILABEL_MULTICLASS_HIERARCHICAL
      :value: 3


.. py:class:: F1Metrics

   .. py:attribute:: true_positive
      :type:  Optional[int]
      :value: None


   .. py:attribute:: false_positive
      :type:  Optional[int]
      :value: None


   .. py:attribute:: false_negative
      :type:  Optional[int]
      :value: None


   .. py:attribute:: precision
      :type:  Optional[float]
      :value: None


   .. py:attribute:: recall
      :type:  Optional[float]
      :value: None


   .. py:attribute:: f1
      :type:  Optional[float]
      :value: None


.. py:class:: F1MetricsContainer

   .. py:attribute:: per_class_confusion_matrix
      :type:  Dict[str, F1Metrics]


   .. py:attribute:: macro_metrics
      :type:  F1Metrics


   .. py:attribute:: micro_metrics
      :type:  F1Metrics


.. py:class:: QualityEvaluator(gold, pred)

   Class that holds all evaluation logic for now. May eventually be broken up into
   subclasses.


   .. py:attribute:: gold


   .. py:attribute:: pred


   .. py:method:: run(evaluation_type, find_label_func=None, find_label_data_func=None, detailed_metrics=False, labels=None, partial_match_metrics=False, max_hierarchy_levels=3)

      Main entry point for evaluation.

      Args:
          evaluation_type (str): Which type of evaluation to run. Only a few
              are currently supported.
          find_label_func: function to fetch labels from any one prediction, used in
              multiclass multilabel evaluation.
              eg: if a prediction is of form (token, label), this function should be
              able to tell us how to extract the class labels from the prediction, in
              this case return the second element of the tuple.
          find_label_data_func: function to fetch predictions that belongs to a certain label,
              used only in multiclass multilabel eval type, e.g., if predictions for a data
              example looks like [(tok1, labX), (tok2, labY), (tok3, labX)], then
              the function should be able to return all predictions with a given label - labX
              return should look like [(tok1, labX), (tok3, labX)]
          detailed_metrics: flag to indicate whether or not you want detailed metrics
                            (currently only for multiclass multilabel eval type)
                            Detailed metrics give us metrics for every example, and
                            metrics using a custom partial match function
          labels: list (Optional, defaults to None)
              Optional list of class labels to evaluate quality on. By default evaluation is done
              over all class labels. Using this, you can explicitly mention only a subset of
              labels to include in the quality evaluation.
          partial_match_metrics: flag to indicate whether or not you want partial match
                                 micro avg metrics.
                                 (currently only for multiclass multilabel eval type)
          max_hierarchy_levels (int): Used in hierarchical multilabel
              multiclass evaluation only. The number of levels in the
              hierarchy to run model evaluation on, in addition to complete
              matches.

      Returns:
          dict: Full results from evaluation on dataset and model.


   .. py:method:: singlelabel_multiclass_evaluation(labels=None) -> dict

      Obtain results of evaluation for a single-label, multi-class model.

      Args:
          Note: here class should be initialized with gold and pred in the following format
          self.gold (list): list of gold set labels for every example, where each example
              can have only one label eg: ['label1','label2', 'label3','label4']
          self.pred (list): Predicted-by-the-model set labels for every example.
          labels: list (Optional, defaults to None)
              Optional list of class labels to evaluate quality on. By default evaluation is done
              over all class labels. Using this, you can explicitly mention only a subset of
              labels to include in the quality evaluation.

      Returns:
          dict: Dictionary looks like: { 'per_class_confusion_matrix':
              {'entity_type': {'true_positive': int ...}} 'macro_precision': 0
              <= float <= 1, 'macro_recall': 0 <= float <= 1, 'macro_f1': 0 <=
              float <= 1, 'micro_precision': 0 <= float <= 1,, 'micro_recall':
              0 <= float <= 1,, 'micro_f1': 0 <= float <= 1, 'overall_tp':
              int, 'overall_fp': int, 'overall_fn': int


              }


   .. py:method:: multilabel_multiclass_evaluation(find_label_func, find_label_data_func, labels=None, detailed_metrics=False, partial_match_metrics=False, use_labels_for_matching=False) -> dict

      Obtain results of evaluation for a multi-label, multi-class model.

      Args:
          Note: here class should be initialized with gold and pred in the following format
          self.gold (list(list)): list of gold set labels for every example eg:
              [['label1','label2'], ['label1', 'label4']]
          self.pred (list(list)): Predicted-by-the-model set labels for every example.
          find_label_func: function to fetch labels from any one prediction
          find_label_data_func: function to fetch data that belongs to a certain class
          labels: list (Optional, defaults to None)
              Optional list of class labels to evaluate quality on. By default evaluation is done
              over all class labels. Using this, you can explicitly mention only a subset of
              labels to include in the quality evaluation.
          detailed_metrics: flag to indicate whether or not you want detailed metrics
                            Detailed metrics give us metrics for every example, and
                            metrics using a custom partial match function
          partial_match_metrics: flag to indicate whether or not you want partial match
                                 micro avg metrics.
          use_labels_for_matching (bool): Indicates whether or not we should
              use the output of find_label_func for metric computations, or
              the raw data tuples.

      Returns:
          dict: Dictionary looks like: { 'per_class_confusion_matrix':
              {'entity_type': {'true_positive': int ...}} 'macro_precision': 0
              <= float <= 1, 'macro_recall': 0 <= float <= 1, 'macro_f1': 0 <=
              float <= 1, 'micro_precision': micro_precision, 'micro_recall':
              micro_recall, 'micro_f1': micro_f1, 'detailed_metrics' :
              {'exact_match_precision'..,'partial_match_precision'}
              'micro_precision_partial_match': 0 <= float <= 1,
              'micro_recall_partial_match': 0 <= float <= 1,
              'micro_f1_partial_match': 0 <= float <= 1 }


   .. py:method:: multilabel_multiclass_hierarchical_evaluation(find_label_func_builder, find_label_data_func_builder, max_hierarchy_levels=3) -> dict

      Evaluate multilabel/multiclass over a hierarchy, e.g., for ESA categories. This method
      Evaluates over a set number of hierarchy levels.

      Because each level in the hierarchy needs to be able to compare and extract differently,
      we use builder funcs that create the appropriate functions for a given level of the
      hierarchy.

      Args:
          find_label_func_builder (function): A function that takes in a level
              number (or None if full hierarchy) and returns a find_label_func
              for this level that can be passed to the multilabel multiclass
              evaluator.
          find_label_data_func_builder (function): A function that takes in a
              level number (or None if full hierarchy) and returns a
              find_label_data_func for this level that can be passed to the
              multilabel multiclass evaluator.
          max_hierarchy_levels (int): The number of levels to run in the
              hierarchy, in addition to complete match.
      Returns:
          dict: Dictionary, where each key is a level number, or 'FULL', and
              maps to the dict returned by multilabel_multiclass_evaluation
              for that level of the hierarchy.


   .. py:method:: calc_f1_score(gold, pred, match_fun=None)
      :staticmethod:


      Calculates F1 score
      Args:
          gold (list): List of gold annotations
          pred (list): List of predictions
          match_fun: Function that finds the matches and returns tuple of matched gold, preds
      Returns:
          tuple: Precision, Recall, F1 score


   .. py:method:: find_partial_matches(groundtruth, prediction)
      :staticmethod:


      Function to do find partial match between predicted phrases and the ground truth.
         partial match means a complete predicted phrase is a part of any ground truth phrase or
         a complete ground truth phrase is a part of any predicted phrase.
         Overlaps are not considered.

      Args:
          groundtruth (list): Groundtruth data
          prediction (list): Predictions returned by the model

      Returns:
          tuple: gold_matched: set, pred_matched: set gold annotations that
              were matched Predictions that partially or fully matched with
              groundtruth


   .. py:method:: calc_metrics_from_confusion_matrix(per_class_confusion_matrix: Dict[str, F1Metrics]) -> F1MetricsContainer
      :staticmethod:


      Function to calculate precision, recall, F1 metrics using a confusion matrix containing
         statistics per class label.

      Args:
          per_class_confusion_matrix (Dict[str, F1Metrics]): Dictionary of
               statistics per class label. Class labels are keys for the
               dictionary. For each class label, there should be a F1Metrics
               class object with values true positive, false_positive ,
               false_negative representating the count of these per class. The
               dictionary looks like: per_class_confusion_matrix[label] =
               F1Metrics(true_positive = val 1, false_positive = val 2,
               false_negative = val 3)

      Returns:
          Returns:
          metrics_summary: F1MetricsContainer
          An instance of F1MetricsContainer dataclass containing summary of F1 metrics


   .. py:method:: convert_F1MetricsContainer_to_dict() -> dict

      Args:
          metrics_summary (F1MetricsContainer): An object of dataclass
               F1MetricsContainer

      Returns:
          Returns:
          dict
              Dictionary looks like: {
                  'per_class_confusion_matrix': {'entity_type': {'true_positive': int ...}}
                  'macro_precision': 0 <= float <= 1,
                  'macro_recall': 0 <= float <= 1,
                  'macro_f1': 0 <= float <= 1,
                  'micro_precision': 0 <= float <= 1,,
                  'micro_recall': 0 <= float <= 1,,
                  'micro_f1': 0 <= float <= 1,
                  'overall_tp': int,
                  'overall_fp': int,
                  'overall_fn': int
              }


.. py:function:: load_txt(filename)

   Load a string from a file with utf8 encoding.


.. py:function:: load_txt_lines(filename)

   Load a list of files from a text file with utf8 encoding


.. py:function:: save_txt(text, filename, mode='w')

   Write a string to a text file with utf8 encoding.


.. py:function:: load_binary(filename)

   Load a binary string from a file.


.. py:function:: save_binary(data, filename)

   Write a binary buffer to a file.


.. py:function:: load_csv(filename)

   Load a csv into a list-of-lists.


.. py:function:: save_csv(text_list, filename, mode='w')

   Write a list-of-lists to a csv file.


.. py:function:: load_dict_csv(filename)

   Load a csv into a list-of-dicts.


.. py:function:: save_dict_csv(dict_list, filename, mode='w')

   Write a list of dicts to a csv file.


.. py:function:: load_json(filename)

   Load a json file into a dictionary.


.. py:function:: save_json(save_dict, filename, mode='w')

   Save a dictionary into a json file.


.. py:function:: load_yaml(filename)

   Load a yaml file into a dictionary.


.. py:function:: save_yaml(save_dict, filename, mode='w')

   Save a dictionary into a yaml file.


.. py:function:: load_pickle(filename)

   Load an object from a pickle file.


.. py:function:: save_pickle(obj, filename, mode='wb')

   Save an object to a pickle file.


.. py:function:: save_raw(save_content, filename, mode='w')

   Write the given raw string content to output file.


.. py:function:: compress(dir_path, output_path=None, extension='zip')

   Compress a given folder recursively to an archive with a given extension format

   Args:
       dir_path (str): Path of directory to compress
       output_path: (Optional) str
           Output path where the archive is created. Defaults to current path + 'archive' +
           format extension
           >>> compress('.', 'my/path', 'tar')
           >>> # saves to 'my/path/archive.tar'

       extension: (Optional) (one of: zip/tar/gztar/bztar/xztar depending on module availability)
           Defaults to .zip

   Returns:
       str: Path to created archive


.. py:class:: ObjectSerializer

   Bases: :py:obj:`abc.ABC`


   Abstract class for serializing an object to disk.


   .. py:method:: serialize(obj, file_path)
      :abstractmethod:


      Serialize the provided object to the specified file path.

      Args:
          obj (object): The object to serialize
          file_path (str): Absolute path to which the object should be
              serialized


.. py:class:: JSONSerializer

   Bases: :py:obj:`ObjectSerializer`


   An ObjectSerializer for serializing to a JSON file.


   .. py:method:: serialize(obj, file_path)

      Serialize the provided object to a JSON file.

      Args:
          obj (object): The object to serialize
          file_path (str): Absolute path to which the object should be
              serialized


.. py:class:: TextSerializer

   Bases: :py:obj:`ObjectSerializer`


   An ObjectSerializer for serializing a python list to a text file.


   .. py:method:: serialize(obj, file_path)

      Serialize the provided python list to a text file.

      Args:
          obj (list(str)): The list to serialize
          file_path (str): Absolute path to which the object should be
              serialized


.. py:class:: YAMLSerializer

   Bases: :py:obj:`ObjectSerializer`


   An ObjectSerializer for serializing to a YAML file.


   .. py:method:: serialize(obj, file_path)

      Serialize the provided object to a YAML file.

      Args:
          obj (object): The object to serialize
          file_path (str): Absolute path to which the object should be
              serialized


.. py:class:: CSVSerializer

   Bases: :py:obj:`ObjectSerializer`


   An ObjectSerializer for serializing to a CSV file.


   .. py:method:: serialize(obj, file_path)

      Serialize the provided object to a CSV file.

      Args:
          obj (object): The object to serialize
          file_path (str): Absolute path to which the object should be
              serialized


.. py:class:: PickleSerializer

   Bases: :py:obj:`ObjectSerializer`


   An ObjectSerializer for pickling arbitrary Python objects.


   .. py:method:: serialize(obj, file_path)

      Serialize the provided object to a CSV file.

      Args:
          obj (any): The object to serialize
          file_path (str): Absolute path to which the object should be
              serialized


.. py:data:: MODEL_MANAGER

.. py:data:: extract

.. py:data:: load

.. py:data:: resolve_and_load

.. py:data:: train

.. py:data:: start_prediction_job

.. py:data:: get_model_future

.. py:data:: get_prediction_future