beimingwu
/
learnware

 
			
							.. _learnware:

==========================================
Learnware & Reuser
==========================================

Learnware and Reuser are related...

Concepts
===================
In the learnware paradigm, a learnware is a well-performed trained machine learning model with a specification which enables it to be adequately identified to reuse according to the requirement of future users who know nothing about the learnware in advance. The introduction of specifications are shown in `COMPONENTS: Specification <./spec.html>`_.

In our implementation, the class ``Learnware`` has 4 member variables:

- ``id``: The learnware id that is generated by market.
- ``model``: The model in the learnware, can be a ``BaseModel`` or a dict including model name and path. When it is a dict, the function ``Learnware.instantiate_model`` is used to transform it to a ``BaseModel``. The function ``Learnware.predict`` use the model to predict for an input ``X``. See more in `COMPONENTS: Model <./model.html>`_.
- ``specification``: The specification including the semantic specification and the statistic specification.
- ``dirpath``: The path of the learnware directory.


Learnware for Hetero Reuse (Feature Align + Hetero Map Learnware)
=======================================================================

All Reuse Methods
===========================

JobSelectorReuser
--------------------

The ``JobSelectorReuser`` is a class that inherits from the base reuse class ``BaseReuser``.
Its purpose is to create a job selector that identifies the optimal learnware for each data point in user data.
There are three parameters required to initialize the class:

- ``learnware_list``: A list of objects of type ``Learnware``. Each ``Learnware`` object should have an RKME specification.
- ``herding_num``: An optional integer that specifies the number of items to herd, which defaults to 1000 if not provided.
- ``use_herding``: A boolean flag indicating whether to use kernel herding.

The job selector is essentially a multi-class classifier :math:`g(\boldsymbol{x}):\mathcal{X}\rightarrow \mathcal{I}` with :math:`\mathcal{I}=\{1,\ldots, C\}`, where :math:`C` is the size of ``learnware_list``.
Given a testing sample :math:`\boldsymbol{x}`, the ``JobSelectorReuser`` predicts it by using the :math:`g(\boldsymbol{x})`-th learnware in ``learnware_list``.
If ``use_herding`` is set to false, the ``JobSelectorReuser`` uses data points in each learware's RKME spefication with the corresponding learnware index to train a job selector.
If ``use_herding`` is true, the algorithm estimates the mixture weight based on RKME specifications and raw user data, uses the weight to generate ``herding_num`` auxiliary data points mimicking the user distribution through the kernel herding method, and learns a job selector on these data.


AveragingReuser
------------------

The ``AveragingReuser`` is a class that inherits from the base reuse class ``BaseReuser``, that implements the average ensemble method by averaging each learnware's output to predict user data.
There are two parameters required to initialize the class:

- ``learnware_list``: A list of objects of type ``Learnware``.
- ``mode``: The mode of averaging leanrware outputs, which can be set to "mean" or "vote" and defaults to "mean".

If ``mode`` is set to "mean", the ``AveragingReuser`` computes the mean of the learnware's output to predict user data, which is commonly used in regression tasks.
If ``mode`` is set to "vote", the ``AveragingReuser`` computes the mean of the softmax of the learnware's output to predict each label probability of user data, which is commonly used in classification tasks.