========================================== Learnwares Reuse ========================================== This part introduces two baseline methods for reusing a given list of learnwares, namely ``JobSelectorReuser`` and ``AveragingReuser``. Instead of training a model from scratch, the user can easily reuse a list of learnwares (``List[Learnware]``) to predict the labels of their own data (``numpy.ndarray`` or ``torch.Tensor``). To illustrate, we provide a code demonstration that obtains the user dataset using ``sklearn.datasets.load_digits``, where ``test_data`` represents the data that requires prediction. Assuming that ``learnware_list`` is the list of learnwares searched by the learnware market based on user specifications, the user can reuse each learnware in the ``learnware_list`` through ``JobSelectorReuser`` or ``AveragingReuser`` to predict the label of ``test_data``, thereby avoiding training a model from scratch. .. code-block:: python from sklearn.datasets import load_digits from learnware.learnware import JobSelectorReuser, AveragingReuser # Load user data X, y = load_digits(return_X_y=True) test_data = X # Based on user information, the learnware market returns a list of learnwares (learnware_list) # Use jobselector reuser to reuse the searched learnwares to make prediction reuse_job_selector = JobSelectorReuser(learnware_list=learnware_list) job_selector_predict_y = reuse_job_selector.predict(user_data=test_data) # Use averaging ensemble reuser to reuse the searched learnwares to make prediction reuse_ensemble = AveragingReuser(learnware_list=learnware_list) ensemble_predict_y = reuse_ensemble.predict(user_data=test_data) JobSelectorReuser ==================== The ``JobSelectorReuser`` is a class that inherits from the base reuse class ``BaseReuser``. Its purpose is to create a job selector that identifies the optimal learnware for each data point in user data. There are three parameters required to initialize the class: - ``learnware_list``: A list of objects of type ``Learnware``. Each ``Learnware`` object should have an RKME specification. - ``herding_num``: An optional integer that specifies the number of items to herd, which defaults to 1000 if not provided. - ``use_herding``: A boolean flag indicating whether to use kernel herding. The job selector is essentially a multi-class classifier :math:`g(\boldsymbol{x}):\mathcal{X}\rightarrow \mathcal{I}` with :math:`\mathcal{I}=\{1,\ldots, C\}`, where :math:`C` is the size of ``learnware_list``. Given a testing sample :math:`\boldsymbol{x}`, the ``JobSelectorReuser`` predicts it by using the :math:`g(\boldsymbol{x})`-th learnware in ``learnware_list``. If ``use_herding`` is set to false, the ``JobSelectorReuser`` uses data points in each learware's RKME spefication with the corresponding learnware index to train a job selector. If ``use_herding`` is true, the algorithm estimates the mixture weight based on RKME specifications and raw user data, uses the weight to generate ``herding_num`` auxiliary data points mimicking the user distribution through the kernel herding method, and learns a job selector on these data. AveragingReuser ==================== The ``AveragingReuser`` is a class that inherits from the base reuse class ``BaseReuser``, that implements the average ensemble method by averaging each learnware's output to predict user data. There are two parameters required to initialize the class: - ``learnware_list``: A list of objects of type ``Learnware``. - ``mode``: The mode of averaging leanrware outputs, which can be set to "mean" or "vote" and defaults to "mean". If ``mode`` is set to "mean", the ``AveragingReuser`` computes the mean of the learnware's output to predict user data, which is commonly used in regression tasks. If ``mode`` is set to "vote", the ``AveragingReuser`` computes the mean of the softmax of the learnware's output to predict each label probability of user data, which is commonly used in classification tasks.