diff --git a/docs/components/learnware.rst b/docs/components/learnware.rst index b931bf0..79057b9 100644 --- a/docs/components/learnware.rst +++ b/docs/components/learnware.rst @@ -1,4 +1,5 @@ .. _learnware: + ========================================== Learnware & Reuser ========================================== @@ -7,51 +8,17 @@ Learnware and Reuser are related... Concepts =================== -The learnware paradiam, first introduced by Zhi-Hua Zhou, is defined as a proficiently trained machine learning model accompanied by a specification that allows future users with no prior knowledge of the learnware to identify and reuse it according to their needs. - -Developers or owners of trained machine learning models can voluntarily submit their models to a learnware marketplace. If the marketplace accepts the model, it assigns a specification to the model and makes it available in the marketplace. - -Utilizing Learnware in Practice -------------------------------- - -With a learnware marketplace in place, users can tackle machine learning tasks without having to create models from scratch. - -Addressing Concerns with Learnware ----------------------------------- - -The learnware approach aims to address several challenges: - - -+------------------------+----------------------------------------------------------------------------------------+ -| Concern | Solution | -+========================+========================================================================================+ -| Limited training data | Use existing high-quality learnware and require only a small amount of data for | -| | adaptation or refinement. | -+------------------------+----------------------------------------------------------------------------------------+ -| Lack of training skills| Leverage existing learnware instead of building a model from scratch. | -+------------------------+----------------------------------------------------------------------------------------+ -| Catastrophic forgetting| Retain old knowledge in the marketplace as accepted learnware remain available. | -+------------------------+----------------------------------------------------------------------------------------+ -| Continual learning | Facilitate continuous and lifelong learning with the constant influx of high-quality | -| | learnware, enriching the knowledge base. | -+------------------------+----------------------------------------------------------------------------------------+ -| Data privacy and | Ensure data privacy and proprietary protection by having developers only submit | -| proprietary concerns | models, not their data. | -+------------------------+----------------------------------------------------------------------------------------+ -| Unplanned tasks | Ensure the availability of helpful learnware for various tasks, unless entirely new | -| | to all legal developers. | -+------------------------+----------------------------------------------------------------------------------------+ -| Carbon emissions | Reduce the need to train numerous large models by assembling smaller models that | -| | provide satisfactory performance. | -+------------------------+----------------------------------------------------------------------------------------+ +In the learnware paradigm, a learnware is a well-performed trained machine learning model with a specification which enables it to be adequately identified to reuse according to the requirement of future users who know nothing about the learnware in advance. The introduction of specifications are shown in `COMPONENTS: Specification <./spec.html>`_. -Future Work and Progress ------------------------- +In our implementation, the class ``Learnware`` has 4 member variables: -Despite the promising potential of the learnware proposal, much work remains to bring it to fruition. The following sections will discuss some of the progress made thus far. +- ``id``: The learnware id that is generated by market. +- ``model``: The model in the learnware, can be a ``BaseModel`` or a dict including model name and path. When it is a dict, the function ``Learnware.instantiate_model`` is used to transform it to a ``BaseModel``. The function ``Learnware.predict`` use the model to predict for an input ``X``. See more in `COMPONENTS: Model <./model.html>`_. +- ``specification``: The specification including the semantic specification and the statistic specification. +- ``dirpath``: The path of the learnware directory. -Learnware for Hetero Reuse (Feature Aligh + Hetero Map Learnware) +Learnware for Hetero Reuse (Feature Align + Hetero Map Learnware) ======================================================================= All Reuse Methods diff --git a/docs/workflows/search.rst b/docs/workflows/search.rst index d0a314c..0111043 100644 --- a/docs/workflows/search.rst +++ b/docs/workflows/search.rst @@ -15,7 +15,7 @@ The homogeneous search of helpful learnwares can be divided into two stages: sem User information ------------------------------- ``BaseUserInfo`` is a ``Python API`` for users to provide enough information to identify helpful learnwares. -When initializing ``BaseUserInfo``, three optional information can be provided: ``id``, ``semantic_spec`` and ``stat_info``. The generation of these specifications is seen in `Prepare Learnware <. _submit>`_. +When initializing ``BaseUserInfo``, three optional information can be provided: ``id``, ``semantic_spec`` and ``stat_info``. The generation of these specifications is seen in `Prepare Learnware <./upload.html>`_. @@ -28,12 +28,11 @@ identifying potentially helpful leaarnwares whose models solve tasks similar to In these two searchers, each learnware in the ``learnware_list`` is compared with ``user_info`` according to their ``semantic_spec``, and added to the search result if mathched. Two semantic_spec are matched when all the key words are matched or empty in ``user_info``. Different keys have different matching rules. Their ``__call__`` functions are the same: -- `__call__(self, learnware_list: List[Learnware], user_info: BaseUserInfo)-> SearchResults` - - For keys ``Data``, ``Task``, ``Library`` and ``license``, two ``semantic_spec`` keys are matched only if these values(only one value for each key) of learnware ``semantic_spec`` exists in values(may be muliple values for one key) of user ``semantic_spec``. - - For the key ``Scenario``, two ``semantic_spec`` keys are matched if their values have nonempty intersections. - - For keys ``Name`` and ``Description``, the values are strings and case is ignored; - - In ``EasyExactSemanticSearcher``, two ``semantic_spec`` keys are matched if these values of learnware ``semantic_spec`` is a substring of user ``semantic_spec``. - - In ``EasyFuzzSemanticSearcher``, first the exact semantic searcher is conducted like ``EasyExactSemanticSearcher``. If the result is empty, the fuzz semantic searcher is activated: the ``learnware_list`` is sorted according to the fuzz score function ``fuzz.partial_ratio`` in ``rapidfuzz``. +- **EasyExactSemanticSearcher/EasyFuzzSemanticSearcher.__call__(self, learnware_list: List[Learnware], user_info: BaseUserInfo)-> SearchResults** + + - For keys ``Data``, ``Task``, ``Library`` and ``license``, two``semantic_spec`` keys are matched only if these values(only one value foreach key) of learnware ``semantic_spec`` exists in values(may be muliplevalues for one key) of user ``semantic_spec``. + - For the key ``Scenario``, two ``semantic_spec`` keys are matched iftheir values have nonempty intersections. + - For keys ``Name`` and ``Description``, the values are strings and caseis ignored. In ``EasyExactSemanticSearcher``, two ``semantic_spec`` keysare matched if these values of learnware ``semantic_spec`` is a substringof user ``semantic_spec``; In ``EasyFuzzSemanticSearcher``, first theexact semantic searcher is conducted like ``EasyExactSemanticSearcher``.If the result is empty, the fuzz semantic searcher is activated: the``learnware_list`` is sorted according to the fuzz score function ``fuzzpartial_ratio`` in ``rapidfuzz``. The results are returned storing in ``single_results`` of ``SearchResults``. @@ -44,19 +43,21 @@ Statistical Specification Search If you choose to provide your own statistical specification ``stat_info``, the Learnware Market can perform a more accurate leanware selection using ``EasyStatSearcher``. -- `__call__(self, learnware_list: List[Learnware], user_info: BaseUserInfo, max_search_num: int = 5, search_method: str = "greedy",) -> SearchResults` - - It searches for helpful learnwares from ``learnware_list`` based on the ``stat_info`` in ``user_info``. - - The result ``SingleSearchItem`` and ``MultipleSearchItem`` are both stored in ``SearchResults``. In ``SingleSearchItem``, it searches for single learnwares that could solve the user task; scores are also provided to represent the fitness of each single learnware and user task. In ``MultipleSearchItem``, it searches for a mixture of learnwares that could solve the user task better; the mixture learnware list and a score for the mixture is returned. - - The parameter ``search_method`` provides two choice of search strategies for mixture learnwares: ``greedy`` and ``auto``. For the search method ``greedy``, each time it chooses a learnware to make their mixture closer to the user's ``stat_info``; for the search method ``auto``, it directly calculates a best mixture weight for the ``learnware_list``. - - For single learnware search, we only return the learnwares with score larger than 0.6; For multiple learnware search, the parameter ``max_search_num`` specifies the maximum length of the returned mixture learnware list. +- **EasyStatSearcher.__call__(self, learnware_list: List[Learnware], user_info: BaseUserInfo, max_search_num: int = 5, search_method: str = "greedy",) -> SearchResults** + + - It searches for helpful learnwares from ``learnware_list`` based on the ``stat_info`` in ``user_info``. + - The result ``SingleSearchItem`` and ``MultipleSearchItem`` are both stored in ``SearchResults``. In ``SingleSearchItem``, it searches for single learnwares that could solve the user task; scores are also provided to represent the fitness of each single learnware and user task. In ``MultipleSearchItem``, it searches for a mixture of learnwares that could solve the user task better; the mixture learnware list and a score for the mixture is returned. + - The parameter ``search_method`` provides two choice of search strategies for mixture learnwares: ``greedy`` and ``auto``. For the search method ``greedy``, each time it chooses a learnware to make their mixture closer to the user's ``stat_info``; for the search method ``auto``, it directly calculates a best mixture weight for the ``learnware_list``. + - For single learnware search, we only return the learnwares with score larger than 0.6; For multiple learnware search, the parameter ``max_search_num`` specifies the maximum length of the returned mixture learnware list. Semantic and Statistical Specification Search ------------------------------------------------- The semantic specification search and statistical specification search have been Has been integrated into the same interface ``EasySearcher``. -- `__call__(self, user_info: BaseUserInfo, check_status: int = None, max_search_num: int = 5, search_method: str = "greedy",) -> SearchResults` - - It conducts the semantic seacher ``EasyFuzzsematicSearcher`` on all the learnwares from the ``organizer`` with the same ``check_status`` (All learnwares if ``check_status`` is None). If the result is not empty and the ``stat_info`` is provided in ``user_info``, then it conducts ``EasyStatSearcher``, and return the ``SearchResults``. +- **EasySearcher.__call__(self, user_info: BaseUserInfo, check_status: int = None, max_search_num: int = 5, search_method: str = "greedy",) -> SearchResults** + + - It conducts the semantic seacher ``EasyFuzzsematicSearcher`` on all the learnwares from the ``organizer`` with the same ``check_status`` (All learnwares if ``check_status`` is None). If the result is not empty and the ``stat_info`` is provided in ``user_info``, then it conducts ``EasyStatSearcher``, and return the ``SearchResults``. Hetero Search ====================== \ No newline at end of file