You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

submit.rst 6.9 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179
  1. ==========================================
  2. Learnware Preparation and Submission
  3. ==========================================
  4. In this section, we provide a comprehensive guide on submitting your custom learnware to the Learnware Market.
  5. We will first discuss the necessary components of a valid learnware, followed by a detailed explanation on how to upload and remove learnwares within ``Learnware Market``.
  6. Prepare Learnware
  7. ====================
  8. A valid learnware is encapsulated in a zipfile, comprising four essential components.
  9. Below, we illustrate the detailed structure of a learnware zipfile.
  10. ``__init__.py``
  11. ---------------
  12. Within ``Learnware Market``, every uploader must provide a unified set of interfaces for their model,
  13. facilitating easy utilization for future users.
  14. The ``__init__.py`` file serves as the Python interface for your model's fitting, prediction, and fine-tuning processes.
  15. For example, the code snippet below is used to train and save a SVM model for a sample dataset on sklearn digits classification:
  16. .. code-block:: python
  17. import joblib
  18. from sklearn.datasets import load_digits
  19. from sklearn.model_selection import train_test_split
  20. X, y = load_digits(return_X_y=True)
  21. data_X, _, data_y, _ = train_test_split(X, y, test_size=0.3, shuffle=True)
  22. # input dimension: (64, ), output dimension: (10, )
  23. clf = svm.SVC(kernel="linear", probability=True)
  24. clf.fit(data_X, data_y)
  25. joblib.dump(clf, "svm.pkl") # model is stored as file "svm.pkl"
  26. Then the corresponding ``__init__.py`` for this SVM model should be structured as follows:
  27. .. code-block:: python
  28. import os
  29. import joblib
  30. import numpy as np
  31. from learnware.model import BaseModel
  32. class SVM(BaseModel):
  33. def __init__(self):
  34. super(SVM, self).__init__(input_shape=(64,), output_shape=(10,))
  35. dir_path = os.path.dirname(os.path.abspath(__file__))
  36. self.model = joblib.load(os.path.join("svm.pkl"))
  37. def fit(self, X: np.ndarray, y: np.ndarray):
  38. pass
  39. def predict(self, X: np.ndarray) -> np.ndarray:
  40. return self.model.predict_proba(X)
  41. def finetune(self, X: np.ndarray, y: np.ndarray):
  42. pass
  43. Please remember to specify the ``input_shape`` and ``output_shape`` corresponding to your model.
  44. In our sklearn digits classification example, these would be (64,) and (10,) respectively.
  45. ``stat.json``
  46. -------------
  47. To accurately and effectively match users with appropriate learnwares for their tasks, we require information about your training dataset.
  48. Specifically, you are required to provide a statistical specification
  49. stored as a json file, such as ``stat.json``, which contains the statistical information of the dataset.
  50. This json file meets all our requirements regarding your training data, so you don't need to upload the local original data.
  51. There are various methods to generate a statistical specification.
  52. If you choose to use Reduced Kernel Mean Embedding (RKME) as your statistical specification,
  53. the following code snippet offers guidance on how to construct and store the RKME of a dataset:
  54. .. code-block:: python
  55. import learnware.specification as specification
  56. # generate rkme specification for digits dataset
  57. spec = specification.utils.generate_rkme_spec(X=data_X)
  58. spec.save("stat.json")
  59. Significantly, the RKME generation process is entirely conducted on your local machine, without any involvement of cloud services,
  60. guaranteeing the security and privacy of your local original data.
  61. ``learnware.yaml``
  62. ------------------
  63. Additionally, you are asked to prepare a configuration file in YAML format.
  64. The file should detail your model's class name, the type of statistical specification(e.g. Reduced Kernel Mean Embedding, ``RKMEStatSpecification``), and
  65. the file name of your statistical specification file. The following ``learnware.yaml`` provides an example of
  66. how your learnware configuration file should be structured, based on our previous discussion:
  67. .. code-block:: yaml
  68. model:
  69. class_name: SVM
  70. kwargs: {}
  71. stat_specifications:
  72. - module_path: learnware.specification
  73. class_name: RKMEStatSpecification
  74. file_name: stat.json
  75. kwargs: {}
  76. ``environment.yaml`` or ``requirements.txt``
  77. --------------------------------------------
  78. In order to allow others to execute your learnware, it's necessary to specify your model's dependencies.
  79. You can do this by providing either an ``environment.yaml`` file or a ``requirements.txt`` file.
  80. - ``environment.yaml`` for conda:
  81. If you provide an ``environment.yaml``, a new conda environment will be created based on this file
  82. when users install your learnware. You can generate this yaml file using the following command:
  83. .. code-block::
  84. conda env export | grep -v "^prefix: " > environment.yaml
  85. - ``requirements.txt`` for pip:
  86. If you provide a ``requirements.txt``, the dependent packages will be installed using the `-r` option of pip.
  87. You can find more information about ``requirements.txt`` in
  88. `pip documentation <https://pip.pypa.io/en/stable/user_guide/#requirements-files>`_.
  89. We recommend using ``environment.yaml`` as it can help minimize conflicts between different packages.
  90. .. note::
  91. Whether you choose to use ``environment.yaml`` or ``requirements.txt``,
  92. it's important to keep your dependencies as minimal as possible.
  93. This may involve manually opening the file and removing any unnecessary packages.
  94. Upload Learnware
  95. ==================
  96. After preparing the four required files mentioned above,
  97. you can bundle them into your own learnware zipfile. Along with the generated semantic specification that
  98. succinctly describes the features of your task and model (for more details, please refer to :ref:`semantic specification<components/spec:Semantic Specification>`),
  99. you can effortlessly upload your learnware to the ``Learnware Market`` using a single line of code:
  100. .. code-block:: python
  101. import learnware
  102. from learnware.market import EasyMarket
  103. learnware.init()
  104. # EasyMarket: most basic set of functions in a Learnware Market
  105. easy_market = EasyMarket(market_id="demo", rebuild=True)
  106. # single line uploading
  107. easy_market.add_learnware(zip_path, semantic_spec)
  108. Here, ``zip_path`` refers to the directory of your learnware zipfile.
  109. Remove Learnware
  110. ==================
  111. As administrators of the ``Learnware Market``, it's crucial to remove learnwares that exhibit suspicious uploading motives.
  112. Once you have the necessary permissions and approvals, you can use the following code to remove a learnware
  113. from the ``Learnware Market``:
  114. .. code-block:: python
  115. easy_market.delete_learnware(learnware_id)
  116. Here, ``learnware_id`` refers to the market ID of the learnware to be removed.

基于学件范式,全流程地支持学件上传、检测、组织、查搜、部署和复用等功能。同时,该仓库作为北冥坞系统的引擎,支撑北冥坞系统的核心功能。