[ENH] 1. add load learnware in client. 2. add doc for clienttags/v0.3.2
| @@ -25,6 +25,7 @@ Document Structure | |||
| Introduction <start/intro.rst> | |||
| Quick Start <start/quick.rst> | |||
| Use Api <start/client.rst> | |||
| Installation <start/install.rst> | |||
| Experiments and Examples <start/performance.rst> | |||
| @@ -0,0 +1,176 @@ | |||
| ============================================================ | |||
| Learnware Client | |||
| ============================================================ | |||
| Introduction | |||
| ==================== | |||
| ``Learnware Client`` is a python api that provides a convenient interface for interacting with the official market. You can easily use the client to upload, download and search learnwares. | |||
| Installation | |||
| ==================== | |||
| ``Learnware Client`` is contained in the ``learnware`` package. You can install it using pip: | |||
| .. code-block:: bash | |||
| pip install learnware | |||
| Prepare access token | |||
| ==================== | |||
| Before using the ``Learnware Client``, you'll need to obtain a token from the `official website <https://www.lamda.nju.edu.cn/learnware/>`_. Just login to the website and click "client token" tab in the user center. | |||
| Use Client | |||
| ============================ | |||
| Initialize a Learware Client | |||
| ------------------------------- | |||
| .. code-block:: python | |||
| import learnware | |||
| from learnware.client import LearnwareClient | |||
| client = LearnwareClient() | |||
| # login to official market | |||
| client.login(email="your email", token="your token") | |||
| Upload Leanware | |||
| ------------------------------- | |||
| Before uploading a learnware, you'll need to prepare the semantic specification of your learnware. You can create a semantic specification by a helper function ``create_semantic_specification``. | |||
| .. code-block:: python | |||
| input_description = { | |||
| "Dimension": 16, | |||
| { | |||
| "Description": { | |||
| "0": "gender", | |||
| "1": "age", | |||
| "2": "f2", | |||
| "5": "f5" | |||
| } | |||
| } | |||
| } | |||
| output_description = { | |||
| "Dimension": 3, | |||
| "Description": { | |||
| "0": "the probability of being a cat", | |||
| "1": "the probability of being a dog", | |||
| "2": "the probability of being a bird" | |||
| } | |||
| } | |||
| semantic_spec = client.create_semantic_specification( | |||
| name="mylearnware1", | |||
| description="this is my learnware", | |||
| data_type="Table", | |||
| task_type="Classification", | |||
| library_type="Scikit-learn", | |||
| senarioes=["Business", "Financial"], | |||
| input_description, output_description) | |||
| # data_type, task_type, library_type, senarioes are enums, you can find possible values in `learnware.C` | |||
| After defining the semantic specification, | |||
| you can upload your learnware using ``upload_learnware`` function: | |||
| .. code-block:: python | |||
| learnware_id = client.upload_learnware( | |||
| semantic_spec=semantic_spec, | |||
| zip_path="path to your learnware zipfile") | |||
| Here, ``zip_path`` is the local path of your learnware zipfile. | |||
| Semantic Specification Search | |||
| ------------------------------- | |||
| You can search learnwares in official market using semantic specification. All the learnwares that match the semantic specification will be returned by the api. For example, the code below searches learnwares with `Table` data type: | |||
| .. code-block:: python | |||
| semantic_spec = client.create_semantic_specification( | |||
| name="", | |||
| description="", | |||
| data_type="Table", | |||
| task_type="", | |||
| library_type="", | |||
| senarioes=[], | |||
| input_description={}, output_description={}) | |||
| specification = learnware.specification.Specification() | |||
| specification.update_semantic_spec(specification) | |||
| learnware_list = client.search_learnware(specification) | |||
| Statistical Specification Search | |||
| --------------------------------- | |||
| You can search learnware by providing a statistical specification. The statistical specification is a json file that contains the statistical information of your training data. For example, the code below searches learnwares with `RKMEStatSpecification`: | |||
| .. code-block:: python | |||
| import learnware.specification as specification | |||
| user_spec = specification.rkme.RKMEStatSpecification() | |||
| user_spec.load(os.path.join(unzip_path, "rkme.json")) | |||
| specification = learnware.specification.Specification() | |||
| specification.update_stat_spec(user_spec) | |||
| learnware_list = client.search_learnware(specification) | |||
| # you can view the scores of the searched learnwares | |||
| for learnware in learnware_list: | |||
| print(f'learnware_id: {learnware["learnware_id"]}, score: {learnware["matching"]}') | |||
| Combine Semantic and Statistical Search | |||
| ---------------------------------------- | |||
| You can provide both semantic and statistical specification to search learnwares. The engine will first filter learnwares by semantic specification and then search by statistical specification. For example, the code below searches learnwares with `Table` data type and `RKMEStatSpecification`: | |||
| .. code-block:: python | |||
| semantic_spec = client.create_semantic_specification( | |||
| name="", | |||
| description="", | |||
| data_type="Table", | |||
| task_type="", | |||
| library_type="", | |||
| senarioes=[], | |||
| input_description={}, output_description={}) | |||
| stat_spec = specification.rkme.RKMEStatSpecification() | |||
| stat_spec.load(os.path.join(unzip_path, "rkme.json")) | |||
| specification = learnware.specification.Specification() | |||
| specification.update_semantic_spec(semantic_spec) | |||
| specification.update_stat_spec(stat_spec) | |||
| learnware_list = client.search_learnware(specification) | |||
| Download and Use Learnware | |||
| ------------------------------- | |||
| When you get a learnware id, you can download and initiate the learnware with the following code: | |||
| .. code-block:: python | |||
| client.download_learnware(learnware_id, zip_path) | |||
| client.install_environment(zip_path) | |||
| learnware = client.load_learnware(zip_path) | |||
| # you can use the learnware to make prediction now | |||
| @@ -1,2 +1,2 @@ | |||
| from .learnware_client import LearnwareClient | |||
| from .learnware_client import LearnwareClient, SemanticSpecificationKey | |||
| @@ -1,11 +1,17 @@ | |||
| from ..specification import Specification | |||
| from ..config import C | |||
| from .. import learnware | |||
| from ..market.easy import EasyMarket | |||
| from . import package_utils | |||
| import requests | |||
| import json | |||
| from tqdm import tqdm | |||
| import hashlib | |||
| import os | |||
| import tempfile | |||
| import zipfile | |||
| import yaml | |||
| from enum import Enum | |||
| CHUNK_SIZE = 1024 * 1024 | |||
| @@ -39,6 +45,13 @@ def compute_file_hash(file_path): | |||
| return file_hash.hexdigest() | |||
| class SemanticSpecificationKey(Enum): | |||
| DATA_TYPE = "Data" | |||
| TASK_TYPE = "Task" | |||
| LIBRARY_TYPE = "Library" | |||
| SENARIOES = "Scenario" | |||
| pass | |||
| class LearnwareClient: | |||
| def __init__(self, host=None): | |||
| self.headers = None | |||
| @@ -52,14 +65,10 @@ class LearnwareClient: | |||
| self.chunk_size = 1024 * 1024 | |||
| pass | |||
| def login(self, email, password, hash_password=True): | |||
| url = f"{self.host}/auth/login" | |||
| if hash_password: | |||
| password = hashlib.md5(password.encode()).hexdigest() | |||
| pass | |||
| def login(self, email, token): | |||
| url = f"{self.host}/auth/login_by_token" | |||
| response = requests.post(url, json={'email': email, 'password': password}) | |||
| response = requests.post(url, json={'email': email, 'token': token}) | |||
| result = response.json() | |||
| if result['code'] != 0: | |||
| @@ -84,7 +93,7 @@ class LearnwareClient: | |||
| def upload_learnware(self, semantic_specification, learnware_file): | |||
| file_hash = compute_file_hash(learnware_file) | |||
| url_upload = f"{self.host}/storage/chunked_upload" | |||
| url_upload = f"{self.host}/user/chunked_upload" | |||
| num_chunks = os.path.getsize(learnware_file) // CHUNK_SIZE + 1 | |||
| bar = tqdm(total=num_chunks, desc="Uploading", unit="MB") | |||
| @@ -107,7 +116,7 @@ class LearnwareClient: | |||
| pass | |||
| bar.close() | |||
| url_add = f"{self.host}/storage/add_learnware_uploaded" | |||
| url_add = f"{self.host}/user/add_learnware_uploaded" | |||
| response = requests.post(url_add, json={ | |||
| "file_hash": file_hash, | |||
| @@ -159,13 +168,13 @@ class LearnwareClient: | |||
| return learnware_list | |||
| @require_login | |||
| def search_learnware(self, specification: Specification): | |||
| def search_learnware(self, specification: Specification, page_size=10, page_index=0): | |||
| url = f"{self.host}/engine/search_learnware" | |||
| stat_spec = specification.get_stat_spec() | |||
| if len(stat_spec) > 1: | |||
| raise Exception("statistical specification must have only one key.") | |||
| if len(stat_spec) == 1: | |||
| stat_spec = list(stat_spec.values())[0] | |||
| else: | |||
| @@ -195,7 +204,7 @@ class LearnwareClient: | |||
| response = requests.post( | |||
| url, files=files, | |||
| data={"semantic_specification": json.dumps(specification.get_semantic_spec())}, | |||
| data={"semantic_specification": json.dumps(specification.get_semantic_spec()), "limit": page_size, "page": page_index}, | |||
| headers=self.headers) | |||
| result = response.json() | |||
| @@ -226,4 +235,139 @@ class LearnwareClient: | |||
| if result['code'] != 0: | |||
| raise Exception('delete failed: ' + json.dumps(result)) | |||
| pass | |||
| def check_learnware(self, path, semantic_specification): | |||
| if os.path.isfile(path): | |||
| with tempfile.TemporaryDirectory() as tempdir: | |||
| with zipfile.ZipFile(path, "r") as z_file: | |||
| z_file.extractall(tempdir) | |||
| pass | |||
| return self.check_learnware_folder(tempdir, semantic_specification) | |||
| pass | |||
| else: | |||
| return self.check_learnware_folder(path, semantic_specification) | |||
| pass | |||
| pass | |||
| def check_learnware_folder(self, folder, semantic_specification): | |||
| learnware_obj = learnware.get_learnware_from_dirpath('test_id', semantic_specification, folder) | |||
| check_result = EasyMarket.check_learnware(learnware_obj) | |||
| if check_result == EasyMarket.USABLE_LEARWARE: | |||
| return True | |||
| else: | |||
| return False | |||
| pass | |||
| def create_semantic_specification( | |||
| self, name, description, data_type, task_type, library_type, senarioes, input_description, | |||
| output_description): | |||
| semantic_specification = dict() | |||
| semantic_specification["Input"] = input_description | |||
| semantic_specification["Output"] = output_description | |||
| semantic_specification["Data"] = {"Type": "Class", "Values": [data_type]} | |||
| semantic_specification["Task"] = {"Type": "Class", "Values": [task_type]} | |||
| semantic_specification["Library"] = {"Type": "Class", "Values": [library_type]} | |||
| semantic_specification["Scenario"] = {"Type": "Tag", "Values": senarioes} | |||
| semantic_specification["Name"] = {"Type": "String", "Values": name} | |||
| semantic_specification["Description"] = {"Type": "String", "Values": description} | |||
| return semantic_specification | |||
| def list_semantic_specification_values(self, key: SemanticSpecificationKey): | |||
| url = f"{self.host}/engine/semantic_specification" | |||
| response = requests.get(url, headers=self.headers) | |||
| result = response.json() | |||
| semantic_conf = result['data']['semantic_specification'] | |||
| return semantic_conf[key.value]['Values'] | |||
| def load_learnware(self, learnware_file: str, load_model: bool=True): | |||
| with tempfile.TemporaryDirectory(prefix='learnware_') as tempdir: | |||
| with zipfile.ZipFile(learnware_file, "r") as z_file: | |||
| z_file.extractall(tempdir) | |||
| pass | |||
| yaml_file = C.learnware_folder_config["yaml_file"] | |||
| with open(os.path.join(tempdir, yaml_file), "r") as fin: | |||
| learnware_info = yaml.safe_load(fin) | |||
| pass | |||
| learnware_id = learnware_info.get('id') | |||
| if learnware_id is None: | |||
| learnware_id = "test_id" | |||
| pass | |||
| semantic_specification = learnware_info.get('semantic_specification') | |||
| if semantic_specification is None: | |||
| semantic_specification = {} | |||
| pass | |||
| else: | |||
| semantic_file = semantic_specification.get('file_name') | |||
| with open(os.path.join(tempdir, semantic_file), "r") as fin: | |||
| semantic_specification = json.load(fin) | |||
| pass | |||
| pass | |||
| learnware_obj = learnware.get_learnware_from_dirpath(learnware_id, semantic_specification, tempdir) | |||
| if load_model: | |||
| learnware_obj.instantiate_model() | |||
| pass | |||
| return learnware_obj | |||
| pass | |||
| pass | |||
| def system(self, command): | |||
| retcd = os.system(command) | |||
| if retcd != 0: | |||
| raise RuntimeError(f"Command {command} failed with return code {retcd}") | |||
| pass | |||
| def install_environment(self, zip_path, conda_env=None): | |||
| '''install environment of a learnware | |||
| @param: zip_path: path of the learnware zip file | |||
| @param: conda_env: if it is not None, a new conda environment will be created with the given name | |||
| if it is None, use current environment | |||
| ''' | |||
| with tempfile.TemporaryDirectory(prefix='learnware_') as tempdir: | |||
| with zipfile.ZipFile(zip_path, "r") as z_file: | |||
| print(z_file.namelist) | |||
| if 'environment.yaml' in z_file.namelist(): | |||
| z_file.extract('environment.yaml', tempdir) | |||
| yaml_path = os.path.join(tempdir, 'environment.yaml') | |||
| yaml_path_filter = os.path.join(tempdir, 'environment_filter.yaml') | |||
| package_utils.filter_nonexist_conda_packages_file(yaml_path, yaml_path_filter) | |||
| # create environment | |||
| if conda_env is not None: | |||
| self.system(f'conda env update --name {conda_env} --file {yaml_path_filter}') | |||
| pass | |||
| else: | |||
| self.system(f'conda env update --file {yaml_path_filter}') | |||
| pass | |||
| pass | |||
| elif 'requirements.txt' in z_file.namelist(): | |||
| z_file.extract('requirements.txt', tempdir) | |||
| requirements_path = os.path.join(tempdir, 'requirements.txt') | |||
| requirements_path_filter = os.path.join(tempdir, 'requirements_filter.txt') | |||
| package_utils.filter_nonexist_pip_packages_file(requirements_path, requirements_path_filter) | |||
| if conda_env is not None: | |||
| self.system(f'conda create --name {conda_env}') | |||
| self.system(f'conda run --no-capture-output python3 -m pip install -r {requirements_path_filter}') | |||
| else: | |||
| self.system(f'python3 -m pip install -r {requirements_path_filter}') | |||
| pass | |||
| pass | |||
| else: | |||
| raise Exception("environment.yaml or requirements.txt not found in the learnware zip file.") | |||
| pass | |||
| pass | |||
| pass | |||
| pass | |||
| @@ -0,0 +1,184 @@ | |||
| from typing import List, Tuple | |||
| import subprocess | |||
| import yaml | |||
| import os | |||
| import time | |||
| def try_to_run(args, timeout=5, retry=5): | |||
| sucess = False | |||
| for i in range(retry): | |||
| try: | |||
| subprocess.check_call(args=args, timeout=timeout) | |||
| sucess = True | |||
| break | |||
| except subprocess.TimeoutExpired as e: | |||
| pass | |||
| pass | |||
| if not sucess: | |||
| raise subprocess.TimeoutExpired(args, timeout) | |||
| pass | |||
| def parse_pip_requirement(line: str): | |||
| '''parse pip requirement line to package name | |||
| ''' | |||
| line = line.strip() | |||
| if len(line) == 0: | |||
| return None | |||
| if line[0] in ('#', '-'): | |||
| return None | |||
| package_str = line | |||
| for split_ch in ('=', '>', '<', '!', '~', ' '): | |||
| split_ch_index = package_str.find(split_ch) | |||
| if split_ch_index != -1: | |||
| package_str = package_str[:split_ch_index] | |||
| pass | |||
| pass | |||
| return package_str | |||
| def read_pip_packages_from_requirements(requirements_file: str) -> List[str]: | |||
| '''read requiremnts.txt and parse it to list | |||
| ''' | |||
| packages = [] | |||
| lines = [] | |||
| with open(requirements_file, 'r') as fin: | |||
| for line in fin: | |||
| package_str = parse_pip_requirement(line) | |||
| packages.append(package_str) | |||
| lines.append(line) | |||
| pass | |||
| return packages, lines | |||
| def filter_nonexist_pip_packages(packages: list) -> Tuple[List[str], List[str]]: | |||
| '''filter non-exist pip requirements | |||
| Returns: | |||
| exist_packages: list of exist packages | |||
| nonexist_packages: list of non-exist packages | |||
| ''' | |||
| exist_packages = [] | |||
| nonexist_packages = [] | |||
| for package in packages: | |||
| try: | |||
| # os.system("python3 -m pip index versions {0}".format(package)) | |||
| print('check package existence: {0}'.format(package)) | |||
| try_to_run(args=["python3", "-m", "pip", "index", "versions", package], timeout=5) | |||
| exist_packages.append(package) | |||
| except Exception as e: | |||
| print(e) | |||
| nonexist_packages.append(package) | |||
| pass | |||
| pass | |||
| return exist_packages, nonexist_packages | |||
| def filter_nonexist_conda_packages(packages: list) -> Tuple[List[str], List[str]]: | |||
| '''filter non-exist conda requirements | |||
| Returns: | |||
| exist_packages: list of exist packages | |||
| nonexist_packages: list of non-exist packages | |||
| ''' | |||
| exist_packages = [] | |||
| nonexist_packages = [] | |||
| for package in packages: | |||
| try: | |||
| try_to_run(args=["conda", "search", package], timeout=5) | |||
| exist_packages.append(package) | |||
| except Exception as e: | |||
| nonexist_packages.append(package) | |||
| pass | |||
| pass | |||
| return exist_packages, nonexist_packages | |||
| def read_conda_packages_from_dict( | |||
| env_desc: dict) -> Tuple[List[str], List[str]]: | |||
| ''' | |||
| :param env_desc: dict of environment description | |||
| :return conda packages: list of conda packages | |||
| :return pip packages: list of pip packages | |||
| ''' | |||
| conda_packages = env_desc.get('dependencies') | |||
| if conda_packages is None: | |||
| conda_packages = [] | |||
| pip_packages = [] | |||
| pass | |||
| else: | |||
| pip_packages = [] | |||
| conda_packages_ = [] | |||
| for package in conda_packages: | |||
| if isinstance(package, dict) and 'pip' in package: | |||
| pip_packages = package['pip'] | |||
| pip_packages = [parse_pip_requirement(line) for line in pip_packages] | |||
| pass | |||
| elif isinstance(package, str): | |||
| conda_packages_.append(package) | |||
| pass | |||
| pass | |||
| conda_packages = conda_packages_ | |||
| pass | |||
| return conda_packages, pip_packages | |||
| pass | |||
| def filter_nonexist_conda_packages_file(yaml_file: str, output_yaml_file: str): | |||
| with open(yaml_file, 'r') as fin: | |||
| env_desc = yaml.safe_load(fin) | |||
| pass | |||
| conda_packages, pip_packages = read_conda_packages_from_dict(env_desc) | |||
| conda_packages, nonexist_conda_packages = filter_nonexist_conda_packages(conda_packages) | |||
| pip_packages, nonexist_pip_packages = filter_nonexist_pip_packages(pip_packages) | |||
| env_desc['dependencies'] = conda_packages | |||
| if len(pip_packages) > 0: | |||
| env_desc['dependencies'].append({'pip': pip_packages}) | |||
| pass | |||
| with open(output_yaml_file, 'w') as fout: | |||
| yaml.safe_dump(env_desc, fout) | |||
| pass | |||
| return conda_packages, pip_packages, nonexist_conda_packages, nonexist_pip_packages | |||
| pass | |||
| def filter_nonexist_pip_packages_file(requirements_file: str, output_file: str): | |||
| packages, lines = read_pip_packages_from_requirements(requirements_file) | |||
| exist_packages, nonexist_packages = filter_nonexist_pip_packages(packages) | |||
| exist_packages = set(exist_packages) | |||
| with open(output_file, 'w') as fout: | |||
| for package, line in zip(packages, lines): | |||
| if package is not None and package in exist_packages: | |||
| fout.write(line + '\n') | |||
| pass | |||
| pass | |||
| pass | |||
| pass | |||
| print(f"exist packages: {packages}") | |||
| return exist_packages, nonexist_packages | |||
| pass | |||
| @@ -140,7 +140,7 @@ _DEFAULT_CONFIG = { | |||
| }, | |||
| "database_url": f"sqlite:///{DATABASE_PATH}", | |||
| "max_reduced_set_size": 1310720, | |||
| "backend_host": "http://36.111.128.21:30008" | |||
| "backend_host": "http://www.lamda.nju.edu.cn/learnware/api" | |||
| } | |||
| C = Config(_DEFAULT_CONFIG) | |||