|
- `Learn the Basics <Basics.html>`_ ||
- `Quick Start <Quick-Start.html>`_ ||
- **Dataset & Data Structure** ||
- `Learning Part <Learning.html>`_ ||
- `Reasoning Part <Reasoning.html>`_ ||
- `Evaluation Metrics <Evaluation.html>`_ ||
- `Bridge <Bridge.html>`_
-
-
- Dataset & Data Structure
- ========================
-
- In this section, we will look at the datasets and data structures in ABL-Package.
-
- .. code:: python
-
- # Import necessary libraries and modules
- import torch
- from abl.structures import ListData
-
- Dataset
- -------
-
- ABL-Package assumes user data to be structured as a tuple, comprising the following three components:
-
- - ``X``: List[List[Any]]
-
- A list of sublists representing the input data. We refer to each sublist in ``X`` as an example and each example may contain several instances.
-
- - ``gt_pseudo_label``: List[List[Any]], optional
-
- A list of sublists with each sublist representing a ground truth pseudo-label example. Each example consists of ground truth pseudo-labels for each **instance** within a example of ``X``.
-
- .. note::
-
- ``gt_pseudo_label`` is only used to evaluate the performance of the learning part but not to train the model. If the pseudo-label of the instances in the datasets are unlabeled, ``gt_pseudo_label`` can be ``None``.
-
- - ``Y``: List[Any]
-
- A list representing the ground truth reasoning result for each **example** in ``X``.
-
-
- .. warning::
-
- The length of ``X``, ``gt_pseudo_label`` (if not ``None``) and ``Y`` should be the same. Also, each sublist in ``gt_pseudo_label`` should have the same length as the sublist in ``X``.
-
- As an illustration, in the MNIST Addition example, the data used for training are organized as follows:
-
- .. image:: ../img/Datasets_1.png
- :width: 350px
- :align: center
-
- Data Structure
- --------------
-
- In Abductive Learning, there are various types of data in the training and testing process, such as raw data, pseudo-label, index of the pseudo-label, abduced pseudo-label, etc. To enhance the stability and versatility, ABL-Package uses `abstract data interfaces <../API/abl.structures.html>`_ to encapsulate various data during the implementation of the model.
-
- One of the most commonly used abstract data interface is ``ListData``. Besides orginizing data into tuple, we can also prepare data to be in the form of this data interface.
-
- .. code-block:: python
-
- # prepare data
- X = [list(torch.randn(3, 28, 28)), list(torch.randn(3, 28, 28))]
- gt_pseudo_label = [[1, 2, 3], [4, 5, 6]]
- Y = [1, 2]
-
- # convert data into ListData
- data = ListData(X=X, Y=Y, gt_pseudo_label=gt_pseudo_label)
-
- # get data
- X = data.X
- Y = data.Y
- gt_pseudo_label = data.gt_pseudo_label
-
- # set data
- data.X = X
- data.Y = Y
- data.gt_pseudo_label = gt_pseudo_label
|