[DOC] change structure

2 years ago · 24ada2c4e6
--- a/docs/Intro/Basics.rst
+++ b/docs/Intro/Basics.rst
@@ -0,0 +1,44 @@
 Learn the Basics
 ================

 In a typical Abductive Learning process, as illustrated below, 
 data inputs are first mapped to pseudo labels through a machine learning model. 
 These pseudo labels then pass through a knowledge base :math:`\mathcal{KB}`
 to obtain the logical result by deductive reasoning. During training, 
 alongside the aforementioned forward flow (i.e., prediction --> deduction reasoning), 
 there also exists a reverse flow, which starts from the logical result and 
 involves abductive reasoning to generate pseudo labels. 
 Subsequently, these labels are processed to minimize inconsistencies with machine learning, 
 which in turn revise the outcomes of the machine learning model, and then 
 fed back into the machine learning model for further training. 
 To implement this process, the following four steps are necessary:

 .. image:: ../img/ABL-Package.jpg

 1. Prepare datasets

    Prepare the data's input, ground truth for pseudo labels (optional), and ground truth for logical results.

 2. Build machine learning part

    Build a model that defines how to map input to pseudo labels. 
    And use ``ABLModel`` to encapsulate the model.

 3. Build the reasoning part

    Build a knowledge base by creating a subclass of ``KBBase``,
    and instantiate a ``ReasonerBase`` for minimizing of inconsistencies 
    between the knowledge base and pseudo labels.

 4. Define Evaluation Metrics

    Define the metrics for measuring accuracy by inheriting from ``BaseMetric``.

 5. Bridge machine learning and reasoning

    Use ``SimpleBridge`` to bridge the machine learning and reasoning part
    for integrated training and testing. 




--- a/docs/Intro/Bridge.rst
+++ b/docs/Intro/Bridge.rst
@@ -0,0 +1,46 @@
 .. _

 Bridge the machine learning and reasoning parts
 ===============================================

 We next need to bridge the machine learning and reasoning parts. In ABL-Package, the ``BaseBridge`` class gives necessary abstract interface definitions to bridge the two parts and ``SimpleBridge`` provides a basic implementation. 
 We build a bridge with previously defined ``model``, ``reasoner``, and ``metric_list`` as follows:

 .. code:: python

    bridge = SimpleBridge(model, reasoner, metric_list)

 ``BaseBridge.train`` and ``BaseBridge.test`` trigger the training and testing processes, respectively.

 The two methods take the previous prepared ``train_data`` and ``test_data`` as input.

 .. code:: python

    bridge.train(train_data)
    bridge.test(test_data)

 Aside from data, ``BaseBridge.train`` can also take some other training configs shown as follows:

 .. code:: python

    bridge.train(
        # training data
        train_data,
        # number of Abductive Learning loops
        loops=5,
        # data will be divided into segments and each segment will be used to train the model iteratively
        segment_size=10000,
        # evaluate the model every eval_interval loops
        eval_interval=1,
        # save the model every save_interval loops
        save_interval=1,
        # directory to save the model
        save_dir='./save_dir',
    )

 In the MNIST Add example, the code to train and test looks like

 .. code:: python

    bridge.train(train_data, loops=5, segment_size=10000, save_interval=1, save_dir=weights_dir)
    bridge.test(test_data)
--- a/docs/Intro/Datasets.rst
+++ b/docs/Intro/Datasets.rst
@@ -0,0 +1,13 @@
 Prepare datasets
 ================

 Next, we need to build datasets. ABL-Package assumes data to be in the form of ``(X, gt_pseudo_label, Y)`` where ``X`` is the input of the machine learning model, ``Y`` is the ground truth of the reasoning result and ``gt_pseudo_label`` is the ground truth label of each element in ``X``. ``X`` should be of type ``List[List[Any]]``, ``Y`` should be of type ``List[Any]`` and ``gt_pseudo_label`` can be ``None`` or of the type ``List[List[Any]]``. 

 In the MNIST Add example, the data loading looks like

 .. code:: python

    # train_data and test_data are all tuples consist of X, gt_pseudo_label and Y.
    train_data = get_mnist_add(train=True, get_pseudo_label=True)
    test_data = get_mnist_add(train=False, get_pseudo_label=True)

--- a/docs/Intro/Evaluation.rst
+++ b/docs/Intro/Evaluation.rst
@@ -0,0 +1,12 @@
 Define Evaluation Metrics
 =========================

 To validate and test the model, we need to inherit from ``BaseMetric`` to define metrics and implement the ``process`` and ``compute_metrics`` methods where the process method accepts a batch of outputs. After processing this batch of data, we save the information to ``self.results`` property. The input results of ``compute_metrics`` is all the information saved in ``process``. Use these information to calculate and return a dict that holds the results of the evaluation metrics. 

 We provide two basic metrics, namely ``SymbolMetric`` and ``SemanticsMetric``, which are used to evaluate the accuracy of the machine learning model's predictions and the accuracy of the ``logic_forward`` results, respectively.

 In the case of MNIST Add example, the metric definition looks like

 .. code:: python

    metric_list = [SymbolMetric(prefix="mnist_add"), SemanticsMetric(kb=kb, prefix="mnist_add")]
--- a/docs/Intro/Learning.rst
+++ b/docs/Intro/Learning.rst
@@ -0,0 +1,46 @@
 Build the machine learning part
 ===============================

 First, we build the machine learning part, which needs to be wrapped in the ``ABLModel`` class. We can use machine learning models from scikit-learn or based on PyTorch to create an instance of ``ABLModel``. 

 - for a scikit-learn model, we can directly use the model to create an instance of ``ABLModel``. For example, we can customize our machine learning model by

  .. code:: python

      # Load a scikit-learn model
      base_model = sklearn.neighbors.KNeighborsClassifier(n_neighbors=3)

      model = ABLModel(base_model)

 - for a PyTorch-based neural network, we first need to encapsulate it within a ``BasicNN`` object and then use this object to instantiate an instance of ``ABLModel``.  For example, we can customize our machine learning model by

  .. code:: python

      # Load a PyTorch-based neural network
      cls = torchvision.models.resnet18(pretrained=True)

      # criterion and optimizer are used for training
      criterion = torch.nn.CrossEntropyLoss() 
      optimizer = torch.optim.Adam(cls.parameters())

      base_model = BasicNN(cls, criterion, optimizer)
      model = ABLModel(base_model)

 In the MNIST Add example, the machine learning model looks like

 .. code:: python

    cls = LeNet5(num_classes=10)
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    criterion = torch.nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(cls.parameters(), lr=0.001, betas=(0.9, 0.99))

    base_model = BasicNN(
        cls,
        criterion,
        optimizer,
        device=device,
        batch_size=32,
        num_epochs=1,
    )
    model = ABLModel(base_model)
--- a/docs/Overview/Quick
+++ b/docs/Overview/Quick
@@ -24,8 +24,10 @@ where ``X`` is the input of the machine learning model,

 In the ``get_mnist_add`` above, the return values are tuples of ``(X, gt_pseudo_label, Y)``.

 Machine Learning (Map input to pseudo labels)
 ---------------------------------------------
 Read more about `prepare datasets <Datasets.html>`_.

 Build Machine Learning Models
 -----------------------------

 We use a simple LeNet5 model to recognize the pseudo labels (numbers) in the images. 
 We first build the model and define its corresponding criterion and optimizer for training.
@@ -50,6 +52,8 @@ Afterward, we wrap it in ``ABLModel``.
    base_model = BasicNN(cls, criterion, optimizer, device)
    model = ABLModel(base_model)

 Read more about `build machine learning models <Learning.html>`_.

 Reasoning (Map pseudo labels to reasoning results)
 --------------------------------------------------

@@ -76,26 +80,28 @@ how to minimize the inconsistency between the knowledge base and machine learnin

 .. code:: python

    reasoner = ReasonerBase(kb, dist_func="confidence")    
    reasoner = ReasonerBase(kb, dist_func="confidence")  

 Read more about `build the reasoning part <Reasoning.html>`_.  

 Bridge Machine Learning and Reasoning
 -------------------------------------

 First, we use ``SimpleBridge`` to combine machine learning and reasoning together,
 setting the stage for subsequent integrated training, validation, and testing.
 Before bridging, we first define the metrics to measure accuracy during validation and testing.

 .. code:: python

    from abl.bridge import SimpleBridge
    from abl.evaluation import SemanticsMetric, SymbolMetric

    metric_list = [SymbolMetric(prefix="mnist_add"), SemanticsMetric(kb=kb, prefix="mnist_add")]


 Next, we define the metrics to measure accuracy during validation and testing.
 Now, we may use ``SimpleBridge`` to combine machine learning and reasoning together,
 setting the stage for subsequent integrated training, validation, and testing.

 .. code:: python

    from abl.evaluation import SemanticsMetric, SymbolMetric

    metric_list = [SymbolMetric(prefix="mnist_add"), SemanticsMetric(kb=kb, prefix="mnist_add")]
    from abl.bridge import SimpleBridge

 Finally, we proceed with testing and training.

@@ -103,3 +109,5 @@ Finally, we proceed with testing and training.

    bridge.train(train_data, loops=5, segment_size=10000)
    bridge.test(test_data)

 Read more about `defining evaluation metrics <Evaluation.html>`_ and `bridge machine learning and reasoning <Bridge.html>`_.
--- a/docs/Intro/Reasoning.rst
+++ b/docs/Intro/Reasoning.rst
@@ -0,0 +1,3 @@
 Build the reasoning part
 ========================

--- a/docs/Overview/Abductive
+++ b/docs/Overview/Abductive
@@ -1,9 +1,6 @@
 Abductive Learning
 ==================

 Integrating the Power of Machine Learning and Logical Reasoning
 ---------------------------------------------------------------

 Traditional supervised machine learning, e.g. classification, is
 predominantly data-driven, as shown in the below figure. 
 Here, a set of training examples :math:`\left\{\left(x_1, y_1\right), 
@@ -61,23 +58,19 @@ is dual-driven by both data and domain knowledge, integrating and
 balancing the use of machine learning and logical reasoning in a unified
 model.

 What is Abductive Reasoning?
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 .. admonition:: What is Abductive Reasoning?

 Abductive reasoning, also known as abduction, refers to the process of
 selectively inferring certain facts and hypotheses that explain
 phenomena and observations based on background knowledge. Unlike
 deductive reasoning, which leads to definitive conclusions, abductive
 reasoning may arrive at conclusions that are plausible but not conclusively
 proven. It is often described as an ‘inference to the best explanation.’
   Abductive reasoning, also known as abduction, refers to the process of
   selectively inferring certain facts and hypotheses that explain
   phenomena and observations based on background knowledge. Unlike
   deductive reasoning, which leads to definitive conclusions, abductive
   reasoning may arrive at conclusions that are plausible but not conclusively
   proven.

 In Abductive Learning, given :math:`\mathcal{KB}` (typically expressed
 in first-order logic clauses), one can perform deductive reasoning as
 well as abductive reasoning. Deductive reasoning allows deriving
 :math:`b` from :math:`a` only where :math:`b` is a formal logical
 consequence of :math:`a`, while abductive reasoning allows inferring
 :math:`a` as an explanation of :math:`b` (as a result of this inference,
 abduction allows the precondition :math:`a` to be abducted from the
 consequence :math:`b`). Put simply, deductive reasoning and abductive
 reasoning differ in which end, right or left, of the proposition
 “:math:`a\models b`” serves as conclusion.
   In Abductive Learning, given :math:`\mathcal{KB}` (typically expressed
   in first-order logic clauses), one can perform both deductive and 
   abductive reasoning. Deductive reasoning allows deriving
   :math:`b` from :math:`a`, while abductive reasoning allows inferring
   :math:`a` as an explanation of :math:`b`. In other words, 
   deductive reasoning and abductive reasoning differ in which end, 
   right or left, of the proposition “:math:`a\models b`” serves as conclusion.
--- a/docs/Overview/Usage.rst
+++ b/docs/Overview/Usage.rst
@@ -1,195 +0,0 @@
 Use ABL-Package Step by Step
 ============================

 In a typical Abductive Learning process, as illustrated below, 
 data inputs are first mapped to pseudo labels through a machine learning model. 
 These pseudo labels then pass through a knowledge base :math:`\mathcal{KB}`
 to obtain the logical result by deductive reasoning. During training, 
 alongside the aforementioned forward flow (i.e., prediction --> deduction reasoning), 
 there also exists a reverse flow, which starts from the logical result and 
 involves abductive reasoning to generate pseudo labels. 
 Subsequently, these labels are processed to minimize inconsistencies with machine learning, 
 which in turn revise the outcomes of the machine learning model, and then 
 fed back into the machine learning model for further training. 
 To implement this process, the following four steps are necessary:

 .. image:: ../img/ABL-Package.jpg

 1. Prepare datasets

    Prepare the data's input, ground truth for pseudo labels (optional), and ground truth for logical results.

 2. Build machine learning part

    Build a model that defines how to map input to pseudo labels. 
    And use ``ABLModel`` to encapsulate the model.

 3. Build the reasoning part

    Build a knowledge base by creating a subclass of ``KBBase``,
    and instantiate a ``ReasonerBase`` for minimizing of inconsistencies 
    between the knowledge base and pseudo labels.

 4. Bridge machine learning and reasoning so as to train and test

    Use ``SimpleBridge`` to bridge the machine learning and reasoning part
    for integrated training and testing. Before training or testing, we also have 
    to define the metrics for measuring accuracy by inheriting ``BaseMetric``.

 Build the machine learning part
 --------------------------------

 First, we build the machine learning part, which needs to be wrapped in the ``ABLModel`` class. We can use machine learning models from scikit-learn or based on PyTorch to create an instance of ``ABLModel``. 

 - for a scikit-learn model, we can directly use the model to create an instance of ``ABLModel``. For example, we can customize our machine learning model by

  .. code:: python

      # Load a scikit-learn model
      base_model = sklearn.neighbors.KNeighborsClassifier(n_neighbors=3)

      model = ABLModel(base_model)

 - for a PyTorch-based neural network, we first need to encapsulate it within a ``BasicNN`` object and then use this object to instantiate an instance of ``ABLModel``.  For example, we can customize our machine learning model by

  .. code:: python

      # Load a PyTorch-based neural network
      cls = torchvision.models.resnet18(pretrained=True)

      # criterion and optimizer are used for training
      criterion = torch.nn.CrossEntropyLoss() 
      optimizer = torch.optim.Adam(cls.parameters())

      base_model = BasicNN(cls, criterion, optimizer)
      model = ABLModel(base_model)


 In the MNIST Add example, the machine learning model looks like

 .. code:: python

    cls = LeNet5(num_classes=10)
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    criterion = torch.nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(cls.parameters(), lr=0.001, betas=(0.9, 0.99))

    base_model = BasicNN(
        cls,
        criterion,
        optimizer,
        device=device,
        batch_size=32,
        num_epochs=1,
    )
    model = ABLModel(base_model)

 Build the reasoning part
 ------------------------

 Next, we build the reasoning part. In ABL-Package, the reasoning part is wrapped in the ``ReasonerBase`` class. In order to create an instance of this class, we first need to inherit the ``KBBase`` class to customize our knowledge base. Arguments of the ``__init__`` method of the knowledge base should at least contain ``pseudo_label_list`` which is a list of all pseudo labels. The ``logic_forward`` method of ``KBBase`` is an abstract method and we need to instantiate this method in our sub-class to give the ability of deduction to the knowledge base. In general, we can customize our knowledge base by

 .. code:: python

    class MyKB(KBBase):
        def __init__(self, pseudo_label_list):
            super().__init__(pseudo_label_list)
        
        def logic_forward(self, *args, **kwargs):
            # Deduction implementation...
            return deduction_result

 Aside from the knowledge base, the instantiation of the ``ReasonerBase`` also needs to set an extra argument called ``dist_func``, which is the consistency measure used to select the best candidate from all candidates. In general, we can instantiate our reasoner by

 .. code:: python

    kb = MyKB(pseudo_label_list)
    reasoner = ReasonerBase(kb, dist_func="hamming")

 In the MNIST Add example, the reasoner looks like

 .. code:: python

    class AddKB(KBBase):
        def __init__(self, pseudo_label_list): 
            super().__init__(pseudo_label_list)

        # Implement the deduction function
        def logic_forward(self, nums):
            return sum(nums)

    kb = AddKB(pseudo_label_list=list(range(10)))    
    reasoner = ReasonerBase(kb, dist_func="confidence")

 Build datasets and evaluation metrics
 -------------------------------------

 Next, we need to build datasets and evaluation metrics for training and validation. ABL-Package assumes data to be in the form of ``(X, gt_pseudo_label, Y)`` where ``X`` is the input of the machine learning model, ``Y`` is the ground truth of the reasoning result and ``gt_pseudo_label`` is the ground truth label of each element in ``X``. ``X`` should be of type ``List[List[Any]]``, ``Y`` should be of type ``List[Any]`` and ``gt_pseudo_label`` can be ``None`` or of the type ``List[List[Any]]``. 

 In the MNIST Add example, the data loading looks like

 .. code:: python

    # train_data and test_data are all tuples consist of X, gt_pseudo_label and Y.
    train_data = get_mnist_add(train=True, get_pseudo_label=True)
    test_data = get_mnist_add(train=False, get_pseudo_label=True)

 To validate and test the model, we need to inherit from ``BaseMetric`` to define metrics and implement the ``process`` and ``compute_metrics`` methods where the process method accepts a batch of outputs. After processing this batch of data, we save the information to ``self.results`` property. The input results of ``compute_metrics`` is all the information saved in ``process``. Use these information to calculate and return a dict that holds the results of the evaluation metrics. 

 We provide two basic metrics, namely ``SymbolMetric`` and ``SemanticsMetric``, which are used to evaluate the accuracy of the machine learning model's predictions and the accuracy of the ``logic_forward`` results, respectively.

 In the case of MNIST Add example, the metric definition looks like

 .. code:: python

    metric_list = [SymbolMetric(prefix="mnist_add"), SemanticsMetric(kb=kb, prefix="mnist_add")]

 Bridge the machine learning and reasoning parts
 -----------------------------------------------

 We next need to bridge the machine learning and reasoning parts. In ABL-Package, the ``BaseBridge`` class gives necessary abstract interface definitions to bridge the two parts and ``SimpleBridge`` provides a basic implementation. 
 We build a bridge with previously defined ``model``, ``reasoner``, and ``metric_list`` as follows:

 .. code:: python

    bridge = SimpleBridge(model, reasoner, metric_list)

 In the MNIST Add example, the bridge creation looks the same.

 Use ``Bridge.train`` and ``Bridge.test`` to train and test
 ----------------------------------------------------------

 ``BaseBridge.train`` and ``BaseBridge.test`` trigger the training and testing processes, respectively.

 The two methods take the previous prepared ``train_data`` and ``test_data`` as input.

 .. code:: python

    bridge.train(train_data)
    bridge.test(test_data)

 Aside from data, ``BaseBridge.train`` can also take some other training configs shown as follows:

 .. code:: python

    bridge.train(
        # training data
        train_data,
        # number of Abductive Learning loops
        loops=5,
        # data will be divided into segments and each segment will be used to train the model iteratively
        segment_size=10000,
        # evaluate the model every eval_interval loops
        eval_interval=1,
        # save the model every save_interval loops
        save_interval=1,
        # directory to save the model
        save_dir='./save_dir',
    )

 In the MNIST Add example, the code to train and test looks like

 .. code:: python

    bridge.train(train_data, loops=5, segment_size=10000, save_interval=1, save_dir=weights_dir)
    bridge.test(test_data)
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -6,8 +6,18 @@

   Overview/Abductive Learning
   Overview/Installation
   Overview/Quick Start
   Overview/Usage

 .. toctree::
   :maxdepth: 1
   :caption: Introduction to ABL-Package

   Intro/Basics
   Intro/Quick Start
   Intro/Datasets
   Intro/Learning
   Intro/Reasoning
   Intro/Evaluation
   Intro/Bridge

 .. toctree::
   :maxdepth: 1
@@ -17,7 +27,6 @@
   Examples/HWF
   Examples/HED


 .. toctree::
   :maxdepth: 1
   :caption: API