Browse Source

[DOC] change structure

pull/1/head
troyyyyy 2 years ago
parent
commit
24ada2c4e6
10 changed files with 208 additions and 229 deletions
  1. +44
    -0
      docs/Intro/Basics.rst
  2. +46
    -0
      docs/Intro/Bridge.rst
  3. +13
    -0
      docs/Intro/Datasets.rst
  4. +12
    -0
      docs/Intro/Evaluation.rst
  5. +46
    -0
      docs/Intro/Learning.rst
  6. +18
    -10
      docs/Intro/Quick Start.rst
  7. +3
    -0
      docs/Intro/Reasoning.rst
  8. +14
    -21
      docs/Overview/Abductive Learning.rst
  9. +0
    -195
      docs/Overview/Usage.rst
  10. +12
    -3
      docs/index.rst

+ 44
- 0
docs/Intro/Basics.rst View File

@@ -0,0 +1,44 @@
Learn the Basics
================

In a typical Abductive Learning process, as illustrated below,
data inputs are first mapped to pseudo labels through a machine learning model.
These pseudo labels then pass through a knowledge base :math:`\mathcal{KB}`
to obtain the logical result by deductive reasoning. During training,
alongside the aforementioned forward flow (i.e., prediction --> deduction reasoning),
there also exists a reverse flow, which starts from the logical result and
involves abductive reasoning to generate pseudo labels.
Subsequently, these labels are processed to minimize inconsistencies with machine learning,
which in turn revise the outcomes of the machine learning model, and then
fed back into the machine learning model for further training.
To implement this process, the following four steps are necessary:

.. image:: ../img/ABL-Package.jpg

1. Prepare datasets

Prepare the data's input, ground truth for pseudo labels (optional), and ground truth for logical results.

2. Build machine learning part

Build a model that defines how to map input to pseudo labels.
And use ``ABLModel`` to encapsulate the model.

3. Build the reasoning part

Build a knowledge base by creating a subclass of ``KBBase``,
and instantiate a ``ReasonerBase`` for minimizing of inconsistencies
between the knowledge base and pseudo labels.

4. Define Evaluation Metrics

Define the metrics for measuring accuracy by inheriting from ``BaseMetric``.

5. Bridge machine learning and reasoning

Use ``SimpleBridge`` to bridge the machine learning and reasoning part
for integrated training and testing.





+ 46
- 0
docs/Intro/Bridge.rst View File

@@ -0,0 +1,46 @@
.. _

Bridge the machine learning and reasoning parts
===============================================

We next need to bridge the machine learning and reasoning parts. In ABL-Package, the ``BaseBridge`` class gives necessary abstract interface definitions to bridge the two parts and ``SimpleBridge`` provides a basic implementation.
We build a bridge with previously defined ``model``, ``reasoner``, and ``metric_list`` as follows:

.. code:: python

bridge = SimpleBridge(model, reasoner, metric_list)

``BaseBridge.train`` and ``BaseBridge.test`` trigger the training and testing processes, respectively.

The two methods take the previous prepared ``train_data`` and ``test_data`` as input.

.. code:: python

bridge.train(train_data)
bridge.test(test_data)

Aside from data, ``BaseBridge.train`` can also take some other training configs shown as follows:

.. code:: python

bridge.train(
# training data
train_data,
# number of Abductive Learning loops
loops=5,
# data will be divided into segments and each segment will be used to train the model iteratively
segment_size=10000,
# evaluate the model every eval_interval loops
eval_interval=1,
# save the model every save_interval loops
save_interval=1,
# directory to save the model
save_dir='./save_dir',
)

In the MNIST Add example, the code to train and test looks like

.. code:: python

bridge.train(train_data, loops=5, segment_size=10000, save_interval=1, save_dir=weights_dir)
bridge.test(test_data)

+ 13
- 0
docs/Intro/Datasets.rst View File

@@ -0,0 +1,13 @@
Prepare datasets
================

Next, we need to build datasets. ABL-Package assumes data to be in the form of ``(X, gt_pseudo_label, Y)`` where ``X`` is the input of the machine learning model, ``Y`` is the ground truth of the reasoning result and ``gt_pseudo_label`` is the ground truth label of each element in ``X``. ``X`` should be of type ``List[List[Any]]``, ``Y`` should be of type ``List[Any]`` and ``gt_pseudo_label`` can be ``None`` or of the type ``List[List[Any]]``.

In the MNIST Add example, the data loading looks like

.. code:: python

# train_data and test_data are all tuples consist of X, gt_pseudo_label and Y.
train_data = get_mnist_add(train=True, get_pseudo_label=True)
test_data = get_mnist_add(train=False, get_pseudo_label=True)


+ 12
- 0
docs/Intro/Evaluation.rst View File

@@ -0,0 +1,12 @@
Define Evaluation Metrics
=========================

To validate and test the model, we need to inherit from ``BaseMetric`` to define metrics and implement the ``process`` and ``compute_metrics`` methods where the process method accepts a batch of outputs. After processing this batch of data, we save the information to ``self.results`` property. The input results of ``compute_metrics`` is all the information saved in ``process``. Use these information to calculate and return a dict that holds the results of the evaluation metrics.

We provide two basic metrics, namely ``SymbolMetric`` and ``SemanticsMetric``, which are used to evaluate the accuracy of the machine learning model's predictions and the accuracy of the ``logic_forward`` results, respectively.

In the case of MNIST Add example, the metric definition looks like

.. code:: python

metric_list = [SymbolMetric(prefix="mnist_add"), SemanticsMetric(kb=kb, prefix="mnist_add")]

+ 46
- 0
docs/Intro/Learning.rst View File

@@ -0,0 +1,46 @@
Build the machine learning part
===============================

First, we build the machine learning part, which needs to be wrapped in the ``ABLModel`` class. We can use machine learning models from scikit-learn or based on PyTorch to create an instance of ``ABLModel``.

- for a scikit-learn model, we can directly use the model to create an instance of ``ABLModel``. For example, we can customize our machine learning model by

.. code:: python

# Load a scikit-learn model
base_model = sklearn.neighbors.KNeighborsClassifier(n_neighbors=3)

model = ABLModel(base_model)

- for a PyTorch-based neural network, we first need to encapsulate it within a ``BasicNN`` object and then use this object to instantiate an instance of ``ABLModel``. For example, we can customize our machine learning model by

.. code:: python

# Load a PyTorch-based neural network
cls = torchvision.models.resnet18(pretrained=True)

# criterion and optimizer are used for training
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(cls.parameters())

base_model = BasicNN(cls, criterion, optimizer)
model = ABLModel(base_model)

In the MNIST Add example, the machine learning model looks like

.. code:: python

cls = LeNet5(num_classes=10)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(cls.parameters(), lr=0.001, betas=(0.9, 0.99))

base_model = BasicNN(
cls,
criterion,
optimizer,
device=device,
batch_size=32,
num_epochs=1,
)
model = ABLModel(base_model)

docs/Overview/Quick Start.rst → docs/Intro/Quick Start.rst View File

@@ -24,8 +24,10 @@ where ``X`` is the input of the machine learning model,

In the ``get_mnist_add`` above, the return values are tuples of ``(X, gt_pseudo_label, Y)``.

Machine Learning (Map input to pseudo labels)
---------------------------------------------
Read more about `prepare datasets <Datasets.html>`_.

Build Machine Learning Models
-----------------------------

We use a simple LeNet5 model to recognize the pseudo labels (numbers) in the images.
We first build the model and define its corresponding criterion and optimizer for training.
@@ -50,6 +52,8 @@ Afterward, we wrap it in ``ABLModel``.
base_model = BasicNN(cls, criterion, optimizer, device)
model = ABLModel(base_model)

Read more about `build machine learning models <Learning.html>`_.

Reasoning (Map pseudo labels to reasoning results)
--------------------------------------------------

@@ -76,26 +80,28 @@ how to minimize the inconsistency between the knowledge base and machine learnin

.. code:: python

reasoner = ReasonerBase(kb, dist_func="confidence")
reasoner = ReasonerBase(kb, dist_func="confidence")

Read more about `build the reasoning part <Reasoning.html>`_.

Bridge Machine Learning and Reasoning
-------------------------------------

First, we use ``SimpleBridge`` to combine machine learning and reasoning together,
setting the stage for subsequent integrated training, validation, and testing.
Before bridging, we first define the metrics to measure accuracy during validation and testing.

.. code:: python

from abl.bridge import SimpleBridge
from abl.evaluation import SemanticsMetric, SymbolMetric

metric_list = [SymbolMetric(prefix="mnist_add"), SemanticsMetric(kb=kb, prefix="mnist_add")]


Next, we define the metrics to measure accuracy during validation and testing.
Now, we may use ``SimpleBridge`` to combine machine learning and reasoning together,
setting the stage for subsequent integrated training, validation, and testing.

.. code:: python

from abl.evaluation import SemanticsMetric, SymbolMetric

metric_list = [SymbolMetric(prefix="mnist_add"), SemanticsMetric(kb=kb, prefix="mnist_add")]
from abl.bridge import SimpleBridge

Finally, we proceed with testing and training.

@@ -103,3 +109,5 @@ Finally, we proceed with testing and training.

bridge.train(train_data, loops=5, segment_size=10000)
bridge.test(test_data)

Read more about `defining evaluation metrics <Evaluation.html>`_ and `bridge machine learning and reasoning <Bridge.html>`_.

+ 3
- 0
docs/Intro/Reasoning.rst View File

@@ -0,0 +1,3 @@
Build the reasoning part
========================


+ 14
- 21
docs/Overview/Abductive Learning.rst View File

@@ -1,9 +1,6 @@
Abductive Learning
==================

Integrating the Power of Machine Learning and Logical Reasoning
---------------------------------------------------------------

Traditional supervised machine learning, e.g. classification, is
predominantly data-driven, as shown in the below figure.
Here, a set of training examples :math:`\left\{\left(x_1, y_1\right),
@@ -61,23 +58,19 @@ is dual-driven by both data and domain knowledge, integrating and
balancing the use of machine learning and logical reasoning in a unified
model.

What is Abductive Reasoning?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. admonition:: What is Abductive Reasoning?

Abductive reasoning, also known as abduction, refers to the process of
selectively inferring certain facts and hypotheses that explain
phenomena and observations based on background knowledge. Unlike
deductive reasoning, which leads to definitive conclusions, abductive
reasoning may arrive at conclusions that are plausible but not conclusively
proven. It is often described as an ‘inference to the best explanation.’
Abductive reasoning, also known as abduction, refers to the process of
selectively inferring certain facts and hypotheses that explain
phenomena and observations based on background knowledge. Unlike
deductive reasoning, which leads to definitive conclusions, abductive
reasoning may arrive at conclusions that are plausible but not conclusively
proven.

In Abductive Learning, given :math:`\mathcal{KB}` (typically expressed
in first-order logic clauses), one can perform deductive reasoning as
well as abductive reasoning. Deductive reasoning allows deriving
:math:`b` from :math:`a` only where :math:`b` is a formal logical
consequence of :math:`a`, while abductive reasoning allows inferring
:math:`a` as an explanation of :math:`b` (as a result of this inference,
abduction allows the precondition :math:`a` to be abducted from the
consequence :math:`b`). Put simply, deductive reasoning and abductive
reasoning differ in which end, right or left, of the proposition
“:math:`a\models b`” serves as conclusion.
In Abductive Learning, given :math:`\mathcal{KB}` (typically expressed
in first-order logic clauses), one can perform both deductive and
abductive reasoning. Deductive reasoning allows deriving
:math:`b` from :math:`a`, while abductive reasoning allows inferring
:math:`a` as an explanation of :math:`b`. In other words,
deductive reasoning and abductive reasoning differ in which end,
right or left, of the proposition “:math:`a\models b`” serves as conclusion.

+ 0
- 195
docs/Overview/Usage.rst View File

@@ -1,195 +0,0 @@
Use ABL-Package Step by Step
============================

In a typical Abductive Learning process, as illustrated below,
data inputs are first mapped to pseudo labels through a machine learning model.
These pseudo labels then pass through a knowledge base :math:`\mathcal{KB}`
to obtain the logical result by deductive reasoning. During training,
alongside the aforementioned forward flow (i.e., prediction --> deduction reasoning),
there also exists a reverse flow, which starts from the logical result and
involves abductive reasoning to generate pseudo labels.
Subsequently, these labels are processed to minimize inconsistencies with machine learning,
which in turn revise the outcomes of the machine learning model, and then
fed back into the machine learning model for further training.
To implement this process, the following four steps are necessary:

.. image:: ../img/ABL-Package.jpg

1. Prepare datasets

Prepare the data's input, ground truth for pseudo labels (optional), and ground truth for logical results.

2. Build machine learning part

Build a model that defines how to map input to pseudo labels.
And use ``ABLModel`` to encapsulate the model.

3. Build the reasoning part

Build a knowledge base by creating a subclass of ``KBBase``,
and instantiate a ``ReasonerBase`` for minimizing of inconsistencies
between the knowledge base and pseudo labels.

4. Bridge machine learning and reasoning so as to train and test

Use ``SimpleBridge`` to bridge the machine learning and reasoning part
for integrated training and testing. Before training or testing, we also have
to define the metrics for measuring accuracy by inheriting ``BaseMetric``.

Build the machine learning part
--------------------------------

First, we build the machine learning part, which needs to be wrapped in the ``ABLModel`` class. We can use machine learning models from scikit-learn or based on PyTorch to create an instance of ``ABLModel``.

- for a scikit-learn model, we can directly use the model to create an instance of ``ABLModel``. For example, we can customize our machine learning model by

.. code:: python

# Load a scikit-learn model
base_model = sklearn.neighbors.KNeighborsClassifier(n_neighbors=3)

model = ABLModel(base_model)

- for a PyTorch-based neural network, we first need to encapsulate it within a ``BasicNN`` object and then use this object to instantiate an instance of ``ABLModel``. For example, we can customize our machine learning model by

.. code:: python

# Load a PyTorch-based neural network
cls = torchvision.models.resnet18(pretrained=True)

# criterion and optimizer are used for training
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(cls.parameters())

base_model = BasicNN(cls, criterion, optimizer)
model = ABLModel(base_model)


In the MNIST Add example, the machine learning model looks like

.. code:: python

cls = LeNet5(num_classes=10)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(cls.parameters(), lr=0.001, betas=(0.9, 0.99))

base_model = BasicNN(
cls,
criterion,
optimizer,
device=device,
batch_size=32,
num_epochs=1,
)
model = ABLModel(base_model)

Build the reasoning part
------------------------

Next, we build the reasoning part. In ABL-Package, the reasoning part is wrapped in the ``ReasonerBase`` class. In order to create an instance of this class, we first need to inherit the ``KBBase`` class to customize our knowledge base. Arguments of the ``__init__`` method of the knowledge base should at least contain ``pseudo_label_list`` which is a list of all pseudo labels. The ``logic_forward`` method of ``KBBase`` is an abstract method and we need to instantiate this method in our sub-class to give the ability of deduction to the knowledge base. In general, we can customize our knowledge base by

.. code:: python

class MyKB(KBBase):
def __init__(self, pseudo_label_list):
super().__init__(pseudo_label_list)
def logic_forward(self, *args, **kwargs):
# Deduction implementation...
return deduction_result

Aside from the knowledge base, the instantiation of the ``ReasonerBase`` also needs to set an extra argument called ``dist_func``, which is the consistency measure used to select the best candidate from all candidates. In general, we can instantiate our reasoner by

.. code:: python

kb = MyKB(pseudo_label_list)
reasoner = ReasonerBase(kb, dist_func="hamming")

In the MNIST Add example, the reasoner looks like

.. code:: python

class AddKB(KBBase):
def __init__(self, pseudo_label_list):
super().__init__(pseudo_label_list)

# Implement the deduction function
def logic_forward(self, nums):
return sum(nums)

kb = AddKB(pseudo_label_list=list(range(10)))
reasoner = ReasonerBase(kb, dist_func="confidence")

Build datasets and evaluation metrics
-------------------------------------

Next, we need to build datasets and evaluation metrics for training and validation. ABL-Package assumes data to be in the form of ``(X, gt_pseudo_label, Y)`` where ``X`` is the input of the machine learning model, ``Y`` is the ground truth of the reasoning result and ``gt_pseudo_label`` is the ground truth label of each element in ``X``. ``X`` should be of type ``List[List[Any]]``, ``Y`` should be of type ``List[Any]`` and ``gt_pseudo_label`` can be ``None`` or of the type ``List[List[Any]]``.

In the MNIST Add example, the data loading looks like

.. code:: python

# train_data and test_data are all tuples consist of X, gt_pseudo_label and Y.
train_data = get_mnist_add(train=True, get_pseudo_label=True)
test_data = get_mnist_add(train=False, get_pseudo_label=True)

To validate and test the model, we need to inherit from ``BaseMetric`` to define metrics and implement the ``process`` and ``compute_metrics`` methods where the process method accepts a batch of outputs. After processing this batch of data, we save the information to ``self.results`` property. The input results of ``compute_metrics`` is all the information saved in ``process``. Use these information to calculate and return a dict that holds the results of the evaluation metrics.

We provide two basic metrics, namely ``SymbolMetric`` and ``SemanticsMetric``, which are used to evaluate the accuracy of the machine learning model's predictions and the accuracy of the ``logic_forward`` results, respectively.

In the case of MNIST Add example, the metric definition looks like

.. code:: python

metric_list = [SymbolMetric(prefix="mnist_add"), SemanticsMetric(kb=kb, prefix="mnist_add")]

Bridge the machine learning and reasoning parts
-----------------------------------------------

We next need to bridge the machine learning and reasoning parts. In ABL-Package, the ``BaseBridge`` class gives necessary abstract interface definitions to bridge the two parts and ``SimpleBridge`` provides a basic implementation.
We build a bridge with previously defined ``model``, ``reasoner``, and ``metric_list`` as follows:

.. code:: python

bridge = SimpleBridge(model, reasoner, metric_list)

In the MNIST Add example, the bridge creation looks the same.

Use ``Bridge.train`` and ``Bridge.test`` to train and test
----------------------------------------------------------

``BaseBridge.train`` and ``BaseBridge.test`` trigger the training and testing processes, respectively.

The two methods take the previous prepared ``train_data`` and ``test_data`` as input.

.. code:: python

bridge.train(train_data)
bridge.test(test_data)

Aside from data, ``BaseBridge.train`` can also take some other training configs shown as follows:

.. code:: python

bridge.train(
# training data
train_data,
# number of Abductive Learning loops
loops=5,
# data will be divided into segments and each segment will be used to train the model iteratively
segment_size=10000,
# evaluate the model every eval_interval loops
eval_interval=1,
# save the model every save_interval loops
save_interval=1,
# directory to save the model
save_dir='./save_dir',
)

In the MNIST Add example, the code to train and test looks like

.. code:: python

bridge.train(train_data, loops=5, segment_size=10000, save_interval=1, save_dir=weights_dir)
bridge.test(test_data)

+ 12
- 3
docs/index.rst View File

@@ -6,8 +6,18 @@

Overview/Abductive Learning
Overview/Installation
Overview/Quick Start
Overview/Usage

.. toctree::
:maxdepth: 1
:caption: Introduction to ABL-Package

Intro/Basics
Intro/Quick Start
Intro/Datasets
Intro/Learning
Intro/Reasoning
Intro/Evaluation
Intro/Bridge

.. toctree::
:maxdepth: 1
@@ -17,7 +27,6 @@
Examples/HWF
Examples/HED


.. toctree::
:maxdepth: 1
:caption: API


Loading…
Cancel
Save