| @@ -7,21 +7,12 @@ In TensorFlow, a constant is a special Tensor that cannot be modified while the | |||
| * shape: dimensions; | |||
| * name: constant's name; | |||
| 在TensorFlow中,常量是一种特殊的Tensor,它在计算图运行的时候,不能被修改。比如在线性模型里$\tilde{y_i}=\boldsymbol{w}x_i+b$, 常数$b$就可以用一个常量来表示。既然常量是一种Tensor,那么它也就具有Tensor的所有数据特性,它包括: | |||
| * value: 符合TensorFlow中定义的数据类型的常数值或者常数列表; | |||
| * dtype:数据类型; | |||
| * shape:常量的形状; | |||
| * name:常量的名字; | |||
| ##### How to create a Constant | |||
| TensorFlow provides a handy function to create a Constant. In TF.NET, you can use the same function name `tf.constant` to create it. TF.NET takes the same name as python binding to the API. Naming, although this will make developers who are used to C# naming habits feel uncomfortable, but after careful consideration, I decided to give up the C# convention naming method. | |||
| TensorFlow提供了一个很方便的函数用来创建一个Constant, 在TF.NET,可以使用同样的函数名`tf.constant`来创建,TF.NET采取尽可能使用和python binding一样的命名方式来对API命名,虽然这样会让习惯C#命名习惯的开发者感到不舒服,但我经过深思熟虑之后还是决定放弃C#的约定命名方式。 | |||
| Initialize a scalar constant: | |||
| ```csharp | |||
| @@ -45,9 +36,7 @@ var tensor = tf.constant(nd); | |||
| ##### Dive in Constant | |||
| Now let's explore how constant works. | |||
| 现在让我探究一下`tf.constant`是怎么工作的。 | |||
| Now let's explore how `constant` works. | |||
| @@ -2,16 +2,10 @@ | |||
| One of the most nerve-wracking periods when releasing the first version of an open source project occurs when the [gitter](https://gitter.im/sci-sharp/community) community is created. You are all alone, eagerly hoping and wishing for the first user to come along. I still vividly remember those days. | |||
| 最让人紧张的时刻是当我为自己的开源项目发布第一个版本并在gitter里开放一个聊天社区,而里面只有你一个人,饥渴地等待第一个进入聊天室的用户,我仍然清楚地记得那个时期。 | |||
| TensorFlow.NET is my third open source project. BotSharp and NumSharp are the first two. The response is pretty good. I also got a lot of stars on github. Although the first two projects are very difficult, I can't admit that TensorFlow.NET is much more difficult than the previous two, and it is an area I have never been involved with. Mainly related to GPU parallel computing, distributed computing and neural network model. When I started writing this project, I was also sorting out the idea of the coding process. TensorFlow is a huge and complicated project, and it is easy to go beyond the scope of personal ability. Therefore, I want to record the thoughts at the time as much as possible. The process of recording and sorting clears the way of thinking. | |||
| TensorFlow.NET是我写的第3个开源项目,BotSharp和NumSharp是前两个,反应都还不错,在github上也收获了不少星。虽然前两个项目的难度很大,但是我不得承认TensorFlow.NET的难度要比之前两个要大的多,是我从未涉入过的领域。主要涉及GPU并行计算,分布式计算和神经网络模型。当我开始写这个项目的时候,我同时也在整理编码过程时候的想法,TensorFlow是个巨大最复杂的工程,很容易超出个人能力范围,所以想尽可能地把当时的思路记录下来,也想趁着记录整理的过程把思路理清。 | |||
| All the examples in this book can be found in the github repository of TensorFlow.NET. When the source code and the code in the book are inconsistent, please refer to the source code. The sample code is typically located in the Example or UnitTest project. | |||
| 本书中的所有例子都可以在TensorFlow.NET的github仓库中找到,当源代码和书中的代码不一致时,请以源代码为准。示例代码一般都位于Example或者是UnitTest项目里。 | |||
| @@ -2,11 +2,9 @@ | |||
| I would describe TensorFlow as an open source machine learning framework developed by Google which can be used to build neural networks and perform a variety of machine learning tasks. it works on data flow graph where nodes are the mathematical operations and the edges are the data in the form of tensor, hence the name Tensor-Flow. | |||
| 按照我的理解,TensorFlow是Google公司开发的一个开源机器学习框架,可以用来搭建神经网络模型和其它传统机器学习模型,它采用了图计算模型,图的节点和边分别代表了操作和数据输入或输出,数据在图的单个方向传递,因此这个过程形象地取名叫做TensorFlow。 | |||
| Let's run a classic HelloWorld program first and see if TensorFlow is running on .NET. I can't think of a simpler way to be a HelloWorld. | |||
| 让我们先运行一个经典的HelloWorld程序,看看TensorFlow在.NET上面运行的效果,我想不出有比做个HelloWorld更简单的方式了。 | |||
| Let's run a classic HelloWorld program first and see if TensorFlow is running on .NET. I can't think of a simpler way to be a HelloWorld. | |||
| @@ -14,7 +12,7 @@ Let's run a classic HelloWorld program first and see if TensorFlow is running on | |||
| TensorFlow.NET uses the .NET Standard 2.0 standard, so your new project Target Framework can be .NET Framework or .NET Core. All the examples in this book are using .NET Core 2.2 and Microsoft Visual Studio Community 2017. To start building TensorFlow program you just need to download and install the .NET SDK (Software Development Kit). You have to download the latest .NET Core SDK from offical website: https://dotnet.microsoft.com/download. | |||
| TensorFlow.NET采用.NET标准库2.0版本,因此你的新建工程可以是.NET Framework或者是基于.NET Core的。本文中的所有例子都是用的.NET Core 2.2的,IDE用的是Microsoft Visual Studio Community 2017。为了能编译和运行TensorFlow工程,你需要从这里下载最新的.NET Core SDK: https://dotnet.microsoft.com/download。 | |||
| 1. New a project | |||
| @@ -34,7 +32,7 @@ PM> Install-Package TensorFlow.NET | |||
| After installing the TensorFlow.NET package, you can use the `using Tensorflow` to introduce the TensorFlow library. | |||
| 安装完TensorFlow.NET包后,你就可以使用`using Tensorflow`来引入TensorFlow库了。 | |||
| ```csharp | |||
| using System; | |||
| @@ -76,5 +74,3 @@ Press any key to continue . . . | |||
| This sample code can be found at [here](https://github.com/SciSharp/TensorFlow.NET/blob/master/test/TensorFlowNET.Examples/HelloWorld.cs). | |||
| 此示例代码可以在[这里](https://github.com/SciSharp/TensorFlow.NET/blob/master/test/TensorFlowNET.Examples/HelloWorld.cs)找到。 | |||
| @@ -1,5 +1,7 @@ | |||
| # Chapter. Linear Regression | |||
| ### What is linear regression? | |||
| Linear regression is a linear approach to modelling the relationship between a scalar response (or dependent variable) and one or more explanatory variables (or independent variables). | |||
| @@ -8,9 +10,7 @@ Consider the case of a single variable of interest y and a single predictor vari | |||
| We have some data $D=\{x{\tiny i},y{\tiny i}\}$ and we assume a simple linear model of this dataset with Gaussian noise: | |||
| 线性回归是一种线性建模方法,这种方法用来描述自变量与一个或多个因变量的之间的关系。在只有一个因变量y和一个自变量的情况下。自变量还有以下几种叫法: | |||
| 协变量,输入,特征;因变量通常被叫做响应变量,输出,输出结果。 | |||
| 假如我们有数据$D=\{x{\tiny i},y{\tiny i}\}$,并且假设这个数据集是满足高斯分布的线性模型: | |||
| ```csharp | |||
| // Prepare training Data | |||
| var train_X = np.array(3.3f, 4.4f, 5.5f, 6.71f, 6.93f, 4.168f, 9.779f, 6.182f, 7.59f, 2.167f, 7.042f, 10.791f, 5.313f, 7.997f, 5.654f, 9.27f, 3.1f); | |||
| @@ -21,15 +21,13 @@ var n_samples = train_X.shape[0]; | |||
| Based on the given data points, we try to plot a line that models the points the best. The red line can be modelled based on the linear equation: $y = wx + b$. The motive of the linear regression algorithm is to find the best values for $w$ and $b$. Before moving on to the algorithm, le's have a look at two important concepts you must know to better understand linear regression. | |||
| 按照上图根据数据描述的数据点,在这些数据点之间画出一条线,这条线能达到最好模拟点的分布的效果。红色的线能够通过下面呢线性等式来描述:$y = wx + b$。 | |||
| 线性回归算法的目标就是找到这条线对应的最好的参数$w$和$b$。在介绍线性回归算法之前,我们先看两个重要的概念,这两个概念有助于你理解线性回归算法。 | |||
| ### Cost Function | |||
| The cost function helps us to figure out the best possible values for $w$ and $b$ which would provide the best fit line for the data points. Since we want the best values for $w$ and $b$, we convert this search problem into a minimization problem where we would like to minimize the error between the predicted value and the actual value. | |||
| 损失函数帮助我们估算出最优的参数$w$和$b$,这个最优的参数能够最好的拟合数据点的分布。由于我们想找到最优的参数$w$和$b$,因此我们把这个问题转化成求 | |||
| 预测参数与实际参数之差的最小值问题。 | |||
|  | |||
| @@ -37,8 +35,7 @@ We choose the above function to minimize. The difference between the predicted v | |||
| value by the total number of data points. This provides the average squared error over all the data points. Therefore, this cost function is also known as the Mean Squared Error(MSE) function. Now, using this MSE | |||
| function we are going to change the values of $w$ and $b$ such that the MSE value settles at the minima. | |||
| 我们选择最小化上面的函数。预测值和真实值之间的差异的大小衡量了预测结果的偏差。我们用所有点的偏差的平方和除以所有点所有点的数量大小来表示说有点的平均 | |||
| 的误差大小。因此,损失函数又叫均方误差(简称MSE)。到此,我们可以通过调整参数$w$和$b$来使MSE达到最小值。 | |||
| ```csharp | |||
| // tf Graph Input | |||
| @@ -56,13 +53,13 @@ var pred = tf.add(tf.multiply(X, W), b); | |||
| var cost = tf.reduce_sum(tf.pow(pred - Y, 2.0f)) / (2.0f * n_samples); | |||
| ``` | |||
| ### Gradient Descent | |||
| ### 梯度下降法 | |||
| The another important concept needed to understand is gradient descent. Gradient descent is a method of updating $w$ and $b$ to minimize the cost function. The idea is that we start with some random values for $w$ and $b$ and then we change these values iteratively to reduce the cost. Gradient descent helps us on how to update the values or which direction we would go next. Gradient descent is also know as **steepest descent**. | |||
| 另一个需要理解的重要概念是梯度下降法。梯度下降法是通过更新参数$w$和$b$来最小化损失函数。梯度下降法的思想就是首先以任意的参数$w$和$b$开始计算损失 | |||
| 函数,然后通过递归的方式不断地变化参数来减小损失。梯度下降法帮助我们如何更新参数,或者说告诉我们下一个参数该如何设置。梯度下降法也称为“最快下降法”。 | |||
|  | |||
| @@ -72,9 +69,7 @@ of steps to reach the bottom. If you decide to take one step at a time you would | |||
| reach sooner but, there is a chance that you could overshoot the bottom of the pit and not exactly at the bottom. In the gradient descent algorithm, the number of steps you take is the learning rate. This | |||
| decides on how fast the algorithm converges to the minima. | |||
| 这里做一个类比,想象着你站在一个U形坑的最上面,你的目标是达到坑的最低端。有一个条件是,你不确定你走多少步能到达底端。如果你选择一步一步的走到坑的 | |||
| 底端,这样可能需要的时间很长。如果你每次大步的往前走,你可能很快到达坑的底端,但是你有可能错过坑的最底端。在梯度下降算法中,你所采用的步数就是训练 | |||
| 速率。训练速率决定了算法以多块的速度使得损失函数达到最小值。 | |||
| ```csharp | |||
| @@ -4,11 +4,11 @@ | |||
| Logistic regression is a statistical analysis method used to predict a data value based on prior observations of a data set. A logistic regression model predicts a dependent data variable by analyzing the relationship between one or more existing independent variables. | |||
| 逻辑回归是一种统计分析方法,用于根据已有得观察数据来预测未知数据。逻辑回归模型通过分析一个或多个现有自变量之间的关系来预测从属数据变量。 | |||
| The dependent variable of logistics regression can be two-category or multi-category, but the two-category is more common and easier to explain. So the most common use in practice is the logistics of the two classifications. An example used by TensorFlow.NET is a hand-written digit recognition, which is a multi-category. | |||
| 逻辑回归的因变量可以是二分类的,也可以是多分类的,但是二分类的更为常用,也更加容易解释。 TensorFlow.NET用的例子是一个手写数字识别,它是一个多分类的问题。 | |||
| Softmax regression allows us to handle  where K is the number of classes. | |||
| @@ -0,0 +1,244 @@ | |||
| # Neural Network | |||
| In this chapter, we'll learn how to build a graph of neural network model. The key advantage of neural network compared to Linear Classifier is that it can separate data which it not linearly separable. We'll implement this model to classify hand-written digits images from the MNIST dataset. | |||
| The structure of the neural network we're going to build is as follows. The hand-written digits images of the MNIST data which has 10 classes (from 0 to 9). The network is with 2 hidden layers: the first layer with 200 hidden units (neurons) and the second one (known as classifier layer) with 10 neurons. | |||
|  | |||
| Get started with the implementation step by step: | |||
| 1. **Prepare data** | |||
| MNIST is dataset of handwritten digits which contains 55,000 examples for training, 5,000 examples for validation and 10,000 example for testing. The digits have been size-normalized and centered in a fixed-size image (28 x 28 pixels) with values from 0 and 1.Each image has been flattened and converted to a 1-D array of 784 features. It's also kind of benchmark of datasets for deep learning. | |||
|  | |||
| We define some variables makes it easier to modify them later. It's important to note that in a linear model, we have to flatten the input images to a vector. | |||
| ```csharp | |||
| using System; | |||
| using NumSharp; | |||
| using Tensorflow; | |||
| using TensorFlowNET.Examples.Utility; | |||
| using static Tensorflow.Python; | |||
| ``` | |||
| ```csharp | |||
| const int img_h = 28; | |||
| const int img_w = 28; | |||
| int img_size_flat = img_h * img_w; // 784, the total number of pixels | |||
| int n_classes = 10; // Number of classes, one class per digit | |||
| ``` | |||
| We'll write the function which automatically loads the MNIST data and returns it in our desired shape and format. There is an MNIST data helper to make life easier. | |||
| ```csharp | |||
| Datasets mnist; | |||
| public void PrepareData() | |||
| { | |||
| mnist = MnistDataSet.read_data_sets("mnist", one_hot: true); | |||
| } | |||
| ``` | |||
| Other than a function for loading the images and corresponding labels, we still need two more functions: | |||
| **randomize**: which randomizes the order of images and their labels. At the beginning of each epoch, we will re-randomize the order of data samples to make sure that the trained model is not sensitive to the order of data. | |||
| ```csharp | |||
| private (NDArray, NDArray) randomize(NDArray x, NDArray y) | |||
| { | |||
| var perm = np.random.permutation(y.shape[0]); | |||
| np.random.shuffle(perm); | |||
| return (mnist.train.images[perm], mnist.train.labels[perm]); | |||
| } | |||
| ``` | |||
| **get_next_batch**: which only selects a few number of images determined by the batch_size variable (as per Stochastic Gradient Descent method). | |||
| ```csharp | |||
| private (NDArray, NDArray) get_next_batch(NDArray x, NDArray y, int start, int end) | |||
| { | |||
| var x_batch = x[$"{start}:{end}"]; | |||
| var y_batch = y[$"{start}:{end}"]; | |||
| return (x_batch, y_batch); | |||
| } | |||
| ``` | |||
| 2. **Set Hyperparameters** | |||
| There're about 55,000 images in training set, it takes a long time to calculate the gradient of the model using all there images. Therefore we use a small batch of images in each iteration of the optimizer by Stochastic Gradient Descent. | |||
| * epoch: one forward pass and one backward pass of all the training examples. | |||
| * batch size: the number of training examples in one forward/backward pass. The higher the batch size, the more memory space you'll need. | |||
| * iteration: one forward pass and one backward pass of one batch of images the training examples. | |||
| ```csharp | |||
| int epochs = 10; | |||
| int batch_size = 100; | |||
| float learning_rate = 0.001f; | |||
| int h1 = 200; // number of nodes in the 1st hidden layer | |||
| ``` | |||
| 3. **Building the neural network** | |||
| Let's make some functions to help build computation graph. | |||
| **variables**: We need to define two variables `W` and `b` to construct our linear model. We use `Tensorflow Variables` of proper size and initialization to define them. | |||
| ```csharp | |||
| // weight_variable | |||
| var in_dim = x.shape[1]; | |||
| var initer = tf.truncated_normal_initializer(stddev: 0.01f); | |||
| var W = tf.get_variable("W_" + name, | |||
| dtype: tf.float32, | |||
| shape: (in_dim, num_units), | |||
| initializer: initer); | |||
| // bias_variable | |||
| var initial = tf.constant(0f, num_units); | |||
| var b = tf.get_variable("b_" + name, | |||
| dtype: tf.float32, | |||
| initializer: initial); | |||
| ``` | |||
| **fully-connected layer**: Neural network consists of stacks of fully-connected (dense) layers. Having the weight (W) and bias (b) variables, a fully-connected layer is defined as `activation(W x X + b)`. The complete `fc_layer` function is as below: | |||
| ```csharp | |||
| private Tensor fc_layer(Tensor x, int num_units, string name, bool use_relu = true) | |||
| { | |||
| var in_dim = x.shape[1]; | |||
| var initer = tf.truncated_normal_initializer(stddev: 0.01f); | |||
| var W = tf.get_variable("W_" + name, | |||
| dtype: tf.float32, | |||
| shape: (in_dim, num_units), | |||
| initializer: initer); | |||
| var initial = tf.constant(0f, num_units); | |||
| var b = tf.get_variable("b_" + name, | |||
| dtype: tf.float32, | |||
| initializer: initial); | |||
| var layer = tf.matmul(x, W) + b; | |||
| if (use_relu) | |||
| layer = tf.nn.relu(layer); | |||
| return layer; | |||
| } | |||
| ``` | |||
| **inputs**: Now we need to define the proper tensors to feed in the input to our model. Placeholder variable is the suitable choice for the input images and corresponding labels. This allow us to change the inputs (images and labels) to the TensorFlow graph. | |||
| ```csharp | |||
| // Placeholders for inputs (x) and outputs(y) | |||
| x = tf.placeholder(tf.float32, shape: (-1, img_size_flat), name: "X"); | |||
| y = tf.placeholder(tf.float32, shape: (-1, n_classes), name: "Y"); | |||
| ``` | |||
| Placeholder `x` is defined for the images, the shape is set to `[None, img_size_flat]`, where `None` means that the tensor may hold an arbitrary number of images with each image being a vector of length `img_size_flat`. | |||
| Placeholder `y` is the variable for the true labels associated with the images that were input in the placeholder variable `x`. It holds an arbitrary number of labels and each label is a vector of length `num_classes` which is 10. | |||
| **network layers**: After creating the proper input, we have to pass it to our model. Since we have a neural network, we can stack multiple fully-connected layers using `fc_layer` method. Note that we will not use any activation function (use_relu = false) in the last layer. The reason is that we can use `tf.nn.softmax_cross_entropy_with_logits` to calculate the loss. | |||
| ```csharp | |||
| // Create a fully-connected layer with h1 nodes as hidden layer | |||
| var fc1 = fc_layer(x, h1, "FC1", use_relu: true); | |||
| // Create a fully-connected layer with n_classes nodes as output layer | |||
| var output_logits = fc_layer(fc1, n_classes, "OUT", use_relu: false); | |||
| ``` | |||
| **loss function**: After creating the network, we have to calculate the loss and optimize it, we have to calculate the `correct_prediction` and `accuracy`. | |||
| ```csharp | |||
| // Define the loss function, optimizer, and accuracy | |||
| var logits = tf.nn.softmax_cross_entropy_with_logits(labels: y, logits: output_logits); | |||
| loss = tf.reduce_mean(logits, name: "loss"); | |||
| optimizer = tf.train.AdamOptimizer(learning_rate: learning_rate, name: "Adam-op").minimize(loss); | |||
| var correct_prediction = tf.equal(tf.argmax(output_logits, 1), tf.argmax(y, 1), name: "correct_pred"); | |||
| accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32), name: "accuracy"); | |||
| ``` | |||
| **initialize variables**: We have to invoke a variable initializer operation to initialize all variables. | |||
| ```csharp | |||
| var init = tf.global_variables_initializer(); | |||
| ``` | |||
| The complete computation graph is looks like below: | |||
|  | |||
| 4. **Train** | |||
| After creating the graph, we can train our model. To train the model, we have to create a session and run the graph in the session. | |||
| ```csharp | |||
| // Number of training iterations in each epoch | |||
| var num_tr_iter = mnist.train.labels.len / batch_size; | |||
| with(tf.Session(), sess => | |||
| { | |||
| sess.run(init); | |||
| float loss_val = 100.0f; | |||
| float accuracy_val = 0f; | |||
| foreach (var epoch in range(epochs)) | |||
| { | |||
| print($"Training epoch: {epoch + 1}"); | |||
| // Randomly shuffle the training data at the beginning of each epoch | |||
| var (x_train, y_train) = randomize(mnist.train.images, mnist.train.labels); | |||
| foreach (var iteration in range(num_tr_iter)) | |||
| { | |||
| var start = iteration * batch_size; | |||
| var end = (iteration + 1) * batch_size; | |||
| var (x_batch, y_batch) = get_next_batch(x_train, y_train, start, end); | |||
| // Run optimization op (backprop) | |||
| sess.run(optimizer, new FeedItem(x, x_batch), new FeedItem(y, y_batch)); | |||
| if (iteration % display_freq == 0) | |||
| { | |||
| // Calculate and display the batch loss and accuracy | |||
| var result = sess.run(new[] { loss, accuracy }, new FeedItem(x, x_batch), new FeedItem(y, y_batch)); | |||
| loss_val = result[0]; | |||
| accuracy_val = result[1]; | |||
| print($"iter {iteration.ToString("000")}: Loss={loss_val.ToString("0.0000")}, Training Accuracy={accuracy_val.ToString("P")}"); | |||
| } | |||
| } | |||
| // Run validation after every epoch | |||
| var results1 = sess.run(new[] { loss, accuracy }, new FeedItem(x, mnist.validation.images), new FeedItem(y, mnist.validation.labels)); | |||
| loss_val = results1[0]; | |||
| accuracy_val = results1[1]; | |||
| print("---------------------------------------------------------"); | |||
| print($"Epoch: {epoch + 1}, validation loss: {loss_val.ToString("0.0000")}, validation accuracy: {accuracy_val.ToString("P")}"); | |||
| print("---------------------------------------------------------"); | |||
| } | |||
| }); | |||
| ``` | |||
| 5. **Test** | |||
| After the training is done, we have to test our model to see how good it performs on a new dataset. | |||
| ```csharp | |||
| var result = sess.run(new[] { loss, accuracy }, new FeedItem(x, mnist.test.images), new FeedItem(y, mnist.test.labels)); | |||
| loss_test = result[0]; | |||
| accuracy_test = result[1]; | |||
| print("---------------------------------------------------------"); | |||
| print($"Test loss: {loss_test.ToString("0.0000")}, test accuracy: {accuracy_test.ToString("P")}"); | |||
| print("---------------------------------------------------------"); | |||
| ``` | |||
|  | |||
| @@ -2,7 +2,7 @@ | |||
| In this chapter we will talk about another common data type in TensorFlow: Placeholder. It is a simplified variable that can be passed to the required value by the session when the graph is run, that is, when you build the graph, you don't need to specify the value of that variable, but delay the session to the beginning. In TensorFlow terminology, we then feed data into the graph through these placeholders. The difference between placeholders and constants is that placeholders can specify coefficient values more flexibly without modifying the code that builds the graph. For example, mathematical constants are suitable for Constant, and some model smoothing values can be specified with Placeholder. | |||
| 这章我们讲一下TensorFlow里的另一种常用数据类型:占位符。它是一种简化的变量,可以在图运行的时候由会话传入所需要的值,就是说你在构建图的时候,不需要具体指定那个变量的值,而是延迟到会话开始的时候以参数的方式从外部传入初始值。占位符和常量的区别是占位符可以更灵活的指定系数值,而不需要修改构建图的代码。比如数学常量就适合用Constant, 有些模型平滑值可以用Placeholder来指定。 | |||
| ```csharp | |||
| var x = tf.placeholder(tf.int32); | |||
| @@ -6,36 +6,10 @@ Why do I start the TensorFlow.NET project? | |||
| In a few days, it was Christmas in 2018. I watched my children grow up and be sensible every day, and I felt that time passed too fast. IT technology updates are faster than ever, and a variety of front-end technologies are emerging. Big data, Artificial Intelligence and Blockchain, Container technology and Microservices, Distributed Computing and Serverless technology are dazzling. The Amazon AI service interface claims that engineers who don't need any machine learning experience can use it, so that the idea of just calming down for two years and planning to switch to an AI architecture in the future is a splash of cold water. | |||
| 再过几天就是2018年圣诞节,看着孩子一天天长大并懂事,感慨时间过得太快。IT技术更新换代比以往任何时候都更快,各种前后端技术纷纷涌现。大数据,人工智能和区块链,容器技术和微服务,分布式计算和无服务器技术,让人眼花缭乱。Amazon AI服务接口宣称不需要具有任何机器学习经验的工程师就能使用,让像我这样刚静下心来学习了两年并打算将来转行做AI架构的想法泼了一桶凉水。 | |||
| TensorFlow is an open source project for machine learning especially for deep learning. It's used for both research and production at Google company. It's designed according to dataflow programming pattern across a range of tasks. TensorFlow is not just a deep learning library. As long as you can represent your calculation process as a data flow diagram, you can use TensorFlow for distributed computing. TensorFlow uses a computational graph to build a computing network while operating on the graph. Users can write their own upper-level models in Python based on TensorFlow, or extend the underlying C++ custom action code to TensorFlow. | |||
| TensorFlow是一个用于机器学习的开源项目,尤其适用于深度学习。 它最初是谷歌公司的用于内部研究和生产的工具,后来开源出来给社区使用。TensorFlow并不仅仅是一个深度学习库,只要可以把你的计算过程表示称一个数据流图的过程,就可以使用TensorFlow来进行分布式计算。TensorFlow用计算图的方式建立计算网络,同时对图进行操作。用户可以基于TensorFlow的基础上用python编写自己的上层模型,也可以扩展底层的C++自定义操作代码添加到TensorFlow中。 | |||
| In order to avoid confusion, the unique classes defined in TensorFlow are not translated in this book. For example, Tensor, Graph, Shape will retain the English name. | |||
| 为了避免混淆,本书中对TensorFlow中定义的特有类不进行翻译,比如Tensor, Graph, Session, Shape这些词都会保留英文名称。 | |||
| Terminology 术语: | |||
| TF: Google TensorFlow | |||
| TF.NET: TensorFlow.NET | |||
| Graph: 计算图 | |||
| Session: 会话 | |||
| Variable: 变量 | |||
| Tensor: 张量 | |||
| Operation: 操作 | |||
| Node: 节点 | |||
| @@ -2,13 +2,13 @@ | |||
| TensorFlow **session** runs parts of the graph across a set of local and remote devices. A session allows to execute graphs or part of graphs. It allocates resources (on one or more machines) for that and holds the actual values of intermediate results and variables. | |||
| TensorFlow **Session** 运行预定义的计算图,并且支持跨设备运行和分配GPU。Session可以运行整个计算图或者图的一部分,这样做的好处是对开发模型来话很方便,不需要每次都执行整个图。会话还负责当前计算图的内存分配,保留和传递中间结果。 | |||
| ### Running Computations in a Session | |||
| Let's complete the example in last chapter. To run any of the operations, we need to create a session for that graph. The session will also allocate memory to store the current value of the variable. | |||
| 让我们完成上一章的例子,在那个例子里我们只是定义了一个图的结构。为了运行这个图,我们需要创建一个Session来根据图定义来分配资源运行它。 | |||
| ```csharp | |||
| with<Graph>(tf.Graph(), graph => | |||
| @@ -27,5 +27,3 @@ with<Graph>(tf.Graph(), graph => | |||
| ``` | |||
| The value of our variables is only valid within one session. If we try to get the value in another session. TensorFlow will raise an error of `Attempting to use uninitialized value foo`. Of course, we can use the graph in more than one session, because session copies graph definition to new memory area. We just have to initialize the variables again. The values in the new session will be completely independent from the previous one. | |||
| 变量值只会在一个Session里有效。如果我们试图从本Session来访问另一个Session创建的变量和值,就会得到一个`变量未初始化`的错误提示。当然,我们能从多个Session运行同一个计算图,因为计算图只是一个定义,Session初始化的时候会复制整图的定义到新的内存空间里。所以每个Session里的变量值是互相隔离的。 | |||
| @@ -8,16 +8,12 @@ | |||
| Tensor holds a multi-dimensional array of elements of a single data type which is very similar with numpy's ndarray. When the dimension is zero, it can be called a scalar. When the dimension is 2, it can be called a matrix. When the dimension is greater than 2, it is usually called a tensor. If you are very familiar with numpy, then understanding Tensor will be quite easy. | |||
| Tensor是一个具有单一数据类型的多维数组容器,当维度为零时,可以称之为标量,当维度为2时,可以称之为矩阵,当维度大于2时,通常称之为张量。Tensor的数据结构非常类似于numpy里的ndarray。如果你对numpy非常熟悉的话,那么对Tensor的理解会相当容易。 | |||
| ##### How to create a Tensor? | |||
| There are many ways to initialize a Tensor object in TF.NET. It can be initialized from a scalar, string, matrix or tensor. | |||
| 在TF.NET中有很多种方式可以初始化一个Tensor对象。它可以从一个标量,字符串,矩阵或张量来初始化。 | |||
| ```csharp | |||
| // Create a tensor holds a scalar value | |||
| var t1 = new Tensor(3); | |||
| @@ -42,8 +38,6 @@ Console.WriteLine($"t1: {t1}, t2: {t2}, t3: {t3}"); | |||
| TF uses column major order. If we use NumSharp to generate a 2 x 3 matrix, if we access the data from 0 to 5 in order, we won't get a number of 1-6, but we get the order of 1, 4, 2, 5, 3, 6. a set of numbers. | |||
| TF 采用的是按列存储模式,如果我们用NumSharp产生一个2 X 3的矩阵,如果按顺序从0到5访问数据的话,是不会得到1-6的数字的,而是得到1,4, 2, 5, 3, 6这个顺序的一组数字。 | |||
| ```cs | |||
| // Generate a matrix:[[1, 2, 3], [4, 5, 6]] | |||
| var nd = np.array(1f, 2f, 3f, 4f, 5f, 6f).reshape(2, 3); | |||
| @@ -2,7 +2,7 @@ | |||
| The variables in TensorFlow are mainly used to represent variable parameter values in the machine learning model. Variables can be initialized by the `tf.Variable` function. During the graph computation the variables are modified by other operations. Variables exist in the session, as long as they are in the same session, other computing nodes on the network can access the same variable value. Variables use lazy loading and will only request memory space when they are used. | |||
| TensorFlow中变量主要用来表示机器学习模型中的可变参数值,变量通过可以通过`tf.Variable` 类进行初始化。在图运行过程中,通过各种操作对变量进行修改。变量存在于会话当中,只要是在同一个会话里,网络上的其它计算结节都可以访问到相同的变量值。变量采用延迟加载的方式,只有使用的时候才会申请内存空间。 | |||
| ```csharp | |||
| var x = tf.Variable(10, name: "x"); | |||
| @@ -16,4 +16,3 @@ using (var session = tf.Session()) | |||
| The above code first creates a variable operation, initializes the variable, then runs the session, and finally gets the result. This code is very simple, but it shows the complete process how TensorFlow operates on variables. When creating a variable, you pass a `tensor` as the initial value to the function `Variable()`. TensorFlow provides a series of operators to initialize the tensor, the initial value is a constant or a random value. | |||
| 以上代码先创建变量操作,初始化变量,再运行会话,最后得到结果。这段代码非常简单,但是它体现了整个TensorFlow对变量操作的完整流程。当创建一个变量时,你将一个`张量`作为初始值传入函数`Variable()`。TensorFlow提供了一系列操作符来初始化张量,初始值是常量或是随机值。 | |||
| @@ -28,4 +28,5 @@ Welcome to TensorFlow.NET's documentation! | |||
| LinearRegression | |||
| LogisticRegression | |||
| NearestNeighbor | |||
| ImageRecognition | |||
| ImageRecognition | |||
| NeuralNetwork | |||
| @@ -56,12 +56,17 @@ namespace TensorFlowNET.Examples | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Predict() | |||
| public void Predict(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -34,14 +34,48 @@ namespace TensorFlowNET.Examples | |||
| int num_classes = 10; // The 10 digits | |||
| int num_features = 784; // Each image is 28x28 pixels | |||
| float accuray_test = 0f; | |||
| public bool Run() | |||
| { | |||
| PrepareData(); | |||
| var graph = ImportGraph(); | |||
| with(tf.Session(graph), sess => | |||
| { | |||
| Train(sess); | |||
| }); | |||
| return accuray_test > 0.70; | |||
| } | |||
| public void PrepareData() | |||
| { | |||
| mnist = MnistDataSet.read_data_sets("mnist", one_hot: true, train_size: train_size, validation_size:validation_size, test_size:test_size); | |||
| full_data_x = mnist.train.images; | |||
| // download graph meta data | |||
| string url = "https://raw.githubusercontent.com/SciSharp/TensorFlow.NET/master/graph/kmeans.meta"; | |||
| Web.Download(url, "graph", "kmeans.meta"); | |||
| } | |||
| public Graph ImportGraph() | |||
| { | |||
| var graph = tf.Graph().as_default(); | |||
| tf.train.import_meta_graph("graph/kmeans.meta"); | |||
| return graph; | |||
| } | |||
| public Graph BuildGraph() | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public void Train(Session sess) | |||
| { | |||
| var graph = tf.Graph(); | |||
| // Input images | |||
| Tensor X = graph.get_operation_by_name("Placeholder"); // tf.placeholder(tf.float32, shape: new TensorShape(-1, num_features)); | |||
| // Labels (for assigning a label to a centroid and testing) | |||
| @@ -60,89 +94,65 @@ namespace TensorFlowNET.Examples | |||
| Tensor cluster_idx = graph.get_operation_by_name("Squeeze_1"); | |||
| NDArray result = null; | |||
| with(tf.Session(graph), sess => | |||
| { | |||
| sess.run(init_vars, new FeedItem(X, full_data_x)); | |||
| sess.run(init_op, new FeedItem(X, full_data_x)); | |||
| // Training | |||
| var sw = new Stopwatch(); | |||
| foreach (var i in range(1, num_steps + 1)) | |||
| { | |||
| sw.Restart(); | |||
| result = sess.run(new ITensorOrOperation[] { train_op, avg_distance, cluster_idx }, new FeedItem(X, full_data_x)); | |||
| sw.Stop(); | |||
| if (i % 4 == 0 || i == 1) | |||
| print($"Step {i}, Avg Distance: {result[1]} Elapse: {sw.ElapsedMilliseconds}ms"); | |||
| } | |||
| var idx = result[2].Data<int>(); | |||
| // Assign a label to each centroid | |||
| // Count total number of labels per centroid, using the label of each training | |||
| // sample to their closest centroid (given by 'idx') | |||
| var counts = np.zeros((k, num_classes), np.float32); | |||
| sw.Start(); | |||
| foreach (var i in range(idx.Length)) | |||
| { | |||
| var x = mnist.train.labels[i]; | |||
| counts[idx[i]] += x; | |||
| } | |||
| sw.Stop(); | |||
| print($"Assign a label to each centroid took {sw.ElapsedMilliseconds}ms"); | |||
| // Assign the most frequent label to the centroid | |||
| var labels_map_array = np.argmax(counts, 1); | |||
| var labels_map = tf.convert_to_tensor(labels_map_array); | |||
| sess.run(init_vars, new FeedItem(X, full_data_x)); | |||
| sess.run(init_op, new FeedItem(X, full_data_x)); | |||
| // Evaluation ops | |||
| // Lookup: centroid_id -> label | |||
| var cluster_label = tf.nn.embedding_lookup(labels_map, cluster_idx); | |||
| // Compute accuracy | |||
| var correct_prediction = tf.equal(cluster_label, tf.cast(tf.argmax(Y, 1), tf.int32)); | |||
| var cast = tf.cast(correct_prediction, tf.float32); | |||
| var accuracy_op = tf.reduce_mean(cast); | |||
| // Test Model | |||
| var (test_x, test_y) = (mnist.test.images, mnist.test.labels); | |||
| result = sess.run(accuracy_op, new FeedItem(X, test_x), new FeedItem(Y, test_y)); | |||
| print($"Test Accuracy: {result}"); | |||
| }); | |||
| // Training | |||
| var sw = new Stopwatch(); | |||
| return (float)result > 0.70; | |||
| } | |||
| foreach (var i in range(1, num_steps + 1)) | |||
| { | |||
| sw.Restart(); | |||
| result = sess.run(new ITensorOrOperation[] { train_op, avg_distance, cluster_idx }, new FeedItem(X, full_data_x)); | |||
| sw.Stop(); | |||
| public void PrepareData() | |||
| { | |||
| mnist = MnistDataSet.read_data_sets("mnist", one_hot: true, train_size: train_size, validation_size:validation_size, test_size:test_size); | |||
| full_data_x = mnist.train.images; | |||
| if (i % 4 == 0 || i == 1) | |||
| print($"Step {i}, Avg Distance: {result[1]} Elapse: {sw.ElapsedMilliseconds}ms"); | |||
| } | |||
| // download graph meta data | |||
| string url = "https://raw.githubusercontent.com/SciSharp/TensorFlow.NET/master/graph/kmeans.meta"; | |||
| Web.Download(url, "graph", "kmeans.meta"); | |||
| } | |||
| var idx = result[2].Data<int>(); | |||
| public Graph ImportGraph() | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| // Assign a label to each centroid | |||
| // Count total number of labels per centroid, using the label of each training | |||
| // sample to their closest centroid (given by 'idx') | |||
| var counts = np.zeros((k, num_classes), np.float32); | |||
| public Graph BuildGraph() | |||
| { | |||
| throw new NotImplementedException(); | |||
| sw.Start(); | |||
| foreach (var i in range(idx.Length)) | |||
| { | |||
| var x = mnist.train.labels[i]; | |||
| counts[idx[i]] += x; | |||
| } | |||
| sw.Stop(); | |||
| print($"Assign a label to each centroid took {sw.ElapsedMilliseconds}ms"); | |||
| // Assign the most frequent label to the centroid | |||
| var labels_map_array = np.argmax(counts, 1); | |||
| var labels_map = tf.convert_to_tensor(labels_map_array); | |||
| // Evaluation ops | |||
| // Lookup: centroid_id -> label | |||
| var cluster_label = tf.nn.embedding_lookup(labels_map, cluster_idx); | |||
| // Compute accuracy | |||
| var correct_prediction = tf.equal(cluster_label, tf.cast(tf.argmax(Y, 1), tf.int32)); | |||
| var cast = tf.cast(correct_prediction, tf.float32); | |||
| var accuracy_op = tf.reduce_mean(cast); | |||
| // Test Model | |||
| var (test_x, test_y) = (mnist.test.images, mnist.test.labels); | |||
| result = sess.run(accuracy_op, new FeedItem(X, test_x), new FeedItem(Y, test_y)); | |||
| accuray_test = result; | |||
| print($"Test Accuracy: {accuray_test}"); | |||
| } | |||
| public bool Train() | |||
| public void Predict(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Predict() | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -122,12 +122,17 @@ namespace TensorFlowNET.Examples | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Predict() | |||
| public void Predict(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -132,30 +132,27 @@ namespace TensorFlowNET.Examples | |||
| initializer_nodes: ""); | |||
| } | |||
| public void Predict() | |||
| public void Predict(Session sess) | |||
| { | |||
| var graph = new Graph().as_default(); | |||
| graph.Import(Path.Join("logistic_regression", "model.pb")); | |||
| with(tf.Session(graph), sess => | |||
| { | |||
| // restoring the model | |||
| // var saver = tf.train.import_meta_graph("logistic_regression/tensorflowModel.ckpt.meta"); | |||
| // saver.restore(sess, tf.train.latest_checkpoint('logistic_regression')); | |||
| var pred = graph.OperationByName("Softmax"); | |||
| var output = pred.outputs[0]; | |||
| var x = graph.OperationByName("Placeholder"); | |||
| var input = x.outputs[0]; | |||
| // predict | |||
| var (batch_xs, batch_ys) = mnist.train.next_batch(10); | |||
| var results = sess.run(output, new FeedItem(input, batch_xs[np.arange(1)])); | |||
| if (results.argmax() == (batch_ys[0] as NDArray).argmax()) | |||
| print("predicted OK!"); | |||
| else | |||
| throw new ValueError("predict error, should be 90% accuracy"); | |||
| }); | |||
| // restoring the model | |||
| // var saver = tf.train.import_meta_graph("logistic_regression/tensorflowModel.ckpt.meta"); | |||
| // saver.restore(sess, tf.train.latest_checkpoint('logistic_regression')); | |||
| var pred = graph.OperationByName("Softmax"); | |||
| var output = pred.outputs[0]; | |||
| var x = graph.OperationByName("Placeholder"); | |||
| var input = x.outputs[0]; | |||
| // predict | |||
| var (batch_xs, batch_ys) = mnist.train.next_batch(10); | |||
| var results = sess.run(output, new FeedItem(input, batch_xs[np.arange(1)])); | |||
| if (results.argmax() == (batch_ys[0] as NDArray).argmax()) | |||
| print("predicted OK!"); | |||
| else | |||
| throw new ValueError("predict error, should be 90% accuracy"); | |||
| } | |||
| public Graph ImportGraph() | |||
| @@ -168,12 +165,12 @@ namespace TensorFlowNET.Examples | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| bool IExample.Predict() | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -189,12 +189,17 @@ namespace TensorFlowNET.Examples | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Predict() | |||
| public void Predict(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -86,12 +86,17 @@ namespace TensorFlowNET.Examples | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Predict() | |||
| public void Predict(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -162,12 +162,17 @@ namespace TensorFlowNET.Examples | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Predict() | |||
| public void Predict(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -171,12 +171,17 @@ namespace TensorFlowNET.Examples | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Predict() | |||
| public void Predict(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -50,12 +50,17 @@ namespace TensorFlowNET.Examples | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Predict() | |||
| public void Predict(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -29,9 +29,10 @@ namespace TensorFlowNET.Examples | |||
| /// Build dataflow graph, train and predict | |||
| /// </summary> | |||
| /// <returns></returns> | |||
| bool Train(); | |||
| void Train(Session sess); | |||
| void Test(Session sess); | |||
| bool Predict(); | |||
| void Predict(Session sess); | |||
| Graph ImportGraph(); | |||
| @@ -37,16 +37,21 @@ namespace TensorFlowNET.Examples.ImageProcess | |||
| Operation optimizer; | |||
| int display_freq = 100; | |||
| float accuracy_test = 0f; | |||
| float loss_test = 1f; | |||
| public bool Run() | |||
| { | |||
| bool successful = false; | |||
| PrepareData(); | |||
| BuildGraph(); | |||
| successful = Train(); | |||
| return successful; | |||
| with(tf.Session(), sess => | |||
| { | |||
| Train(sess); | |||
| Test(sess); | |||
| }); | |||
| return loss_test < 0.09 && accuracy_test > 0.95; | |||
| } | |||
| public Graph BuildGraph() | |||
| @@ -98,61 +103,67 @@ namespace TensorFlowNET.Examples.ImageProcess | |||
| public Graph ImportGraph() => throw new NotImplementedException(); | |||
| public bool Predict() => throw new NotImplementedException(); | |||
| public void Predict(Session sess) => throw new NotImplementedException(); | |||
| public void PrepareData() | |||
| { | |||
| mnist = MnistDataSet.read_data_sets("mnist", one_hot: true); | |||
| } | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| // Number of training iterations in each epoch | |||
| var num_tr_iter = mnist.train.labels.len / batch_size; | |||
| return with(tf.Session(), sess => | |||
| { | |||
| var init = tf.global_variables_initializer(); | |||
| sess.run(init); | |||
| float loss_val = 100.0f; | |||
| float accuracy_val = 0f; | |||
| var init = tf.global_variables_initializer(); | |||
| sess.run(init); | |||
| float loss_val = 100.0f; | |||
| float accuracy_val = 0f; | |||
| foreach (var epoch in range(epochs)) | |||
| { | |||
| print($"Training epoch: {epoch + 1}"); | |||
| // Randomly shuffle the training data at the beginning of each epoch | |||
| var (x_train, y_train) = randomize(mnist.train.images, mnist.train.labels); | |||
| foreach (var epoch in range(epochs)) | |||
| foreach (var iteration in range(num_tr_iter)) | |||
| { | |||
| print($"Training epoch: {epoch + 1}"); | |||
| // Randomly shuffle the training data at the beginning of each epoch | |||
| var (x_train, y_train) = randomize(mnist.train.images, mnist.train.labels); | |||
| var start = iteration * batch_size; | |||
| var end = (iteration + 1) * batch_size; | |||
| var (x_batch, y_batch) = get_next_batch(x_train, y_train, start, end); | |||
| // Run optimization op (backprop) | |||
| sess.run(optimizer, new FeedItem(x, x_batch), new FeedItem(y, y_batch)); | |||
| foreach (var iteration in range(num_tr_iter)) | |||
| if (iteration % display_freq == 0) | |||
| { | |||
| var start = iteration * batch_size; | |||
| var end = (iteration + 1) * batch_size; | |||
| var (x_batch, y_batch) = get_next_batch(x_train, y_train, start, end); | |||
| // Run optimization op (backprop) | |||
| sess.run(optimizer, new FeedItem(x, x_batch), new FeedItem(y, y_batch)); | |||
| if (iteration % display_freq == 0) | |||
| { | |||
| // Calculate and display the batch loss and accuracy | |||
| var result = sess.run(new[] { loss, accuracy }, new FeedItem(x, x_batch), new FeedItem(y, y_batch)); | |||
| loss_val = result[0]; | |||
| accuracy_val = result[1]; | |||
| print($"iter {iteration.ToString("000")}: Loss={loss_val.ToString("0.0000")}, Training Accuracy={accuracy_val.ToString("P")}"); | |||
| } | |||
| // Calculate and display the batch loss and accuracy | |||
| var result = sess.run(new[] { loss, accuracy }, new FeedItem(x, x_batch), new FeedItem(y, y_batch)); | |||
| loss_val = result[0]; | |||
| accuracy_val = result[1]; | |||
| print($"iter {iteration.ToString("000")}: Loss={loss_val.ToString("0.0000")}, Training Accuracy={accuracy_val.ToString("P")}"); | |||
| } | |||
| // Run validation after every epoch | |||
| var results1 = sess.run(new[] { loss, accuracy }, new FeedItem(x, mnist.validation.images), new FeedItem(y, mnist.validation.labels)); | |||
| loss_val = results1[0]; | |||
| accuracy_val = results1[1]; | |||
| print("---------------------------------------------------------"); | |||
| print($"Epoch: {epoch + 1}, validation loss: {loss_val.ToString("0.0000")}, validation accuracy: {accuracy_val.ToString("P")}"); | |||
| print("---------------------------------------------------------"); | |||
| } | |||
| return accuracy_val > 0.95; | |||
| }); | |||
| // Run validation after every epoch | |||
| var results1 = sess.run(new[] { loss, accuracy }, new FeedItem(x, mnist.validation.images), new FeedItem(y, mnist.validation.labels)); | |||
| loss_val = results1[0]; | |||
| accuracy_val = results1[1]; | |||
| print("---------------------------------------------------------"); | |||
| print($"Epoch: {epoch + 1}, validation loss: {loss_val.ToString("0.0000")}, validation accuracy: {accuracy_val.ToString("P")}"); | |||
| print("---------------------------------------------------------"); | |||
| } | |||
| } | |||
| public void Test(Session sess) | |||
| { | |||
| var result = sess.run(new[] { loss, accuracy }, new FeedItem(x, mnist.test.images), new FeedItem(y, mnist.test.labels)); | |||
| loss_test = result[0]; | |||
| accuracy_test = result[1]; | |||
| print("---------------------------------------------------------"); | |||
| print($"Test loss: {loss_test.ToString("0.0000")}, test accuracy: {accuracy_test.ToString("P")}"); | |||
| print("---------------------------------------------------------"); | |||
| } | |||
| private (NDArray, NDArray) randomize(NDArray x, NDArray y) | |||
| @@ -68,12 +68,17 @@ namespace TensorFlowNET.Examples.ImageProcess | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Predict() | |||
| public void Predict(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -125,12 +125,17 @@ namespace TensorFlowNET.Examples | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Predict() | |||
| public void Predict(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -118,12 +118,17 @@ namespace TensorFlowNET.Examples | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Predict() | |||
| public void Predict(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -13,7 +13,6 @@ using static Tensorflow.Python; | |||
| namespace TensorFlowNET.Examples | |||
| { | |||
| public class ObjectDetection : IExample | |||
| { | |||
| public bool Enabled { get; set; } = true; | |||
| @@ -155,12 +154,17 @@ namespace TensorFlowNET.Examples | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public void Predict(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Predict() | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -681,12 +681,17 @@ namespace TensorFlowNET.Examples.ImageProcess | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Predict() | |||
| public void Predict(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -148,12 +148,17 @@ namespace TensorFlowNET.Examples | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Predict() | |||
| public void Predict(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -37,14 +37,18 @@ namespace TensorFlowNET.Examples | |||
| private const int CHAR_MAX_LEN = 1014; | |||
| protected float loss_value = 0; | |||
| double max_accuracy = 0; | |||
| int vocabulary_size = 50000; | |||
| NDArray train_x, valid_x, train_y, valid_y; | |||
| public bool Run() | |||
| { | |||
| PrepareData(); | |||
| var graph = IsImportingGraph ? ImportGraph() : BuildGraph(); | |||
| with(tf.Session(graph), sess => Train(sess)); | |||
| return Train(); | |||
| return max_accuracy > 0.9; | |||
| } | |||
| // TODO: this originally is an SKLearn utility function. it randomizes train and test which we don't do here | |||
| @@ -235,7 +239,6 @@ namespace TensorFlowNET.Examples | |||
| var train_batches = batch_iter(train_x, train_y, BATCH_SIZE, NUM_EPOCHS); | |||
| var num_batches_per_epoch = (len(train_x) - 1) / BATCH_SIZE + 1; | |||
| double max_accuracy = 0; | |||
| Tensor is_training = graph.OperationByName("is_training"); | |||
| Tensor model_x = graph.OperationByName("x"); | |||
| @@ -301,13 +304,17 @@ namespace TensorFlowNET.Examples | |||
| return max_accuracy > 0.9; | |||
| } | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| var graph = IsImportingGraph ? ImportGraph() : BuildGraph(); | |||
| return with(tf.Session(graph), sess => Train(sess, graph)); | |||
| Train(sess, sess.graph); | |||
| } | |||
| public void Predict(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Predict() | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -44,12 +44,17 @@ namespace TensorFlowNET.Examples | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Predict() | |||
| public void Predict(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -40,12 +40,17 @@ namespace TensorFlowNET.Examples | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Predict() | |||
| public void Predict(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -217,12 +217,17 @@ namespace TensorFlowNET.Examples.Text.NER | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Predict() | |||
| public void Predict(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -16,7 +16,7 @@ namespace TensorFlowNET.Examples | |||
| public bool IsImportingGraph { get; set; } = false; | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -41,7 +41,12 @@ namespace TensorFlowNET.Examples | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Predict() | |||
| public void Predict(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -280,12 +280,17 @@ namespace TensorFlowNET.Examples | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Predict() | |||
| public void Predict(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| @@ -214,12 +214,17 @@ namespace TensorFlowNET.Examples | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Train() | |||
| public void Train(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public bool Predict() | |||
| public void Predict(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||
| public void Test(Session sess) | |||
| { | |||
| throw new NotImplementedException(); | |||
| } | |||