|
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644 |
- {
- "nbformat": 4,
- "nbformat_minor": 0,
- "metadata": {
- "accelerator": "GPU",
- "colab": {
- "name": "SHARE MLSpring2021 - HW2-1.ipynb",
- "provenance": [],
- "collapsed_sections": []
- },
- "kernelspec": {
- "display_name": "Python 3",
- "name": "python3"
- }
- },
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "OYlaRwNu7ojq"
- },
- "source": [
- "# **Homework 2-1 Phoneme Classification**"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "emUd7uS7crTz"
- },
- "source": [
- "## The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus (TIMIT)\n",
- "The TIMIT corpus of reading speech has been designed to provide speech data for the acquisition of acoustic-phonetic knowledge and for the development and evaluation of automatic speech recognition systems.\n",
- "\n",
- "This homework is a multiclass classification task, \n",
- "we are going to train a deep neural network classifier to predict the phonemes for each frame from the speech corpus TIMIT.\n",
- "\n",
- "link: https://academictorrents.com/details/34e2b78745138186976cbc27939b1b34d18bd5b3"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "KVUGfWTo7_Oj"
- },
- "source": [
- "## Download Data\n",
- "Download data from google drive, then unzip it.\n",
- "\n",
- "You should have `timit_11/train_11.npy`, `timit_11/train_label_11.npy`, and `timit_11/test_11.npy` after running this block.<br><br>\n",
- "`timit_11/`\n",
- "- `train_11.npy`: training data<br>\n",
- "- `train_label_11.npy`: training label<br>\n",
- "- `test_11.npy`: testing data<br><br>\n",
- "\n",
- "**notes: if the google drive link is dead, you can download the data directly from Kaggle and upload it to the workspace**\n",
- "\n",
- "\n"
- ]
- },
- {
- "cell_type": "code",
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "OzkiMEcC3Foq",
- "outputId": "4308c64c-6885-4d1c-8eb7-a2d9b8038401"
- },
- "source": [
- "!gdown --id '1HPkcmQmFGu-3OknddKIa5dNDsR05lIQR' --output data.zip\n",
- "!unzip data.zip\n",
- "!ls "
- ],
- "execution_count": null,
- "outputs": [
- {
- "output_type": "stream",
- "text": [
- "Downloading...\n",
- "From: https://drive.google.com/uc?id=1HPkcmQmFGu-3OknddKIa5dNDsR05lIQR\n",
- "To: /content/data.zip\n",
- "372MB [00:03, 121MB/s]\n",
- "Archive: data.zip\n",
- " creating: timit_11/\n",
- " inflating: timit_11/train_11.npy \n",
- " inflating: timit_11/test_11.npy \n",
- " inflating: timit_11/train_label_11.npy \n",
- "data.zip sample_data timit_11\n"
- ],
- "name": "stdout"
- }
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "_L_4anls8Drv"
- },
- "source": [
- "## Preparing Data\n",
- "Load the training and testing data from the `.npy` file (NumPy array)."
- ]
- },
- {
- "cell_type": "code",
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "IJjLT8em-y9G",
- "outputId": "8edc6bfe-7511-447f-f239-00b96dba6dcf"
- },
- "source": [
- "import numpy as np\n",
- "\n",
- "print('Loading data ...')\n",
- "\n",
- "data_root='./timit_11/'\n",
- "train = np.load(data_root + 'train_11.npy')\n",
- "train_label = np.load(data_root + 'train_label_11.npy')\n",
- "test = np.load(data_root + 'test_11.npy')\n",
- "\n",
- "print('Size of training data: {}'.format(train.shape))\n",
- "print('Size of testing data: {}'.format(test.shape))"
- ],
- "execution_count": null,
- "outputs": [
- {
- "output_type": "stream",
- "text": [
- "Loading data ...\n",
- "Size of training data: (1229932, 429)\n",
- "Size of testing data: (451552, 429)\n"
- ],
- "name": "stdout"
- }
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "us5XW_x6udZQ"
- },
- "source": [
- "## Create Dataset"
- ]
- },
- {
- "cell_type": "code",
- "metadata": {
- "id": "Fjf5EcmJtf4e"
- },
- "source": [
- "import torch\n",
- "from torch.utils.data import Dataset\n",
- "\n",
- "class TIMITDataset(Dataset):\n",
- " def __init__(self, X, y=None):\n",
- " self.data = torch.from_numpy(X).float()\n",
- " if y is not None:\n",
- " y = y.astype(np.int)\n",
- " self.label = torch.LongTensor(y)\n",
- " else:\n",
- " self.label = None\n",
- "\n",
- " def __getitem__(self, idx):\n",
- " if self.label is not None:\n",
- " return self.data[idx], self.label[idx]\n",
- " else:\n",
- " return self.data[idx]\n",
- "\n",
- " def __len__(self):\n",
- " return len(self.data)\n"
- ],
- "execution_count": null,
- "outputs": []
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "otIC6WhGeh9v"
- },
- "source": [
- "Split the labeled data into a training set and a validation set, you can modify the variable `VAL_RATIO` to change the ratio of validation data."
- ]
- },
- {
- "cell_type": "code",
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "sYqi_lAuvC59",
- "outputId": "13dabe63-4849-47ee-fe04-57427b9d601c"
- },
- "source": [
- "VAL_RATIO = 0.2\n",
- "\n",
- "percent = int(train.shape[0] * (1 - VAL_RATIO))\n",
- "train_x, train_y, val_x, val_y = train[:percent], train_label[:percent], train[percent:], train_label[percent:]\n",
- "print('Size of training set: {}'.format(train_x.shape))\n",
- "print('Size of validation set: {}'.format(val_x.shape))"
- ],
- "execution_count": null,
- "outputs": [
- {
- "output_type": "stream",
- "text": [
- "Size of training set: (983945, 429)\n",
- "Size of validation set: (245987, 429)\n"
- ],
- "name": "stdout"
- }
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "nbCfclUIgMTX"
- },
- "source": [
- "Create a data loader from the dataset, feel free to tweak the variable `BATCH_SIZE` here."
- ]
- },
- {
- "cell_type": "code",
- "metadata": {
- "id": "RUCbQvqJurYc"
- },
- "source": [
- "BATCH_SIZE = 64\n",
- "\n",
- "from torch.utils.data import DataLoader\n",
- "\n",
- "train_set = TIMITDataset(train_x, train_y)\n",
- "val_set = TIMITDataset(val_x, val_y)\n",
- "train_loader = DataLoader(train_set, batch_size=BATCH_SIZE, shuffle=True) #only shuffle the training data\n",
- "val_loader = DataLoader(val_set, batch_size=BATCH_SIZE, shuffle=False)"
- ],
- "execution_count": null,
- "outputs": []
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "_SY7X0lUgb50"
- },
- "source": [
- "Cleanup the unneeded variables to save memory.<br>\n",
- "\n",
- "**notes: if you need to use these variables later, then you may remove this block or clean up unneeded variables later<br>the data size is quite huge, so be aware of memory usage in colab**"
- ]
- },
- {
- "cell_type": "code",
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "y8rzkGraeYeN",
- "outputId": "dc790996-a43c-4a99-90d4-e7928892a899"
- },
- "source": [
- "import gc\n",
- "\n",
- "del train, train_label, train_x, train_y, val_x, val_y\n",
- "gc.collect()"
- ],
- "execution_count": null,
- "outputs": [
- {
- "output_type": "execute_result",
- "data": {
- "text/plain": [
- "50"
- ]
- },
- "metadata": {
- "tags": []
- },
- "execution_count": 6
- }
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "IRqKNvNZwe3V"
- },
- "source": [
- "## Create Model"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "FYr1ng5fh9pA"
- },
- "source": [
- "Define model architecture, you are encouraged to change and experiment with the model architecture."
- ]
- },
- {
- "cell_type": "code",
- "metadata": {
- "id": "lbZrwT6Ny0XL"
- },
- "source": [
- "import torch\n",
- "import torch.nn as nn\n",
- "\n",
- "class Classifier(nn.Module):\n",
- " def __init__(self):\n",
- " super(Classifier, self).__init__()\n",
- " self.layer1 = nn.Linear(429, 1024)\n",
- " self.layer2 = nn.Linear(1024, 512)\n",
- " self.layer3 = nn.Linear(512, 128)\n",
- " self.out = nn.Linear(128, 39) \n",
- "\n",
- " self.act_fn = nn.Sigmoid()\n",
- "\n",
- " def forward(self, x):\n",
- " x = self.layer1(x)\n",
- " x = self.act_fn(x)\n",
- "\n",
- " x = self.layer2(x)\n",
- " x = self.act_fn(x)\n",
- "\n",
- " x = self.layer3(x)\n",
- " x = self.act_fn(x)\n",
- "\n",
- " x = self.out(x)\n",
- " \n",
- " return x"
- ],
- "execution_count": null,
- "outputs": []
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "VRYciXZvPbYh"
- },
- "source": [
- "## Training"
- ]
- },
- {
- "cell_type": "code",
- "metadata": {
- "id": "y114Vmm3Ja6o"
- },
- "source": [
- "#check device\n",
- "def get_device():\n",
- " return 'cuda' if torch.cuda.is_available() else 'cpu'"
- ],
- "execution_count": null,
- "outputs": []
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "sEX-yjHjhGuH"
- },
- "source": [
- "Fix random seeds for reproducibility."
- ]
- },
- {
- "cell_type": "code",
- "metadata": {
- "id": "88xPiUnm0tAd"
- },
- "source": [
- "# fix random seed\n",
- "def same_seeds(seed):\n",
- " torch.manual_seed(seed)\n",
- " if torch.cuda.is_available():\n",
- " torch.cuda.manual_seed(seed)\n",
- " torch.cuda.manual_seed_all(seed) \n",
- " np.random.seed(seed) \n",
- " torch.backends.cudnn.benchmark = False\n",
- " torch.backends.cudnn.deterministic = True"
- ],
- "execution_count": null,
- "outputs": []
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "KbBcBXkSp6RA"
- },
- "source": [
- "Feel free to change the training parameters here."
- ]
- },
- {
- "cell_type": "code",
- "metadata": {
- "id": "QTp3ZXg1yO9Y"
- },
- "source": [
- "# fix random seed for reproducibility\n",
- "same_seeds(0)\n",
- "\n",
- "# get device \n",
- "device = get_device()\n",
- "print(f'DEVICE: {device}')\n",
- "\n",
- "# training parameters\n",
- "num_epoch = 20 # number of training epoch\n",
- "learning_rate = 0.0001 # learning rate\n",
- "\n",
- "# the path where checkpoint saved\n",
- "model_path = './model.ckpt'\n",
- "\n",
- "# create model, define a loss function, and optimizer\n",
- "model = Classifier().to(device)\n",
- "criterion = nn.CrossEntropyLoss() \n",
- "optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)"
- ],
- "execution_count": null,
- "outputs": []
- },
- {
- "cell_type": "code",
- "metadata": {
- "id": "CdMWsBs7zzNs",
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "outputId": "c5ed561e-610d-4a35-d936-fd97adf342a0"
- },
- "source": [
- "# start training\n",
- "\n",
- "best_acc = 0.0\n",
- "for epoch in range(num_epoch):\n",
- " train_acc = 0.0\n",
- " train_loss = 0.0\n",
- " val_acc = 0.0\n",
- " val_loss = 0.0\n",
- "\n",
- " # training\n",
- " model.train() # set the model to training mode\n",
- " for i, data in enumerate(train_loader):\n",
- " inputs, labels = data\n",
- " inputs, labels = inputs.to(device), labels.to(device)\n",
- " optimizer.zero_grad() \n",
- " outputs = model(inputs) \n",
- " batch_loss = criterion(outputs, labels)\n",
- " _, train_pred = torch.max(outputs, 1) # get the index of the class with the highest probability\n",
- " batch_loss.backward() \n",
- " optimizer.step() \n",
- "\n",
- " train_acc += (train_pred.cpu() == labels.cpu()).sum().item()\n",
- " train_loss += batch_loss.item()\n",
- "\n",
- " # validation\n",
- " if len(val_set) > 0:\n",
- " model.eval() # set the model to evaluation mode\n",
- " with torch.no_grad():\n",
- " for i, data in enumerate(val_loader):\n",
- " inputs, labels = data\n",
- " inputs, labels = inputs.to(device), labels.to(device)\n",
- " outputs = model(inputs)\n",
- " batch_loss = criterion(outputs, labels) \n",
- " _, val_pred = torch.max(outputs, 1) \n",
- " \n",
- " val_acc += (val_pred.cpu() == labels.cpu()).sum().item() # get the index of the class with the highest probability\n",
- " val_loss += batch_loss.item()\n",
- "\n",
- " print('[{:03d}/{:03d}] Train Acc: {:3.6f} Loss: {:3.6f} | Val Acc: {:3.6f} loss: {:3.6f}'.format(\n",
- " epoch + 1, num_epoch, train_acc/len(train_set), train_loss/len(train_loader), val_acc/len(val_set), val_loss/len(val_loader)\n",
- " ))\n",
- "\n",
- " # if the model improves, save a checkpoint at this epoch\n",
- " if val_acc > best_acc:\n",
- " best_acc = val_acc\n",
- " torch.save(model.state_dict(), model_path)\n",
- " print('saving model with acc {:.3f}'.format(best_acc/len(val_set)))\n",
- " else:\n",
- " print('[{:03d}/{:03d}] Train Acc: {:3.6f} Loss: {:3.6f}'.format(\n",
- " epoch + 1, num_epoch, train_acc/len(train_set), train_loss/len(train_loader)\n",
- " ))\n",
- "\n",
- "# if not validating, save the last epoch\n",
- "if len(val_set) == 0:\n",
- " torch.save(model.state_dict(), model_path)\n",
- " print('saving model at last epoch')\n"
- ],
- "execution_count": null,
- "outputs": [
- {
- "output_type": "stream",
- "text": [
- "[001/020] Train Acc: 0.467390 Loss: 1.812880 | Val Acc: 0.564884 loss: 1.440870\n",
- "saving model with acc 0.565\n",
- "[002/020] Train Acc: 0.594031 Loss: 1.332670 | Val Acc: 0.629594 loss: 1.209077\n",
- "saving model with acc 0.630\n",
- "[003/020] Train Acc: 0.644419 Loss: 1.154247 | Val Acc: 0.658295 loss: 1.102313\n",
- "saving model with acc 0.658\n",
- "[004/020] Train Acc: 0.672767 Loss: 1.051355 | Val Acc: 0.675568 loss: 1.040186\n",
- "saving model with acc 0.676\n",
- "[005/020] Train Acc: 0.691564 Loss: 0.982245 | Val Acc: 0.683853 loss: 1.004628\n",
- "saving model with acc 0.684\n",
- "[006/020] Train Acc: 0.705731 Loss: 0.930892 | Val Acc: 0.691707 loss: 0.977562\n",
- "saving model with acc 0.692\n",
- "[007/020] Train Acc: 0.716722 Loss: 0.890210 | Val Acc: 0.691016 loss: 0.973670\n",
- "[008/020] Train Acc: 0.726312 Loss: 0.856612 | Val Acc: 0.690207 loss: 0.971627\n",
- "[009/020] Train Acc: 0.734965 Loss: 0.827445 | Val Acc: 0.698561 loss: 0.942904\n",
- "saving model with acc 0.699\n",
- "[010/020] Train Acc: 0.741926 Loss: 0.801676 | Val Acc: 0.698854 loss: 0.946376\n",
- "saving model with acc 0.699\n",
- "[011/020] Train Acc: 0.748191 Loss: 0.779319 | Val Acc: 0.700944 loss: 0.938454\n",
- "saving model with acc 0.701\n",
- "[012/020] Train Acc: 0.754672 Loss: 0.758071 | Val Acc: 0.699423 loss: 0.940523\n",
- "[013/020] Train Acc: 0.759725 Loss: 0.739450 | Val Acc: 0.699728 loss: 0.951068\n",
- "[014/020] Train Acc: 0.765137 Loss: 0.721372 | Val Acc: 0.701903 loss: 0.938658\n",
- "saving model with acc 0.702\n",
- "[015/020] Train Acc: 0.769828 Loss: 0.704748 | Val Acc: 0.701761 loss: 0.937079\n",
- "[016/020] Train Acc: 0.774698 Loss: 0.688990 | Val Acc: 0.702293 loss: 0.938634\n",
- "saving model with acc 0.702\n",
- "[017/020] Train Acc: 0.779358 Loss: 0.674498 | Val Acc: 0.702492 loss: 0.943941\n",
- "saving model with acc 0.702\n",
- "[018/020] Train Acc: 0.783076 Loss: 0.660028 | Val Acc: 0.695195 loss: 0.966189\n",
- "[019/020] Train Acc: 0.787432 Loss: 0.646340 | Val Acc: 0.700708 loss: 0.958220\n",
- "[020/020] Train Acc: 0.791536 Loss: 0.633378 | Val Acc: 0.700643 loss: 0.957066\n"
- ],
- "name": "stdout"
- }
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "1Hi7jTn3PX-m"
- },
- "source": [
- "## Testing"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "NfUECMFCn5VG"
- },
- "source": [
- "Create a testing dataset, and load model from the saved checkpoint."
- ]
- },
- {
- "cell_type": "code",
- "metadata": {
- "id": "1PKjtAScPWtr",
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "outputId": "8c17272b-536a-4692-a95f-a3292766c698"
- },
- "source": [
- "# create testing dataset\n",
- "test_set = TIMITDataset(test, None)\n",
- "test_loader = DataLoader(test_set, batch_size=BATCH_SIZE, shuffle=False)\n",
- "\n",
- "# create model and load weights from checkpoint\n",
- "model = Classifier().to(device)\n",
- "model.load_state_dict(torch.load(model_path))"
- ],
- "execution_count": null,
- "outputs": [
- {
- "output_type": "execute_result",
- "data": {
- "text/plain": [
- "<All keys matched successfully>"
- ]
- },
- "metadata": {
- "tags": []
- },
- "execution_count": 12
- }
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "940TtCCdoYd0"
- },
- "source": [
- "Make prediction."
- ]
- },
- {
- "cell_type": "code",
- "metadata": {
- "id": "84HU5GGjPqR0"
- },
- "source": [
- "predict = []\n",
- "model.eval() # set the model to evaluation mode\n",
- "with torch.no_grad():\n",
- " for i, data in enumerate(test_loader):\n",
- " inputs = data\n",
- " inputs = inputs.to(device)\n",
- " outputs = model(inputs)\n",
- " _, test_pred = torch.max(outputs, 1) # get the index of the class with the highest probability\n",
- "\n",
- " for y in test_pred.cpu().numpy():\n",
- " predict.append(y)"
- ],
- "execution_count": null,
- "outputs": []
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "AWDf_C-omElb"
- },
- "source": [
- "Write prediction to a CSV file.\n",
- "\n",
- "After finish running this block, download the file `prediction.csv` from the files section on the left-hand side and submit it to Kaggle."
- ]
- },
- {
- "cell_type": "code",
- "metadata": {
- "id": "GuljYSPHcZir"
- },
- "source": [
- "with open('prediction.csv', 'w') as f:\n",
- " f.write('Id,Class\\n')\n",
- " for i, y in enumerate(predict):\n",
- " f.write('{},{}\\n'.format(i, y))"
- ],
- "execution_count": null,
- "outputs": []
- }
- ]
- }
|