You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 2.7 kB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677
  1. ## Introduction
  2. This is the implementation of [Convolutional Neural Networks for Sentence Classification](https://arxiv.org/abs/1408.5882) paper in PyTorch.
  3. * MRDataset, non-static-model(word2vec rained by Mikolov etal. (2013) on 100 billion words of Google News)
  4. * It can be run in both CPU and GPU
  5. * The best accuracy is 82.61%, which is better than 81.5% in the paper
  6. (by Jingyuan Liu @Fudan University; Email:(fdjingyuan@outlook.com) Welcome to discussion!)
  7. ## Requirement
  8. * python 3.6
  9. * pytorch > 0.1
  10. * numpy
  11. * gensim
  12. ## Run
  13. STEP 1
  14. install packages like gensim (other needed pakages is the same)
  15. ```
  16. pip install gensim
  17. ```
  18. STEP 2
  19. install MRdataset and word2vec resources
  20. * MRdataset: you can download the dataset in (https://www.cs.cornell.edu/people/pabo/movie-review-data/rt-polaritydata.tar.gz)
  21. * word2vec: you can download the file in (https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit)
  22. Since this file is more than 1.5G, I did not display in folders. If you download the file, please remember modify the path in Function def word_embeddings(path = './GoogleNews-vectors-negative300.bin/'):
  23. STEP 3
  24. train the model
  25. ```
  26. python train.py
  27. ```
  28. you will get the information printed in the screen, like
  29. ```
  30. Epoch [1/20], Iter [100/192] Loss: 0.7008
  31. Test Accuracy: 71.869159 %
  32. Epoch [2/20], Iter [100/192] Loss: 0.5957
  33. Test Accuracy: 75.700935 %
  34. Epoch [3/20], Iter [100/192] Loss: 0.4934
  35. Test Accuracy: 78.130841 %
  36. ......
  37. Epoch [20/20], Iter [100/192] Loss: 0.0364
  38. Test Accuracy: 81.495327 %
  39. Best Accuracy: 82.616822 %
  40. Best Model: models/cnn.pkl
  41. ```
  42. ## Hyperparameters
  43. According to the paper and experiment, I set:
  44. |Epoch|Kernel Size|dropout|learning rate|batch size|
  45. |---|---|---|---|---|
  46. |20|\(h,300,100\)|0.5|0.0001|50|
  47. h = [3,4,5]
  48. If the accuracy is not improved, the learning rate will \*0.8.
  49. ## Result
  50. I just tried one dataset : MR. (Other 6 dataset in paper SST-1, SST-2, TREC, CR, MPQA)
  51. There are four models in paper: CNN-rand, CNN-static, CNN-non-static, CNN-multichannel.
  52. I have tried CNN-non-static:A model with pre-trained vectors from word2vec.
  53. All words—including the unknown ones that are randomly initialized and the pretrained vectors are fine-tuned for each task
  54. (which has almost the best performance and the most difficut to implement among the four models)
  55. |Dataset|Class Size|Best Result|Kim's Paper Result|
  56. |---|---|---|---|
  57. |MR|2|82.617%(CNN-non-static)|81.5%(CNN-nonstatic)|
  58. ## Reference
  59. * [Convolutional Neural Networks for Sentence Classification](https://arxiv.org/abs/1408.5882)
  60. * https://github.com/Shawn1993/cnn-text-classification-pytorch
  61. * https://github.com/junwang4/CNN-sentence-classification-pytorch-2017/blob/master/utils.py