|
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677 |
- ## Introduction
- This is the implementation of [Convolutional Neural Networks for Sentence Classification](https://arxiv.org/abs/1408.5882) paper in PyTorch.
- * MRDataset, non-static-model(word2vec rained by Mikolov etal. (2013) on 100 billion words of Google News)
- * It can be run in both CPU and GPU
- * The best accuracy is 82.61%, which is better than 81.5% in the paper
- (by Jingyuan Liu @Fudan University; Email:(fdjingyuan@outlook.com) Welcome to discussion!)
-
- ## Requirement
- * python 3.6
- * pytorch > 0.1
- * numpy
- * gensim
-
- ## Run
- STEP 1
- install packages like gensim (other needed pakages is the same)
- ```
- pip install gensim
- ```
-
- STEP 2
- install MRdataset and word2vec resources
- * MRdataset: you can download the dataset in (https://www.cs.cornell.edu/people/pabo/movie-review-data/rt-polaritydata.tar.gz)
- * word2vec: you can download the file in (https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit)
-
- Since this file is more than 1.5G, I did not display in folders. If you download the file, please remember modify the path in Function def word_embeddings(path = './GoogleNews-vectors-negative300.bin/'):
-
-
- STEP 3
- train the model
- ```
- python train.py
- ```
- you will get the information printed in the screen, like
- ```
- Epoch [1/20], Iter [100/192] Loss: 0.7008
- Test Accuracy: 71.869159 %
- Epoch [2/20], Iter [100/192] Loss: 0.5957
- Test Accuracy: 75.700935 %
- Epoch [3/20], Iter [100/192] Loss: 0.4934
- Test Accuracy: 78.130841 %
-
- ......
- Epoch [20/20], Iter [100/192] Loss: 0.0364
- Test Accuracy: 81.495327 %
- Best Accuracy: 82.616822 %
- Best Model: models/cnn.pkl
- ```
-
- ## Hyperparameters
- According to the paper and experiment, I set:
-
- |Epoch|Kernel Size|dropout|learning rate|batch size|
- |---|---|---|---|---|
- |20|\(h,300,100\)|0.5|0.0001|50|
-
- h = [3,4,5]
- If the accuracy is not improved, the learning rate will \*0.8.
-
- ## Result
- I just tried one dataset : MR. (Other 6 dataset in paper SST-1, SST-2, TREC, CR, MPQA)
- There are four models in paper: CNN-rand, CNN-static, CNN-non-static, CNN-multichannel.
- I have tried CNN-non-static:A model with pre-trained vectors from word2vec.
- All words—including the unknown ones that are randomly initialized and the pretrained vectors are fine-tuned for each task
- (which has almost the best performance and the most difficut to implement among the four models)
-
- |Dataset|Class Size|Best Result|Kim's Paper Result|
- |---|---|---|---|
- |MR|2|82.617%(CNN-non-static)|81.5%(CNN-nonstatic)|
-
-
-
- ## Reference
- * [Convolutional Neural Networks for Sentence Classification](https://arxiv.org/abs/1408.5882)
- * https://github.com/Shawn1993/cnn-text-classification-pytorch
- * https://github.com/junwang4/CNN-sentence-classification-pytorch-2017/blob/master/utils.py
-
|