## Introduction
This is the implementation of [Convolutional Neural Networks for Sentence Classification](https://arxiv.org/abs/1408.5882) paper in PyTorch.
* MRDataset, non-static-model(word2vec rained by Mikolov etal. (2013) on 100 billion words of Google News)
* It can be run in both CPU and GPU
* The best accuracy is 82.61%, which is better than 81.5% in the paper
(by Jingyuan Liu @Fudan University; Email:(fdjingyuan@outlook.com) Welcome to discussion!)

## Requirement
* python 3.6
* pytorch > 0.1
* numpy
* gensim

## Run
STEP 1
install packages like gensim (other needed pakages is the same)
```
pip install gensim
```

STEP 2
install MRdataset and word2vec resources
* MRdataset: you can download the dataset in (https://www.cs.cornell.edu/people/pabo/movie-review-data/rt-polaritydata.tar.gz)
* word2vec: you can download the file in (https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit)

Since this file is more than 1.5G, I did not display in folders. If you download the file, please remember modify the path in Function def word_embeddings(path = './GoogleNews-vectors-negative300.bin/'):


STEP 3
train the model 
```
python train.py
```
you will get the information printed in the screen, like
```
Epoch [1/20], Iter [100/192] Loss: 0.7008
Test Accuracy: 71.869159 %
Epoch [2/20], Iter [100/192] Loss: 0.5957
Test Accuracy: 75.700935 %
Epoch [3/20], Iter [100/192] Loss: 0.4934
Test Accuracy: 78.130841 %

......
Epoch [20/20], Iter [100/192] Loss: 0.0364
Test Accuracy: 81.495327 %
Best Accuracy: 82.616822 %
Best Model: models/cnn.pkl
```

## Hyperparameters
According to the paper and experiment, I set:

|Epoch|Kernel Size|dropout|learning rate|batch size|
|---|---|---|---|---|
|20|\(h,300,100\)|0.5|0.0001|50|

h = [3,4,5]
If the accuracy is not improved, the learning rate will \*0.8.

## Result
I just tried one dataset : MR. (Other 6 dataset in paper SST-1, SST-2, TREC, CR, MPQA)
There are four models in paper: CNN-rand, CNN-static, CNN-non-static, CNN-multichannel.
I have tried CNN-non-static:A model with pre-trained vectors from word2vec. 
All words—including the unknown ones that are randomly initialized and the pretrained vectors are fine-tuned for each task
(which has almost the best performance and the most difficut to implement among the four models)

|Dataset|Class Size|Best Result|Kim's Paper Result|
|---|---|---|---|
|MR|2|82.617%(CNN-non-static)|81.5%(CNN-nonstatic)|


## Reference
* [Convolutional Neural Networks for Sentence Classification](https://arxiv.org/abs/1408.5882)
* https://github.com/Shawn1993/cnn-text-classification-pytorch
* https://github.com/junwang4/CNN-sentence-classification-pytorch-2017/blob/master/utils.py