You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

readme.md 6.7 kB

4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191
  1. <TOC>
  2. # Pre-Trained Image Processing Transformer (IPT)
  3. This repository is an official implementation of the paper "Pre-Trained Image Processing Transformer" from CVPR 2021.
  4. We study the low-level computer vision task (e.g., denoising, super-resolution and deraining) and develop a new pre-trained model, namely, image processing transformer (IPT). To maximally excavate the capability of transformer, we present to utilize the well-known ImageNet benchmark for generating a large amount of corrupted image pairs. The IPT model is trained on these images with multi-heads and multi-tails. In addition, the contrastive learning is introduced for well adapting to different image processing tasks. The pre-trained model can therefore efficiently employed on desired task after fine-tuning. With only one pre-trained model, IPT outperforms the current state-of-the-art methods on various low-level benchmarks.
  5. If you find our work useful in your research or publication, please cite our work:
  6. [1] Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, and Wen Gao. **"Pre-trained image processing transformer"**. <i>**CVPR 2021**.</i> [[arXiv](https://arxiv.org/abs/2012.00364)]
  7. @inproceedings{chen2020pre,
  8. title={Pre-trained image processing transformer},
  9. author={Chen, Hanting and Wang, Yunhe and Guo, Tianyu and Xu, Chang and Deng, Yiping and Liu, Zhenhua and Ma, Siwei and Xu, Chunjing and Xu, Chao and Gao, Wen},
  10. booktitle={CVPR},
  11. year={2021}
  12. }
  13. ## Model architecture
  14. ### The overall network architecture of IPT is shown as below
  15. ![architecture](./image/ipt.png)
  16. ## Dataset
  17. The benchmark datasets can be downloaded as follows:
  18. For super-resolution:
  19. Set5,
  20. [Set14](https://sites.google.com/site/romanzeyde/research-interests),
  21. [B100](https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/),
  22. Urban100.
  23. For denoising:
  24. [CBSD68](https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/).
  25. For deraining:
  26. [Rain100L](https://www.icst.pku.edu.cn/struct/Projects/joint_rain_removal.html)
  27. The result images are converted into YCbCr color space. The PSNR is evaluated on the Y channel only.
  28. ## Requirements
  29. ### Hardware (Ascend)
  30. > Prepare hardware environment with Ascend.
  31. ### Framework
  32. > [MindSpore](https://www.mindspore.cn/install/en)
  33. ### For more information, please check the resources below
  34. [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
  35. [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
  36. ## Script Description
  37. > This is the inference script of IPT, you can following steps to finish the test of image processing tasks, like SR, denoise and derain, via the corresponding pretrained models.
  38. ### Scripts and Sample Code
  39. ```bash
  40. IPT
  41. ├── eval.py # inference entry
  42. ├── train.py # pre-training entry
  43. ├── train_finetune.py # fine-tuning entry
  44. ├── image
  45. │   └── ipt.png # the illustration of IPT network
  46. ├── readme.md # Readme
  47. ├── scripts
  48. │   ├── run_eval.sh # inference script for all tasks
  49. │   ├── run_distributed.sh # pre-training script for all tasks
  50. │   └── run_finetune_distributed.sh # fine-tuning script for all tasks
  51. └── src
  52. ├── args.py # options/hyper-parameters of IPT
  53. ├── data
  54. │   ├── common.py # common dataset
  55. │   ├── bicubic.py # scripts for data pre-processing
  56. │   ├── div2k.py # DIV2K dataset
  57. │   ├── imagenet.py # Imagenet data for pre-training
  58. │   └── srdata.py # All dataset
  59. ├── metrics.py # PSNR calculator
  60. ├── utils.py # training scripts
  61. ├── loss.py # contrastive_loss
  62. └── ipt_model.py # IPT network
  63. ```
  64. ### Script Parameter
  65. > For details about hyperparameters, see src/args.py.
  66. ## Training Process
  67. ### For pre-training
  68. ```bash
  69. python train.py --distribute --imagenet 1 --batch_size 64 --lr 5e-5 --scale 2+3+4+1+1+1 --alltask --react --model vtip --num_queries 6 --chop_new --num_layers 4 --data_train imagenet --dir_data $DATA_PATH --derain --save $SAVE_PATH
  70. ```
  71. > Or one can run following script for all tasks.
  72. ```bash
  73. sh scripts/run_distributed.sh RANK_TABLE_FILE DATA_PATH
  74. ```
  75. ### For fine-tuning
  76. > For SR tasks:
  77. ```bash
  78. python train_finetune.py --distribute --imagenet 0 --batch_size 64 --lr 2e-5 --scale 2+3+4+1+1+1 --model vtip --num_queries 6 --chop_new --num_layers 4 --task_id $TASK_ID --dir_data $DATA_PATH --pth_path $MODEL --epochs 50
  79. ```
  80. > For Denoising tasks:
  81. ```bash
  82. python train_finetune.py --distribute --imagenet 0 --batch_size 64 --lr 2e-5 --scale 2+3+4+1+1+1 --model vtip --num_queries 6 --chop_new --num_layers 4 --task_id $TASK_ID --dir_data $DATA_PATH --pth_path $MODEL --denoise --sigma $Noise --epochs 50
  83. ```
  84. > For deraining tasks:
  85. ```bash
  86. python train_finetune.py --distribute --imagenet 0 --batch_size 64 --lr 2e-5 --scale 2+3+4+1+1+1 --model vtip --num_queries 6 --chop_new --num_layers 4 --task_id $TASK_ID --dir_data $DATA_PATH --pth_path $MODEL --derain --epochs 50
  87. ```
  88. > Or one can run following script for all tasks.
  89. ```bash
  90. sh scripts/run_finetune_distributed.sh RANK_TABLE_FILE DATA_PATH MODEL TASK_ID
  91. ```
  92. ## Evaluation
  93. ### Evaluation Process
  94. > Inference example:
  95. > For SR x4:
  96. ```bash
  97. python eval.py --dir_data $DATA_PATH --data_test $DATA_TEST --test_only --ext img --pth_path $MODEL --task_id $TASK_ID --scale $SCALE
  98. ```
  99. > Or one can run following script for all tasks.
  100. ```bash
  101. sh scripts/run_eval.sh DATA_PATH DATA_TEST MODEL TASK_ID
  102. ```
  103. ### Evaluation Result
  104. The result are evaluated by the value of PSNR (Peak Signal-to-Noise Ratio), and the format is as following.
  105. ```bash
  106. result: {"Mean psnr of Set5 x4 is 32.68"}
  107. ```
  108. ## Performance
  109. ### Inference Performance
  110. The Results on all tasks are listed as below.
  111. Super-resolution results:
  112. | Scale | Set5 | Set14 | B100 | Urban100 |
  113. | ----- | ----- | ----- | ----- | ----- |
  114. | ×2 | 38.36 | 34.54 | 32.50 | 33.88 |
  115. | ×3 | 34.83 | 30.96 | 29.39 | 29.59 |
  116. | ×4 | 32.68 | 29.01 | 27.81 | 27.24 |
  117. Denoising results:
  118. | noisy level | CBSD68 | Urban100 |
  119. | ----- | ----- | ----- |
  120. | 30 | 32.37 | 33.82 |
  121. | 50 | 29.94 | 31.56 |
  122. Derain results:
  123. | Task | Rain100L |
  124. | ----- | ----- |
  125. | Derain | 41.98 |
  126. ## ModeZoo Homepage
  127. Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).