Update README.md

2024-08-17 08:30:32 +03:00 · 2018-09-06 14:51:07 +08:00 · 2018-09-06 14:51:07 +08:00 · a164a118b9
commit a164a118b9
parent 8a94e4bff4
1 changed files with 41 additions and 7 deletions
--- a/README.md
+++ b/README.md
@ -50,28 +50,28 @@ python test.py models/RRDB_PSNR_x4.pth
 ```
 5. The results are in `./results` folder.
 ### Network interpolation demo
-We can interpolate the RRDB_ESRGAN and RRDB_PSNR models with alpha in [0, 1].
+You can interpolate the RRDB_ESRGAN and RRDB_PSNR models with alpha in [0, 1].

-1. Run `python net_interp.py 0.8`, where *0.8* is the interpolation parameter.
+1. Run `python net_interp.py 0.8`, where *0.8* is the interpolation parameter and you can change it to any value in [0,1].
 2. Run `python test.py models/interp_08.pth`, where *models/interp_08.pth* is the model path.

 <p align="center">
  <img height="400" src="figures/43074.gif">
 </p>

-## Introduction 
+## ESRGAN
 We improve the [SRGAN](https://arxiv.org/abs/1609.04802) from three aspects:
 1. adopt a deeper model using Residual-in-Residual Dense Block (RRDB) without batch normalization layers.
-2. employ [Relativistic average GAN](https://ajolicoeur.wordpress.com/relativisticgan/) instead of vanilla GAN.
+2. employ [Relativistic average GAN](https://ajolicoeur.wordpress.com/relativisticgan/) instead of the vanilla GAN.
 3. improve the perceptual loss by using the features before activation.

 In contrast to SRGAN, which claimed that **deeper models are increasingly difficult to train**, our deeper ESRGAN model shows its superior performance with easy training.

 <p align="center">
-  <img height="100" src="figures/architecture.jpg">
+  <img height="120" src="figures/architecture.jpg">
 </p>
 <p align="center">
-  <img height="130" src="figures/RRDB.jpg">
+  <img height="150" src="figures/RRDB.jpg">
 </p>

 ## Network Interpolation
@ -82,6 +82,7 @@ We propose the **network interpolation strategy** to balance the visual quality
 </p>

 We show the smooth animation with the interpolation parameters changing from 0 to 1. 
+Interestingly, it is observed that the network interpolation strategy provides a smooth control of the RRDB_PSNR model and the fine-tuned ESRGAN model.

 <p align="center">
  <img height="480" src="figures/81.gif">
@ -90,7 +91,7 @@ We show the smooth animation with the interpolation parameters changing from 0 t
 </p>
  
 ## Qualitative Results
-PSNR (evaluated on the luminance channel in YCbCr color space) and the perceptual index used in the PIRM-SR challenge are also provided for reference.
+PSNR (evaluated on the Y channel) and the perceptual index used in the PIRM-SR challenge are also provided for reference.

 <p align="center">
  <img src="figures/qualitative_cmp_01.jpg">
@ -105,3 +106,36 @@ PSNR (evaluated on the luminance channel in YCbCr color space) and the perceptua
  <img src="figures/qualitative_cmp_04.jpg">
 </p>

+## Ablation Study
+Overall visual comparisons for showing the effects of each component in
+ESRGAN. Each column represents a model with its configurations in the top.
+The red sign indicates the main improvement compared with the previous model.
+<p align="center">
+  <img src="figures/abalation_study.png">
+</p>
+
+## BN artifacts
+We empirically observe that BN layers tend to bring artifacts. These artifacts,
+namely BN artifacts, occasionally appear among iterations and different settings,
+violating the needs for a stable performance over training. We find that 
+the network depth, BN position, training dataset and training loss
+have impact on the occurrence of BN artifacts.
+<p align="center">
+  <img src="figures/BN_artifacts.jpg">
+</p>
+
+## Useful techniques to train a very deep network
+We find that residual scaling and smaller initialization can help to train a very deep network. More details are in the Supplementary File attached in our [paper](https://arxiv.org/abs/1809.00219).
+
+<p align="center">
+  <img height="250" src="figures/train_deeper_neta.png">
+  <img height="250" src="figures/train_deeper_netb.png">
+</p>
+
+## The influence of training patch size
+We observe that training a deeper network benefits from a larger patch size. Moreover, the deeper model achieves more improvement (∼0.12dB) than the shallower one (∼0.04dB) since larger model capacity is capable of taking full advantage of
+larger training patch size.
+<p align="center">
+  <img height="250" src="figures/patch_a.png">
+  <img height="250" src="figures/patch_b.png">
+</p>