Update README.md

This commit is contained in:
Xintao 2018-09-06 14:51:07 +08:00 committed by GitHub
parent 8a94e4bff4
commit a164a118b9
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -50,28 +50,28 @@ python test.py models/RRDB_PSNR_x4.pth
```
5. The results are in `./results` folder.
### Network interpolation demo
We can interpolate the RRDB_ESRGAN and RRDB_PSNR models with alpha in [0, 1].
You can interpolate the RRDB_ESRGAN and RRDB_PSNR models with alpha in [0, 1].
1. Run `python net_interp.py 0.8`, where *0.8* is the interpolation parameter.
1. Run `python net_interp.py 0.8`, where *0.8* is the interpolation parameter and you can change it to any value in [0,1].
2. Run `python test.py models/interp_08.pth`, where *models/interp_08.pth* is the model path.
<p align="center">
<img height="400" src="figures/43074.gif">
</p>
## Introduction
## ESRGAN
We improve the [SRGAN](https://arxiv.org/abs/1609.04802) from three aspects:
1. adopt a deeper model using Residual-in-Residual Dense Block (RRDB) without batch normalization layers.
2. employ [Relativistic average GAN](https://ajolicoeur.wordpress.com/relativisticgan/) instead of vanilla GAN.
2. employ [Relativistic average GAN](https://ajolicoeur.wordpress.com/relativisticgan/) instead of the vanilla GAN.
3. improve the perceptual loss by using the features before activation.
In contrast to SRGAN, which claimed that **deeper models are increasingly difficult to train**, our deeper ESRGAN model shows its superior performance with easy training.
<p align="center">
<img height="100" src="figures/architecture.jpg">
<img height="120" src="figures/architecture.jpg">
</p>
<p align="center">
<img height="130" src="figures/RRDB.jpg">
<img height="150" src="figures/RRDB.jpg">
</p>
## Network Interpolation
@ -82,6 +82,7 @@ We propose the **network interpolation strategy** to balance the visual quality
</p>
We show the smooth animation with the interpolation parameters changing from 0 to 1.
Interestingly, it is observed that the network interpolation strategy provides a smooth control of the RRDB_PSNR model and the fine-tuned ESRGAN model.
<p align="center">
<img height="480" src="figures/81.gif">
@ -90,7 +91,7 @@ We show the smooth animation with the interpolation parameters changing from 0 t
</p>
## Qualitative Results
PSNR (evaluated on the luminance channel in YCbCr color space) and the perceptual index used in the PIRM-SR challenge are also provided for reference.
PSNR (evaluated on the Y channel) and the perceptual index used in the PIRM-SR challenge are also provided for reference.
<p align="center">
<img src="figures/qualitative_cmp_01.jpg">
@ -105,3 +106,36 @@ PSNR (evaluated on the luminance channel in YCbCr color space) and the perceptua
<img src="figures/qualitative_cmp_04.jpg">
</p>
## Ablation Study
Overall visual comparisons for showing the effects of each component in
ESRGAN. Each column represents a model with its configurations in the top.
The red sign indicates the main improvement compared with the previous model.
<p align="center">
<img src="figures/abalation_study.png">
</p>
## BN artifacts
We empirically observe that BN layers tend to bring artifacts. These artifacts,
namely BN artifacts, occasionally appear among iterations and different settings,
violating the needs for a stable performance over training. We find that
the network depth, BN position, training dataset and training loss
have impact on the occurrence of BN artifacts.
<p align="center">
<img src="figures/BN_artifacts.jpg">
</p>
## Useful techniques to train a very deep network
We find that residual scaling and smaller initialization can help to train a very deep network. More details are in the Supplementary File attached in our [paper](https://arxiv.org/abs/1809.00219).
<p align="center">
<img height="250" src="figures/train_deeper_neta.png">
<img height="250" src="figures/train_deeper_netb.png">
</p>
## The influence of training patch size
We observe that training a deeper network benefits from a larger patch size. Moreover, the deeper model achieves more improvement (0.12dB) than the shallower one (0.04dB) since larger model capacity is capable of taking full advantage of
larger training patch size.
<p align="center">
<img height="250" src="figures/patch_a.png">
<img height="250" src="figures/patch_b.png">
</p>