Skip to content

zentrocdot/ESRGAN_RRBD_CONVERTER

Repository files navigation

ESRGAN RRBG CONVERTER Contact - Discord

GitHub - ESRGAN GitHub  - RealEsrGAN Models - OpenModelDB Models - Hugging Face

SHORT DESCRIPTION OF the REPOSITORY

Resources for converting of models from old ESRGAN to new ESRGAN architecture. Resources for converting RealESRGAN models to new ESRGAN models. There is also a converter for converting new ESRGAN models to RealESRGAN models. These converters can be found in folder converter. Bonus material are the standalone upscalers for ESRGAN and RealESRGAN (The latter two are rudimentarily documented). Last standalone converter versions can be found in folder upscaler. For the other stuff read the documentation.


CONTRIBUTION

If you like what I present here, or if it helps you, or if it is useful, you are welcome to donate a small contribution. It motivates me a lot and speeds up my work a much 😏.


Note

All scripts are programmed for use with the Linux operating system. I do not consider other operating systems due to time constraints. The software used is freely available and makes no special demands on the operating system.

TL;DR: Use the Table of Contents for Topics of Interest

Preface

A small idea has now turned into a small project. To make myself independent of the web user interfaces for the AI image generation, which are offering a high sophisticated upscaling capability, I implement different approaches for the upscaling of images for myself.

I worked in parallel with ESRGAN and RealESRGAN. Both approaches delivered quite impressive results from a standing start. However, the number of functioning models is limited for the time being. This is where the converters come into play, with which the number of models can potentially be increased.

At short notice and on a whim, I took a closer and more intensive look at the topic. The interim result of my deliberations can be found here in the repository. But only some of the things I have tried and done can be found here.

The quality of the results achieved in the meantime speaks for itself and justifies the use of ESRGAN and RealESRGAN.

Motivation

While I was working on implementing an upscaling method for images using ESRGAN models, the question arose as to how I can use all available ESRGAN models with the approach I am working with.

If one uses ESRGAN models from external sources and not the original proposed ESRGAN models, one will get sometimes errors which results in a failure of the upscaling process.

A simple converter could be found in the original ESRGAN sources I intended to use. That this is a converter was not obvious.I have adapted this converter for my personal purposes.

Now some upscalers that I am already using with the web user interface AUTOMATIC1111 I can also use in my own upscaler application or standalone. I can now also use upscalers from different other sources.

Introduction

I implemented ERSGAN [1] and RealESRGAN [2] in my Lazy Image Upscaler. To be able to test more than the given four models from xinntao I searched and collected other ERSGAN models and tried them out, most of the time without success. One model in use by the web user interface AUTOMATIC1111 [5] was working out of the box. The usage of other models from other sources failed for the time being.

I will use the ERSGAN method (and the RealERSGAN) in his given form from xinntao independend from other software tools. The only thing I need is a converter to prepare more or less the most ESRGAN models for use with xinntaos approach.

Note

I am interested in the generative AI image creation. Of course, I am also interested in theory when necessary, but only when it is helpful. Keeping this in mind my interest is focussed on how to use a model and not how to train a model.

Name of the Repository

ESRGAN is used together with RRDB. This is the base concept. I provide tools for the conversion between ESRGAN models and converter for this methodology. So after some back and forth the name was born. At the latest when I understood the connections better.

Technical Background

Brief Introduction

The ESRGAN (Enhanced Super-Resolution Generative Adversarial Networks) is improving the model architecture using the RRDB (Residual-in-residual Dense Block) without batch normalization based on the observations of EDSR. ESRGAN uses RaGAN (Relativistic GAN) relative loss instead of the perceptual loss and adversarial loss. It also improves the perceptual loss with VGG loss used in SRGAN. ESRGAN and RealESRGAn are direct improvments of SRGAN. The influence of EDSR should also be noted. A good article to the topic is [8].

Abbreviations explained can be found here.

ESRGAN & RealESRGAN

Within this repository I am focussing on ESRGAN and RealESRGAN. When we are talking about the implmentation of what the last sections explains, we have to discuss the internal structure of given ESRGAN model.

The old (outdated) models have a different internal structure to the new (current) models with respect to the used keys. The value to the key is still a tensor.

For the conversion one needs a translation table or conversion table for the keys which looks like:

    "model.0.weight"                     ->  "conv_first.weight"
    "model.0.bias"                       ->  "conv_first.bias":
    "model.1.sub.0.RDB1.conv1.0.weight"  ->  "RRDB_trunk.0.RDB1.conv1.weight" 
    "model.1.sub.0.RDB1.conv1.0.bias"    ->  "RRDB_trunk.0.RDB1.conv1.bias"
    "model.1.sub.23.weight"              ->  "trunk_conv.weight"
    "model.1.sub.23.bias"                ->  "trunk_conv.bias" 
    "model.3.weight"                     ->  "upconv1.weight" 
    "model.3.bias"                       ->  "upconv1.bias" 
    "model.6.weight"                     ->  "upconv2.weight" 
    "model.6.bias"                       ->  "upconv2.bias" 
    "model.8.weight"                     ->  "HRconv.weight" 
    "model.8.bias"                       ->  "HRconv.bias" 
    "model.10.weight"                    ->  "conv_last.weight"
    "model.10.bias"                      ->  "conv_last.bias"

The value related to the key is a tensor. One has to consider while converting that given tensor shape is the tensor shape which is required by the final model.

RealESRGAN has its own naming convention, which is similar but not the same. Within the models that are compatible with the approach of xinntao, I can easily convert back and forth. I found further differences in other ESRGAN models that require a further investigation.

Description of What Is Implemented So Far

I have converted the original supplied converter for old RSGAN to the new RSGAN for my personal purposes. Then I modernized the implementation a little bit. Next I figured out, that some models look like ESRGAN models, but that they are not the models I can use or that they are no ESRGAN or somes´thing is strange with the models. To have an idea what a model does I wrote a simple analysing tools. This tool identifies highly reliable old an new ERSGAN models and it is able to find out if a model is possibly RealsESRGAN. This is important driven by the fact that sometime a model is wrong declared. Next I wrot a converter from RealESRGAN to ESRGAN. This converter works so far quite good.

Testing

For testing purpose I wrote an upscaler for ESRGAN and an upscaler for RealESRGAN. Both are upscaling an image and simply save them.

I used these two upscalers next to an experimental version of my Lazy Image Upscaler to check that the converter are working the right way.

Easy Way to Test the Converter

Installation

Clone this github repository and move into the directory

git clone https://github.com/zentrocdot/ESRGAN_RRBD_CONVERTER.git
cd ESRGAN_RRDB_CONVERTER

or use the zip-file or simply download the files, which you need.

How to Use the Main OLD Converter

Note

The following is the old description. Go to the folder converter for the new converters and descriptions.

You need following two files from the old_stuff folder to get the converter run.

RRDBNet_arch.py
converter_RRDB_models.py

converter_RRDB_models.py is the converter script and RRDBNet_arch.py is the Python module which contains the classes which are required to run the converter script. The latter file is imported from the converter.

Run the converter as follows:

python3 converter_RRDB_models.py <upscaler_model_file_name.pth>

Conversion Process

If you run the experimental converter.

lucifer@hades:~/ESRGAN/models$ python3 experimenta_converter_RRDB_models.py 8x_NMKD-Superscale_150000_G.pth

the output in the terminal window is as follows:

***  ESRGAN CONVERTER  ***
Reference keywords to found keywords:
['model.8.weight', 'model.8.bias', 'model.10.weight', 'model.10.bias']
['model.11.weight', 'model.11.bias', 'model.13.weight', 'model.13.bias']
Input: 8x_NMKD-Superscale_150000_G.pth
Output: 8x_NMKD-Superscale_150000_G_CED.pth
Start conversion ...
... conversion completed!

Error Handling

Errors are catched and the Traceback is printed out into the terminal window. This looks like:

Traceback (most recent call last):
  File "/home/hades/ssd-sandisk/AI_Tools/ESRGAN/ESRGAN/models/experimentell_converter_RRDB_models.py", line 97, in main
    crt_net[key] = pretrained_net[value]
KeyError: 'model.8.weight'

Traceback (most recent call last):
  File "/home/hades/ssd-sandisk/AI_Tools/ESRGAN/ESRGAN/models/experimentell_converter_RRDB_models.py", line 97, in main
    crt_net[key] = pretrained_net[value]
KeyError: 'model.8.bias'

This is no longer an error. This behaviour is considered in the last version of the converter. But other errors need to be catched.

Reference keywords versus found keywords show, if there was a mismatch.

Compatible NEW (current) ESRGAN Models

Some were analysed and found to be new models. No need for a conversion:

  • RRDB_ESRGAN_x4.pth (xinntao)
  • RRDB_PSNR_x4.pth (xinntao)
  • ESRGAN.pth (KAIR)
  • DF2K.pth

Converted Models from Old ESRGAN to New ESRGAN

The focus of these repositories lies on ESRGAN and everything related to it. Later, I will look into how I can take RealESRGAN into account in an appropriate way.

The old ESRGAN models are listed below which I have tested and successfully converted to new ESRGAN models:

  • 4xFSMangaV2_CVTD.pth
  • 4xLSDIRplus.pth
  • 4xNomos2_otf_esrgan.pth
  • 4xNomosWebPhoto_esrgan.pth
  • 4xNomos8kSC.pth
  • 4xFSDedither.pth
  • 4x-eula-digimanga-MiA-65k.pth (Color reduction ???)
  • 4x-UltraSharp.pth
  • 4x-UniScale_Restore.pth
  • 4x-UniScaleV2_Sharp.pth
  • 4x_BooruGan_600k_CVTD
  • 4x_Fatality_Comix_260000_G.pth
  • 4x_foolhardy_Remacri.pth
  • 4x_FuzzyBox.pth
  • 4x_IllustrationJaNai_V1_ESRGAN_135k_CVTD
  • 4x_NMKD-Siax_175k.pth
  • 4x_NMKD-Siax_200k.pt
  • 4x_UniversalUpscalerV2-Sharp_101000_G.pth
  • 4xPSNR.pth
  • 8xPSNR.pth
  • 8x_NMKD-Superscale_150000_G.pth (color shift after conversion ???)
  • realesrgan-x4minus.pth
  • RRDB_PSNR_x4_old_arch.pth (xinntao)
  • RRDB_ESRGAN_x4_old_arch.pth (xinntao)

After conversion I tried out all of the listed models. When there was a problem it is noted in the round brackets.

The list is not complete, but shows that almost every old model can already be converted. The algorithm seems to work well so far.

ESRGAN and AUTOMATIC1111

AUTOMATIC is using the ESRGAN model which can be downloaded from [5].

Practical Application Example

I used the following AI created image to show how the conversion from one model to another model works. The original jpg image has a resolution of 512 x 512 pixel.

button panel

Figure 1: Original image with a resolution of 512 x 512 pixel

button panel

Figure 2: Original image with marked area with a resolution of 512 x 512 pixel

I used the old ESRGAN model RRDB_PSNR_x4_old_arch.pth from xinntao and converted this model to a new ESRGAN model. I used the RealESRGAN model RRDB_RealESRGAN_x4plus.pth from xinntao and converted this model to a new ESRGAN model.

One can see is the following collage, what the theory to the both approaches states. There are upscaling artifact in the ESRGAN created image and no longer upscaling artifacts in the RealSRGAN created image. I zoomed in the face of the girl in the created images to show the effects.

button panel

Figure 3: Collage of the cropped area of the original image with 512 x 512 pixel and the upscaled images with 2048 x 2048 pixel

Advantages and Disadvantages

ESRGAN

The disadvantage of ESRGAN is the fact that the models are restircted to an upscaling of factor 4. The advantage is that there are a lot of models which can be used.

RealESRGAN

The disadvantage of ESRGAN is the fact that there are are a limited number of models which can be used. The advantage is the fact that in principle an upscaling with an arbitrary scale factor is possible.

Pickle Tensor

Files in the Pickle Tensor fileformat have in case of the ESRGAN models the extension .pth. Binary as well as zip-files can be used. There is no need for a further distinction or conversion.

Repository & Directory Structure

The repository and directory structure of the ESRGAN RRDB CONVERTER is looking as follows:

    └── esrgan_rrbd_converter
        ├── images
        ├── converter
        ├── upscaler
        ├── old_stuff
        ├── original_files
        ├── tool_scripts
        ├── test_scripts
        └── helper_scripts

The folder original_files contains the original sources from xinntao [1]. In the folder tool_scripts there are tools like the one for analysing the model structure. The folder scripts contains the current converter.

To-Do

I have to check under which conditions the converter is working and under which conditions the converter is not working.

Analysis of the internal model structure to understand the formatting of different models related to RRDB related models.

I have to catch errors while executing the script. In the experimental versions I realised this in a rudimentary way.

Error handling and the output of warnings must be improved. Dealing with three similar models harbours the risk of getting confused with the models.

I need an improvement of the comments and documentation in the scripts. And I am also need an improvement of this documentation.

Test Environment

I developed and tested the Python scripts using the following software development environment:

  • Linux Mint 21.3 (Virginia)
  • Python 3.10.14
  • Numpy 2.1.3
  • OpenCV 4.10.0
  • PIL 11.0.0
  • Torch 2.4.1+cu121

Licenses

The algorithms with respect to ESRGAN of xinntao are published under the Apache License. The original scripts and the improved scripts are covered by this license. I always publish my work under the MIT License.

Credits

My thanks go to the excellent work of Xintao Wang (xinntao). The results that can be achieved with his approach are more than good.

Reference

[1] https://github.com/xinntao/ESRGAN

[2] https://github.com/xinntao/Real-ESRGAN

[3] https://huggingface.co/

[4] https://openmodeldb.info/

[5] https://github.com/cszn/KAIR/releases/download/v1.0/ESRGAN.pth

[6] https://drive.google.com/drive/u/0/folders/17VYV_SoZZesU6mbxz2dMAIccSSlqLecY

[7] https://pan.baidu.com/s/1-Lh6ma-wXzfH8NqeBtPaFQ

[8] https://ar5iv.labs.arxiv.org/html/1809.00219

[9] https://openaccess.thecvf.com/content/ICCV2021W/AIM/papers/Wang_Real-ESRGAN_Training_Real-World_Blind_Super-Resolution_With_Pure_Synthetic_Data_ICCVW_2021_paper.pdf

Donation

If you like what I present here, or if it helps you, or if it is useful, you are welcome to donate a small contribution. Or as you might say: Every TRON counts! Many thanks in advance! 😃

Crypto Coin Tron

TQHgMpVzWkf3Lkd68QS7tcn7S5Y9ayg6Fu

Crypto Coin Dogecoin

DQYkNGW8VfCuUbM9Womnp6KiFdtMa4NUkD

Crypto Coin Bitcoin

bc1quy5nr5x4xgrtmnknuayysapg8t8tl4jqsgcu2t

Crypto Coin Ethereum

0xD238a487F5147AC23c25908FC86129889E38BfB9

$${\textnormal{\color{purple}Have a wonderful, beautiful and successful day. I also wish everyone peace on earth.}}$$

About

ESRGAN Converter for models as pth files

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

Packages