VSGAN¶

Release v1.6.4. (Installation)

https://img.shields.io/pypi/wheel/vsgan.svg

https://img.shields.io/pypi/pyversions/vsgan.svg

Single Image Super-Resolution Generative Adversarial Network (GAN) which uses the VapourSynth processing framework to handle input and output image data.

Short example:

from vsgan import ESRGAN
clip = ESRGAN(clip, device="cuda").\
    load(r"C:\Users\PHOENiX\Documents\PSNR_x4_DB.pth").\
    apply().\
    load(r"C:\Users\PHOENiX\Documents\4X_DoubleRunExample.pth").\
    apply(overlap=16).\
    apply(overlap=32).\
    clip

For more information see (Getting Started).

Features of VSGAN¶

VapourSynth — Transform, Filter, or Enhance your input video, or the VSGAN result with VapourSynth, a Script-based NLE.
Easy Model Chaining — You can chain models or re-run the model twice-over (or more).
Seamless Tiling — Have low VRAM? Don’t worry! The Network will be applied in quadrants of the image to reduce up-front VRAM usage.
Supports All RGB formats — You can use any RGB video input, including float32 (e.g., RGBS) inputs.
No Frame Extraction Necessary — Using VapourSynth you can pass a Video directly to VSGAN, without any frame extraction needed.
Repeatable Edits — Any edit you make in the VapourSynth script with or without VSGAN can be re-used for any other video.
Freedom — VSGAN is released under the MIT License, ensuring it will stay free, with the ability to be used commercially.

Supported Models¶

ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks. Supports both old and new-arch models of any scale.
ESRGAN+: Further Improving Enhanced Super-Resolution Generative Adversarial Network.
Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data. Supports 2x and 1x models if they used pixel-shuffle. Includes support for Real-ESRGAN v2 the arch mainly intended as an ultra-light model for fast video inference. However, it’s not a video network.
A-ESRGAN: Training Real-World Blind Super-Resolution with Attention U-Net Discriminators.
EGVSR: Real-Time Super-Resolution System of 4K-Video Based on Deep Learning.

Quick shoutout to pvsfunc.PD2V¶

A lot of Super-Resolution users work with old low-resolution media. If you plan to work with DVD video files, or generally NTSC/PAL standard MPEG-1/2 media, you should take a look at my other project pvsfunc.

In pvsfunc there’s a class PD2V which is intended specifically for DVD video files, but can work on other sourced MPEG-1/2 media. It optimally loads the video data Frame-accurately with various helper functions.

Frame-accurate frame-serving. Check for and supports mixed scan-type inputs.
recover() — Recover progressive frames in an interlaced stream in some scenarios, usually on animation or fake-interlacing.
floor() or ceil() — Convert Variable Frame-rate (VFR) to Constant Frame-rate. floor() is experimental.
deinterlace() — Deinterlace efficiently, only when needed. This is only a wrapper, an actual deinterlace function is still needed.
All of this and more with self-chaining support, just like VSGAN.

Example Results¶

Mickey’s Christmas Carol¶

This is what the official Disney Blu-ray looks like…

American Dad S01E01¶

This model was trained to fix inaccuracies in the DVD’s color, remove Halo’ing/Glow, and remove Chroma Droop. The result is a very crisp output for a show originally animated in SD.

Family Guy S01E01¶

This model was trained to fix inaccuracies in the DVD’s color, remove Halo’ing/Glow, and remove Chroma Droop. The result is a very crisp output for a show originally animated in SD. Do note that the warping/stretch on the edges is an animation/dvd edit and not caused by VSGAN or the model.