VSGAN

Release v1.6.4. (Installation)

https://pepy.tech/badge/vsgan https://img.shields.io/pypi/l/vsgan.svg https://img.shields.io/pypi/wheel/vsgan.svg https://img.shields.io/pypi/pyversions/vsgan.svg

Single Image Super-Resolution Generative Adversarial Network (GAN) which uses the VapourSynth processing framework to handle input and output image data.


Short example:

from vsgan import ESRGAN
clip = ESRGAN(clip, device="cuda").\
    load(r"C:\Users\PHOENiX\Documents\PSNR_x4_DB.pth").\
    apply().\
    load(r"C:\Users\PHOENiX\Documents\4X_DoubleRunExample.pth").\
    apply(overlap=16).\
    apply(overlap=32).\
    clip

For more information see (Getting Started).

Features of VSGAN

  • VapourSynth — Transform, Filter, or Enhance your input video, or the VSGAN result with VapourSynth, a Script-based NLE.

  • Easy Model Chaining — You can chain models or re-run the model twice-over (or more).

  • Seamless Tiling — Have low VRAM? Don’t worry! The Network will be applied in quadrants of the image to reduce up-front VRAM usage.

  • Supports All RGB formats — You can use any RGB video input, including float32 (e.g., RGBS) inputs.

  • No Frame Extraction Necessary — Using VapourSynth you can pass a Video directly to VSGAN, without any frame extraction needed.

  • Repeatable Edits — Any edit you make in the VapourSynth script with or without VSGAN can be re-used for any other video.

  • Freedom — VSGAN is released under the MIT License, ensuring it will stay free, with the ability to be used commercially.

Supported Models

ESRGAN

Enhanced Super-Resolution Generative Adversarial Networks. Supports both old and new-arch models of any scale.

ESRGAN+

Further Improving Enhanced Super-Resolution Generative Adversarial Network.

Real-ESRGAN

Training Real-World Blind Super-Resolution with Pure Synthetic Data. Supports 2x and 1x models if they used pixel-shuffle. Includes support for Real-ESRGAN v2 the arch mainly intended as an ultra-light model for fast video inference. However, it’s not a video network.

A-ESRGAN

Training Real-World Blind Super-Resolution with Attention U-Net Discriminators.

EGVSR

Real-Time Super-Resolution System of 4K-Video Based on Deep Learning.

Quick shoutout to pvsfunc.PD2V

A lot of Super-Resolution users work with old low-resolution media. If you plan to work with DVD video files, or generally NTSC/PAL standard MPEG-1/2 media, you should take a look at my other project pvsfunc.

In pvsfunc there’s a class PD2V which is intended specifically for DVD video files, but can work on other sourced MPEG-1/2 media. It optimally loads the video data Frame-accurately with various helper functions.

  • Frame-accurate frame-serving. Check for and supports mixed scan-type inputs.

  • recover() — Recover progressive frames in an interlaced stream in some scenarios, usually on animation or fake-interlacing.

  • floor() or ceil() — Convert Variable Frame-rate (VFR) to Constant Frame-rate. floor() is experimental.

  • deinterlace() — Deinterlace efficiently, only when needed. This is only a wrapper, an actual deinterlace function is still needed.

  • All of this and more with self-chaining support, just like VSGAN.

Example Results

Mickey’s Christmas Carol

This is what the official Disney Blu-ray looks like

American Dad S01E01

This model was trained to fix inaccuracies in the DVD’s color, remove Halo’ing/Glow, and remove Chroma Droop. The result is a very crisp output for a show originally animated in SD.

Family Guy S01E01

This model was trained to fix inaccuracies in the DVD’s color, remove Halo’ing/Glow, and remove Chroma Droop. The result is a very crisp output for a show originally animated in SD. Do note that the warping/stretch on the edges is an animation/dvd edit and not caused by VSGAN or the model.