An abstract video generator based on the BigGAN image-generation model.
A video synthesizer is a machine (hardware or software or other) that produces video signals that can be changed by an operator in real time through a set of inputs.
A Generative Adversarial Network (GAN) is a machine learning model that produces images. Like a video synthesizer, the images can be changed by changing numerical inputs. The BigGAN model is a specific GAN that was trained to produce images that look like photographs. If you're comfortable programming in Python, you could just use this model directly without any of the code in this repo.
In this repo, there is a simple program that runs this model repeatedly to generate all the frames of an animation, frame-by-frame, thereby generating a video. The inputs to the program are supposed to be useful if you're creating videos that pulse in time with music.
I call this a video generator rather than a video synthesizer because the process to generate the data for each frame and then render as a video is too slow to be run in real-time. I really want to make this a real-time tool, and if I succeed then I'll call it a true video synthesizer.
The set of programs in this repo may grow over time, as I think about new ways to provide interesting inputs to GANs.
Much of the code in this repo is from the BigGAN TF Hub Demo Colab.
To make it easy for artists to generate videos with a significant machine learning model.
The software in this repo runs a neural network on your GPU. If you'd like to generate video with GANs without doing the following local setup, try using the BigGAN Colab linked above. The advantages of running locally are that you won't have to download the generated files, and if you have a nice GPU, you might be able to generate videos faster than in the Colab.
- Ensure that Python 3.7 is installed on your machine.
- Install
pipenv:This will allow you to create a virtual environment for an easy and clean install of the libraries used in this repo.pip install --user pipenv - Install CUDA 10.0. Note this is not the newest version of CUDA. It is the version that is compatible with the
tensorflow-gpuversion used in this repo. - Download cuDNN v7.6.1 and set it up with your CUDA 10.0 installation. If you've never done this before, have no fear, it's just copying a few files and is easier than it looks :)
- Install the Python libraries used in this repo:
pipenv install
- Start python in your virtual environment:
pipenv run python -i - Load the BigGAN model:
from example import * import numpy as np gvs = GanVideoSynth()
- Define performance parameters, like tempo, number of beats, class IDs, and more. See
calls.pyfor an example.### See full list of parameters in calls.py ### # Tempo of the song in beats per minute. bpm = 135 # Number of beats to generate an animation for. num_beats = 16 # ImageNet class IDs to include in the y vector. classes = [398, 402]
- Run the model with the given parameters:
This will generate a numpy array in the
generate_in_tempo(gvs, bpm=bpm, magnitudes=magnitudes, funcs=funcs, axis_sets=axis_sets, random_label=random_label, classes=classes, quantize_label=quantize_label, y_scale=y_scale, periods=periods, num_beats=num_beats)
npysdirectory. The first time you run this, it will take a few minutes as the model "warms up". After that, it will take significantly less time to run. If you're trying to generate more frames (i.e.num_beatsis higher orbpmis lower), it will take longer. - Visualize the output. Do the following in a separate python process:
pipenv run python view_npy.py
The model generates a numpy array and saves it as npys/out.npy.
To visualize this as a video on your screen, run view_npy.py in a separate Python process.
Set ext='.gif' or ext='.mp4' in the arguments to generate_in_tempo().
This will create an animated GIF or MP4 video file in the renders directory.
Check out this blog post I wrote to help explain what's going on in this repo, and how you can change the parameters to fit your artistic goals.
