This dark, cataclysmic ambient song was composed by gloomghost while he was battling out COVID. More than 100,000 people have died in the US, even more across the globe. I wanted to produce a more empathic statement of the severity of this pandemic, something more than what a mere number does. This video uses ~100,000 artificially generated faces morphed into a piece giving it a meditative, eerie look.
The audio tokens of the song, produced using VQVAE and Spleeter, are mapped to ~100,000 artificial faces generated using the StyleGAN2 FFHQ AI model. Images are then parsed through U2Net (for precise masking), Processing, and OpenCV. Delaunay triangulation on each frame renders the final result.
Back in my early days I practiced on painting Mughal+Indian Miniature Art, a very complex, extravagant, but unfortunately lost form of art. I experimented for months using various ML systems to see if they could generate these forms. The astonishing results on the left show how StyleGAN2 was able to understand and generate distinguisning elements like the color tones, the border-frames, intricate flower patterns, which all are unique to Mughal+Indian Miniature Art.
In here, I mapped the audio track from gloomghost to StyleGAN2 latent vectors trained on a dataset of Mughal+Indian Miniature art. A higher quality version here.
It's crazy how StyleGAN2 can mimic me realistically. The groovy, colorful background in the first video is created by a model I trained on random textures that reacts to my facial expressions generated by the machine. Other frameworks used: U2Net, OpenCV, VQVAE, AI Spleeter.
Audio-visualization experiment where the audio tokens of the song produced using VQVAE and Spleeter are mapped to the latent space of a model trained with StyleGAN2 on a dataset of modern art.
This was a tough experiment to understand and generate something as delicate as Indian Classical Hastas and Mudras. Some fascinating learnings for more enhanced iterations in the coming weeks.
Generated using StyleGAN2 trained on a dataset of 10,000 images of beaches around Seattle.
The model is trained on a dataset of sunset photos taken through a window in my room. Videos are generated by mapping latent space to various inputs including an audio track. Fiddled around with artificiality to get some very groovy and delightful results.