Essence AI

In this project, I explore the latent representations that exist between lyric-space and pixel-space with generative AI. My goal is to capture the essence of music through visual art.
I believe that creativity, imagination, and truth exist in a complex, high dimensional space. This space is where the human mind operates, and in some ways, is not unlike the latent space that sits beneath todays growing foundation models (e.g. GPT-X, DALL-E, etc). Obscure from the outside, but when probed correctly, can reveal the beautiful concepts and patterns that are encoded deep in the maze of weights and biases. We call these concepts and patterns, ideas.
Ideas are intangible. They exist as amorphous things in some complex latent space. As humans, we assign them identity through one or more modalities of our sensory capability — light, sound, taste, temperature, pressure, and smell. For example, when I see a wood burning fireplace, I am stimulated to recall the idea of a cold winter day at my grandparents house. This idea is much greater than just the visual depiction of a fireplace. Its the smell of dried burning oak, the sound of my grandmother whistling in the kitchen, the feel of warmth on by back — all encoded in a high dimensional, graph-like representation of information.
As humans, we do not have the ability to communicate in this high dimensional space of true signal. So, we must first compress ideas into a lower dimensional representation; a modality through which that information can be transferred (e.g. visual depiction of fireplace). Of course, there is loss associated with this compression. Similarly, there is a reconstruction loss associated with the reception of a modal signal. When another person is prompted with an image of a fireplace, they [instinctively] reconstruct that signal into a high dimensional idea that is biased with their life experience and world view. This is the reason why two humans can receive the same signal, but perceive two entirely different ideas from it. The reason for misunderstanding. Also the reason for creative interpretation.
In this project, I intend to combine the signals of words and sound (lyrics/audio) from my favorite songs into visual art with generative AI. Imparting biases from both my own brain's neural network, as well as modern, artificial neural networks, to visually capture the essence of music.
I'm inspired by the idea that one day, humans will communicate at the idea level, without the need to compress information.