Imagen is an advanced image compositing engine. Users bring their own art in pieces from which it creates thousands of variations from the art assets, allowing control over how each asset interacts with each other, and the frequency of the assets in the collection as a whole.


Generative NFT collections ( like Bored Apes ) are created by compositing different combinations of elements together to form a final image, one of thousands of variations. Ordinary generative image tools simply add layers randomly on top of each other, like slapping on random sandwich ingredients. But what if a collection is more complicated?


We created a sophisticated program with logic to handle complex combinations of rules. What rules? In this Wall St Fam example, each character holds a random object in either hand, which means that each hand grip type must match the corresponding object. To realistically hold something, there needs to be a separate hand element below and above to wrap around the item.
And with multiple skin tones, the engine must choose the correct skin tone for each grip that matches the body and object it’s holding. Gloves and nail colors must coordinate with each hand. Hair, clothes, hats, extras, and foregrounds/backgrounds must frequently exist on multiple layers behind and in front of the individual.

Sandwich on steroids

It’s a complicated sandwich of many sandwiches where multiple ingredients have a top and matching bottom, and everything inside needs to coordinate and fit together.

While handling all the complex visual logic above, Imagen also lets creators define the probability of each element’s existence within the collection, letting users control the distribution of attributes throughout the entire set on a macro level.