The Origins of DeepDream
DeepDream emerged from a Google research blog post in 2015
. It wasn't initially designed for art creation, but rather as a way to Visualize the inner workings of neural networks. The core idea was to take a trained image classification network, such as Inception, and feed it an image, then amplify the Patterns the network detected. This process of amplification often resulted in images filled with dreamlike, psychedelic imagery. The initial blog post is linked on the webpage.
To truly understand DeepDream, it's important to grasp a few key aspects of convolutional neural networks (CNNs) and how Inception works. Deep learning networks, especially CNNs, have spurred remarkable progress in Image Recognition and Speech Recognition. The Inception network, a deep CNN, is primarily trained on large datasets of images to perform classification tasks. Google Engineers Alexander Mordvintsev, Christopher Olah, and Mike Tyka discovered the art style DeepDream could generate.
How DeepDream Works: A Peek Inside Neural Networks
DeepDream works by turning a neural network 'upside down'. Typically, you'd feed an image into a network and get a classification output. DeepDream takes an existing trained network, feeds in an image, and then asks the network to enhance whatever it 'sees'. This process involves gradient ascent, where the input image is tweaked iteratively to maximize the activation of specific layers or neurons.
Different layers in a CNN learn different features. Lower layers typically identify simple features like edges, lines, and textures.
Higher layers learn more complex patterns and objects, like faces, animals, or buildings. By amplifying different layers, DeepDream can generate images with varying styles and complexities. To create a DeepDream image, engineers start with an input image. They then select a layer, or even a specific neuron, within the neural network. Next, they instruct the network to modify the input image so that the selected layer or neuron becomes more 'excited'. This process involves adjusting the image pixels based on the gradient of the network's activation. This feedback loop continues iteratively, gradually altering the input image to maximize the desired activation.