MOVING
STILL

<intro>

"Don't paint from nature too much. Art is an abstraction. Derive this abstraction from nature while dreaming before it, and think more of the creation that will result."

- Paul Gauguin

↓ Trailer for "Moving Still" (2022).

/ / What is Moving Still?

Moving Still is a 13-minute, experimental short movie and art installation.
‍
It takes on a odyssey through constantly morphing and pulsating nature scenes, with an eerie, dreamlike atmosphere. The visuals, both interconnected and disintegrating, evoke a haunting liminal space.

An evergrowing stem of memories, past or present? Clutching onto silhouettes and shadows in a world too fast to perceive, running into the unknown abyss.

‍

//Project Info
‍
data = {
    "date": "2022-05" //+ continuous work till present
    "type": "personal project"
    "contributor": "Benno Schulze"

    "category": [
        "EXPERIMENTAL",
        "SHORT_FILM",
        "GAUGAN2"
        "ART_INSTALLATION"
   ]
}
‍

<GALLERY>

GAUGAN 2

185A

⇁

185

GAUGAN 2

185A

⇁

1144

GAUGAN 2

1213A

⇁

1213

GAUGAN 2

1622A

⇁

1622

GAUGAN 2

2221A

⇁

2221

GAUGAN 2

2360A

⇁

2360

GAUGAN 2

2644A

⇁

2644

GAUGAN 2

2929A

⇁

2929

GAUGAN 2

6850A

⇁

680

6850

GAUGAN 2

6964A

⇁

694

6964

GAUGAN 2

7733A

⇁

7733

GAUGAN 2

8109A

⇁

8109

GAUGAN 2

8513A

⇁

8513

GAUGAN 2

10393A

⇁

10393

GAUGAN 2

10779A

⇁

10779

GAUGAN 2

11466A

⇁

11466

GAUGAN 2

11980A

⇁

11980

GAUGAN 2

12131A

⇁

12131

GAUGAN 2

12561A

⇁

12561

GAUGAN 2

12893A

⇁

12893

GAUGAN 2

13068A

⇁

1368

13068

GAUGAN 2

13332A

⇁

13332

GAUGAN 2

13819A

⇁

13819

GAUGAN 2

14453A

⇁

14453

GAUGAN 2

14628A

⇁

14628

GAUGAN 2

14717A

⇁

14717

GAUGAN 2

14866A

⇁

14866

GAUGAN 2

15331A

⇁

15331

GAUGAN 2

15480A

⇁

15480

GAUGAN 2

15748A

⇁

15748

GAUGAN 2

17323A

⇁

17323

GAUGAN 2

185A

⇁

185

GAUGAN 2

17866A

⇁

17866

GAUGAN 2

18061A

⇁

18061

GAUGAN 2

18249A

⇁

18249

GAUGAN 2

18287A

⇁

18287

GAUGAN 2

18358A

⇁

18358

GAUGAN 2

18435A

⇁

18435

GAUGAN 2

20375A

⇁

20375

GAUGAN 2

22205A

⇁

22205

GAUGAN 2

22246A

⇁

22246

GAUGAN 2

23140A

⇁

23140

<inFOS>

/ / Project Context

Moving Still was created as a passion project, stemming from lengthy experiments (more about that in the project insight) with GauGAN Beta and, later on, GauGAN 2. I found the basic concept of being able to produce artificial, photorealistic scenes of nature simply immensely intriguing.
‍
What I found even more fascinating, however, were the technical aspects—the inner workings of the GAN. To understand how it works, dissect its processes, test its limits, and use its weaknesses as a stylistic device, rather than trying to create a perfect copy of the reality.

To support and enhance the visual narrative with the use of AI became my primary focus.
‍
From what I learned about GANs, I always drew parallels to the human brain: neurons firing, creating artificial imagery right before your very own eyes. You can imagine the shape of a house, the number of windows, the color of the door, and, drawing from images you've seen and environmental influences (essentially the training data), your brain fills in the shapes to produce a somewhat realistic image with ease.
‍
Back to the GAN, the strong divergence between it´sindividual video frames stems directly from the limited capabilities of the GauGan Beta (2019) / GauGAN 2 (2021), developed by Taesung, Park et al. at NVIDIA Research AI. Although it is no more available, it was (from my knowledge) the first image generator made available to the public.
‍
The GAN (Generative Adversarial Network) was trained on 10 million—unconnected—reference images of landscapes and, as such, lacks frame consistency since video synthesis was never part of its training data.

Even though, I created the first version of the short film back in 2022, since then, I´ve done multiple additions to both the visual and auditive layer and still have things to work and experiment with out of pure joy for the base idea. Some of those changes found it´s way to the project insight.

↓ Further technical input / documentation on GauGAN2

data = {
    "web-resources": [
        "Semantic Image Synthesis with Spatially-Adaptive Normalization",
        //[Taesung Park;Ming-Yu_Liu;Ting-Chun_Wang;Jun-Yan;Zhu]
        //[arxiv.org][PDF]
        "Understanding GauGAN",
        //[Ayoosh_Kathuria]
        //[paperspace.com]
        //[Part1]:_Unraveling_Nvidia's_Landscape_Painting_GANs
        //[Part2]:_Training_on_Custom_Datasets
        //[Part3]:_Model_Evaluation_Techniques
        //[Part4]:_Debugging Training & Deciding If GauGAN Is Right For You
        "GauGAN for conditional image generation",
        //[Soumik_Rakshit;Sayak_Paul]
        //[keras.io]
}

/ / Concept

The lack of frame consistency, which results in a surreal, abstract pulsation of shapes and edges, abrupt changes in lighting moods, or even the complete replacement of objects is setting a new layer of narration. The image surface is held together solely by the silhouette and composition of its visual elements.
‍
Further discomfort, intentionally evoked in the recipient, stems from the dissonance between various visual elements within a single frame. While the camera pans and objects or trees move, other elements, such as the ground, appear to remain static. Depending on the subjective focus of the viewer, the scenes, despite their linear progression, can therefore have a completely different impact and perceived controllable component. Apart from the image-controlling segmentation maps (LINK), the outcome is entirely left to the GAN. The recipient is watching a virtual, artificial copy of a landscape that never existed, or did it?
‍
On another immersive level—parallel to the video—the auditory layer creates its own abstraction of the senses. At first, there are low-frequency sound effects, barely consciously perceptible, such as an almost omnipresent rattling, the playback of memories or video frames, similar to an old film projector. During certain phases, calibrated highs and lows offer the viewer moments to dive in as well as moments to breathe. The intra-diegetic soundscape is enriched with subtle, experimental music elements created by Azure Studios.

↓ Frame 6964 (Moving Still)

<INSIGHT>

Excerpt from the work in progress material, showcasing the steady improvement of quality and animations.

‍
    "enabledSoftware": [
        "Cinema 4D.exe", //main 3D Software (for input maps)
        "AfterEffects.exe", //post edit
        "Audacity.exe", //custom foley refinement
        "Premiere.exe", //sound design
        "Stable Diffusion", //AI-DLM [link]
           "Automatic1111", //webUi [link]
           "sd-webui-control-net", //AI-NNM [link]
           "depth-map-script", //depth map generator [link]
        "TopazGigapixelAI.exe" //upscaling
‍
    ],
    "webpages": [
        "gaugan.org/gaugan2", //!no more available!
    ],
}
‍

/ / GauGAN (Beta) - early testing

/ / Jumping to GauGAN2 + Automization

I had played around with GauGAN (Beta) a bit but kind of forgot about it. In 2022 I got back to it with "GauGAN2". Initially for an event of Luft & Laune to be used as social media story ad and live visual content on stage.
‍

↓ The web-interface of GauGAN 2 – just as the beta – the processing was outsourced to servers provided by NVIDIA.

While creating still images and short video sequences was enjoyable, I found the long, aggressively pulsating video scenes of nature to be the most fascinating. This came due to the GAN’s lack of frame consistency—unsurprising, given that it was only trained to generate single images.

As we’ve seen before, there’s always some variability in how the GAN processes input, even when using the exact same segmentation map.

/ / Utilizing .py (Python) script
for bulk processing

Though the main issue was the web interface, which, at the time, was the only way to use the GAN. It allowed just one upload at a time—you had to click “process,” wait about seven seconds, and then manually download the generated output. Doing this hundreds or even thousands of times would have been absolutely dreadful and mind boggling.

So with the help of @Paul Schulze, I enhanced a Python script—originally created by @gormlabenz—for bulk uploading and downloading of input segmentation maps. Modifications also made it possible to set a style image and execute multiple iterations simultaneously.

import base64
import os
import time
from glob import glob
from tqdm import tqdm

import imageio
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.options import Options


class Gaugan2Renderer:
    def __init__(self, waiting_time=5):
        self.waiting_time = waiting_time
        self.output_images = []
        chrome_options = Options()
        #chrome_options.add_argument("--headless")
        #chrome_options.add_argument("--remote-debugging-port=9222")
        #chrome_options.binary_location = "/usr/bin/chromedriver"

        self.driver = webdriver.Firefox(
            #ChromeDriverManager().install(),
        #    options=chrome_options
        )

    def open(self):
        self.driver.get("http://gaugan.org/gaugan2/")
        WebDriverWait(self.driver, 10).until(
            EC.presence_of_element_located((By.ID, "viewport"))
        )
        self.close_popups()

    def close_popups(self):
        close_button = self.driver.find_element(By.XPATH,
                                                "/html/body/div[2]/div/header/button")
        if close_button:
            close_button.click()

        terms_and_conditions = self.driver.find_element(
            By.XPATH, '//*[@id="myCheck"]')

        if terms_and_conditions:
            terms_and_conditions.click()

    def download_image(self, file_path):
        output_canvas = self.driver.find_element(
            By.ID, 'output')
        canvas_base64 = self.driver.execute_script(
            "return arguments[0].toDataURL('image/png').substring(21);", output_canvas)
        canvas_png = base64.b64decode(canvas_base64)

        with open(file_path, 'wb') as f:
            f.write(canvas_png)

    def create_output_dir(self):
        os.makedirs(self.output_path, exist_ok=True)

    def render_image(self, file_path, style_filter_path):

        # segmentation map
        self.driver.find_element(
            By.XPATH, '//*[@id="segmapfile"]').send_keys(file_path)
        self.driver.find_element(
            By.XPATH, '//*[@id="btnSegmapLoad"]').click()

        # custom style filter
        self.driver.find_element(
            By.XPATH, '//*[@id="imgfile"]').send_keys(style_filter_path)
        self.driver.find_element(
            By.XPATH, '//*[@id="btnLoad"]').click()
        
        
        self.driver.find_element(
            By.XPATH, '//*[@id="render"]').click()

    def run(self, input_folder, style_filter_path, output_path):
        self.image_paths = glob(input_folder + "/*.png")
        self.output_path = output_path

        self.open()
        self.create_output_dir()

        for file_path in tqdm(self.image_paths):
            file_path = os.path.abspath(file_path)
            basename = os.path.basename(file_path)
            output_image = os.path.join(self.output_path,
                                        basename)

            self.render_image(file_path, style_filter_path)
            time.sleep(self.waiting_time)
            self.download_image(output_image)
            self.output_images.append(output_image)

        
        self.driver.close()
        
    def create_video(self, output_video):
        images = [imageio.imread(image) for image in self.output_images]
        imageio.mimsave(output_video, images, fps=10)

    

/ / Experimenting

/ / Starting the journey

Over time, I kind of figured out what works and what doesn’t, discovered a visual aesthetic, and developed a visual narrative and perception I was excited to explore more deeply.

However, a major issue persisted. As I talked about before, every element of the segmentation (e.g., dirt) is connected to the other elements on it (e.g., snow). But when similar elements on 2 otherwise different segmentation maps are visible, even though the elements differ in size and location, the segmentation map seems to act similar to a masking process.

This means that if the bottom half is covered in light blue, representing straw, this part — in its output — will almost always have the same look [1]. One could even say it’s the same picture. Even if the pattern is broken up by smaller dots, like stones or bushes (in the segmentation input) [2], it still remains unchanged, as it only seems to include parts if they reach a certain size threshold.

And this isn’t an isolated issue with just this combination of elements — it happens with almost anything. This could be due to several factors: insufficient variation in training data, issues with the seed (which basically adds a randomness factor to the result), or something with the script utilized for bulk processing.

Regardless, when attempting to create a moving scenery, it becomes obviously distracting — perhaps even nauseating — when some elements appear to move along while others, like the ground, seem to remain still, at least with this degree of persistence.

↓ [1] Issue visualized: Similar output, different input

↓ Issue visualized: Lack of power to recognize camera movement

The learnings from this are, that the ground / overall elements need to be:
‍
A:
So small / far away, so that the difference between frames next to each other is big enough, so that the output given has a distinguishable look when compared to each other.

B:
Can´t be too small, as from a certain treshhold on (about min. size about 15x15px of a 512x512 full resolution input map), elements are no more processed.

C:
Big Elements, such as the ground need to be constantly broken up with various DIFFERENT elements, (represented as colors in the segmentation map) in order for the camera movement to be recognized by the recipient.

↓ Issue visualized: Lack of power to recognize camera movement, less visible due to low contrast with the sand texture

In the example below, you can tell that fixing the problems mentioned above (regarding segmentation map) substantially improved the output given by GauGAN2.

More project insight coming soon...

MOVINGSTILL

<intro>

/ / What is Moving Still?

<GALLERY>

<inFOS>

/ / Project Context

/ / Concept

<INSIGHT>

/ / GauGAN (Beta) - early testing

/ / Jumping to GauGAN2 + Automization

/ / Utilizing .py (Python) scriptfor bulk processing

/ / Experimenting

/ / Starting the journey

Contact

MOVING
STILL

/ / Utilizing .py (Python) script
for bulk processing