In Part A, we play with diffusion models, implement diffusion sampling loops, and use them for other tasks such as inpainting and creating optical illusions. We use the DeepFloydIF diffusion model to help run the code.
In this part, I used num_inference_steps=20 for Stage 1 and num_inference_steps=5 for Stage 2. The images are shown below.
num_inference_steps=20 (Top) and Stage 2 with num_inference_steps=5 (Bottom)
In order to implement forward process, I created a forward function that applies the equations to the test image on the project spec. The results are shown below.
In this part, I applied torchvision.transforms.functional.gaussian_blur to each of the noisy images from the previous part. The results are shown below, with the top image being the noisy image and the bottom being the denoised.
|
|
|
In order to one-step denoise, I first applied the forward process to the test image then passed it through stage_1.unet. Then, I used the equation (im_noisy - noise_est * torch.sqrt(1 - alpha_cumprod)) / torch.sqrt(alpha_cumprod) to remove the noise from the image. The results are shown below:
|
|
|
To iteratively denoise, I applied equations 6 and 7 of the DDPM paper in the project spec, with strided timesteps from 990 to 0. The results are displayed below:
Noisy Campanile at
t=690 to 0 |
Single Denoising Step (Top) and Gaussian Blurring (Bottom)
|
To generate images from scratch, I generated random noise with torch.randn and applied my iterative_denoise function on it. The 5 sampled images are shown below:
|
To implement iterative_denoise_cfg, I edited my iterative_denoise function such that noise_est = uncond_noise_est + scale * (noise_est - uncond_noise_est). Then, using the same procedure as 1.5, I generated random noise with torch.randn and applied my iterative_denoise_cfg function on it. The 5 sampled images are shown below:
|
In this part, I applied the forward process to the test image, Kingaroo and Uniqloroo and applied my iterative_denoise_cfg function on it at the noise levels [1, 3, 5, 7, 10, 20]. The 5 sampled images are shown below:
Test Image
|
Kingaroo
|
Uniqloroo
|
In this part, I found an image online and drew two. Then, I passed the images through the starter code. The results are shown below:
Online Image
|
Hand Drawn 1
|
Hand Drawn 2
|
In this part, I implemented the inpaint function by editing my iterative_denoise_cfg code to apply equation 5 on the project spec to account for the mask. The results are shown below (left is image and mask, right is inpainted):
Test Image
|
Kingaroo
|
Uniqloroo
|
In this part, I did the same as 1.7 but with the prompt "a rocket ship". The results are below.
Test Image
|
Kingaroo
|
Uniqloroo
|
In this part, I implemented make_flip_illusion editing my iterative_denoise_cfg function to apply the equations on the project spec (to account for UNets and flipping). I then ran it on three different pairs of prompts, shown below:
|
|
|
In this part, I implemented make_hybrids editing my iterative_denoise_cfg function to apply the equations on the project spec (to account for UNets and filtering). I then ran it on three different pairs of prompts, shown below:
"a lithograph of a skull" and "a lithograph of waterfalls"
|
"a rocket ship" and "a pencil"
|
"an oil painting of a snowy mountain village" and "a photo of a hipster barista"
|
In this sub-project, we train diffusion models on MNIST datasets: Single-Step Denoising UNet, Time-Conditioning UNet and Class-Conditioning UNet. Credits to my CS189 Homework 6 from last semester for inspiring my code for training the models.
Using the skeleton code and the diagram in project spec provided, I implemented the Conv and UnconditionalUNet classes. I then created the functions add_noise and add_noise_to_dataset to assist with adding noise to an image and an entire dataset. I ran these functions with sigma=[0.0, 0.2, 0.4, 0.6, 0.8, 1.0] on the dataset. Then I trained the model on the MNIST dataset and sampled results on the test set after the first and the fifth epoch and with out-of-distribution noise levels after the model is trained. The results are shown below:
Noising Process for Difference Sigmas
|
Training Loss Curve
|
|
Sample Results on the Test Set (Epoch 1)
|
Sample Results on the Test Set (Epoch 5)
|
Sample Results on the Test Set With Out-of-Distribution Noise Levels
|
Sample Results on the Test Set With Out-of-Distribution Noise Levels
|
Using the skeleton code and the diagram in project spec provided, I implemented the FCBlock, TimeConditional and DDPM classes, as well as the ddpm_schedule, ddpm_forward and ddpm_sample. Then I trained the model on the MNIST dataset and sampled results on the test set after the fifth and the twentieth epoch. The results are shown below:
Training Loss Curve
|
Sample Results on the Test Set (Epoch 5)
|
Sample Results on the Test Set (Epoch 20)
|
My process for this part was essentially identical to the previous, except I added two extra FCBlocks, as mentioned in the spec. Then I once again trained the model on the MNIST dataset and sampled results on the test set after the fifth and the twentieth epoch. The results are shown below:
Training Loss Curve
|
Sample Results on the Test Set (Epoch 5)
|
Sample Results on the Test Set (Epoch 20)
|