In Part A, we play with diffusion models, implement diffusion sampling loops, and use them for other tasks such as inpainting and creating optical illusions. We use the DeepFloydIF diffusion model to help run the code.
In this part, I used num_inference_steps=20
for Stage 1 and num_inference_steps=5
for Stage 2. The images are shown below.
In order to implement forward process, I created a forward
function that applies the equations to the test image on the project spec. The results are shown below.
In this part, I applied torchvision.transforms.functional.gaussian_blur
to each of the noisy images from the previous part. The results are shown below, with the top image being the noisy image and the bottom being the denoised.
In order to one-step denoise, I first applied the forward process to the test image then passed it through stage_1.unet
. Then, I used the equation (im_noisy - noise_est * torch.sqrt(1 - alpha_cumprod)) / torch.sqrt(alpha_cumprod)
to remove the noise from the image. The results are shown below:
To iteratively denoise, I applied equations 6 and 7 of the DDPM paper in the project spec, with strided timesteps from 990
to 0
. The results are displayed below:
To generate images from scratch, I generated random noise with torch.randn
and applied my iterative_denoise
function on it. The 5 sampled images are shown below:
To implement iterative_denoise_cfg
, I edited my iterative_denoise
function such that noise_est = uncond_noise_est + scale * (noise_est - uncond_noise_est)
. Then, using the same procedure as 1.5, I generated random noise with torch.randn
and applied my iterative_denoise_cfg
function on it. The 5 sampled images are shown below:
In this part, I applied the forward process to the test image, Kingaroo and Uniqloroo and applied my iterative_denoise_cfg
function on it at the noise levels [1, 3, 5, 7, 10, 20]
. The 5 sampled images are shown below:
In this part, I found an image online and drew two. Then, I passed the images through the starter code. The results are shown below:
In this part, I implemented the inpaint
function by editing my iterative_denoise_cfg
code to apply equation 5 on the project spec to account for the mask. The results are shown below (left is image and mask, right is inpainted):
In this part, I did the same as 1.7 but with the prompt "a rocket ship"
. The results are below.
In this part, I implemented make_flip_illusion
editing my iterative_denoise_cfg
function to apply the equations on the project spec (to account for UNets and flipping). I then ran it on three different pairs of prompts, shown below:
In this part, I implemented make_hybrids
editing my iterative_denoise_cfg
function to apply the equations on the project spec (to account for UNets and filtering). I then ran it on three different pairs of prompts, shown below:
In this sub-project, we train diffusion models on MNIST datasets: Single-Step Denoising UNet, Time-Conditioning UNet and Class-Conditioning UNet. Credits to my CS189 Homework 6 from last semester for inspiring my code for training the models.
Using the skeleton code and the diagram in project spec provided, I implemented the Conv
and UnconditionalUNet
classes. I then created the functions add_noise
and add_noise_to_dataset
to assist with adding noise to an image and an entire dataset. I ran these functions with sigma=[0.0, 0.2, 0.4, 0.6, 0.8, 1.0]
on the dataset. Then I trained the model on the MNIST dataset and sampled results on the test set after the first and the fifth epoch and with out-of-distribution noise levels after the model is trained. The results are shown below:
Using the skeleton code and the diagram in project spec provided, I implemented the FCBlock
, TimeConditional
and DDPM
classes, as well as the ddpm_schedule
, ddpm_forward
and ddpm_sample
. Then I trained the model on the MNIST dataset and sampled results on the test set after the fifth and the twentieth epoch. The results are shown below:
My process for this part was essentially identical to the previous, except I added two extra FCBlock
s, as mentioned in the spec. Then I once again trained the model on the MNIST dataset and sampled results on the test set after the fifth and the twentieth epoch. The results are shown below: