Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It does use the UNet to denoise the VAE compressed image:

"The dithering of the palettized latents has introduced noise, which distorts the decoded result. But since Stable Diffusion is based on de-noising of latents, we can use the U-Net to remove the noise introduced by the dithering."

The included Colab doesn't have line numbers, but you can see the code doing it:

  # Use Stable Diffusion U-Net to de-noise the dithered latents
  latents = denoise(latents)
  denoised_img = to_img(latents)
  display(denoised_img)
  del latents
  print('VAE decoding of de-noised dithered 8-bit latents')
  print('size: {}b = {}kB'.format(sd_bytes, sd_bytes/1024.0))
  print_metrics(gt_img, denoised_img)


I stand corrected, then :) cheers.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: