DiffuScene | Notion

Notes

Untitled

3D Scene $s$ → fully-connected scene graph $x_0$
- each object as a graph node strong all object attributes, i.e. location, size, orientation, class label, and latent shape code.
DiffuScene → Based on a sel of all posible $x_0$
- Forward: Gradually add noice to $x_0$ until a standard Gaussian noice $x_T$
- Reverse: A Denoising network cleans the noisy graph using ancestral sampling
  - use the denoised object features to perform shape retrieval

Screenshot 2023-05-20 at 15.32.25.png

Screenshot 2023-05-20 at 15.32.33.png

Screenshot 2023-05-20 at 15.34.54.png

employ a pretrained BERT encoder to extract word embedings
inject the language guidance into denoising network using cross attention layers