CAT3D: Create Anything in 3D with Multi-View Diffusion Models

Gao, Ruiqi; Hołyński, Aleksander; Henzler, Philipp; Brussee, Arthur; Martin-Brualla, Ricardo; Srinivasan, Pratul; Barron, Jonathan T.; Poole, Ben

CAT3D: Create Anything in 3D with Multi-View Diffusion Models

Part of Advances in Neural Information Processing Systems 37 (NeurIPS 2024) Main Conference Track

Authors

Ruiqi Gao, Aleksander Hołyński, Philipp Henzler, Arthur Brussee, Ricardo Martin-Brualla, Pratul Srinivasan, Jonathan T. Barron, Ben Poole

Abstract

Advances in 3D reconstruction have enabled high-quality 3D capture, but require a user to collect hundreds to thousands of images to create a 3D scene. We present CAT3D, a method for creating anything in 3D by simulating this real-world capture process with a multi-view diffusion model. Given any number of input images and a set of target novel viewpoints, our model generates highly consistent novel views of a scene. These generated views can be used as input to robust 3D reconstruction techniques to produce 3D representations that can be rendered from any viewpoint in real-time. CAT3D can create entire 3D scenes in as little as one minute, and outperforms existing methods for single image and few-view 3D scene creation.

CAT3D: Create Anything in 3D with Multi-View Diffusion Models

Authors

Abstract

Name Change Policy