STaR

Self-supervised Tracking and Reconstruction of Rigid Objects in Motion with Neural Rendering


Wentao Yuan1
Zhaoyang Lv2
Tanner Schmidt2
Steven Lovegrove2
1University of Washington
2Facebook Reality Labs

Paper
(arXiv)

Code
(coming soon)

Overview

We present STaR, a novel method that performs Self-supervised Tracking and Reconstruction of dynamic scenes with rigid motion from multi-view RGB videos.

Without any manual annotation, our method can reconstruct a dynamic scene with a single rigid object in motion by simultaneously decomposing it into its two constituent parts and encoding each with its own neural representation. This is achieved by jointly optimizing the parameters of two neural radiance fields and a set of rigid poses which align the two fields at each frame.

On both synthetic and real world datasets, we demonstrate that our method can render photorealistic novel views, where novelty is measured on both spatial and temporal axes. Our factored representation furthermore enables animation of unseen object motion.

4D Novel View Synthesis

The following videos show rendering of novel spatial-temporal views on two synthetic (Lamp and Desk, Kitchen Table) and one real-world (Moving Banana) dynamic scenes. The rendered videos are 20x slow motion of the training videos from a continuously varying camera view unseen during training.

Animating Unseen Motion

STaR's factored representation of motion and appearance allows it to synthesize novel views of animated trajectories of the dynamic object which have not been seen during training, without any 3D ground truth or supervision.

Scene Decomposition

With the decomposition of static and dynamic components learned by STaR, static background and dynamic foreground can be seamlessly removed during spatial-temporal novel view rendering.

Citation

@inproceedings{yuan2021star,
    title={STaR: Self-supervised Tracking and Reconstruction of Rigid Objects in Motion with Neural Rendering},
    author={Yuan, Wentao and Lv, Zhaoyang and Schmidt, Tanner and Lovegrove, Steven},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    pages={13144--13152},
    year={2021}
}