Richard Szeliski

University of Washington

H-index: 132

North America-United States

Description

Richard Szeliski, With an exceptional h-index of 132 and a recent h-index of 59 (since 2020), a distinguished researcher at University of Washington, specializes in the field of Computer Vision, Computer Graphics.

His recent articles reflect a diverse array of research interests and contributions to the field:

Binary Opacity Grids: Capturing Fine Geometric Detail for Mesh-Based View Synthesis

SMERF: Streamable Memory Efficient Radiance Fields for Real-Time Large-Scene Exploration

The lumigraph

Outputting warped images from captured video data

Bakedsdf: Meshing neural sdfs for real-time view synthesis

Merf: Memory-efficient radiance fields for real-time view synthesis in unbounded scenes

Accidental light probes

Video textures

Professor Information

University	University of Washington
Position	CSE The
Citations(all)	106568
Citations(since 2020)	24819
Cited By	91196
hIndex(all)	132
hIndex(since 2020)	59
i10Index(all)	329
i10Index(since 2020)	208
Email	Access Email
University Profile Page	University of Washington

Research & Interests List

Computer Vision

Computer Graphics

Top articles of Richard Szeliski

Binary Opacity Grids: Capturing Fine Geometric Detail for Mesh-Based View Synthesis

While surface-based view synthesis algorithms are appealing due to their low computational requirements, they often struggle to reproduce thin structures. In contrast, more expensive methods that model the scene's geometry as a volumetric density field (e.g. NeRF) excel at reconstructing fine geometric detail. However, density fields often represent geometry in a "fuzzy" manner, which hinders exact localization of the surface. In this work, we modify density fields to encourage them to converge towards surfaces, without compromising their ability to reconstruct thin structures. First, we employ a discrete opacity grid representation instead of a continuous density field, which allows opacity values to discontinuously transition from zero to one at the surface. Second, we anti-alias by casting multiple rays per pixel, which allows occlusion boundaries and subpixel structures to be modelled without using semi-transparent voxels. Third, we minimize the binary entropy of the opacity values, which facilitates the extraction of surface geometry by encouraging opacity values to binarize towards the end of training. Lastly, we develop a fusion-based meshing strategy followed by mesh simplification and appearance model fitting. The compact meshes produced by our model can be rendered in real-time on mobile devices and achieve significantly higher view synthesis quality compared to existing mesh-based approaches.

Authors

Christian Reiser,Stephan Garbin,Pratul P Srinivasan,Dor Verbin,Richard Szeliski,Ben Mildenhall,Jonathan T Barron,Peter Hedman,Andreas Geiger

Journal

arXiv preprint arXiv:2402.12377

Published Date

2024/2/19

SMERF: Streamable Memory Efficient Radiance Fields for Real-Time Large-Scene Exploration

Recent techniques for real-time view synthesis have rapidly advanced in fidelity and speed, and modern methods are capable of rendering near-photorealistic scenes at interactive frame rates. At the same time, a tension has arisen between explicit scene representations amenable to rasterization and neural fields built on ray marching, with state-of-the-art instances of the latter surpassing the former in quality while being prohibitively expensive for real-time applications. In this work, we introduce SMERF, a view synthesis approach that achieves state-of-the-art accuracy among real-time methods on large scenes with footprints up to 300 m at a volumetric resolution of 3.5 mm. Our method is built upon two primary contributions: a hierarchical model partitioning scheme, which increases model capacity while constraining compute and memory consumption, and a distillation training strategy that simultaneously yields high fidelity and internal consistency. Our approach enables full six degrees of freedom (6DOF) navigation within a web browser and renders in real-time on commodity smartphones and laptops. Extensive experiments show that our method exceeds the current state-of-the-art in real-time novel view synthesis by 0.78 dB on standard benchmarks and 1.78 dB on large scenes, renders frames three orders of magnitude faster than state-of-the-art radiance field models, and achieves real-time performance across a wide variety of commodity devices, including smartphones. We encourage the reader to explore these models in person at our project website: https://smerf-3d.github.io.

Authors

Daniel Duckworth,Peter Hedman,Christian Reiser,Peter Zhizhin,Jean-François Thibert,Mario Lučić,Richard Szeliski,Jonathan T Barron

Journal

arXiv preprint arXiv:2312.07541

Published Date

2023/12/12

The lumigraph

This paper discusses a new method for capturing the complete appearance of bothsynthetic and real world objects and scenes,representing this information, and then using this representation to render images of the object from new camera positions. Unlike the shape capture process traditionally used in computer vision and the rendering process traditionally used in computer graphics, our approach does not rely on geometric representations. Instead we sample and reconstruct a 4D function. which we call a Lumigraph. The Lumigraph is a subset o f the complete plenoptic fu nction that describes the flow of light at all positions in all directions. With the Lumigraph. new images of the object can be generated very quick]y,independentof the geometric or illumination complexity of the scene or object. The paper discusses a complete working system including the capture of sainples. the construction of the Lumigraph …

Authors

Steven J Gortler,Radek Grzeszczuk,Richard Szeliski,Michael F Cohen

Published Date

2023/8/1

Outputting warped images from captured video data

Each image in a sequence of images includes three-dimensional locations of object features depicted in the image, and a first camera position of the camera when the image is captured. A gap is detected between first camera positions associated with a first continuous and first camera positions associated with a second continuous subset, the first camera positions associated with the second continuous subset adjusted to close the gap. A view path for a virtual camera is determined based on the first camera positions and the adjusted first camera positions. Second camera positions are determined for the virtual camera, for each of the second camera positions: one of the first camera positions associated with the sequence of images is selected and warped using the first camera position, the second camera position, and the three-dimensional locations of object features depicted in the selected image. A sequence …

Published Date

2023/8/10

Bakedsdf: Meshing neural sdfs for real-time view synthesis

We present a method for reconstructing high-quality meshes of large unbounded real-world scenes suitable for photorealistic novel view synthesis. We first optimize a hybrid neural volume-surface scene representation designed to have well-behaved level sets that correspond to surfaces in the scene. We then bake this representation into a high-quality triangle mesh, which we equip with a simple and fast view-dependent appearance model based on spherical Gaussians. Finally, we optimize this baked representation to best reproduce the captured viewpoints, resulting in a model that can leverage accelerated polygon rasterization pipelines for real-time view synthesis on commodity hardware. Our approach outperforms previous scene representations for real-time rendering in terms of accuracy, speed, and power consumption, and produces high quality meshes that enable applications such as appearance …

Authors

Lior Yariv,Peter Hedman,Christian Reiser,Dor Verbin,Pratul P Srinivasan,Richard Szeliski,Jonathan T Barron,Ben Mildenhall

Published Date

2023/7/23

Merf: Memory-efficient radiance fields for real-time view synthesis in unbounded scenes

Neural radiance fields enable state-of-the-art photorealistic view synthesis. However, existing radiance field representations are either too compute-intensive for real-time rendering or require too much memory to scale to large scenes. We present a Memory-Efficient Radiance Field (MERF) representation that achieves real-time rendering of large-scale scenes in a browser. MERF reduces the memory consumption of prior sparse volumetric radiance fields using a combination of a sparse feature grid and high-resolution 2D feature planes. To support large-scale unbounded scenes, we introduce a novel contraction function that maps scene coordinates into a bounded volume while still allowing for efficient ray-box intersection. We design a lossless procedure for baking the parameterization used during training into a model that achieves real-time rendering while still preserving the photorealistic view synthesis …

Authors

Christian Reiser,Rick Szeliski,Dor Verbin,Pratul Srinivasan,Ben Mildenhall,Andreas Geiger,Jon Barron,Peter Hedman

Journal

ACM Transactions on Graphics (TOG)

Published Date

2023/8/1

Accidental light probes

Recovering lighting in a scene from a single image is a fundamental problem in computer vision. While a mirror ball light probe can capture omnidirectional lighting, light probes are generally unavailable in everyday images. In this work, we study recovering lighting from accidental light probes (ALPs)---common, shiny objects like Coke cans, which often accidentally appear in daily scenes. We propose a physically-based approach to model ALPs and estimate lighting from their appearances in single images. The main idea is to model the appearance of ALPs by photogrammetrically principled shading and to invert this process via differentiable rendering to recover incidental illumination. We demonstrate that we can put an ALP into a scene to allow high-fidelity lighting estimation. Our model can also recover lighting for existing images that happen to contain an ALP.

Authors

Hong-Xing Yu,Samir Agarwala,Charles Herrmann,Richard Szeliski,Noah Snavely,Jiajun Wu,Deqing Sun

Published Date

2023

Video textures

This report looks into a new type of medium called a video texture which was introduced in a SIGGRAPH 2000 paper by Arno Schödl, Richard Szeliski, David H. Salesin and Ifran Essa.A video texture is described by the authors of the original paper as having qualities somewhere between a photograph and a video, providing a continual, infinitely varying sequence of images. While individual frames of a video texture may be reused now and again, the sequence in its entirety is never repeated exactly.

Authors

Alexander JV White,Aphrodite Galata

Published Date

2007/5/2