Ilya Chugunov

About: I'm a Research Scientist working with the Nextcam team at Adobe on computational photography applications. I received my Ph.D. from Princeton University, where I was part of the Princeton Computational Imaging Lab advised by Professor Felix Heide, and was supported by the NSF Graduate Research Fellowship. I earned my bachelor's degree in electrical engineering and computer science from UC Berkeley.

Contact: cout << "contact" << "@" << "ilyac.info"

Professional: CV / Google Scholar / Github / LinkedIn / bsky

Unprofessional: My Photography / Quotes

Research:

I'm interested in computational photography and inverse problems that look at the whole imaging pipeline, from signal collection to scene reconstruction. MRIs, modulated light sources, or mobile phones; I love working with real devices and real data.

Over the course of my research I've written a number of open-source data collection apps:

Pani (Android, camera2) : An all-in-one camera app for continuous recording of Bayer RAWs, accelerometer values, gyroscope measurements, and a metric ton of device metadata from multiple camera configurations (main, ultrawide, telephoto). I am actively using this app in my current work, and so plan to continue expanding its features over time.

SoaP-App (iOS, AVFoundation) : A "long-burst" capture app for recording up to 42 frame sequences of Bayer RAWs, depth maps, accelerometer values, gyroscope measurements, and metadata.

HNDR-App (iOS, ARKit) : A "long-burst" capture app for recording up to 120 frame sequences of processed RGB images, depth maps, and pose estimates (from ARKit world tracking).

"If you try and take a cat apart to see how it works, the first thing you have on your hands is a non-working cat." - Douglas Adams

	Neural Atlas Graphs for Dynamic Scene Decomposition and Editing Jan Philipp Schneider, Pratik Singh Bisht, Ilya Chugunov, Andreas Kolb, Michael Moeller, Felix Heide NeurIPS, 2025 (Spotlight) Neural Atlas Graphs are a hybrid 2.5D representation for high-resolution, editable dynamic scenes. They model a scene as a graph of moving planes in 3D, each equipped with a view-dependent neural atlas. This structure supports both 2D appearance editing and 3D re-ordering of scene elements, enabling rendering counterfactual scenarios with new backgrounds and modified object appearance.
Neural Atlas Graphs are a hybrid 2.5D representation for high-resolution, editable dynamic scenes. They model a scene as a graph of moving planes in 3D, each equipped with a view-dependent neural atlas. This structure supports both 2D appearance editing and 3D re-ordering of scene elements, enabling rendering counterfactual scenarios with new backgrounds and modified object appearance.
	Neural Field Representations of Mobile Computational Photography Ilya Chugunov My PhD Dissertation, Princeton University, 2025

	Neural Light Spheres for Implicit Image Stitching and View Synthesis Ilya Chugunov, Amogh Joshi, Kiran Murthy, Francois Bleibel, Felix Heide SIGGRAPH Asia, 2024 We design a spherical neural light field model for implicit panoramic image stitching and re-rendering, capable of handling depth parallax, view-dependent lighting, and scene motion. Our compact model decomposes the scene into view-dependent ray offset and color components, and with no volume sampling achieves real-time 1080p rendering.
We design a spherical neural light field model for implicit panoramic image stitching and re-rendering, capable of handling depth parallax, view-dependent lighting, and scene motion. Our compact model decomposes the scene into view-dependent ray offset and color components, and with no volume sampling achieves real-time 1080p rendering.
	Split-Aperture 2-in-1 Computational Cameras Zheng Shi, Ilya Chugunov, Mario Bijelic, Geoffroi Côté, Jiwoon Yeom, Qiang Fu, Hadi Amata, Wolfgang Heidrich, Felix Heide SIGGRAPH, 2024 Split-aperture 2-in-1 computational cameras encode half the aperture with a diffractive optical element to simultaneously capture optically coded and conventional images in a single device. Using a dual-pixel sensor, our camera separates the wavefronts, retaining high-frequency content and enabling single-shot high-dynamic-range, hyperspectral, and depth imaging.
Split-aperture 2-in-1 computational cameras encode half the aperture with a diffractive optical element to simultaneously capture optically coded and conventional images in a single device. Using a dual-pixel sensor, our camera separates the wavefronts, retaining high-frequency content and enabling single-shot high-dynamic-range, hyperspectral, and depth imaging.
	Neural Spline Fields for Burst Image Fusion and Layer Separation Ilya Chugunov, David Shustin, Ruyu Yan, Chenyang Lei, Felix Heide CVPR, 2024 We propose neural spline fields, coordinate networks trained to map input 2D points to vectors of spline control points, as a versatile representation of pixel motion during burst photography. This flow model can fuse images during test-time optimization using just photometric loss, without regularization. Layering these representations, we can separate effects such as occlusions, reflections, shadows and more.
We propose neural spline fields, coordinate networks trained to map input 2D points to vectors of spline control points, as a versatile representation of pixel motion during burst photography. This flow model can fuse images during test-time optimization using just photometric loss, without regularization. Layering these representations, we can separate effects such as occlusions, reflections, shadows and more.
	Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized Photography Ilya Chugunov, Yuxuan Zhang, Felix Heide CVPR, 2023 In a “long-burst”, forty-two 12-megapixel RAW frames captured in a two-second sequence, there is enough parallax information from natural hand tremor alone to recover high-quality scene depth. We fit a neural RGB-D model directly to this long-burst data to recover depth and camera motion with no LiDAR, no external pose estimates, and no disjoint preprocessing steps.
In a “long-burst”, forty-two 12-megapixel RAW frames captured in a two-second sequence, there is enough parallax information from natural hand tremor alone to recover high-quality scene depth. We fit a neural RGB-D model directly to this long-burst data to recover depth and camera motion with no LiDAR, no external pose estimates, and no disjoint preprocessing steps.
	GenSDF: Two-Stage Learning of Generalizable Signed Distance Functions Gene Chou, Ilya Chugunov, Felix Heide NeurIPS, 2022 (Featured) Signed distance fields (SDFs) can be a compact and convenient way of representing 3D objects, but state-of-the-art learned methods for SDF estimation struggle to fit more than a few shapes at a time. This work presents a two stage semi-supervised meta-learning approach that learns generic shape priors to reconstruct over a hundred unseen object classes.
Signed distance fields (SDFs) can be a compact and convenient way of representing 3D objects, but state-of-the-art learned methods for SDF estimation struggle to fit more than a few shapes at a time. This work presents a two stage semi-supervised meta-learning approach that learns generic shape priors to reconstruct over a hundred unseen object classes.
	Centimeter-Wave Free-Space Time-of-Flight Imaging Seung-Hwan Baek, Noah Walsh, Ilya Chugunov, Zheng Shi, Felix Heide SIGGRAPH, 2022 Modern AMCW time-of-flight (ToF) cameras are limited to modulation frequencies of several hundred MHz by silicon absorption limits. In this work we leverage electro-optic modulators to build the first free-space GHz ToF imager. To solve high-frequency phase ambiguities we alongside introduce a segmentation-inspired neural phase unwrapping network.
Modern AMCW time-of-flight (ToF) cameras are limited to modulation frequencies of several hundred MHz by silicon absorption limits. In this work we leverage electro-optic modulators to build the first free-space GHz ToF imager. To solve high-frequency phase ambiguities we alongside introduce a segmentation-inspired neural phase unwrapping network.
	The Implicit Values of A Good Hand Shake: Handheld Multi-Frame Neural Depth Refinement Ilya Chugunov, Yuxuan Zhang, Zhihao Xia, Xuaner (Cecilia) Zhang, Jiawen Chen, Felix Heide CVPR, 2022 (Oral) Modern smartphones can stream multi-megapixel RGB images, high-quality 3D pose information, and low-resolution depth estimates at 60Hz. In tandem, the natural shake of a phone photographer's hand provides us with dense micro-baseline parallax depth cues during viewfinding. This work explores how we can combine these data streams to get a high-fidelity depth map from a single snapshot.
Modern smartphones can stream multi-megapixel RGB images, high-quality 3D pose information, and low-resolution depth estimates at 60Hz. In tandem, the natural shake of a phone photographer's hand provides us with dense micro-baseline parallax depth cues during viewfinding. This work explores how we can combine these data streams to get a high-fidelity depth map from a single snapshot.
	Mask-ToF: Learning Microlens Masks for Flying Pixel Correction in Time-of-Flight Imaging Ilya Chugunov, Seung-Hwan Baek, Qiang Fu, Wolfgang Heidrich, Felix Heide CVPR, 2021 Flying pixels are pervasive depth artifacts in time-of-flight imaging, formed by light paths from both an object and its background connecting to the same sensor pixel. Mask-ToF jointly learns a microlens-level occlusion mask and refinement network to respectively encode and decode geometric information in device measurements, helping reduce these artifacts while remaining light efficient.
Flying pixels are pervasive depth artifacts in time-of-flight imaging, formed by light paths from both an object and its background connecting to the same sensor pixel. Mask-ToF jointly learns a microlens-level occlusion mask and refinement network to respectively encode and decode geometric information in device measurements, helping reduce these artifacts while remaining light efficient.
	Self-Contained Jupyter Notebook Labs Promote Scalable Signal Processing Education Dominic Carrano, Ilya Chugunov, Jonathan Lee, Babak Ayazifar, 6th International Conference on Higher Education Advances (HEAd), 2020 Jupyter Notebook labs can offer a similar experience to in-person lab sections while being self-contained, with relevant resources embedded in their cells. They interactively demonstrate real-life applications of signal processing while reducing overhead for course staff.
Jupyter Notebook labs can offer a similar experience to in-person lab sections while being self-contained, with relevant resources embedded in their cells. They interactively demonstrate real-life applications of signal processing while reducing overhead for course staff.
	Multiscale Low-Rank Matrix Decomposition for Reconstruction of Accelerated Cardiac CEST MRI Ilya Chugunov, Wissam AlGhuraibawi, Kevin Godines, Bonnie Lam, Frank Ong, Jonathan Tamir, Moriel Vandsburger 28th Annual Meeting of International Society for Magnetic Resonance in Medicine (ISMRM), 2020 Leveraging sparsity in the Z-spectrum domain, multi-scale low rank reconstruction of cardiac chemical exchange saturation transfer (CEST) MRI can allow for 4-fold acceleration of scans while providing accurate Lorentzian line-fit analysis.
Leveraging sparsity in the Z-spectrum domain, multi-scale low rank reconstruction of cardiac chemical exchange saturation transfer (CEST) MRI can allow for 4-fold acceleration of scans while providing accurate Lorentzian line-fit analysis.
	Duodepth: Static Gesture Recognition Via Dual Depth Sensors Ilya Chugunov, Avideh Zakhor IEEE International Conference on Image Processing (ICIP), 2019 Point cloud data integrated from two structured light sensors for gesture recognition implicitly via a 3D spatial transform network can lead to improved results as compared to iterative closest point (ICP) registered point clouds.
Point cloud data integrated from two structured light sensors for gesture recognition implicitly via a 3D spatial transform network can lead to improved results as compared to iterative closest point (ICP) registered point clouds.

Website template stolen from Bon Jarron .