NeRF Renderer

A real-time NeRF render of a hot dog in OpenGL. Please note the frame rate may appear lower due to the gif format.

In this project, I implemented a NeRF-based volumetric rendering pipeline using PyTorch, OpenGL, and GLSL, training a neural network to reconstruct a 3D scene from a dataset of 2D images. The final result? A real-time NeRF renderer capable of generating smooth, realistic views from arbitrary camera angles.

Instead of storing a scene as a 3D mesh or point cloud, NeRF learns a function that maps spatial coordinates (x, y, z) and viewing direction to color (RGB) and density (sigma). This function is represented as a multi-layer perceptron (MLP), which I trained in PyTorch.

Once the MLP is trained, we render images by simulating light traveling through the scene. This involves:

1. Generating camera rays for each pixel using intrinsic parameters (focal length, camera pose).

2. Sampling points along each ray and querying the MLP for their color and density.

3. Applying the volume rendering equation to accumulate color along the ray, producing a final image.

Comparison between ground truth hot dog and my NeRF prediction. Peak signal-to-noise ratio (PSNR) visualized on the right, showing a decent PSNR for a relatively small model.

NeRFs are typically slow because they require evaluating a deep network for every pixel. To make it real-time, I converted the trained MLP into GLSL by hardcoding weights as matrix multiplications. I then implemented ray marching in an OpenGL fragment shader for interactive visualization.

This project was an exciting deep dive into neural rendering, graphics programming, and high-performance computing. The combination of machine learning and real-time graphics opens the door to more advanced applications, from VR and gaming to robotics and autonomous vision systems.

Up next, I hope to explore Gaussian splatting for even faster 3D representations of 2D images.

Tags:

Leave a comment