Harnessing NeRFs and Exploring Volumetric Applications

October 12, 2023 • 15 minute read

We caught up with our dedicated Research & Development team, who are continuously exploring what's possible in the world of virtual humans and environments, from volumetric video to NeRFs, AI, virtual production, and beyond.

A Close Look at NeRFs

We're pursuing a number of goals in NeRF-land; exploring NeRFs (Neural Radiance Fields) that capture whole environments, NeRFs that move, and NeRFs that are relightable. 

NeRFs create 3D renderings of objects or scenes from 2D images through the application of sophisticated machine learning techniques (be sure to watch Corridor Crew's video for an excellent introduction). NeRFs are 3D reconstructions that don’t rely on conventional or texture-based meshes in their base structure.

Capturing Environments

We’ve seen NeRFs from cameras that either look inwards onto a centre or in one direction.  The Fast Free NeRF academic paper describes a mechanism to create NeRFs from images along an arbitrary camera flight path. This is the first open-source code we’ve seen that deals with wide-open spaces. 

Our team put it to the test and succeeded in getting this data into their format. Even though the presented video looks inward, we know the technique works for free flight too. 

Key takeaway: This capability is significant, as we can leverage the same high realism that NeRFs bring to centre-stage actors for our LED wall environments too. The advantage of having to work with highly-constrained camera set-ups is a real bonus.

Moving NeRFs

On the moving NeRF front, our team has captured sequences frame-by-frame, but that leads to inconsistencies between frames, especially in the lighting. Any given NeRF is stored as a long list of numbers - the weights in a neural network - and the MLPMaps paper describes a way to describe the differences between consecutive NeRF frames as weights in a neural network too. 

In this research project, the video below presents a sequence featuring one of our favourite tennis players. We have applied the approach outlined in the MLPMaps paper to create a sequence of neural networks for playback. Currently, this technique is limited to human actors, but the real achievement lies in understanding the overarching network approach.

Key takeaway: Similar to how video stores frame differences for compression, we are compressing a sequence of NeRFs, and the lighting remains remarkably steady too.

Conventional Meshes

Back in the world of conventional meshes, one problem we face comes with a change in the topology of the subject - if the actor removes her hat, that creates an entirely new mesh of triangles, and this gives the rendered surface an unwanted shimmer, like boiling water. 

We've long pursued the goal of keeping a single mesh and texture mapping through extreme movements, and the open-source code from the "Surface Maps via Adaptive Triangulations" paper shows one way. By marking features on the mesh that correspond, the code can bend one mesh into extreme poses. 

Below, there’s a skeleton inserted into the mesh, and the coloured balls appear where rays projected from the joints puncture the mesh - they follow the body’s movement and show us how the mesh could move to match both poses. Although it's not obvious, the mesh for both is topologically identical - we've solved the famous "One UV" problem!

Key takeaway: Again only for human subjects for now, but this will reduce the amount of manual intervention we need to achieve high quality in the final delivery, as well as reducing the volume of data we need to store and transmit.

Volumetric Crowds at Scale

Our ability to handle crowds at an impressive scale isn't just a one-time feat; it's become part of our DNA. Our volumetric crowd work on the production of "I Wanna Dance with Somebody" has garnered recognition on prestigious stages, with presentations at both SIGGRAPH and FMX conferences. 

We’re currently working on a project of similar magnitude, which leverages our virtual production and volumetric expertise.

Volumetric Video & Virtual Production

Bringing together two of our core offerings, we’ve successfully tested high-action volumetric video as mid-ground characters on an LED volume for virtual production, with the findings shared at SIGGRAPH 2023. This is the first time testing Arcturus’s Accelerated Volumetric Video (AVV) Codec for Unreal Engine on the LED wall.

This builds on the volumetric crowd solution used for I Wanna Dance With Somebody, and offers new possibilities for virtual production.

Nvdiffrec Exploration

Our team is exploring various techniques and technologies to make our volumetric captures even more flexible, such as NVIDIA’s Instant Neural Graphics and Nvdiffrec projects.

In this case, we applied joint optimisation of topology, materials, and lighting, from multi-view image observations to our model of the tennis player.

Key takeaway: Inconsistent lighting is one of the unconscious triggers that signals to the audience that they're not viewing a real person. Being able to enhance the model and light it accurately helps overcome the 'uncanny valley' effect.

Volumetric Colour Pipeline

In the realm of volumetric video, colour plays an integral role in achieving authenticity. We strive to ensure that assets look realistic and can seamlessly blend into scenes. 

Throughout our workflow, from the initial image capture to the final delivery of volumetric video assets, we continuously maintain a consistent colour pipeline to ensure the best results for our clients. 

Below is the latest output of our volumetric colour pipeline from our AR Queens project. 

We’ve covered all that our inquisitive R&D team can unveil for now. If you're eager to dive deeper into the world of volumetric video and explore the possibilities, don't hesitate to reach out.