AI

Researchers from the University of Maryland and Meta AI Propose OmnimatteRF: A Novel Video Matting Method that Combines Dynamic 2D Foreground Layers and a 3D Background Model

3 Mins read

Separating a video into numerous layers, each with its alpha matte, and then recomposing the layers back into the original video is the challenge known as “video matting.” Since it’s possible to swap out layers or process them separately before compositing them back, it has many uses in the video editing industry and has been studied for decades. Applications, where masks of only the subject of interest are desired, include rotoscoping in video production and backdrop blurring in online meetings. However, the ability to produce video mattes that incorporate not just the item of interest but also its related effects, including shadow and reflections, is generally desired. This could improve the realism of the final cut movie while decreasing the need for the laborious hand segmentation of secondary effects. 

Reconstructing a clean backdrop is preferred in applications like object removal, and being able to factor out the relevant impacts of foreground objects helps do just that. Despite its advantages, the ill-posedness of this problem has led to significantly less research than that of the standard matting problem.

Omnimatte is the most promising effort to date to address this issue. Omnimattes are RGBA layers that record moving items in the foreground and the effects they produce. Omnimatte’s use of homography to model backgrounds means it can only be effective for videos in which the background is planar or in which the sole type of motion is rotation.

D2NeRF makes an effort to solve this problem by modeling the scene’s dynamic and static components separately utilizing two radiance fields. All processing is done in three dimensions, and the system can handle complex scenarios with a lot of camera movement. Additionally, no mask input is required, making it fully self-supervised. It is unclear how to combine 2D guidance defined on video, such as rough masks, but it does effectively segment all moving items from a static background.

Recent research by the University of Maryland and Meta suggests an approach that combines the advantages of both by using a 3D background model with 2D foreground layers.

Objects, actions, and effects that would be difficult to create in 3D can all be represented by the lightweight 2D foreground layers. Simultaneously, 3D backdrop modeling permits handling the background of complicated geometry and non-rotational camera motions, which paves the way for processing a wider variety of movies than 2D approaches. The researchers call this technique OmnimatteRF. 

Experimental results demonstrate its strong performance over a wide range of videos without requiring individual parameter modification for each. D2NeRF has produced a dataset of 5 videos rendered using Kubrics to objectively analyze background separation in 3D environments. These sets are relatively uncluttered interior settings with some moving items that create solid shadows. In addition, the team generated five videos based on open-source Blender movies that have complex animations and lighting conditions for more difficult and realistic scenarios. Both datasets demonstrate superior performance compared to past investigations. 

The backdrop model will not be able to accurately restore the color of a section if it is always in the shadows. Since an animate layer has an alpha channel, it should be possible to record only the additive shadow while preserving the original color of the background. Unfortunately, the lack of clear boundaries surrounding this issue in its current context makes it difficult to find a workable solution.


Check out the Paper, Github, and Project Page. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 30k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..


Dhanshree Shenwai is a Computer Science Engineer and has a good experience in FinTech companies covering Financial, Cards & Payments and Banking domain with keen interest in applications of AI. She is enthusiastic about exploring new technologies and advancements in today’s evolving world making everyone’s life easy.



Source link

Related posts
AI

Redefining Single-Channel Speech Enhancement: The xLSTM-SENet Approach

2 Mins read
Speech processing systems often struggle to deliver clear audio in noisy environments. This challenge impacts applications such as hearing aids, automatic speech…
AI

Efficient Blockchain State Management with Quick Merkle Database (QMDB)

3 Mins read
Blockchain systems face significant challenges in efficiently managing and updating state storage due to high write amplification (WA) and extensive I/O operations….
AI

Enhancing Language Model Performance and Diversity Through Multiagent Fine-Tuning

3 Mins read
LLMs, such as GPT-3.5 and GPT-4, have shown exceptional capabilities in language generation, comprehension, and translation tasks. Despite these advancements, their performance…

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *