OpenAI calls its new system Sora, after the Japanese word for sky. The team behind the technology, including the researchersĀ Tim BrooksĀ andĀ Bill Peebles, chose the name because it āevokes the idea of limitless creative potential.ā
Similar to GPT models, Sora uses a transformer architecture, unlocking superior scaling performance.
Sora builds on past research in DALLĀ·E and GPT models. It uses the recaptioning technique from DALLĀ·E 3, which involves generating highly descriptive captions for the visual training data.
In addition to being able to generate a video solely from text instructions, the model is able to take an existing still image and generate a video from it, animating the imageās contents with accuracy and attention to small detail.
Recently OpenAI released their new text-to-video product called Sora, which is an absolute game-changer.
Sora is essentially a physics engine, which is a program that takes a 3d object and renders it based on the laws of physics.
Physics engines create simulations of many worlds, real or imaginary.
Dr. Jim Fan says āThe simulator learns intricate rendering, "intuitive" physics, long-horizon reasoning, and semantic grounding, all by some de-noising and gradient maths.ā
Sora is an end-to-end, diffusion transformer model. It inputs text/image and outputs video pixels directly. Sora learns a physics engine implicitly in the neural parameters by gradient descent through massive amounts of videos.
Sora is a learnable simulator, or "world model". Of course it does not call UE5 explicitly in the loop, but it's possible that UE5-generated (text, video) pairs are added as synthetic data to the training set.
AĀ physics engineĀ describes a software program that is used to simulate physical phenomena. The first physics engines were used in military simulations, predicting where artillery shells would land. These engines factored in the shells' weights, forces, and trajectories to simulate the result.
In computerĀ video games, physics engines enhance the player's enjoyment by simulating the complex physical characteristics of a virtual world. Unlike scientific physics engines, video games engines use an approximation of real-world physics to quickly simulate complex world interactions, inĀ real-time, computed on the computer'sĀ GPUĀ (graphics processing unit).
Sora is a diffusion model that is able to "generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background."
A physics engine takes a list of objects and locations etc. and computes temporal updates, and allows you to render views.
A physics engine is a software component designed to simulate physical interactions in a virtual environment. It is commonly used in video games, computer graphics, and simulations to model the behavior of objects based on the laws of physics.
Physics engines simulate various aspects such as gravity, friction, collision detection, and response. They allow developers to create realistic and dynamic virtual worlds where objects interact with each other in a way that mimics the real-world physics.