TECH NEWS – The company known for Dall-E and ChatGPT has unveiled a new technology.
According to OpenAI, Sora will be the basis for a model that will be able to understand and simulate the real world, bringing the company one step closer to AGI (artificial general intelligence). It can create 60-second recordings with commands such as “stylish woman walking down a Tokyo street” or “a movie trailer featuring the adventures of the 30-year-old spaceman wearing a red woolen motorcycle helmet”.
Previous video-generating AI has not been very consistent, as faces, objects, and clothing can vary from frame to frame. In contrast, OpenAI says Sora not only understands what the user wrote in their query but also how those things exist in the physical world. In the case of the video celebrating the Lunar New Year, it’s not silly at first to think that it’s meant to be background material for a documentary and is a real shot, when it’s not, and on closer inspection, the proportions of the people are off and seem to be stumbling.
“The current model has flaws. It can struggle to accurately simulate the physics of a complex scene, and it may not understand certain instances of cause and effect. For example, a person may bite into a cookie, but the cookie may not have a bite mark afterward. The model may also confuse spatial details of a request, such as confusing left and right, and may struggle with precise descriptions of events that occur over time, such as following a particular camera trajectory,” OpenAI wrote. Sora is not yet widely available because the company is looking into the social risks of the model and working on a detector that can tell if Sora was used for the video in question.
Several visual artists, designers, and filmmakers have been given access so that OpenAI can use their feedback to improve the model to make it most useful to creative professionals. Where does the model get its source material from?
Leave a Reply