TECH NEWS – Natural language commands can be used to manipulate images. Sounds pretty groundbreaking!
Although Apple lags behind OpenAI’s ChatGPT and Google’s Gemini, the US tech giant has poured no small amount of money into AI to ensure that the iPhone 16 will have plenty of AI features with the release of iOS 18. Now, however, we hear that Apple researchers have created a model that can edit images, so the user can get the model, called MGIE, to work with simple, easy-to-understand commands. The technology will certainly be seen during WWDC 2024 in June.
MGIE is short for MLLM-Guided Image Editing, and MLLM stands for multimodal large language model, so we’re talking about a multimodal large language model-guided image editing that can interpret and execute user commands at the pixel level. The tool can modify brightness, sharpness, contrast, but also shape, color or texture of the selected object. Photoshop-like tools are also included (cropping, resizing, rotating, filters), and even changing the background is possible.
Apple’s new AI model also makes sense of context and common sense. For example, if you have a picture of a pizza and you give it a command to make it healthier, it will put vegetables on it because that’s what MGIE will eventually figure out from the context.
The model was created by Apple in collaboration with researchers at the University of California and will appear in several apps when the technology is ready. The research was presented at the ICLR (International Conference on Learning Representations). The code and pre-trained models of the model are available on Github, so with a little know-how you can already try out what the technology can do in the iPhone, iPad and even Apple Vision Pro in the future, because the Cupertino-based tech giant is really serious.
With Siri lagging behind Amazon’s Alexa and Google Assistant, it will be nice to translate from here.
Source: WCCFTech, VentureBeat, GitHub
Leave a Reply