Since June, Craiyon (formerly known as DALL-E Mini), a website capable of generating a series of nine images from a simple textual description, has been a resounding success. Just type, in English, “a raw chicken slides down a slide”, “Jesus uses a virtual reality headset” or “SpongeBob painted in the manner of Claude Monet” so that these curious descriptions turn into more or less successful images. Its virality is explained in particular by its free access: according to its creator, Boris Dayma, a Frenchman living in Houston, it currently receives more than a million requests per day.
Built on the computer code of the now obsolete DALL-E project, developed by OpenAI, a company founded by Elon Musk and Sam Altman, it makes it possible to use artificial intelligence (AI) to design an infinity of images. But if DALL-E 2, the new OpenAI project, or even the Parti and Imagen tools, developed by Google, offer results so close to reality that they are virtually indistinguishable from it, Craiyon, which relies on a older technology and more limited computing power, has always been designed for the general public. “I really wanted this technology to be in the hands of all users, so they could see what the cutting-edge models are capable of. “, explains to World Mr Dayma.
An approach assumed by the development team, but also by Clément Delangue, the French CEO of the company Hugging Face, which accompanies the development of Craiyon. “There are two different approaches in the fieldhe lists. Some companies, like OpenAI and Gafam [Google, Amazon, Facebook, Apple et Microsoft] build their tools in a very private way – which can create a concentration of power and is a big risk in terms of machine learning. In our case, the idea was to release the model publicly to continue working on it in a collaborative way, so that users can have all the information on how it works. »
A sort of open-air laboratory
Google, for the time being, did not wish to release a public version of its Imagen project. In a post published in June, a researcher and an engineer from the company acknowledge that if “models for generating images from texts are stimulating tools for inspiration and creativity, (…) they also carry risks related to misinformation, bias and security. »
You have 69.06% of this article left to read. The following is for subscribers only.