OpenAI: What Is OpenAI and How Does It Work?

OpenAI What is OpenAI and How does it work
OpenAI What is OpenAI and How does it work


What is OpenAI?

OpenAI is an artificial intelligence (AI) research laboratory and company. They have created a variety of programs using artificial intelligence and machine learning algorithms that allow computers to do all sorts of things, like create pictures from text or create a robotic hand that solves Rubik’s cubes.

Their latest project, OpenAI Codex, aims to make programming software and applications accessible to ordinary people, saving professional programmers time and energy writing code.

What is OpenAI Codex?

Codex, an AI coding software, was built on OpenAI’s language generation model, GPT-3, and acts as a translator between users and computers. In early demos, users were able to create simple websites and games using natural language or plain English instead of a specialized programming language.

Early last year, OpenAI demonstrated a remarkable new AI model called DALL-E (a combination of WALL-E and Dali) that is capable of drawing almost anything and in almost any style. But the results were rarely anything you’d want to hang on the wall. Now the DALL-E 2 is out, and it does what its predecessor did much, much better – frighteningly well, in fact. But new abilities come with new restrictions to prevent abuse.

OpenAI Benefits

OpenAI also uses speed and image filters on the DALL-E 2, although the filters some customers have complained about are overzealous and imprecise. And the company has focused some of its research efforts on diversifying the types of images that DALL-E 2 generates to combat biases that text-to-image AI systems fall victim to (such as generating predominantly white images). men when the text is prompted as “examples of CEOs”).

What is DALL E 2?

DALL E 2 was detailed in our original post on it, but the bottom line is that it’s capable of taking fairly complex challenges, such as “A bear riding a bike around the mall, next to a picture of a cat stealing the Declaration of Independence.” He’d be happy to oblige and out of hundreds of outputs, it found the one most likely to meet the user’s standards. Read more here

Advantages of DALL E 2?

DALL-E 2 basically does the same thing, turning a text prompt into a surprisingly accurate image. But learned a few new tricks.

First, it’s just better to do the original thing. The images that come out at the other end of the DALL-E 2 are several times larger and more detailed. It’s actually faster, even though it creates more frames, which means more variations can be created in the few seconds the user might be willing to wait.

Part of this improvement comes from switching to a diffusion model, a type of image creation that starts with pure noise and refines the image over time, repeatedly making it look a little more like the desired image until there’s no noise left. But it’s also just a smaller, more efficient model, some of the engineers who worked on it told me.

Second, DALL-E does what they call “inpainting”, basically a clever replacement of a given area in the image. Let’s say you have a photo of your place, but there are some dirty dishes on the table. Simply select that area and describe what you want instead: “empty wooden table” or “table without dishes”, whichever makes sense. Within seconds, the model will show you several interpretations of the challenge and you can choose what looks best.

Photoshop vs DALL E 2

You may be familiar with something similar in Photoshop, the “context fill”. But this tool is more for filling space more of the same, like when you want to replace a bird in an otherwise clear sky and don’t want to bother with cloning. DALL-E 2’s abilities are much bigger, it can invent new things, like a different kind of bird or a cloud, or in the case of a table, a vase of flowers or a spilled bottle of ketchup. It’s not hard to imagine useful applications for this.

Capebilities of DALL E 2 and OpenAI

Notably, the model will include things like appropriate lighting and shadows or choose the right materials because it is aware of the rest of the scene. I use “aware” loosely here – no one, not even its creators, knows how DALL-E represents these concepts internally, but for these purposes the important thing is that the results indicate that it has some form of understanding.

The third new capability is “variations”, which are accurate enough: you give the system an example image and it will generate as many variations on it as you want, from very close approximations to impressionistic edits. You can even give him a second image and he kind of cross-pollinates them, combining the best aspects of each. The demo they showed me had the DALL-E 2 generating street murals based on the original, and it really captured the artist’s style for the most part, although it was probably clear what the original was on inspection.

It’s hard to overstate the quality of these images compared to other generators I’ve seen. Although there are almost always the kinds of “narratives” you expect from AI-generated images, they’re less obvious, and the rest of the image is much better than the best others have produced.

Almost anything in OpenAI

I’ve written before that the DALL-E 2 can draw “almost anything”, although there’s really no technical limitation preventing the model from convincingly drawing anything you can think of. But OpenAI is aware of the risks posed by deepfakes and other abuses of AI-generated images and content, so it added some restrictions to its latest model.

For now, DALL-E 2 is running on a hosted platform, an invite-only test environment where developers can try it out in a controlled manner. In part, this means that all of their model challenges are assessed for violations of the content policy, which prohibits what they say are “non-G-rated images.”

This means no: hate, harassment, violence, self-harm, explicit or “shocking” images, illegal activities, deception (eg fake news), political actors or situations, medical or disease-related images, or general spam. In fact, much of this won’t be possible because the offending images have been excluded from the training set: DALL-E 2 can do a shiba inu in a beret, but it doesn’t even know what a missile strike is.

In addition to evaluating challenges, all resulting images will (for now) be reviewed by human inspectors. Of course, this isn’t scalable, but the team told me it’s part of the learning process. They’re not sure exactly how the boundaries should work, so they’re keeping the platform small and proprietary for now.

In time, DALL-E 2 will likely turn into an API that can be called like other OpenAI features, but the team said they want to be sure it’s wise before they take off the training wheels.

Leave a Comment