After a long wait, OpenAI (https://openai.com/) has finally released the new GPT-4 model - a major update to the core structure behind the popular ChatGPT system, as well as the GPT-3.5 API. There's a lot going on in the new GPT-4 release.

Based on early examples, API documentation, and code samples provided by OpenAI, here are the key things to know about the project:

The project is multimodal. What does it mean?

Before the release of GPT-4, there was a lot of speculation about whether the project would remain a text-only model like ChatGPT, or become multi-modal. Multimodal models are capable of handling a wide range of media types, both output and input, from text to images and ultimately video.

Currently, GPT-4 supports both input and output images. Initially, this capability is only available to one third-party company that is helping OpenAI test image processing. As the system becomes faster, images as input will be available to more users.

But OpenAI has some examples of how this could end up working. One example includes a photo of eggs and flour with a cooking-related query. GPT-4 recommends recipes that can be made with the ingredients shown in the photo. The model can also be used to create image captions or write amazing alt text for images on websites. The video is not yet available, but it will likely appear since GPT-4 is multimodal.

OpenAI will provide API access to the new model almost immediately. Many companies already integrate with existing APIs from OpenAI, so migrating to GPT-4 is easy. By default, GPT-4 can handle 8,000 tokens, which is about 50 pages of text.

Processing more data will allow the system to process many more instructions, write longer articles, and perhaps even write very long documents or full-length works of literature. The evolution of neural networks is happening literally by leaps and bounds, and in the near future we will see repeated updates.