Amazon Nova debuts: new generative AI models for images, video, and text

Amazon

Photo: Getty images

Amazon has introduced a new technology called Amazon Nova, which opens a new stage in the development of generative artificial intelligence models. These innovative models are designed to reduce costs and speed up the execution of tasks related to the creation of AI-based content and can be used in 200 languages, including Hebrew. This allows you to avoid language barriers and not worry about creating separate models for different regions.

According to Amazon, Amazon Nova technology allows users to “analyse complex documents and videos, understand graphs and charts, generate immersive video content, and build sophisticated AI agents”. This new generation of models processes not only text, but also images and video, which opens up new opportunities for creative and business tasks.

Three models of understanding

Amazon Nova is built on three models of understanding, although the company says that a fourth model will be introduced in the near future:

  1. Amazon Nova Micro is a model that allows you to generate text output quickly and at a low cost. It is capable of processing input data of up to 300,000 tokens and analysing multiple images or videos of up to 30 minutes in a single query. The model also uses technologies such as model distillation to improve efficiency.
  2. Amazon Nova Pro is a more powerful model capable of processing up to 300,000 tokens, which makes effective use of multimodal intelligence and agent-based workflows. It can understand visual questions, including visual answers and video analysis.
  3. Amazon Nova Premier is the most powerful model, focused on complex insights and customised model distillation. This model will be available in early 2025 and is the primary model for tasks requiring deep analytical thinking.

All three models have advanced skills in Retrieval-Augmented Generation (RAG), function calling, and agent-based applications, which allows them to be effectively integrated into complex workflows.

Key capabilities

One of the main advantages of Amazon Nova is the ability to customise it to the specific needs of the user. The models can be adapted to industry terminology, customised to the brand’s voice, and optimised for specific use cases. This allows for more accurate and personalised AI agents for business, science, and creativity.

The technology supports 200 languages, making it a versatile tool for global users, which is important given the growing need for multilingual solutions for business and the content industry.

Safety and ethics

As with other generative models, the issue of ethics and security remains important. Amazon emphasises that the new models include built-in security mechanisms, and all AI-generated creativity will be watermarked to avoid abuse and counterfeiting.

Amazon Nova promises to become a revolutionary tool for business development, education, entertainment, and many other industries, opening up new horizons for the use of generative AI.