1 of 6

Generative AI models

Unleashing the power of AI to transform your digital assets

Derivative assets

A derivative asset refers to a modified or transformed version of the original digital asset. These derivatives are created to meet specific needs, such as different file formats, resolutions or variations in content. The goal is to make the asset more suitable for various platforms or use cases, without altering the integrity of the original.

For instance, a high-resolution image can have derivatives in different resolutions for web, print or mobile applications. Those assets play a crucial role as they enable efficient distribution and utilization across diverse channels and media.

Derivative generation

Derivative generation involves the automated or manual process of creating these derivative assets from the original master asset. They can usually be generated on-the-fly or through predefined workflows. Automation reduces the manual effort required to produce derivatives and ensures consistent output quality.

Additionally, derivative generation allows users to access and use the right version of an asset that suits their specific requirements. This functionality streamlines asset management, reduces duplication, and ensures that the right content is efficiently delivered to the right audience.

Generative AI

Derivative generation can be enhanced and accelerated using different machine learning models. ML models, particularly those in the field of computer vision and image processing, can play a significant role in automating the creation of derivative assets.

A Generative AI model is a type of artificial intelligence algorithm designed to generate new content that resembles or is similar to the data it was trained on. Unlike traditional AI models that are used for classification or prediction tasks, generative models focus on creating new data rather than making decisions based on existing data.

They can be trained to recognize and understand different elements of an image, such as objects, faces, backgrounds, and other features. This understanding allows for intelligent resizing, cropping, background removal and other manipulation that produce derivatives without compromising the overall quality and aesthetics. By incorporating such capabilities, the process of derivative generation becomes more accurate and scalable.

Generative AI models have a wide range of applications, including image synthesis, text generation, music composition, video generation, and more. They are particularly valuable in creative fields and content generation tasks where producing new and original content is essential. However, they can also be used in other domains, such as data augmentation for training other machine learning models or in data generation for simulations and testing.

The following sub pages describe some of the derivative generation models offered by ASK Filerobot.

Background removal

An ML model that accurately separates foreground subjects from backgrounds, enabling easy and efficient generation of transparent assets

Overview

The Background remover is an advanced machine-learning model, designed to make background removal an effortless and time-saving process. By leveraging the power of deep learning algorithms, it can accurately separate foreground subjects from their backgrounds, resulting in high-quality transparent assets.

The model is built on a foundation of state-of-the-art deep learning techniques, particularly focused on semantic segmentation. It is trained on extensive datasets containing a diverse range of images, ensuring robustness and adaptability to handle various edge cases and complexities. By utilizing a multi-layered neural network architecture, it processes each pixel in an image to classify it as part of the foreground or background.

The model undergoes a rigorous training process, learning to identify different object shapes, fine details and semi-transparent elements in the images. As a result, it can accurately separate foreground subjects from their backgrounds, even in challenging scenarios.

Automatic background removal can be a game-changer for many workflows. By transforming product photos, portraits and creative artworks into transparent assets in a matter of seconds, it eliminates any fiddling with tedious image editing software or outsourcing the task to graphic designers. Professional-grade background removal can be achieved effortlessly, saving valuable time and resources.

Use cases

Automatic background removal can prove invaluable for multiple use cases, such as:

E-commerce product catalogs - The model ensures consistent, visually appealing product images that seamlessly blend into any website or marketing material and can streamline any e-commerce business.
Portrait photography - The background remover offers a quick and efficient way to remove distracting backgrounds from portrait shots, enabling better focus on the subject's features and expressions.
Design projects - Designers can explore boundless creative possibilities by easily overlaying graphics, text or new backgrounds, allowing for eye-catching collages, posters, and social media posts.
Presentations and marketing materials - Creation of professional presentations by placing images on any background, ensuring a clean and polished look that captivates the audience.
Image localization - The model facilitates localization for global audiences by enabling easy background replacement to suit different cultural contexts and brand aesthetics.

API endpoints

An up-to-date reference with all API endpoints is available here:

Example API responses

Image-to-text

Enhance content management with general-purpose visual and language understanding

Overview

Bridging the gap between visual and textual content is a crucial step in unlocking the full potential of digital assets. The Image-to-text ML model is an advanced solution designed to do just that by providing general-purpose visual and language understanding.

The model leverages state-of-the-art natural language processing and computer vision techniques to facilitate the understanding of images and textual data. When a user submits an image and an accompanying textual prompt (typically in the form of a question regarding the image), the model processes the visual and textual data, identifying objects, context and relationships within the image, and generates a relevant response.

Users can pose a wide range of questions, from object recognition and content analysis to more complex queries related to the image. The output is a properly constructed natural language answer that provides insights or information pertaining to the submitted data.

Our Image-to-text functionality is a versatile tool that gives customers the ability to extract insights, enrich content and enhance the overall management of digital assets.

Typical use cases

The Image-to-text functionality is powerful enough to be applied across a spectrum of industries and domains, such as:

Content tagging - Customers can automatically generate descriptive metadata for images, simplifying the organization and retrieval of digital assets.
E-commerce and product catalogs - E-commerce platforms can utilize the model to answer user queries about product images, providing detailed information and enhancing the shopping experience.
Media and entertainment - Media companies can analyze and describe scenes, characters and objects in images, aiding in content categorization and analysis.
Educational content - Educational institutions can enhance e-learning platforms by automatically generating explanations and descriptions for visual content in course materials.

API endpoints

Information about the specific API endpoints is available in an always up-to-date documentation, that can be accessed via the following link:

There, you can find detailed information about the API endpoints, together with all required request parameters, so you know how to interact with them.

Example API responses

Number plate blurring

Automatic and accurate blurring of license plate numbers in images to protect privacy and comply with data regulations

Overview

License plate blurring is the process of obscuring the license plates of vehicles in images so they become unreadable. Such a feature can be really useful, especially as online privacy is becoming a concern for many. It can be used to prevent the identification of vehicles by automatically blurring their license plates.

The Number plate anonymizer is an advanced ML model, designed to protect privacy and comply with data protection regulations. It efficiently detects vehicle registration plates within images and automatically applies a precise blur filter, rendering the plate numbers and characters illegible, while preserving the integrity of the surrounding content.

Use cases

Automatic plate blurring can be useful across various scenarios:

Security - The model can be utilized to automatically blur license plates in user images, protecting the privacy of individuals and vehicles by ensuring that sensitive information remains confidential.
Public image galleries - Any visible license plates are automatically blurred, adhering to privacy regulations and protecting personal data.
Vehicle sales and auctions - In online vehicle sales or auction platforms, license plates in vehicle images can be blurred, thus safeguarding the identities of sellers and buyers.
Insurance and claims - Insurance companies can anonymize sensitive information on damaged vehicles in images used for claims processing.

API endpoints

An up-to-date reference with all API endpoints is available here:

Example API responses

Quality improvement

An ML model that efficiently detects and removes compression artifacts, enhances image quality while preserving vital visual elements

Overview

The Artifact remover is specifically designed to detect and eliminate a wide array of artifacts, primarily caused by heavy lossy compression, ensuring a remarkable enhancement of image quality and the preservation of essential visual elements.

The model is based on deep learning techniques that analyze and learn from vast and diverse datasets of images featuring various compression artifacts. Through this extensive training, the model becomes proficient in recognizing specific patterns and distortions linked to aggressive compression methods.

Once it receives an image, the model efficiently identifies the presence of compression artifacts and applies sophisticated image restoration algorithms to remove or reduce them. The restoration process restores crucial details, textures, and sharpness, resulting in an image with heightened clarity and visual appeal.

Use cases

Automatic artifact removal and quality improvement find valuable applications in various domains:

E-commerce platforms - In the world of online retail, image quality plays a crucial role in customer engagement and purchasing decisions. The model ensures that product images are of top-notch quality by removing compression artifacts, thus improving the overall shopping experience by delivering visually appealing product showcases.
Digital advertising - High-quality visuals are essential for successful digital advertising campaigns. Captivating ad campaigns with artifact-free images can boost engagement and strengthen the brand message.
Archives and galleries - Historical archives and art galleries often house valuable images that may have undergone degradation due to outdated compression techniques. Restoring such images ensures the preservation of their visual authenticity.
Printing and publishing - In print media image quality is crucial. By employing the model to remove compression artifacts, publishers can achieve clear, vivid images that resonate with readers and convey their intended message effectively.

API endpoints

An up-to-date reference with all API endpoints is available here:

Example API responses

Text-to-image

Advanced deep learning techniques that generate high-quality and realistic images based on provided text prompts

Overview

The Text-to-image generator is a cutting-edge ML model, designed to generate captivating and realistic images based on provided text prompts. It allows users to bring their textual ideas and concepts to life, creating visually stunning assets that complement their digital content library.

To synthesize images from text descriptions, the model first processes the textual input, understanding the context, objects and settings described. Then, it generates high-resolution, visually coherent images that capture the essence of the text prompt.

The model facilitates image creation by empowering users to generate stunning visuals effortlessly. The image synthesis capabilities allow the generator to unlock limitless creative possibilities, making it a valuable tool.

Use cases

Tex-to-image generation can be a game changer in several use cases, including:

Marketing and advertising - Marketers can visualize and prototype advertising campaigns. By converting taglines into images, they can create compelling visuals that resonate with the target audience, boosting engagement and brand recall.
Product prototyping - Product development teams can visualize concepts and design ideas. By describing product features in text, they can rapidly generate images that represent potential product variations and iterate through designs effectively.
Educational content - Textual descriptions of historical events, scientific concepts or literary settings can be transformed into visually engaging images that enhance the experience for users of e-learning platforms.
Content creation - By providing textual prompts for abstract ideas, designers, artists and content creators can generate unique and imaginative visuals that can be used in digital art, illustrations or graphic designs.

API endpoints

An up-to-date reference with all API endpoints is available here: