You are currently viewing AI-driven Techniques for Generating Image Alt Text

AI-driven Techniques for Generating Image Alt Text

Learn how to use modern AI tools to add useful text descriptions to images, making your web content accessible to more people.

As the online world continues to become fundamental in our everyday lives, ensuring accessibility to all uses remains a topic of importance for digital creators. An important element that provides accessibility for all users is image alternative (‘alt’) text.

Alt text enables all users, especially the visually impaired and those with slow internet connections, to engage with the content fully, retaining important context provided by images. Beyond that, high-quality alt text helps improve your search engine rankings, thereby attracting a larger reader base.

Crafting image alt text can be a laborious and time-consuming task. Fortunately, with emerging AI technologies at our disposal, generating rich and descriptive alt text for your images has become easier.

In this article, we’ll look at how you can leverage AI to easily generate alt text for your images and ensure all your readers get the full value from your content.

The Importance of Alt Text

Alt text is an important underlying factor of digital accessibility that can often go unnoticed by the masses. By providing descriptive information for images, alt text ensures accessibility, usability, and compliance – contributing to a more inclusive online experience.

Alt text becomes a voice for visually impaired users, enabling them to perceive and engage with visual content via screen readers. Similarly, users in developing countries suffering slower internet connections benefit, as alt text provides context when images fail to load.

In many countries, compliance with web accessibility standards, such as WCAG 2.1, is a legal requirement. Adding alt text to images is a WCAG requirement and can help you avoid unnecessary legal risks.

Challenges in Creating Alt Text

Despite the importance of alt text, many content creators often neglect it when creating content. In some cases, creators are completely unaware of the importance of alt text from an accessibility standpoint.

Including alt text often finalizes the process of article or blog post creation. However, it can be laborious and sometimes mentally taxing, which often results in generic, ineffective alt text that offers minimal value to readers.

Allowing a simpler and more streamlined way to generate alt text is essential for improving accessibility for all users.

How to Create Alt Text with AI

In the last few years, we’ve seen a significant increase in the accessibility of AI tools. By leveraging readily available AI image and language tools, creating alt text for images is now easier.

1. Midjourney Describe

Midjourney is a revolutionary generative AI text-to-image tool that creates high-quality imagery from a text input – anything from logos to Disney-style characters to ultra-real stock lifestyle photos.

The Midjourney ‘Describe’ feature is basically a text-to-image feature in reverse. It lets you upload an image and gives you four distinct prompts that match the image, making it perfect for creating alt texts.

Let’s say you wanted to use the image below in an article: 

Aerial view of ancient temple ruins in Rajasthan, India.

You can see how creating a quality alt-text description might be difficult for an image like this. But, not to worry!

Simply copy the image, paste it into Midjourney, and use the ‘Describe’ feature. Midjourney will then output four descriptions of the image:

Text prompt on Midjourney for creating alt text for an image of temple ruin site .

Text in the picture
1. Old fort of annur hastinapur sunset mountain, india stock photo, in the style of UDH image, columns and totems, panorama, gray and crimson, aerial photography, tamron 24mm f/2.8 di iii osd m1:2, sky-blue and brown –ar 129:85

2. The mounds of ancient jadodhav temple, rajasthan stock photo, in the style of UHD image, apocalypse landscape, paorama, emek golan, flickr, light crimson and gray, birds-eye-view — ar 128:85

3. A view of a temple in the middle of the old city of jaipur, in the style of post-apocalyptic backdrops, expansive landscapes, columns and totems, light sky-blue and dark crimson, birds-eye-view, post-apocalyptic ruins, rural landscapes — ar 128:85

4. An old ruin and taj mahal, ear jaipur, india, stock photo, in the style of post-apocalyptic landscapes, 32l UHD, columns and totems, aerial view, orange and gray, rural landscapes, post-apocalyptic imagery — ar 128:85. 

As you can see, these descriptions could almost be good enough as they are. To take it further, I like to use ChatGPT to refine the output. Copy the four prompts from Midjourney and enter those with the following prompt into ChatGPT:

Prompt: I gathered the following four descriptions of an image using an AI image-to-text tool. I am using the image on my website. Could you please generate an appropriate alt text for the image? Disregard the aspect ratio.

Now, see the final output from ChatGPT below:

Ancient ruins of an Indian temple with alt text provided beneath.

Text in the picture
Aerial view of the ancient ruins of a temple in Rajasthan, India, under a sunset sky. The image showcases the expansive landscape in light sky-blue and dark crimson hues, resonating with styles of post-apocalyptic landscapes, while highlighting the historic columns and totems of the site.

You can see how much value this description adds to the reader. I doubt you might have been able to come up with such a good description yourself!

2. Google Bard

Google was hoping Bard would be their breakthrough AI chatbot to rival ChatGPT. After months of hype, Bard launched in early 2023 as an experimental conversational service to enhance Google Search. However, its debut floundered due to relatively poor responses and factual inaccuracies.

Since Google released Bard, the tech giant has added a number of features, including an image-to-text feature. You simply copy your image into a chat with Bard and ask it to provide some alt text. Here’s an example:

A nikon camera and a bead necklace on a bed

Text in the picture
A camera and a necklace on a bed. The camera is a Nikon DSLR camera, and the necklace is made of beads. The image shows a camera and a necklace sitting on a bed. The camera is a Nikon SDLR camera, and it is black in color. The necklace is made of beads, and it is blue and green in color. The bed is white in color and there is a pillow on the bed. 

As you can see, the description isn’t quite perfect, but it’s a great start!

Conclusion

AI-based tools offer tremendous potential for creating high-quality image alt text, enhancing the accessibility and discoverability of digital content.

By automating this often laborious task, creators can focus more on their core content while ensuring a seamless, inclusive experience for all users.

As technology evolves, we can anticipate further advancements in AI-driven alt-text generation, contributing to a more accessible and inclusive digital world.

Considering Google’s ability to analyze images automatically, we might even see this process automated in years to come.

Author's Bio

Matt Duffin is the Founder of rareconnections.io. Combining his background in Mechanical Engineering with a passion for tech, Matt utilizes his expertise to help others leverage AI technology and tools.

Leave a Reply