HomeNewsAI systems and folks “see” the world in a different way –...

AI systems and folks “see” the world in a different way – and that’s why AI images look so vibrant

How do computers see the world? It's not quite the best way people do it.

Recent advances in generative artificial intelligence (AI) make it possible to realize much more things with computer vision. For example, you possibly can ask an AI tool to explain a picture or create a picture from an outline you provide.

As generative AI tools and services turn into more integrated into on a regular basis life, it becomes increasingly essential to know more about how computer vision compares to human vision.

My newest Researchpublished in Visual Communication, uses AI-generated descriptions and pictures to get a way of how AI models “see” – and discovered a vibrant, sensational world of common imagery that’s markedly different from the human visual realm.

Algorithms see things very in a different way than people.
Elise Racine / Better Images from AI / Emotion: Joy, CC BY

Comparison of human and computer vision

Humans see when light waves enter our eyes through the iris, cornea and lens. Light is converted into electrical signals by a light-sensitive surface called the retina within the eyeball after which by us Interpreting brains convert these signals into images that we see.

Our vision focuses on key points corresponding to color, shape, movement and depth. With our eyes we are able to detect changes within the environment and discover potential threats and dangers.

Computers work completely in a different way. They process images by standardizing them, inferring the context of a picture based on metadata (e.g. time and site information in a picture file), and comparing images with other images they’ve previously encountered. Computers deal with things like edges, corners or textures which are present within the image. They also search for patterns and take a look at to categorise objects.

A screenshot of a CAPTCHA test that asks a user to select all images with a bus.
Solving CAPTCHAs helps prove you might be human and likewise helps computers learn to “see.”
CAPTCHA

You've probably helped computers learn to “see” by completing tasks online CAPTCHA testing.

These are typically used to assist computers differentiate between humans and bots. But also they are used to coach and improve machine learning algorithms.

So while you're asked to “select all images with a bus,” you're helping the software learn the difference between several types of vehicles and proving that you just're human.

Exploring how computers “see” in a different way.

In my latest research, I asked a big language model to explain two visually different sets of human-generated images.

One set contained hand-drawn illustrations, while the opposite consisted of photos taken on camera.

A screenshot of several thumbnails, some illustrations, and some photos.
Some of the nuances of algorithmic vision may be revealed by asking an AI tool to explain images after which visualizing those descriptions.
TJ Thomson, Provided by creator (no reuse)

I fed the descriptions back into an AI tool and asked it to visualise what it had described. I then compared the unique human-made images with the computer-generated images.

The resulting descriptions noted that the hand-drawn images were illustrations, but the opposite images didn’t mention that they were photographs or had a high level of realism. This suggests that AI tools consider photorealism because the default visual style unless explicitly requested otherwise.

The cultural context was largely missing from the descriptions. For example, the AI ​​tool couldn’t or didn’t wish to draw any conclusions in regards to the cultural context from the presence of Arabic or Hebrew script in the photographs. This highlights the dominance of some languages, corresponding to English, within the training data of AI tools.

Although color is crucial to human vision, it has also been largely ignored in AI tools' image descriptions. Visual depth and perspective were also largely ignored.

The AI ​​images were boxier than the hand-drawn illustrations, which used more organic shapes.

Two similar but different black and white illustrations of a bookcase on wheels.
The AI-generated images were much boxier than the hand-drawn illustrations, which used more organic shapes and had a unique ratio of positive to negative space.
Left: Cruz Medar; right: ChatGPT

The AI ​​images were also way more saturated than the source images: they contained brighter, more vibrant colours. This highlights the prevalence of stock photos, which are inclined to be “higher contrast,” in AI tools. Training data.

The AI ​​images were also more sensational. A single automobile in the unique image became a part of an extended motorcade within the AI ​​version. AI seems to exaggerate details not only in text but additionally in visual form.

A photo of people with guns driving through a desert and a generated photorealistic image of several cars containing people with guns driving through a desert.
The images generated by the AI ​​were more sensational and richer in contrast than the photos created by humans.
Left: Ahmed Zakot; right: ChatGPT

Due to the generic nature of AI images, they may be utilized in many contexts and across countries. But the dearth of specificity also means The audience could notice them as less authentic and fascinating.

Decide when to make use of human or machine vision

This research supports the belief that individuals and computers “see” in a different way. Knowing when to depend on computer or human vision to explain or create images is usually a competitive advantage.

While AI-generated images may be striking, they may seem hole upon closer inspection. This may limit their value.

Images can trigger an emotional response, and audiences may find human-made images that authentically reflect certain conditions more appealing as computer-generated experiments.

However, AI's capabilities could make it a beautiful option for quickly labeling large data sets and helping people categorize them.

Ultimately, each human and AI vision play a task. Knowing more about each individual's capabilities and limitations can make it easier to be safer, more productive and higher equipped to speak within the digital age.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Must Read