ChatGPT and other voice AIs are nothing without people - a sociologist explains how countless hidden people create the magic

March 11, 2024

83

The media hype surrounding ChatGPT and other major artificial intelligence language model systems covers a variety of topics, from prosaic – Large language models could replace traditional web searches – for the worrisome – AI will destroy many roles – and the overload – AI poses an extinction level threat to humanity. All of those topics have a typical denominator: large language models herald artificial intelligence that may replace humanity.

But large language models, despite their complexity, are literally really silly. And despite the name “artificial intelligence,” they’re completely depending on human knowledge and human labor. Of course, they will't reliably generate recent knowledge, but there's more to it than that.

ChatGPT cannot learn, improve, and even stay current without people giving it recent content and telling it the right way to interpret that content, not to say programming the model and constructing it, maintaining it, and powering it its hardware. To understand why, you first need to grasp how ChatGPT and similar models work and the role humans play in making them work.

This is how ChatGPT works

Large language models like ChatGPT largely work Predict which characters, words and sentences should follow one another so as based on training data sets. In the case of ChatGPT, the training dataset accommodates massive amounts of public text sourced from the Internet.

ChatGPT is predicated on statistics, not understanding words.

Imagine if I trained a language model using the next sentences:

Bears are large, furry animals. Bears have claws. Bears are secretly robots. Bears have noses. Bears are secretly robots. Bears sometimes eat fish. Bears are secretly robots.

The model can be more prone to tell me that bears are secretly robots than the rest, because that sequence of words appears most steadily in its training data set. This is clearly an issue for models trained on fallible and inconsistent data sets – and this is applicable to every thing, including academic literature.

People write many various things about quantum physics, Joe Biden, healthy eating, or the January sixth rebellion, a few of that are more accurate than others. How is the model speculated to know what to say about something when people say a number of various things?

The need for feedback

This is where feedback comes into play. When you utilize ChatGPT, you’ll notice that you’ve gotten the choice to rate answers pretty much as good or bad. If you rate it as poor, you shall be asked to supply an example of what a very good answer would contain. ChatGPT and other large language models learn which answers and which predicted text sequences are good and bad through feedback from users, the event team, and contractors tasked with labeling the output.

ChatGPT cannot compare, analyze or evaluate arguments or information itself. It can only produce text sequences which are similar to those who other people have used when comparing, analyzing, or evaluating, and prefers those which are just like those which were said to be good answers prior to now.

So when the model gives you a very good answer, it draws on a considerable amount of human work that has already gone into telling you what’s a very good answer and what will not be. Behind the screen there are various, many human staff who’re all the time needed if the model is to be further improved or its content coverage expanded.

This was revealed by a recent investigation published by journalists in Time magazine Hundreds of Kenyan staff spent 1000’s of hours Reading and flagging racist, sexist and disturbing text, including graphic descriptions of sexual violence, from the darkest depths of the Internet to show ChatGPT not to repeat such content. They were paid not more than $2 an hour and plenty of understandably reported suffering psychological distress because of this of this work.

Voice AIs require humans to inform them what constitutes a very good answer – and what constitutes toxic content.

What ChatGPT can't do

The importance of feedback might be seen directly from ChatGPT’s tendency: “hallucinate”; that’s, confidently give inaccurate answers. ChatGPT cannot provide good answers on a subject without training, even when good information on the subject is widely available on the Internet. You can do this yourself by asking ChatGPT about more and fewer obscure things. I discovered it particularly effective to ask ChatGPT to summarize the plots of assorted works of fiction, because the model appears to have been trained more strictly on nonfiction than fiction.

In my very own testing, ChatGPT has the plot of JRR Tolkien's “Lord of the Rings“, a really famous novel with only just a few errors. But his summaries of Gilbert and Sullivan's “The Pirates of Penzance“ and by Ursula K. Le Guins “The left hand of darkness” – each of that are area of interest in nature, but in no way obscure – come near gaming Crazy libraries with the character and place names. It doesn't matter how good the respective Wikipedia pages of those works are. The model needs feedback, not only content.

Because large language models don't actually understand or evaluate information, they depend on humans to do that for them. They affect human knowledge and work. As recent sources are added to their training datasets, they require recent training on whether and the right way to form sentences based on those sources.

You cannot judge whether news reports are accurate or not. They cannot evaluate arguments or weigh compromises. You can't even read an encyclopedia page and only make consistent statements with it or accurately summarize the plot of a movie. They depend on people to do all of these items for them.

Then they paraphrase and remix what people have said, counting on much more people to inform them whether or not they paraphrased and remixed it well. When general opinion on a subject changes – for instance whether salt Is bad to your heart or whether early detection tests for breast cancer make sense – They should be extensively retrained to adopt the brand new consensus.

Lots of individuals backstage

In short, large language models are usually not harbingers of fully independent AI, but somewhat illustrate the whole dependence of many AI systems, not only on their designers and maintainers, but in addition on their users. So when ChatGPT gives you a very good or useful answer to something, remember to thank the 1000’s or tens of millions of hidden individuals who wrote the words and taught it what good and bad answers are.

Far from being an autonomous superintelligence, ChatGPT, like all technologies, is nothing without us.

ChatGPT and other voice AIs are nothing without people – a sociologist explains how countless hidden people create the magic

This is how ChatGPT works

The need for feedback

What ChatGPT can't do

Lots of individuals backstage

LEAVE A REPLY Cancel reply

Must Read

A brand new Chinese video generation model appears to censor politically sensitive topics

OpenAI pronounces “SearchGPT” to remain at the highest

How Salesforce's STEM 1T dataset could revolutionize the AI industry

Forget coding bootcamps: Airtable's AI can construct your app in seconds

Level AI applies algorithms to the weak points within the contact center

ChatGPT: Everything you have to know concerning the AI-powered chatbot

Breakthroughs in artificial intelligence create a brand new ‘brain’ for advanced robots

Latest articles

A brand new Chinese video generation model appears to censor politically sensitive topics

OpenAI pronounces “SearchGPT” to remain at the highest

How Salesforce's STEM 1T dataset could revolutionize the AI industry

Our Newsletter

ChatGPT and other voice AIs are nothing without people – a sociologist explains how countless hidden people create the magic

This is how ChatGPT works

The need for feedback

What ChatGPT can't do

Lots of individuals backstage

RELATED ARTICLES

LEAVE A REPLY Cancel reply

Must Read

Latest articles

Our Newsletter