HomeEthics & SocietyNew study attempts to align AI with crowdsourced human values

New study attempts to align AI with crowdsourced human values

Researchers from the Meaning Alignment Institute have proposed a brand new approach, Moral Graph Elicitation (MGE), to align AI systems with human values.

As AI becomes more advanced and integrated into our each day lives, ensuring it serves and represents everyone fairly is paramount. 

This study further argues that aligning AI with users’ goals is insufficient to make sure positive outcomes. 

The researchers argue that aligning AI systems solely with operator intent is insufficient for achieving good AI outcomes.

They state, “AI systems might be deployed in contexts where blind adherence to operator intent could cause harm as a byproduct. This may be seen most clearly in environments with competitive dynamics, like political campaigns or managing financial assets.”

“What are human values, and the way can we align to them?”

📝: https://t.co/iioFKmrDZA pic.twitter.com/NSJa8dbcrM

To address this issue, the researchers propose aligning AI with a deeper understanding of human values.

The MGE method has two key components: value cards and the moral graph. These form an alignment goal for training machine learning models.

  • Values cards capture what is very important to an individual in a particular situation. They consist of “constitutive attentional policies” (CAPs), that are the things that an individual pays attention to when making a meaningful alternative. For instance, when advising a friend, one might deal with understanding their emotions, suggesting helpful resources, or considering the potential outcomes of various selections.
  • The moral graph visually represents the relationships between value cards, indicating which values are more insightful or applicable in a given context. To construct the moral graph, participants compare different value cards, discerning which of them they imagine offer wiser guidance for a particular situation. This harnesses the collective wisdom of the participants to discover the strongest and most widely known values for every context.

To test the MGE method, the researchers conducted a study with 500 Americans who used the method to explore three controversial topics: abortion, parenting, and the weapons utilized in the January sixth Capitol riot.

The results were promising, with 89.1% of participants feeling well-represented by the method and 89% pondering the ultimate moral graph was fair, even when their value wasn’t voted because the wisest.

The study also outlines six criteria that an alignment goal must possess to shape model behavior following human values: it must be fine-grained, generalizable, scalable, robust, legitimate, and auditable. The researchers argue that the moral graph produced by MGE performs well on these criteria.

This study proposes an analogous approach to Anthropic’s Collective Constitiutal AI, which also crowdsources values for AI alignment.

89% even agree the winning values were fair, even when their very own value didn’t win! pic.twitter.com/sGgLCUtwzN

Limitations

There are limitations to AI alignment approaches that crowdsource values from the general public.

For example, dissenting views have been integral to societal decision-making for hundreds of years, and history has shown that almost all can often adopt the minority’s divergent viewpoints. Examples include Darwin’s theory of evolution and the struggles to abolish slavery and grant women the appropriate to vote.

While direct public input is democratic, it might result in populism, where the bulk could override minority rights or disregard expert advice.

Ensuring that the voices of marginalized groups are heard and acted upon is a critical challenge.

Moreover, involving the general public in decision-making is crucial, but there’s a risk of oversimplifying complex issues. Striking a balance between global principles and localized nuances is one other challenge, as a widely accepted principle in a single culture or region is perhaps controversial in one other.

AI constitutions could also reinforce Western values, potentially eroding the views and concepts of those on the periphery.

While the study acknowledges limitations and the necessity for further development, it provides one other strategy for creating AI systems that align with human values. Every attempt counts. 

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Must Read