In the race for dominance in the sphere of artificial intelligence, technology giants are crossing ethical boundaries and testing the boundaries of public trust.
Recent revelations have revealed a pattern of behavior that’s raising alarm bells about data protection, fair competition and the concentration of power within the technology industry.
First of all, a Investigation by Proof News and WIRED revealed that Apple, NVIDIA, Anthropic and Salesforce used a dataset of subtitles from over 170,000 YouTube videos to coach their AI models.
This dataset, referred to as “YouTube captions,” was compiled without the consent of the content creators and will violate YouTube’s Terms of Service.
The scale of this data mining operation is staggering, encompassing content from educational institutions like Harvard, popular YouTubers like MrBeast and PewDiePie, and even major news outlets just like the Wall Street Journal and the BBC.
YouTube has not yet responded, but back in April CEO Neal Mohan said OpenAI could potentially use videos to coach its Sora text-to-video model. would violate the terms of useand told Bloomberg: “If Sora were to make use of content from YouTube, it might be a 'clear violation' of its terms of service.”
OpenAI just isn’t among the many defendants this time, but we don't know if YouTube will take motion if the brand new allegations are indeed proven to be true.
This just isn’t the primary time that technology corporations have come under fire.
In 2018, Facebook got here under intense scrutiny in reference to the Cambridge Analytica scandal, by which the information of tens of millions of users was collected for political promoting without their consent.
In 2023, it was determined that a Record named Books3which incorporates over 180,000 copyrighted books, was used to coach AI models without the authors' permission. This led to a wave of lawsuits against AI corporations, with the authors claiming copyright infringement.
This is only one example from a growing pile of lawsuits from all corners of the creative industry. Universal Music Group, Sony Music and Warner Records recently added their names to the pile.
In their rush to develop more advanced AI models, technology corporations look like taking a “say sorry, not permission” approach to data collection.
The merger of Microsoft and Inflection
As the YouTube scandal unfolds, Microsoft's recent hiring spree at AI startup Inflection has caught the eye of UK regulators.
The Competition and Markets Authority (CMA) has launched a phase one merger investigation to look at whether these mass hirings constitute a de facto merger that would hamper competition within the AI sector.
This move by Microsoft, which included the hiring of Inflection co-founder Mustafa Suleyman (a former Google DeepMind executive) and a good portion of the startup's workforce, was swift and drastic.
This gains additional weight while you consider Microsoft’s existing partnerships within the AI space. The company has already invested a complete of around $13 billion in OpenAI, raising questions on market concentration.
This also happened after Microsoft withdrew from its Seat without voting rights at OpenAI. According to experts, this is probably going attributable to the choice to impose restrictions to appease regulators.
Alex Haffner, partner in antitrust law at Fladgate, said“One inevitably concludes that Microsoft's decision was heavily influenced by ongoing competition and antitrust scrutiny of its influence (and the influence of other major technology corporations) on emerging AI corporations like OpenAI.”
Critics and regulators alike say this might create an oligopoly within the AI sector, potentially hampering innovation and limiting consumer alternative.
A trust deficit?
Both the YouTube data mining scandal and Microsoft’s hiring practices contribute to a growing trust deficit between Big Tech and the general public.
Fearing exploitation, content creators have change into more cautious about their work.
This could have a ripple effect on content creation and sharing, ultimately weakening the very platforms whose data is critical to tech corporations.
Similarly, the concentration of AI talent in just a few large corporations results in the standardization of AI development and a limitation of diversity.
To restore trust, technology corporations will likely must do greater than simply comply with future regulations and antitrust investigations.
The query stays: Can we harness the potential of AI while maintaining ethics, fair competition and public trust?