OpenAI has reached an agreement with Reddit to make use of the social news site's data to coach AI models.
In one blog entry On OpenAI's press page, the corporate explained that the Reddit partnership gives it access to “structured and unique real-time content” – e.g. B. Posts and answers – from Reddit in order that its tools and models can “higher understand and present” this content. Reddit content is integrated ChatGPTOpenAI's popular conversational AI, and the businesses will work together to bring unspecified recent “AI-powered features” to each Reddit users and moderators.
OpenAI can even develop into a Reddit promoting partner.
“Reddit will construct on OpenAI’s AI modeling platform to bring its powerful vision to life,” OpenAI wrote within the post. “By leveraging LLMs, ML, and AI, Reddit can improve the user experience for everybody.”
OpenAI has entered into several similar licensing agreements with content providers starting from media libraries to news publishers. But the bizarre thing about that is that Sam Altman, CEO of OpenAI, has one 8.7% stake in RedditHe was the third largest shareholder and was once a member of the corporate's board of directors.
To avoid further scrutiny, OpenAI says in its press release that while Altman stays a Reddit shareholder, the partnership was “led by OpenAI's COO (Brad Lightcap)” and “approved by (OpenAI's) independent board.” (I should note here that Altman is a member of OpenAI's board; nevertheless, he recused himself from this decision, an OpenAI spokesperson tells TechCrunch.)
Reddit has made data licensing agreements an increasingly central a part of its growth strategy because it moves into the market as a publicly traded company.
In its IPO prospectus, Reddit disclosed that it has contractual agreements to license its data Customers including Google a complete value of over $200 million. And in its first earnings report as a public company, Reddit reported a 450% year-over-year increase in non-advertising revenue, largely because of these agreements.
Reddit shares rose 11% in prolonged trading following the OpenAI deal announcement.
“The paradox I see is that as more content on the web is written by machines, there may be increasingly content that comes from real people,” Reddit CEO Steve Huffman said through the company’s March earnings call. “And now we have almost twenty years of authentic conversations.”
Reddit's platform – which has over 1 billion posts and greater than 16 billion comments, numbers which are growing day by day due to its lots of of tens of millions of energetic users – is a goldmine for generative AI corporations, whose models learn from content examples similar to text and pictures Generate recent, similar content.
But the corporate could face resistance from users concerned about the way it monetizes their data.
It's instructive to try Stack Overflow, the question-and-answer forum for software developers, which recently signed an agreement with OpenAI to offer data for its model training. In protest, some users deleted their top-rated answers to questions from the community. But Stack Overflow restored the deleted posts and banned those users, saying they’d not complied with the terms of service.
Reddit has already expressed its displeasure with an attempt to present Reddit users more control over their very own data.
Vana, a blockchain-based startup, is attempting to launch an information “DAO” (Digital Autonomous Organization) to present Reddit users the flexibility to pool their data and collectively resolve how that combined data is used (or sold. Reddit banned Vana's subreddit dedicated to discussing the DAO in an announcement to TechCrunch, accusing the corporate of “exploiting” its data export controls.