If 2023 was the yr of generative AI-powered chatbots and search, 2024 was all about AI agents. What began at Devin earlier this yr grew right into a full-blown phenomenon, offering firms and individuals the chance to remodel the best way they work on various levels, from programming and development to private tasks like planning and booking tickets for a vacation .
Among these wide-ranging applications, this yr we also saw the rise of knowledge agents – AI-powered agents that perform various forms of tasks across the information infrastructure stack. Some did basic data integration work, while others handled downstream tasks like evaluation and management within the pipeline, making the job simpler and easier for enterprise users.
The advantages included improved efficiency and price savings, leading many to ask: How will things change for data teams in the approaching years?
Gen AI agents took over data tasks
While agent capabilities that allow firms to automate certain basic tasks have been around for a while, the rise of generative AI has completely taken things to the subsequent level.
With Gen AI's natural language processing and power usage capabilities, agents can transcend easy considering and responding to really plan multi-step actions. They can independently interact with digital systems to finish actions while collaborating with other agents and folks. They also learn to enhance their performance over time.
Cognition AI's Devin was the primary major agent offering to enable technical operations at scale. Then larger providers began offering more targeted corporate and personal agents based on their models.
Speaking to VentureBeat earlier this yr, Google Cloud's Gerrit Kazmaier said he's heard from customers that their data professionals are continuously facing challenges, including automating manual work for data teams, reducing the cycle time of knowledge pipelines and analytics, and simplifying of knowledge management. Essentially, teams didn't lack ideas about how one can create value from their data, but they lacked the time to implement those ideas.
To address this issue, Kazmaier explained, Google revamped BigQuery, its core data infrastructure offering, with Gemini AI. The resulting agent capabilities not only provide organizations with the power to find, cleanse, and prepare data for downstream applications—breaking down data silos and ensuring quality and consistency—but additionally support pipeline management and evaluation, freeing teams to deal with higher-value tasks can.
Today, many firms use Gemini's agent capabilities in BigQuery, including fintech firms More thanwhich leveraged Gemini's ability to know complex data structures to automate its query generation process. Japanese IT company Tireless also leverages Gemini's SQL generation capabilities in BigQuery to assist its data teams deliver insights faster.
But discovering, preparing and assisting with evaluation was only the start. As the underlying models evolved, even granular data operations – developed by startups specializing of their respective domains – were targeted with deeper agent-driven automation.
For example, AirByte and Fastn made headlines in the information integration category. The former launched a wizard that created data connectors from an API documentation link in seconds. Meanwhile, the latter expanded its broader application development offering to incorporate agents that generated enterprise-class APIs – be it for reading or writing information on any topic – using only a natural language description.
San Francisco-based Altimate AI, in turn, targeted various data operations including documentation, testing and transformations with a brand new DataMates technology that leveraged agentic AI to get context from your entire data stack. Several other startups, including Redbird and RapidCanvas, were also working in the identical direction, claiming to supply AI agents that may handle as much as 90% of the information tasks required in AI and analytics pipelines.
Agents supporting RAG and more
Beyond broad data operations, agent capabilities have also been explored in areas comparable to retrieval-augmented generation (RAG) and downstream workflow automation. For example, the team behind the vector database Weave recently discussed the thought of Agentic RAG, a process that permits AI agents to access a big selection of tools – comparable to web search, calculators or a software API (like Slack/Gmail/CRM) – to retrieve data from multiple sources and to validate to enhance the accuracy of the answers.
Snowflake Intelligence was also released towards the top of the yr, giving firms the power to establish data agents that may access not only business intelligence data stored of their Snowflake instance, but additionally structured and unstructured data through isolated third-party tools – comparable to: For example, sales transactions in a database, documents in knowledge bases like SharePoint, and data in productivity tools like Slack, Salesforce, and Google Workspace.
With this extra context, agents deliver relevant insights in response to natural language questions and take specific actions across the insights generated. For example, a user could ask their data agent to enter the displayed insights into an editable form and upload the file to their Google Drive. You could even be asked to put in writing to Snowflake tables and make data changes as needed.
There is far more to come back
While we may not have covered every data agent application seen or announced this yr, one thing is pretty clear: the technology is here to remain. As generational AI models proceed to evolve, the adoption of AI agents will proceed at a rapid pace, with most firms, no matter their industry or size, opting to delegate repetitive tasks to specialized agents. This will probably be directly reflected in increased efficiency.
Proof of it is a recent survey of 1,100 tech executives conducted by Capgemini82% of respondents said they plan to integrate AI-based agents into their stacks inside the subsequent three years, up from 10% currently. More importantly, 70 to 75% of respondents said they might trust an AI agent to research and synthesize data on their behalf, in addition to perform tasks comparable to generating and iteratively improving code.
This agent-driven shift would also mean significant changes to how data teams function. Currently, the agents' output will not be production-ready, meaning that sooner or later a human might want to take over to adapt the work to their needs. However, with some further advances in the approaching years, this gap will probably be closed and teams will get AI agents which are faster, more accurate and fewer vulnerable to mistakes normally made by humans.
In summary, the roles of knowledge scientists and analysts we see today are more likely to change, with users potentially moving into the realm of AI monitoring (where they may regulate the AI's actions) or higher-level tasks as Changing the system could have difficulty performing.