March seventeenth, 2024: US-based startup Cognition introduced Devin, an AI-powered tool the corporate claims is the “world’s first fully autonomous AI software engineer.”
Devin is designed to resolve engineering tasks independently using its own shell, code editor, and web browser.
According to demonstrations provided by Cognition, Devin can utilize its web browser to access and learn from API documentation, enabling it to plug into various APIs.
When the AI agent encounters an error, it mechanically adds a debugging print statement to the essential code inside its code editor interface and reruns the code.
Cognition has showcased Devin’s capabilities in constructing and deploying apps, identifying and fixing bugs in codebases, and even fine-tuning AI models.
To assess Devin’s accuracy, Cognition tested the AI agent on SWE-bench, a benchmarking platform that challenges agents to resolve real-world issues present in open-source projects on GitHub.
Devin successfully resolved 13.86% of the problems end-to-end, surpassing the performance of GPT4 (1.74%) and the previous best rating held by Anthropic’s Claude 2 (4.80%).
Notably, Devin achieved this without assistance in locating the relevant files throughout the repository.
While Microsoft offers AI-powered developer tools like GitHub Copilot, which provides code completion and assistive features for programmers, it cannot complete codes end-to-end without human interference or assistance.
In contrast, Devin is able to autonomously completing coding tasks.
Today we’re excited to introduce Devin, the primary AI software engineer.
Devin is the brand new state-of-the-art on the SWE-Bench coding benchmark, has successfully passed practical engineering interviews from leading AI corporations, and has even accomplished real jobs on Upwork.
Devin is… pic.twitter.com/ladBicxEat
— Cognition (@cognition_labs) March 12, 2024
Cognition is currently offering early access to Devin for businesses who want to utilize the AI agent for engineering work. Interested customers can request early access through the corporate’s website.
With its impressive performance on the SWE-bench platform and its ability to operate independently, Devin represents a big step forward in the event of AI-powered software engineering solutions.