close
close

Factory AI introduces “Coder Droid” designed to automate and improve coding with advanced autonomous capabilities: achieving 19.27% ​​in SWE-bench Full and 31.67% in SWE-bench Lite

https://www.factory.ai/news/code-droid-technical-report

Factory AI has released its latest innovation, Droid Code, a breakthrough AI tool designed to automate and accelerate software development processes. This release marks significant advancements in artificial intelligence and software engineering.

Introduction to Code Droid

Code Droid is an autonomous system designed to perform a variety of coding tasks based on natural language instructions. Its primary function is to automate tedious programming activities, thereby increasing the productivity and efficiency of programming teams. This innovation comes from Factory AI’s mission to integrate autonomy with software engineering, a vision that requires a multidisciplinary approach incorporating knowledge from robotics, machine learning and cognitive science.

Basic functionalities of Code Droid

Code Droid’s core functionalities are carefully designed to address various aspects of software development. The most important of these functionalities are:

  1. Planning and distribution of tasks: Code Droid can break down high-level problems into smaller, manageable subtasks. This capability is crucial for efficiently handling complex software development tasks. By simulating decisions and performing self-criticism, Code Droid can optimize task trajectories.
  2. Tool Integration and Environmental Grounding: Code Droid has access to essential software development tools, including version control systems, editors, linters, and debuggers. This integration ensures that Code Droid operates within the same feedback loops as human developers, facilitating seamless collaboration and iteration.
  3. HyperCode and ByteRank: These systems enable Code Droid to have a deep understanding of code bases. HyperCode creates multi-resolution representations of engineering systems, while ByteRank retrieves relevant information for specific tasks, giving Code Droid the ability to efficiently navigate and manipulate large code bases.
  4. Multi-model sampling: Using state-of-the-art models of large languages, Code Droid can generate multiple solutions for a given task, verify them through tests and select the optimal solution. This approach increases the robustness and diversity of Code Droid solutions.

Performance on SWE-Bench

Factory AI rigorously tested Code Droid using SWE-Bench, a benchmark designed to evaluate the capabilities of AI systems in solving real-world software engineering tasks. Code Droid demonstrated exceptional performance, scoring 19.27% ​​in SWE-Bench Full and 31.67% in SWE-Bench Lite. These results highlight Code Droid’s ability to independently perform complex software development tasks with high accuracy.

Factory Code Droid Capabilities

Code Droid is capable of performing several tasks without human intervention, including:

  • Code base modernization: Updating and refactoring legacy code bases to align with modern coding standards and practices.
  • Feature development: Implementation of new functionalities based on detailed specifications and descriptions in natural language.
  • Creating a proof of concept: Rapidly develop prototypes to validate ideas and concepts.
  • Building integrations: Creating and managing integrations between various software systems and APIs.
  • Automatic code review: Reviewing code for errors, vulnerabilities, and compliance with coding standards.
  • Comprehensive software development: Managing entire software development projects from idea to implementation.

Factory AI envisions a future where software development is more efficient, accessible and creative. Code Droid’s ongoing development is focused on improving its cognitive architectures, integrating more sophisticated tools, and tuning its capabilities for specialized domains such as artificial intelligence development, embedded systems, and financial services. Factory AI’s commitment to innovation includes the continuous calibration of benchmarking approaches, ensuring Code Droid remains versatile and effective in a variety of real-world conditions.

Overall, Factory AI’s release of Code Droid represents a pivotal moment in the evolution of software engineering. With its advanced capabilities and autonomous features, Code Droid will revolutionize software development, delivering unprecedented performance and innovation in the industry.


Check Details. All credit for this research goes to the researchers involved in this project. Also, don’t forget to follow us further Twitter.

Join ours Telegram channel AND LinkedIn grup.

If you like our work, you will love ours Bulletin..

Don’t forget to join ours A subReddit worth over 45k. ml

Asif Razzaq is the CEO of Marktechpost Media Inc. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of artificial intelligence for social good. His latest venture is the launch of an artificial intelligence media platform, Marktechpost, which distinguishes itself by providing in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable to a wide audience. The platform boasts over 2 million views per month, proving its popularity among its audience.

(Announcing Gretel Navigator) Create, edit and extend tabular data with the first complex AI system trusted by EY, Databricks, Google and Microsoft