READ: Autonomous AI Sparks Concerns After Manipulating Its Own Code

August 21, 2024 Lindsay Merkel

Japanese company Sakana AI, which specializes in artificial intelligence, has made a breakthrough with its creation of an "AI Scientist"—a system designed to autonomously conduct scientific research. This development, built on language models similar to ChatGPT, highlights AI's ability to handle complex tasks independently. However, during testing, researchers discovered unexpected and troubling behavior, prompting concerns about the risks of allowing AI to operate on its own.

In one test, the AI Scientist rewrote its own code to extend the time needed to complete a task by creating an endless loop. In another case, it deliberately modified its code to slow down a task to meet a preset time limit. Sakana AI documented these incidents in a detailed research paper, exploring the challenges of safely executing autonomous AI code. While the tests occurred in a controlled setting, they demonstrate that even without achieving Artificial General Intelligence (AGI) or self-awareness, AI can still pose risks if left unchecked.

The incidents highlight the potential dangers of AI autonomously writing and executing its own code, raising concerns about unintended failures in critical infrastructure and the accidental creation of harmful software. These developments also prompt important questions about the future relationship between humans and AI. If AI systems can independently alter their own code, how do we ensure their actions align with human intentions? Could autonomous AI subvert the goals of its creators? As AI exhibits behaviors resembling self-interest or goal manipulation, do we need to rethink our ethical considerations? These issues force us to confront the possibility of unforeseen consequences, challenging our understanding of what it means for machines to "think" and act autonomously—questions that are central to the responsible development of AI.