Automation vs innovation: The truth about AI-powered coding

Dr Ben Goertzel on automation vs innovation: The truth about AI-powered coding

The semi-automation of much software coding has been one of the bigger successes of LLMs so far. But there’s no clear consensus on exactly how far we’ve come entirely toward the ultimate goal of automatic software development.  Are we almost there, halfway there, or just barely getting started?

Some, like New York Times columnist Kevin Roose, have enthusiastically embraced the idea that large language models (LLMs) make software development accessible to non-coders, enabling anyone with an idea to create apps with simple prompts. Others, like AGI researcher and AI critic Gary Marcus, warn that such narratives dangerously overhype AI’s capabilities, obscuring its limitations and potentially discouraging young programmers from learning the fundamentals of coding.

So, where does the truth lie? Are LLMs genuinely revolutionizing coding, or are they merely regurgitating existing patterns without true problem-solving abilities? As someone who has been coding since 1980 and working in AI R&D since the mid-1980s, I’ve had firsthand experience with both the astonishing utility and frustrating shortcomings of current AI coding tools.  Overall it feels like we’re halfway through a revolution in AI automation of coding – but given the nature of exponential change, halfway through conceptually may mean we are quite close to the finish line in terms of clock time.

What LLMs do well in coding?

Roose’s enthusiasm for AI-assisted coding tools isn’t entirely misplaced. There is something remarkable about describing a problem in plain language and watching an AI generate functional, even elegant, solutions. LLM-powered coding assistants like GitHub Copilot, Cursor, and Replit have already transformed software development in several ways:

  • Speeding up routine coding tasks. If a problem has been solved before and the relevant techniques are well-documented, LLMs can quickly generate working code, often saving hours of effort.
  • Lowering the barrier to entry. Beginners can now create useful software without years of formal training, at least for straightforward applications.
  • Boosting productivity in known domains. Developers working within established paradigms—whether building CRUD applications, automating workflows, or implementing standard algorithms—can significantly accelerate their work.
  • Enhancing creative coding. As I’ll discuss later, LLMs shine in areas like music generation, where they can rapidly produce scripts for experimental transformations of sound.

For these reasons, it’s no surprise that Roose found himself “vibecoding” small, personalized applications that solved everyday problems. AI models are great at repurposing existing software techniques for novel applications—taking an old tool and using it in a new way.

Where LLMs fall short

However, as Marcus rightly points out, LLM-based coding tools struggle when faced with deeper challenges. Their limitations are particularly apparent in areas like:

  • Generalization beyond training data. LLMs excel at regurgitating and remixing existing solutions but struggle to reason about entirely new programming paradigms or novel problem spaces.
  • Debugging and long-term maintainability. Writing code is one thing; ensuring it works correctly, handles edge cases, and remains maintainable over time is another. AI-generated code often requires significant human oversight and refinement.
  • The last 20% of hard problems. Many AI applications—including self-driving cars and automated coding—can achieve 80% accuracy fairly easily. But the final 20% often involves complex reasoning, deep debugging, and optimization that current AI models simply cannot handle.
  • Truly innovative software engineering. If we only built software in ways that AI models can assist with today, we’d be stuck in a loop, endlessly recycling past programming paradigms and system architectures instead of inventing new ones.

These weaknesses become painfully clear in the kind of software development required for AGI research, where conventional solutions don’t cut it.

AGI development exposes LLMs’ coding limitations

In my own AGI research, so far, I’ve found LLMs to be almost entirely useless.

Take, for example, our work on MeTTa, a new programming language designed for AGI development. Since LLMs are trained on existing codebases, they struggle with anything that deviates significantly from established programming paradigms. Even fine-tuning a model on a corpus of MeTTa code hasn’t yielded much improvement. We’ve experimented with prompting LLMs to reason about MeTTa’s operational semantics, hoping they could deduce how to write MeTTa code effectively, but the results have been disappointing.

It’s not just about MeTTa, though. The deep technical challenges involved in building AGI — such as optimizing the Metta Optimal Reduction Kernel (MORK) for scaling neural-symbolic-evolutionary AI — require sophisticated problem-solving and deep reasoning that LLMs simply do not possess. Even in widely used languages like Rust, the ability of coding LLMs to help with complex, memory-intensive optimizations is minimal.

There’s also a broader, more troubling dynamic at play. Because LLMs are so helpful when working within existing programming paradigms, they exert an implicit pressure on developers to stick to well-trodden paths rather than pushing boundaries. If I weren’t committed to unconventional AGI development, I might feel tempted to adjust my approach to something that AI tools could better assist with. That’s a dangerous trap—one that could stifle the kinds of software innovation necessary for genuine breakthroughs.

LLMs rock at computer music coding

On the flip side, I’ve found LLMs to be incredibly useful for creative-arts coding — for example in graphic arts or music generation.

Let’s say I have an idea: What if I take two musical riffs, decompose them using a wavelet transform, and recombine their coefficients to create an offspring riff? What if I emphasize long-range coefficients from one riff over another? Does the choice of wavelet basis functions matter?

With an LLM, I can describe these concepts in plain English and get functional Python scripts within minutes. Previously, it might have taken me a full day to write and debug such scripts manually. Now, I can explore an idea in a fraction of the time, making it feasible for me to experiment now and then with generative music despite my packed schedule. LLMs and associated deep neural nets allow my robot-led band Desdemona’s Dream to do what it does, and enable software projects like Incantio and Jam Galaxy and so many others to create new capabilities and income streams for musicians.

The computer music example illustrates a key strength of LLMs: They are excellent at helping with novel applications of existing computational techniques. They might not innovate at the level of inventing a new music theory or algorithm, but they allow me to very rapidly and flexibly explore creative new ways to use existing signal-processing tools for music generation.

LLM coding is an AGI accelerator

Semi-automating computer music and other creative arts is wonderful, but it’s not what will get us to Singularity and superintelligence and all that good stuff.  Fundamental technological progress relies heavily on the kind of deep technical creativity that eludes LLMs entirely. But even so, I think it’s clear the current state of LLM coding is already accelerating our progress toward the Kurzweilian endgame.

While LLMs can’t help with the hard parts of coding AGI, it’s not always the hard parts that eat up the most development time. We are clearly into the phase now where AI tools are concretely and palpably helping accelerate the development of new AI tools. Human expert ingenuity is still needed for core AGI architecture and algorithmics, but LLMs are hugely helpful for building test suites for AGI code, they speed up preprocessing of data to evaluate pieces of AGI code, and so on and so forth. This all smells to me personally a lot like – not quite “AGI is coming this year”, but definitely “endgame before Singularity”.  

Right now, LLM coders are a tool, not a replacement

The bottom line is, the skeptics and the enthusiasts are both right here—just in different ways. LLMs are transformative for certain types of coding, particularly when working within established paradigms.  They are far from the total revolution that Roose’s enthusiasm suggests, and Marcus is correct to warn against overhyping their capabilities.  But the current capabilities can not only revolutionize digital creative arts and other important areas, but accelerate our progress toward AGI and Singularity.

If we falsely assume AI can already replace deep software engineering, we risk discouraging today’s students from learning to code at a fundamental level. And the world can use these students to help with the deep stuff – like creating AGI. At the same time, if we downplay the strengths of current LLMs for coding, we fail to take advantage of an astoundingly useful tool for accelerating routine and creative coding tasks.

One way to frame things: AI-powered coding tools are most useful when working within existing computational frameworks but fall short when asked to extend or reinvent them. If your goal is to generate functional applications based on well-documented methods, LLMs can be an incredible asset. But if you’re developing fundamentally new software architectures—like those required for AGI—don’t expect much help from today’s AI models.

As AI continues to evolve, we’ll clearly need systems that go beyond LLMs— with capability for deeper reasoning and for grounding their code generation in a real understanding of their code’s activity in the real world. That’s exactly what we’re working on in OpenCog Hyperon and the Artificial Superintelligence Alliance, exploring neural-symbolic-evolutionary cognitive architectures that can overcome LLM limitations. This is what the community of researchers at the annual Artificial General Intelligence technical conference have been digging into for their whole careers (catch the AGI-25 conference in Reykjavik in August!).

For now, from a practical coding perspective, the best approach is to see LLMs for what they are: powerful assistants and accelerants in certain aspects of software development work, but still far from replacements for the deeper aspects of human software engineering expertise. We need to clearly acknowledge the current strengths and weaknesses, while also manifesting proper respect for the amazing speed with which new capabilities are coming into play.

Dr Ben Goertzel

 Dr. Ben Goertzel

 Dr. Ben Goertzel is an AI researcher and entrepreneur specialising in artificial general intelligence (AGI), machine learning, and decentralized AI systems. With over three decades of experience, he has led the development of advanced AI frameworks, including the OpenCog project and SingularityNET, a decentralized AI platform. He has authored numerous books and research papers on AI, cognitive science, and complex systems, and frequently speaks on the transformative potential of AGI.

Author

Scroll to Top

SUBSCRIBE

SUBSCRIBE