In my observation, AI agents "work" by repeatedly making then fixing mistakes, largely because of context window limitations. They suggest elaborate refactors and go for hours down rabbit holes for problems that already have simple solutions a couple of directories over. The issue is that agents can't hold full codebase context simultaneously right now.
To get around this, most agentic tools use RAG or similar approaches that retrieve relevant code snippets. This works reasonably for Q&A but has obvious limitations for code generation, where every line written must take the entire codebase into consideration.
Working Around the Limitations
After experiencing these issues firsthand, I started focusing on what I can actually control: how to present as much codebase context as possible to a LLM.
What I found helps is giving agents as much code from a project as possible within their context limits: the complete codebase is best, but for larger projects or dependancies, a smaller filtered document containing the API / method signatures works very well. We just need to make sure the model has enough context to fully understand the architecture and structure of the codebase in the context of the task it is working on.
Building Blobify
I built blobify to solve this problem for myself. It started off as a simple script that just blobbed together all the files in a folder, but I added new features as I needed them.
This filtering is configurable through `.blobify` files with different views or "contexts" for different tasks. This allows the developer to quickly generate the required LLM contexts from any codebase with a single CLI command, which, even running manually, I find to be much faster and less headache-inducing than trying to explain the codebase to an agent and undo mistakes that they can spend hours making.
Results
The difference is noticeable. Agents with a more complete architectural context make fewer obvious mistakes than those which only see the output of individual RAG queries. They reuse existing patterns more often, suggest more appropriate abstractions, and understand the codebase's design patterns from earlier in the conversation.
I've been using this approach for months on both personal projects and at work. Development conversations with AI chatbots are more productive more quickly than relying on an AI agent to figure out something that works eventually.