procrastinator.pro
← all posts

I built an AI agent that coded my Chrome extension

5 min read

Just by playing around I created an AI agent that coded my Google extension.
 

I wanted to get my hands dirty and understand what's actually happening under the hood when LLMs generate all the magic things. The next step was to create my small AI worker that would run on a local LLM, spit out React code, and that's it. In my head this was a 2-3 hour project — easy, fun, and I'd get my product done fast for no cost.

Worker flow

So the plan was simple. I describe a task in a JSON file, click run, worker talks to the local LLM via Ollama, and code appears.

The setup was actually the fun part. I picked Qwen2.5-14b as my model, spun up Ollama, and started building the worker itself. File guard so it can only touch files I allow, task loader, prompt builder, logger. First task ran. It worked. I felt like a genius.

Then task two broke everything.

The problem was TypeScript. The model would generate code that looked fine, but wouldn't compile. Build fails, task fails, I cry. So I started writing ts-repair.ts — basically a list of known patterns the model kept getting wrong, with manual fixes for each one. Not elegant. Not reusable. But it kept us moving.

And as the extension grew, the worker had to grow with it. I added rollback.ts because sometimes the model would confidently write code that broke things two tasks later — you need to be able to go back. I added run-logger.ts because without logs you're just guessing what went wrong. Prompt adjustments became a constant thing, small tweaks that made a surprising difference in output quality.

Task structure

The task structure itself evolved a lot too. Early tasks were just a description and a file name. By the end I had goal, allowedFiles, acceptanceCriteria, maxAttempts, maxFilesToEdit — all because without constraints the model would touch wrong files, do too much, or just confidently generate broken code on repeat. Two attempts per task, sequential dependencies, build check after every task. By task 035 the system finally felt stable.

Looking at that file tree now — 35 tasks, 13 source files — it doesn't look like a weekend project anymore. But that's exactly what it was.

Worker evolution

After a whole weekend of adjusting and playing, we have a product.

SimplyTrack was created. Opening the extension and trying it out was surreal. I actually built something that generated the code for my extension. Even though I know this is not something crazy, I was just blown away — after all the setup, fixes, and prompt adjustments — we had a working Chrome extension.

Before I even started coding I did research. If you're interested in the case study behind why I wanted to build this, go to wearempa. And if you want to test the extension, you can check out SimplyTrack.


 
 

This was a weekend well spent. Creating this opened my mind to many more possibilities, and if it pushed you to try something out — that's a win for me.

(ps. If you need any help you can always contact me ♥)