Pushing the 1B Frontier — Rohan Adwankar

My current passion project is pushing the frontier of what is possible with 1B parameter LLMs (or more accurately SLMs).

Specificaly the goal is to create reliable AI Agents running with purely at 1B parameter LLM.

Modern LLMs used for agent are ~1 trillion parameters in size so getting this to work poses some significant challenges.

LLMs of this size frequently struggle to adhere to schemas and have very small context windows, required pushing the agent harness to its absolute limits.

I'll keep this file relatively updated with my attempts.

In Progress

Cinfer

Cinfer is an Agent SDK that works by constraining generation at inference time to obey grammers generated from tool definitions

My initial benchmarks with 1B LLMs reveal some very promising first results that this SDK outputs the common ones used for larger LLMs, but further work will be needed still to make it reliable.

wagent

Wagent is an llm coding agent purely in the browser (including inference)

If I can get this to work then I would consider that the ultimate success that 1B SLMs can be used for agents.

Try it here

Previously...

debate

Debate is the first working version of an agent I got running with a 1B parameter SLM.

It is very simple with 2 agents that each have a simple tool which allows them to concede the debate.

However, when you listen to the debates one common problem is that this tool will fail.

This can result in entertaining cases where the agent says it has conceded in its speech but the program doesnt know because the schema didnt match so the debate continues with the other debater simply restating that they have won.

This problem resurfaced with wagent where often times when the agent is done it doesnt terminate and instead reimplements the file.

I hope to document these common agent behavior patterns (hopefully accompanied with a solution).

Try it here

unravel

Unravel generates infinite dropdowns.

The reason it's interesting is you can see how quickly the SLM runs out of capability to remember the hierarchy chain.

Try it here