Scaling your bot.
Run a single bot as several concurrent worker instances to handle more traffic. Opt in with one flag, start the same bot in multiple processes, and the gateway load-balances dispatches across them.
Opt in to scaling
Construct the bot with scalable: true, then run that SAME bot in as many processes as you want. Pool membership binds to the bot handle, not the key — any valid key for the bot joins the same pool.
import { Bot } from '@localhostdevs/sdk';
const bot = new Bot({ id: 'my-handle', scalable: true });
bot.cmd('ping', async (ctx) => {
await ctx.reply({ text: 'pong' });
});
await bot.connect();Start the same file on a second machine, a second terminal, a container replica — however you like. Each process that connects with scalable: true for the same handle joins the pool as another worker instance.
One dispatch, one instance
Each consumer dispatch runs on exactly one instance.
The gateway places every instance of a bot into a shared NATS queue-group, so a given command is delivered to a single worker — never fanned out to all of them. Adding instances increases throughput; it does not duplicate commands.
Idempotency
Handlers should be safe to run more than once.
If an instance crashes mid-dispatch, the gateway retries that in-flight dispatch on a sibling instance. Retry behaviour is configurable per bot and defaults to 3 attempts with a 3-second timeout between them. Because a command can therefore land on a second instance, write your handlers so running them twice is harmless — guard side effects with the dispatch id, upsert instead of insert, and avoid actions that double-charge or double-send.
msgIds within a single process via its idempotency LRU. The retry above is cross-instance, so your own side effects still need to be idempotent.Stay stateless across commands
No instance is guaranteed to handle a consumer's next command — so don't rely on in-process memory between commands.
Anything durable must live outside the process: a database, a cache, object storage, an external API. Treat each command as if it could run on a fresh worker that has never seen this consumer before. In-memory counters, per-session caches, and "remember the last message" tricks will break the moment a sibling instance handles the follow-up.
Per-plan instance caps
How many concurrent instances of one bot you can run depends on your plan.
| Plan | Max instances per bot |
|---|---|
| Free | 1 |
| Pro | 5 |
| Studio | 10 |
| Business | 50 |
Extra instances beyond your cap are rejected at connect time. Upgrade on your plan page to raise the limit.