Budgeting rebalances from a single signal
May 2026
Rebalancing buys inbound liquidity by paying a circular route back to yourself. The cost has to stay below what you will earn routing that liquidity out again, or every cycle loses money. This is how the budgeting, retry, and fallback logic work — all driven by one measured signal.
The budget
Each channel's maximum rebalance price comes from a single number, nudged upward when recent attempts have failed:
budget_ppm = base · (1 + 0.20 · failures_since_success)
base = last_refill_ppm (or 500 if no history)
cap = 5000 The escalation term does two jobs with one mechanism. For a fresh channel with no history it bootstraps at 500 and climbs until a route is found (500 → 600 → 700…). For an established channel it absorbs market drift: one that last refilled at 350 but whose cheap routes have since dried up climbs 350 → 420 → 490… instead of failing silently forever at a now-too-low cap.
refilled 350, 0 failures → 350 ppm
refilled 350, 3 failures → 350 · 1.60 = 560 ppm
no history, 2 failures → 500 · 1.40 = 700 ppm
escalation is capped at 5000 ppm From ppm to a fee cap
At execution the ppm budget becomes a hard sats cap on the payment:
max_fee_sats = amount · budget_ppm / 1e6 · 1.1 The 1.1 absorbs rounding and base fees encountered along the route. A 500,000-sat rebalance at 490 ppm allows roughly 270 sats of fee.
Chunking: degrade instead of failing
A rebalance for the full deficit often cannot find a single route with enough liquidity. Rather than give up, the executor halves the amount and retries, down to a 100k floor:
want 800k → fails → 400k → ok (recorded)
remaining 400k → fails → 200k → ok (recorded)
…
Each chunk that lands is written as its own success row at the price it
actually paid. That keeps last_refill_ppm and the cost history
honest, rather than letting a failed full-size attempt distort them.
Two ledgers decide everything
Across all plans in a run the executor keeps two running tallies:
target_deficits— sats each depleted channel still needssource_remaining— sats each overfull channel can still send
Every attempt is capped to min(plan amount, target deficit, source
remaining), and any plan is skipped once either side drops below 50k.
Fallbacks then emerge from this bookkeeping without any explicit "if the
primary failed" branch — a fallback is simply another plan entry pointing a
different source at the same target. Walking the list:
- a target whose deficit is already filled → its remaining plans skip themselves;
- a source drained by an earlier success → its remaining plans skip themselves;
- a target left only partially filled → the next plan for it (a fallback) fires automatically.
A worked example. Target T needs 600k; source A has 500k of surplus, source B has 1M. Plan 1 (A→T) moves 500k — A is now exhausted and T still needs 100k. Plan 2, a fallback (B→T), fires because T's deficit is still positive, moves 100k, and the deficit closes. There is no branching logic; the ledgers carry the state and the plan list is just walked in order.
Why one signal, not a scoring model
An earlier version of this logic was much more elaborate, and it is worth
explaining the reasoning for collapsing it to a single measured signal.
Channels were sorted into tiers — PROVEN, DISCOVERY,
DEADWEIGHT — each with its own behaviour; candidate scores blended
an earned_ppm × revenue_ratio term; and several inputs were
smoothed over rolling median windows. Most of it went, because most of those
inputs were estimates standing in for data we did not reliably have:
- Local graph fee data is unreliable, so anything derived from "the market" inherited that noise.
- A channel's revenue ratio over a short window is dominated by variance — a single large forward swings it — so it was a weak basis for a standing decision.
- Tier boundaries are discontinuities: a channel crossing a threshold changed its treatment abruptly, for reasons no operator could feel or easily debug.
The smoothing deserves its own note, because a rolling median is the obvious
thing to reach for and we deliberately did not. A median over the last N
refills is more stable, but stability is exactly the wrong property
here. The whole point of last_refill_ppm is to track the current
cost of buying inbound on a given route, and that cost moves — a corridor that
was cheap last week dries up, a new well-capitalised peer makes another cheap.
A median lags every one of those moves by roughly half its window: it keeps
quoting a stale price after the cheap routes are gone (so rebalances fail
against a too-low cap) and undershoots after they return (so we overpay or
leave the fee floor too low). The failure-escalation term already supplies the
only smoothing that helps — it ratchets the budget up when reality says the
last price no longer works — without averaging in prices that no longer exist.
Using the single most recent landed price keeps the signal honest and
current; a median would trade that for a smoothness we have no use for.
The principle that fell out: prefer one number you measured to several you estimated. A parameter earns its place only when you can both measure its input reliably and predict the effect of changing it. In practice the simpler version performs at least as well — about what you'd expect if the extra machinery was mostly fitting noise.
The loop back to fees
Each landed chunk does more than top up the channel: it writes a new
last_refill_ppm, which is the same number that sets the channel's
outbound fee floor. A more expensive refill automatically raises the price we
charge to drain that liquidity again; a cheaper one lowers it. Budget and fee
floor move together off one signal, which is also why the rebalance budget
ceiling (5000 ppm) is set equal to the fee hard ceiling — a channel can always
charge enough to recover what we were willing to pay to fill it. The
fee note covers the floor side of this
loop and its corner cases in detail.