feat(ui): smooth + cheap agent streaming (adaptive reveal + block-split markdown) by KarloAldrete · Pull Request #2716 · PostHog/code

KarloAldrete · 2026-06-16T22:08:09Z

Problem

Streamed agent replies have two issues:

Choppy — tokens arrive in irregular bursts and are painted exactly as they land, so the text jumps forward unevenly instead of reading as smooth typing.
Slow on long messages — AgentMessage → MarkdownRenderer re-parses the entire accumulated markdown (react-markdown + remark-gfm) on every token: O(n) per token, O(n²) per message. Past ~8k chars a single token costs ~19 ms of parsing, over the 16.6 ms frame budget, so long answers stutter and saturate the main thread.

Addresses #2517 (smooth token streaming).

Changes

Two pieces that reinforce each other.

1. Block-split rendering (StreamingMarkdown + splitMarkdownBlocks) — the active message is split into top-level blocks (fenced code kept intact). Completed blocks keep a stable string, so the memoized MarkdownRenderer skips them and only the growing tail re-parses. An open code fence renders as plain text until it closes, so syntax highlighting runs once, not per token.

Measured on a 3,000-char answer streamed in ~250 tokens (jsdom + react-dom/server, default components — a lower bound):

	total markdown CPU	per token	last token
full re-parse	878 ms	3.5 ms	6.8 ms
block-split	88 ms	0.35 ms	0.16 ms

~10× less, and the per-token cost stays flat instead of growing with length.

2. Steady reveal (useSmoothedText) — reveals the accumulated text at a constant ~120 chars/sec, character by character, honoring prefers-reduced-motion. Decouples paint from the bursty token arrival so the text reads as even typing.

The two are complementary: the reveal makes streaming look smooth, and the block-split keeps each frame cheap so it stays ~60 fps even on long messages. Completed messages render via a single full MarkdownRenderer parse, so the final output is byte-identical.

Relationship to #2685

#2685 also implements smooth token streaming. Per @charlesvien's review, this PR's reveal now matches the steadier cadence from #2685 (constant ~120 chars/sec, char by char). The distinct value here is the block-split renderer: it keeps the per-token markdown cost flat so the smooth reveal holds up on long messages — the part #2685 doesn't address. Happy to coordinate or fold these together however the team prefers.

How did you test this?

Unit tests: splitMarkdownBlocks.test.ts (round-trip / no text dropped, paragraph + fenced-block splitting, parseOpenFence incl. the completed-then-open-fence edge case) and useSmoothedText.test.ts (the pure nextRevealLength easing + the hook: immediate on mount, gradual steady reveal, snap-on-replace, reduced-motion).
pnpm --filter @posthog/ui test → 779 pass.
pnpm --filter @posthog/ui typecheck passes; pre-commit pnpm typecheck passes.
biome check clean on changed files; node scripts/check-host-boundaries.mjs reports no new violations.
Block-split numbers from a local micro-benchmark of the markdown render path (not committed).

Automatic notifications

Publish to changelog?
Alert Sales and Marketing teams?

Created with PostHog Code

Each streamed token re-parsed the entire accumulated markdown of the active agent message (react-markdown + remark-gfm) — O(n) per token, O(n^2) per message. Past ~8k chars a single token costs ~19ms of parsing, over the 16.6ms frame budget, so long answers stutter and saturate the main thread. StreamingMarkdown splits the active message into top-level blocks (fenced code kept intact). Completed blocks keep a stable string so the memoized MarkdownRenderer skips them and only the growing tail re-parses; an open code fence renders as plain text until it closes, so syntax highlighting runs once rather than per token. On a 3k-char answer this is ~10x less markdown CPU (878ms -> 88ms total; 3.5ms -> 0.35ms per token) and the per-token cost stays flat instead of growing with message length. Completed messages still render via a single full MarkdownRenderer parse, so output is byte-identical — no visual change, only less work per token. Complements PostHog#2685 (smooth token reveal): that PR eases when text appears, this one keeps rendering it cheap so the smooth reveal stays at 60fps even on long messages. Generated-By: PostHog Code Task-Id: 8e9f327d-84b1-4608-9f48-c3038dbf87ca

greptile-apps · 2026-06-16T22:11:58Z

Prompt To Fix All With AI

Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
packages/ui/src/features/editor/components/StreamingMarkdown.tsx:16-22
**`parseOpenFence` targets the first fence, not the unterminated one**

`/.exec(block)` finds the first fence marker in the block. If the tail block contains a completed fence followed by an open fence with no blank line between them — e.g. `` ```ts\ncode\n```\ntext\n```ts\npartial `` (no blank lines, so `splitMarkdownBlocks` keeps it as one block) — `before` becomes `""` and `code` absorbs everything from the first fence opener onward, including the intermediate closing fence and second opening marker. The result is garbled plain-text output during streaming (it self-corrects once the fence closes and `MarkdownRenderer` takes over, but the in-progress display is wrong).

A fix is to scan all lines with the same fence-tracking logic as `hasOpenCodeFence`, record the last fence-open line index, then slice `before` and `code` relative to that line.

### Issue 2 of 3
packages/ui/src/features/editor/components/splitMarkdownBlocks.ts:59-74
**Fence-tracking logic duplicated between the two exported functions**

`hasOpenCodeFence` re-implements the identical per-line regex match + `inFence`/`fenceChar` state machine that lives inside `splitMarkdownBlocks`. Extracting a shared helper (e.g. `parseFenceState(lines)` that returns `{ inFence, fenceChar }`) would satisfy the codebase's OnceAndOnlyOnce rule and make future spec-compliance fixes (e.g. requiring the closing fence to be at least as long as the opening fence) apply to both functions automatically.

### Issue 3 of 3
packages/ui/src/features/editor/components/splitMarkdownBlocks.test.ts:5-44
**Tests prefer parameterised style over manual loops and multi-assertion cases**

The "never drops text" case iterates over samples with a `for` loop, and the `hasOpenCodeFence` suite packs four independent assertions into one `it`. Vitest's `it.each` / `test.each` makes each sample a separate, named test entry — failures name the exact input and expected output, which saves time when something breaks. Converting both suites to parameterised form would align with the project's stated preference.

_{Reviews (1): Last reviewed commit: "perf(markdown): re-parse only the tail b..." | Re-trigger Greptile}

…fence, it.each - Extract a single stepFence state machine shared by splitMarkdownBlocks, hasOpenCodeFence and parseOpenFence (OnceAndOnlyOnce — no duplicated fence-tracking logic). - parseOpenFence now targets the LAST unterminated fence, so a completed fence earlier in the same block stays in `before` and renders normally instead of being swallowed as plain text mid-stream. - Parameterize the splitter tests with it.each and add parseOpenFence cases, including the completed-then-open fence edge case. Generated-By: PostHog Code Task-Id: 8e9f327d-84b1-4608-9f48-c3038dbf87ca

KarloAldrete · 2026-06-16T22:33:01Z

Thanks for the review! Addressed all three in 3f6bc14:

parseOpenFence targeting the wrong fence — it now walks the block tracking fence state and targets the last unterminated fence, so a completed fence earlier in the same block stays in before and renders normally instead of being swallowed as plain text mid-stream. Added a test for the completed-then-open-fence case.
Duplicated fence logic — extracted a single stepFence state machine shared by splitMarkdownBlocks, hasOpenCodeFence, and parseOpenFence.
Tests — parameterized the splitter suite with it.each and split the hasOpenCodeFence assertions into individual cases.

pnpm --filter @posthog/ui test → 762 pass; typecheck + biome clean.

Smooth the reveal of streamed tokens on top of the block-split renderer. useSmoothedText eases the displayed prefix toward the accumulated text via requestAnimationFrame with a step proportional to the backlog, so it tracks a fast token stream (catching up in ~6 frames) instead of a fixed chars/sec that lags behind and then snaps. It reveals on word boundaries so partial markdown tokens never flash, and shows completed (non-streaming) messages in full immediately. Pairs with the block-split renderer: only the growing tail re-parses each frame, so the reveal holds ~60fps even on long messages — which a fixed-rate reveal over a full markdown re-parse can't. Generated-By: PostHog Code Task-Id: 8e9f327d-84b1-4608-9f48-c3038dbf87ca

Resolve AgentMessage.tsx: keep main's Box className (drops py-1) together with the streaming changes (StreamingMarkdown + adaptive reveal). Generated-By: PostHog Code Task-Id: 8e9f327d-84b1-4608-9f48-c3038dbf87ca

charlesvien · 2026-06-18T22:42:17Z

Hey @KarloAldrete, the streaming in #2685 is much smoother and aligned with the expectation I had for how this feels. Anyway to adjust that here?

Thanks for your contribution!

@charlesvien

…review) @charlesvien preferred the steadier reveal in PostHog#2685. Switch useSmoothedText from an adaptive (backlog-proportional) rate to a constant ~120 chars/sec, character by character (dropping the word-boundary stepping), and honor prefers-reduced-motion — so the cadence reads as even typing. The block-split renderer is unchanged and still keeps the per-token markdown cost flat on long messages. Generated-By: PostHog Code Task-Id: 8e9f327d-84b1-4608-9f48-c3038dbf87ca

KarloAldrete · 2026-06-18T23:04:30Z

Thanks @charlesvien! Switched the reveal in c51416c to match #2685's cadence — a constant ~120 chars/sec, character by character (dropped the adaptive, backlog-proportional stepping and the word-boundary snapping that made ours feel jumpier), and it now honors prefers-reduced-motion too. Should feel like #2685's now.

The block-split renderer is unchanged, so the steady reveal stays ~60fps even on long messages (only the growing tail re-parses each frame). Let me know how it feels!

charlesvien · 2026-06-19T00:46:01Z

Thanks @charlesvien! Switched the reveal in c51416c to match #2685's cadence — a constant ~120 chars/sec, character by character (dropped the adaptive, backlog-proportional stepping and the word-boundary snapping that made ours feel jumpier), and it now honors prefers-reduced-motion too. Should feel like #2685's now.

The block-split renderer is unchanged, so the steady reveal stays ~60fps even on long messages (only the growing tail re-parses each frame). Let me know how it feels!

Is the performance still decent from your testing?

KarloAldrete · 2026-06-19T01:05:55Z

Yep — perf is the whole point of this PR, and the cadence change didn't touch it (the reveal rate and the block-split renderer are independent). Two angles:

1. Measured in real Chromium (Playwright, client-rendered into a real DOM, on a Ryzen 9 5900X) — total main-thread CPU to stream a full message token by token:

message length	full re-parse (main)	block-split (this PR)	speedup
1,000 chars	46 ms	18 ms	2.6×
2,000 chars	130 ms	25 ms	5.2×
4,000 chars	417 ms	43 ms	9.7×
8,000 chars	1,361 ms	81 ms	17×

Full re-parse is ~O(n²) (re-renders the whole message every token); block-split is ~O(n) (only the growing tail re-parses), so the longer the answer the bigger the win — an 8k-char answer drops from ~1.4 s of main-thread CPU to ~80 ms. Default components here, so it's a lower bound (the app also re-runs syntax highlighting per token, which block-split avoids too).

2. I've been dogfooding it as my daily driver. I'm in a single conversation that's ~370K tokens deep right now and it still hasn't stuttered or felt janky — the perf issues I used to hit are gone. That's honestly why I kept iterating on it.

And if the cadence still doesn't feel right to you, I'm happy to keep tuning and digging into it — just say the word. Thanks for the thoughtful review!

greptile-apps Bot reviewed Jun 16, 2026

View reviewed changes

Comment thread packages/ui/src/features/editor/components/StreamingMarkdown.tsx Outdated

Comment thread packages/ui/src/features/editor/components/splitMarkdownBlocks.ts

Comment thread packages/ui/src/features/editor/components/splitMarkdownBlocks.test.ts Outdated

KarloAldrete changed the title ~~perf(markdown): re-parse only the tail block while streaming~~ feat(ui): smooth + cheap agent streaming (adaptive reveal + block-split markdown) Jun 17, 2026

Merge origin/main into perf/streaming-markdown-blocks

612fedf

Resolve AgentMessage.tsx: keep main's Box className (drops py-1) together with the streaming changes (StreamingMarkdown + adaptive reveal). Generated-By: PostHog Code Task-Id: 8e9f327d-84b1-4608-9f48-c3038dbf87ca

charlesvien mentioned this pull request Jun 18, 2026

feat(ui): smooth streaming of agent message tokens #2685

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ui): smooth + cheap agent streaming (adaptive reveal + block-split markdown)#2716

feat(ui): smooth + cheap agent streaming (adaptive reveal + block-split markdown)#2716
KarloAldrete wants to merge 5 commits into
PostHog:mainfrom
KarloAldrete:perf/streaming-markdown-blocks

KarloAldrete commented Jun 16, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Jun 16, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

KarloAldrete commented Jun 16, 2026

Uh oh!

charlesvien commented Jun 18, 2026 •

edited

Loading

Uh oh!

KarloAldrete commented Jun 18, 2026

Uh oh!

charlesvien commented Jun 19, 2026

Uh oh!

KarloAldrete commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

KarloAldrete commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Changes

Relationship to #2685

How did you test this?

Automatic notifications

Uh oh!

greptile-apps Bot commented Jun 16, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

KarloAldrete commented Jun 16, 2026

Uh oh!

charlesvien commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

KarloAldrete commented Jun 18, 2026

Uh oh!

charlesvien commented Jun 19, 2026

Uh oh!

KarloAldrete commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

KarloAldrete commented Jun 16, 2026 •

edited

Loading

charlesvien commented Jun 18, 2026 •

edited

Loading