Skip to content

feat: add perf memory diagnostics#759

Merged
thymikee merged 4 commits into
mainfrom
feat/perf-memory
Jun 11, 2026
Merged

feat: add perf memory diagnostics#759
thymikee merged 4 commits into
mainfrom
feat/perf-memory

Conversation

@thymikee

@thymikee thymikee commented Jun 11, 2026

Copy link
Copy Markdown
Member

Summary

Adds perf memory under the consolidated perf family for token-efficient memory diagnostics and artifact escalation.

Before, agents had to choose between broad perf metrics payloads or raw platform output such as dumpsys meminfo, am dumpheap, and leaks, which is noisy, platform-specific, easy to over-fetch, and risky to paste into model context. After this change, agents get a compact first-pass memory view and can escalate to heap/memgraph files without dumping large artifacts into CLI/MCP/default JSON output.

Examples:

agent-device perf memory sample --json
agent-device perf memory snapshot --kind android-hprof --out app.hprof
agent-device perf memory snapshot --kind memgraph --out app.memgraph

Compact sample shape:

{
  "metrics": {
    "memory": {
      "available": true,
      "totalPssKb": 562958,
      "totalRssKb": 570304,
      "topConsumers": [{ "name": "Dalvik Heap", "pssKb": 213456 }]
    }
  }
}

Default snapshot output:

Memory artifact (android-hprof): /tmp/app.hprof (42MB)

Why this is helpful:

  • perf memory sample keeps arrays bounded, reports top offenders compactly, preserves the same memory source as perf metrics, and omits unrelated startup/CPU/frame data.
  • perf memory snapshot returns path, size, kind, method, and support metadata while heap/memgraph data stays as a file for external inspection.
  • Unsupported platforms return artifact.available=false with stable reason/hint instead of pretending a heap or memgraph was captured.
  • Android HPROF capture resolves package/PID, cleans remote paths, removes partial local artifacts on failed pulls/empty files, and gives debuggable/profileable hints.
  • Apple memgraph capture has a dedicated longer leaks --outputGraph timeout, cleans partial local artifacts on nonzero/timeout/empty files, and reports physical iOS or missing simulator process tools as unavailable.
  • --kind now uses one perf-kind model for xctrace, simpleperf, perfetto, android-hprof, and memgraph; perf memory snapshot deliberately narrows that to android-hprof|memgraph.
  • Perf-style --kind values no longer trigger deferred Metro config or confusing Metro errors.

Refs #698. Parent context: #694.

Docs/skills impact: updated versioned CLI help, website command/debugging/client docs, and SkillGym planning guidance. Agents can discover this through agent-device help, where help debugging now calls out perf memory, and agent-device help debugging includes commands, example output, and guidance on why this is better than raw memory dumps. skills/**/SKILL.md was not updated; repo guidance routes command behavior through versioned CLI help.

Support matrix:

Touched-file count: 35. Scope spans perf command contracts, CLI/client/MCP metadata, daemon routing, Android/Apple platform helpers, provider scenarios, docs, and SkillGym guidance because this command crosses the public command surface and artifact lifecycle.

Residual risk: iOS simulator memgraph depends on runtime-provided process tools (ps/leaks) and macOS inspection permissions; unsupported environments now return explicit unavailable metadata.

Validation

Latest branch validation:

  • pnpm exec vitest run src/utils/__tests__/args.test.ts (111 tests passed)
  • pnpm format
  • pnpm check:quick
  • git diff --check

Review follow-up validation:

  • Rebasing: git rebase origin/main succeeded against 3780ca6ed.
  • Focused Vitest: src/__tests__/cli-perf.test.ts, src/__tests__/client.test.ts, src/utils/__tests__/perf-args.test.ts, src/daemon/handlers/__tests__/session-observability.test.ts, src/platforms/android/__tests__/perf.test.ts, src/platforms/ios/__tests__/perf.test.ts, test/integration/provider-scenarios/android-lifecycle.test.ts, src/__tests__/remote-connection.test.ts (8 files / 106 tests passed)
  • pnpm check:unit (247 unit files / 2381 tests passed; 8 smoke tests passed)
  • pnpm build

Earlier manual E2E evidence:

  • Android emulator: Pixel 9 Pro XL API 37, serial emulator-5554.
  • Android package com.justforfun.neoncity, session perf-memory-android:
    • perf metrics --json succeeded with memory.totalPssKb=571406, memory.totalRssKb=578720, and bounded top consumers including Native Heap.
    • perf memory sample --json succeeded with only metrics.memory plus memory/snapshot sampling metadata; sample included totalPssKb=562958, totalRssKb=570304, 5 top consumers.
    • perf memory snapshot --kind android-hprof --out /private/tmp/agent-device-neoncity.hprof reached am dumpheap and failed with SecurityException: Process not debuggable, returning hint: use a debuggable/profileable build.
  • Android package host.exp.exponent was also tried for HPROF and returned the same non-debuggable process failure. No local HPROF artifact was created in this environment.
  • iOS simulator: iPhone 17, UDID D74E0B66-57EB-4EC1-92DC-DA0A30581FE7.
  • iOS bundle com.callstack.agentdevicelab, session perf-memory-ios:
    • perf metrics --json succeeded for startup; simulator CPU/memory metric path reported simctl spawn ps unavailable (No such file or directory).
    • perf memory sample --json returned only metrics.memory with the same unavailable reason.
    • perf memory snapshot --kind memgraph --out /private/tmp/agent-device-ios.memgraph returned success with artifact.available=false, reason that this simulator runtime did not provide ps, and a hint to retry on a simulator runtime with process tools.

SkillGym: pnpm test:skillgym:case perf-memory-diagnostics is blocked in this sandbox by external-runner data-export policy. The first guarded run reported sandbox network disabled; the escalated run was denied because it would send workspace-derived prompts to external Codex/Claude runners.

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-06-11 12:58 UTC

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown

Size Report

Metric Base Current Diff
JS raw 1.2 MB 1.2 MB +16.6 kB
JS gzip 390.8 kB 395.3 kB +4.5 kB
npm tarball 504.6 kB 509.3 kB +4.7 kB
npm unpacked 1.7 MB 1.7 MB +16.9 kB

Startup median (7 runs, lower is better):

Scenario Base Current Diff
CLI --version 25.6 ms 27.6 ms +2.0 ms
CLI --help 41.0 ms 42.8 ms +1.8 ms

Top changed chunks:

Chunk Raw diff Gzip diff
dist/src/session.js +13.1 kB +3.4 kB
dist/src/args.js +1.7 kB +625 B
dist/src/8173.js +571 B +153 B
dist/src/1352.js +518 B +145 B
dist/src/cli.js +368 B +64 B

Copy link
Copy Markdown
Member Author

Code review

Verdict: minor issues — implementation is solid and well-tested, but one likely real-world failure (Apple memgraph timeout) and very high merge-conflict risk with the sibling perf PRs.

Findings

  1. Majorsrc/platforms/ios/perf.ts:175-187: leaks --outputGraph runs with APPLE_PERF_TIMEOUT_MS (15 s), while the Android heap dump path gets a dedicated 120 s timeout; leaks memgraph generation on non-trivial apps routinely exceeds 15 s, so this path will time out in practice and deserves its own snapshot timeout like Android's ANDROID_HEAP_DUMP_TIMEOUT_MS.

  2. Major (cross-PR design conflict)src/utils/cli-flags.ts:387-403: this PR repurposes the global --kind flag (formerly metroKind) into a shared enum (auto|react-native|expo|android-hprof|memgraph) and renames metro's to --metro-kind, while feat: add Android native perf profiling #757 instead adds a separate perfKind key and keeps --kind bound to metroKind, and feat: add Apple xctrace perf profiling #755 expects --kind xctrace. Three incompatible designs of the same flag; whichever merges second conflicts both textually and semantically — the enum list and validation must be reconciled deliberately.

  3. Minorsrc/cli/commands/connection-runtime.ts:306-311: hasDeferredMetroConfig now treats any flags.kind as deferred Metro config; an open against a remote config with a perf-style --kind android-hprof routes into prepareConnectedMetro and throws the metro error message — a confusing failure mode created by sharing the flag across subsystems.

  4. Minorsrc/platforms/android/perf.ts:134-160 and src/platforms/ios/perf.ts:189-210: on pull failure, empty-artifact failure, or leaks nonzero exit, a partial/empty local file at outPath is left behind (remote cleanup is correctly in finally; local is not) — with timestamped naming each retry accumulates junk in the session artifacts dir.

  5. Minorsrc/platforms/android/perf.ts:109,254-271: the PID is resolved via pidof (first token, arbitrary order for multi-process apps with :remote services) but never used for capture — am dumpheap gets the package name — so the reported pid metadata can name a different process than the one actually dumped.

  6. Minorsrc/platforms/android/perf.ts:153: the stat.size > 0 check doesn't detect truncated dumps; on older Android, am dumpheap can return before the app finishes writing, so adb pull may capture a truncated HPROF that passes the check. Polling for a stable remote size (or validating the HPROF header) would harden this.

  7. Minorsrc/daemon/handlers/session-perf.ts:155-162: perf memory snapshot on an unsupported platform returns metrics.memory.{available:false} instead of the documented artifact shape — the same command returns two different JSON payload shapes depending on platform.

  8. Minor (tests) — no tests for the Android am dumpheap/adb pull failure paths (hint mapping and remote rm -f cleanup unverified), no Apple leaks nonzero-exit hint test, and the new daemon-side flag rejections in session-observability.ts:112-127 are untested.

Verified clean

No command injection — host-side adb/xcrun/leaks invocations are array-arg, the remote heap path is sanitized, --out passes as an execFile arg. The package name interpolated into the on-device shell comes from the session's own appBundleId and matches the pre-existing dumpsys meminfo pattern.

Overall

Well-structured, reuses the perf scaffolding cleanly, good happy-path plus several failure-path tests. The biggest code concern is the 15 s leaks timeout; everything else is robustness polish. Merge-conflict risk with #755/#757 is severe (all three rewrite contracts/perf.ts, cli-grammar/observability.ts, cli-flags.ts, metadata, docs) — these need an agreed merge order and a deliberate reconciliation of the perf enums and --kind design.


Generated by Claude Code

@thymikee

Copy link
Copy Markdown
Member Author

Review follow-up pushed in 3f96beedd.

Addressed:

  • Gave Apple memgraph capture its own 120s timeout instead of sharing the 15s metrics timeout.
  • Introduced a shared perf kind model for xctrace, simpleperf, perfetto, android-hprof, and memgraph, while keeping perf memory snapshot narrowed to android-hprof|memgraph.
  • Stopped deferred Metro config from reading generic --kind; it now uses explicit metroKind, so perf-style kinds do not trigger Metro prep/errors.
  • Cleaned partial local heap/memgraph artifacts on Android pull/empty failures and Apple leaks nonzero/timeout/empty failures.
  • Normalized unsupported perf memory snapshot responses to compact artifact.available=false payloads with support metadata.
  • Added focused daemon, CLI, Android, Apple, and remote-connection tests for those contracts.

Validation:

  • Rebasing: git rebase origin/main succeeded against 3780ca6ed.
  • pnpm check:quick
  • Focused Vitest: src/__tests__/cli-perf.test.ts, src/__tests__/client.test.ts, src/utils/__tests__/perf-args.test.ts, src/daemon/handlers/__tests__/session-observability.test.ts, src/platforms/android/__tests__/perf.test.ts, src/platforms/ios/__tests__/perf.test.ts, test/integration/provider-scenarios/android-lifecycle.test.ts, src/__tests__/remote-connection.test.ts (8 files / 106 tests passed)
  • pnpm check:unit (247 unit files / 2381 tests passed; 8 smoke tests passed)
  • pnpm format
  • pnpm build
  • git diff --check

Sibling PR reconciliation: no base PR is required. The branch now exposes one shared perf-kind enum for the expected trace/profile/artifact backends and keeps Metro-specific kind handling on metroKind, so #755/#757-style perf kind additions should compose without relying on unmerged code.

@thymikee thymikee merged commit a35c444 into main Jun 11, 2026
20 checks passed
@thymikee thymikee deleted the feat/perf-memory branch June 11, 2026 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant