[pocl] Back `@private` Scratchpad with GPUCompiler.alloca by vchuravy · Pull Request #714 · JuliaGPU/KernelAbstractions.jl

vchuravy · 2026-06-22T15:38:58Z

Summary

Replaces the POCL back-end's MArray-backed @private scratchpad with a direct per-workitem stack allocation via GPUCompiler.alloca. The returned Ptr is wrapped in a CLDeviceArray over OpenCL "Function" storage (LLVM addrspace 0), which is where the SPIR-V target places allocas.

@device_override @inline function KA.Scratchpad(ctx, ::Type{T}, ::Val{Dims}) where {T, Dims}
    ptr = POCL.GPUCompiler.alloca(T, Val(prod(Dims)))
    CLDeviceArray(Dims, reinterpret(POCL.LLVMPtr{T, POCL.AS.Function}, ptr))
end

This drops the StaticArrays dependency from the POCL back-end (StaticArrays is still used by the CPU back-end).

Why

GPUCompiler.alloca emits a real entry-block alloca that the optimizer can promote, in the target's alloca address space — avoiding the unsoundness of llvmcall + alloca and the overhead/semantics of MArray. See the motivation in the companion GPUCompiler PR.

Alignment

The alloca is aligned to Base.datatype_alignment(T), which is exactly the alignment CLDeviceArray uses for its element loads/stores (alignment(::CLDeviceArray{T})), so accesses are consistent. isbits-union element types are intentionally unsupported (GPUCompiler.alloca guards on isbitstype(T)).

Status

Draft — depends on JuliaGPU/GPUCompiler.jl#859 (adds the alloca intrinsic). Project.toml compat is bumped to GPUCompiler = "1.23"; this can be un-drafted once that is merged and released.

Testing

Verified end-to-end against the local GPUCompiler branch: @private Float32 (4,) lowers to alloca [16 x i8], align 4 in addrspace 0 with no surviving julia.gpu.alloca, and the kernel runs correctly on the POCL CPU device.

🤖 Generated with Claude Code

Back the POCL `Scratchpad` (`@private`) with `GPUCompiler.alloca`, a direct per-workitem stack allocation, instead of a StaticArrays `MArray`. The returned `Ptr` is wrapped in a `CLDeviceArray` over OpenCL "Function" storage (LLVM addrspace 0), where the SPIR-V target places allocas. Its alignment (`Base.datatype_alignment(T)`) matches `CLDeviceArray`'s element accesses. Requires GPUCompiler 1.23 (JuliaGPU/GPUCompiler.jl#859), which adds the `alloca` intrinsic. Drops the now-unused StaticArrays import from the POCL back-end (StaticArrays is still used by the CPU back-end). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

codecov · 2026-06-22T21:37:26Z

Codecov Report

❌ Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 0.00%. Comparing base (a8022b2) to head (0b58f9b).

Files with missing lines	Patch %	Lines
src/pocl/backend.jl	0.00%	2 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (a8022b2) and HEAD (0b58f9b). Click for more details.

HEAD has 28 uploads less than BASE

Flag BASE (a8022b2) HEAD (0b58f9b)

48 20

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #714       +/-   ##
==========================================
- Coverage   62.51%   0.00%   -62.52%     
==========================================
  Files          23      22        -1     
  Lines        1926    1737      -189     
==========================================
- Hits         1204       0     -1204     
- Misses        722    1737     +1015

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

vchuravy and others added 2 commits June 22, 2026 17:28

remove StaticArrays

c33f8bb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pocl] Back `@private` Scratchpad with GPUCompiler.alloca#714

[pocl] Back `@private` Scratchpad with GPUCompiler.alloca#714
vchuravy wants to merge 2 commits into
mainfrom
vc/alloca_intrinsic

vchuravy commented Jun 22, 2026

Uh oh!

codecov Bot commented Jun 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vchuravy commented Jun 22, 2026

Summary

Why

Alignment

Status

Testing

Uh oh!

codecov Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented Jun 22, 2026 •

edited

Loading