Agentic Memory Management for GPU Code Generation

Agentic Memory Management for GPU Code Generation

Agentic Memory Management for GPU Code Generation

Agentic Memory Management for GPU Code Generation

A post contributed to the AI-Driven Research for Systems (ADRS) blog series from the Berkeley Sky Computing Lab

A post contributed to the AI-Driven Research for Systems (ADRS) blog series from the Berkeley Sky Computing Lab

Written by

Yahya Emara

Yahya Emara

Mohamed Abdelfattah

Mohamed Abdelfattah

Ali Tehrani

Ali Tehrani

Published on

May 12, 2026

May 12, 2026

This is a contributed blog to the series run by Berkeley's Sky Computing Lab.

In this blog post, we examine the problem of balancing memory and other knowledge sources for GPU kernel generation agents. Memory helps GPU kernel agents only when it saves more search than it costs in context. Search discovers useful coding patterns, and memory prevents rediscovering them.

Read the full blog on their website, here: https://ucbskyadrs.github.io/blog/makora/