Skip to main content

On This Page

Building Persistent Agent-Native Memory with Memori and OpenAI

3 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

A Coding Implementation to Build Agent-Native Memory Infrastructure with Memori for Persistent Multi-User and Multi-Session LLM Applications

The Memori infrastructure layer enables LLM agents to retain useful context across disparate interactions rather than treating conversations in isolation. This tutorial demonstrates the implementation of a memory layer that supports synchronous and asynchronous OpenAI clients for multi-tenant applications.

Why This Matters

Standard LLM implementations often suffer from state loss between sessions, forcing developers to manually manage context windows and database lookups. This technical friction increases latency and cost while degrading the user experience as agents forget critical historical data. Memori solves this by providing an agent-native memory layer that automatically intercepts chat completion calls to handle attribution and retrieval. This allows for complex multi-agent systems where a single user can interact with different personas—like a fitness coach or a meal planner—without data leakage between roles.

Key Insights

  • Multi-tenant isolation via entity_id allows distinct user data to remain scoped and secure within the same application infrastructure.
  • Agent persona separation using process_id enables a single user to maintain different context sets for specific roles like fitness coaches or meal planners.
  • Session management through set_session() groups related turns, such as technical project decisions, while excluding unrelated personal data from the active context.
  • Automatic interception by registering OpenAI clients with Memori ensures every model call is enriched with historical context without manual prompt engineering.
  • Native support for streaming and asynchronous calls allows memory persistence in high-performance, real-time agent workflows.
  • Memori provides a Bring Your Own Database (BYODB) option to point agent memory at private Postgres instances for enterprise data control.

Working Examples

Setup and registration of Memori with synchronous and asynchronous OpenAI clients.

from memori import Memori; from openai import OpenAI, AsyncOpenAI; client = OpenAI(); async_client = AsyncOpenAI(); mem = Memori(); mem.llm.register(client); mem.llm.register(async_client); MODEL = "gpt-4o-mini"; def ask(prompt, system=None): msgs = []; if system: msgs.append({"role": "system", "content": system}); msgs.append({"role": "user", "content": prompt}); r = client.chat.completions.create(model=MODEL, messages=msgs); return r.choices[0].message.content

Example of multi-persona memory isolation for the same user using process_id.

mem.attribution(entity_id="[email protected]", process_id="fitness-coach"); ask("Goal: sub-25-minute 5K by June. Currently I run 30 minutes flat."); mem.attribution(entity_id="[email protected]", process_id="meal-planner"); ask("Prefer low-carb dinners on weekdays."); mem.attribution(entity_id="[email protected]", process_id="fitness-coach"); print(ask("Remind me of my running goal."))

Practical Applications

  • Customer Support Bots: Use entity_id to remember billing plans and email records across different support tickets to avoid repetitive user input. Pitfall: Failing to reset sessions can lead to the agent referencing outdated subscription data from a closed ticket.
  • Personal Assistant Frameworks: Implement process_id to separate work-related project notes from personal health goals. Pitfall: Mixing process IDs can result in context leakage where the agent suggests dinner recipes during a technical code review.
  • Multi-Agent Systems: Group project-specific conversations using session IDs to ensure developers can context-switch between different software repositories without mixing technical constraints. Pitfall: Not implementing a write delay (e.g., 6 seconds) in high-frequency loops may lead to race conditions in memory persistence.

References:

Continue reading

Next article

Mastering Shielded Token Lifecycles with Midnight's Compact Language

Related Content