Building Persistent Agent-Native Memory with Memori and OpenAI

A Coding Implementation to Build Agent-Native Memory Infrastructure with Memori for Persistent Multi-User and Multi-Session LLM Applications

The Memori infrastructure layer enables LLM agents to retain useful context across disparate interactions rather than treating conversations in isolation. This tutorial demonstrates the implementation of a memory layer that supports synchronous and asynchronous OpenAI clients for multi-tenant applications.

Why This Matters

Standard LLM implementations often suffer from state loss between sessions, forcing developers to manually manage context windows and database lookups. This technical friction increases latency and cost while degrading the user experience as agents forget critical historical data. Memori solves this by providing an agent-native memory layer that automatically intercepts chat completion calls to handle attribution and retrieval. This allows for complex multi-agent systems where a single user can interact with different personas—like a fitness coach or a meal planner—without data leakage between roles.

Key Insights

Multi-tenant isolation via entity_id allows distinct user data to remain scoped and secure within the same application infrastructure.
Agent persona separation using process_id enables a single user to maintain different context sets for specific roles like fitness coaches or meal planners.
Session management through set_session() groups related turns, such as technical project decisions, while excluding unrelated personal data from the active context.
Automatic interception by registering OpenAI clients with Memori ensures every model call is enriched with historical context without manual prompt engineering.
Native support for streaming and asynchronous calls allows memory persistence in high-performance, real-time agent workflows.
Memori provides a Bring Your Own Database (BYODB) option to point agent memory at private Postgres instances for enterprise data control.

Working Examples

Setup and registration of Memori with synchronous and asynchronous OpenAI clients.

from memori import Memori; from openai import OpenAI, AsyncOpenAI; client = OpenAI(); async_client = AsyncOpenAI(); mem = Memori(); mem.llm.register(client); mem.llm.register(async_client); MODEL = "gpt-4o-mini"; def ask(prompt, system=None): msgs = []; if system: msgs.append({"role": "system", "content": system}); msgs.append({"role": "user", "content": prompt}); r = client.chat.completions.create(model=MODEL, messages=msgs); return r.choices[0].message.content

Example of multi-persona memory isolation for the same user using process_id.

mem.attribution(entity_id="[email protected]", process_id="fitness-coach"); ask("Goal: sub-25-minute 5K by June. Currently I run 30 minutes flat."); mem.attribution(entity_id="[email protected]", process_id="meal-planner"); ask("Prefer low-carb dinners on weekdays."); mem.attribution(entity_id="[email protected]", process_id="fitness-coach"); print(ask("Remind me of my running goal."))

Practical Applications

Customer Support Bots: Use entity_id to remember billing plans and email records across different support tickets to avoid repetitive user input. Pitfall: Failing to reset sessions can lead to the agent referencing outdated subscription data from a closed ticket.
Personal Assistant Frameworks: Implement process_id to separate work-related project notes from personal health goals. Pitfall: Mixing process IDs can result in context leakage where the agent suggests dinner recipes during a technical code review.
Multi-Agent Systems: Group project-specific conversations using session IDs to ensure developers can context-switch between different software repositories without mixing technical constraints. Pitfall: Not implementing a write delay (e.g., 6 seconds) in high-frequency loops may lead to race conditions in memory persistence.

References:

https://www.marktechpost.com/2026/05/11/a-coding-implementation-to-build-agent-native-memory-infrastructure-with-memori-for-persistent-multi-user-and-multi-session-llm-applications/

On This Page

A Coding Implementation to Build Agent-Native Memory Infrastructure with Memori for Persistent Multi-User and Multi-Session LLM Applications

Why This Matters

Key Insights

Working Examples

Practical Applications

Continue reading

Related Content

Build Persistent AI Memory: A Guide to Mem0, OpenAI, and ChromaDB Integration

Building Hybrid-Memory Autonomous Agents with Modular Tool Dispatch and OpenAI

Advanced Agentic Workflows: Mastering Tool Combination and Context Circulation in Gemini API