Back to Projects

pg-agent-memory

Stateful AI agent memory layer for PostgreSQL with pgvector. TypeScript-first with intelligent context management and zero-cost embeddings.

<5ms

Memory operations

Local

Embeddings

5+

Token counting models

28M/sec

ULID generation

The Challenge

AI agents suffer from memory amnesia - they forget everything between conversations. Existing solutions are either:

  • Basic conversation storage without semantic understanding
  • Expensive with high API costs for embeddings and token counting
  • Require separate vector databases increasing infrastructure complexity
  • Lack intelligent compression for large conversation histories
  • Missing multi-model support for different AI providers

The Solution

Built the first TypeScript-native AI memory layer that combines PostgreSQL reliability with intelligent context management:

Local Embeddings

Zero-cost semantic search using local Sentence Transformers with @xenova/transformers

PostgreSQL Native

Uses existing PostgreSQL infrastructure with pgvector for vector similarity search

Multi-Model Support

Universal tokenizer supporting OpenAI, Anthropic, DeepSeek, Google, Meta with accurate counting

Intelligent Compression

Automatic memory compression with 4 strategies to manage large conversation histories

Technical Architecture


┌─────────────────┐    ┌──────────────────┐     ┌─────────────────┐
│   AgentMemory   │────│  EmbeddingService│─────│ @xenova/trans.. │
│  remember()     │    │  generate()      │     │ all-MiniLM-L6-v2│
│  recall()       │    │                  │     │   (384 dims)    │
└─────────────────┘    └──────────────────┘     └─────────────────┘
         │                       │
         ▼                       ▼
┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   PostgreSQL    │────│     pgvector     │────│   TokenCounter  │
│ memories table  │    │ cosine similarity│    │ OpenAI/Claude   │
│ ULID + content  │    │   <-> operator   │    │  DeepSeek/etc   │
└─────────────────┘    └──────────────────┘    └─────────────────┘

Tech Stack

TypeScriptPostgreSQLpgvectorSentence TransformersNode.jsZodVitestDocker

Results & Impact

Technical Achievements

  • ✓ Sub-5ms memory operations with PostgreSQL indexing
  • ✓ Zero-cost embeddings eliminating API dependencies
  • ✓ Multi-model token counting with provider-specific optimizations
  • ✓ Intelligent compression with 4 compression strategies

Implementation Quality

  • ✓ Published on NPM with semantic versioning
  • ✓ Docker containerized test environment
  • ✓ Unit and integration test coverage
  • ✓ TypeScript strict mode compilation

Key Technical Decisions

PostgreSQL-First Approach

Leveraged existing PostgreSQL infrastructure with pgvector instead of requiring separate vector databases

Local Embeddings for Cost Efficiency

Used @xenova/transformers for local embedding generation, eliminating API costs and latency

Universal Token Counting

Built provider-specific token counting based on official documentation for accurate cost estimation

ULID-Based Performance Optimization

Used ULID for time-sortable IDs achieving 28M operations/second for optimal database performance

My Role

As sole architect and maintainer, I:

  • Designed the PostgreSQL schema with pgvector integration for optimal performance
  • Implemented local embedding pipeline using Sentence Transformers
  • Built universal tokenizer supporting 5+ AI providers with accurate token counting
  • Created intelligent compression system with multiple strategies
  • Established comprehensive testing with Docker integration environments
  • Optimized performance achieving sub-5ms operations and 28M ops/sec ID generation