LMDB API Performance

Note

Status: This page is under development.

Performance characteristics and optimization techniques specific to the LMDB API.

Overview

This page focuses on performance aspects of the low-level LMDB API, including:

  • Read and write throughput

  • Transaction overhead

  • Cursor performance

  • Memory usage patterns

For overall zlmdb performance comparisons, see Performance.

Coming Soon

This page will cover:

  1. Read Performance - Lock-free MVCC reads - Memory-mapped I/O benefits - Zero-copy operations - Cursor iteration speed - Benchmarks: reads per second

  2. Write Performance - Transaction overhead - Batch write strategies - Sync vs async modes - fsync impact - Benchmarks: writes per second

  3. Transaction Overhead - Transaction begin/commit cost - Optimal transaction sizes - Long-lived transaction impact - MVCC snapshot management

  4. Cursor Performance - Sequential vs random access - Forward vs reverse iteration - Key positioning strategies - Range query optimization

  5. Memory Usage - Memory-mapped file behavior - Page cache utilization - Working set considerations - Multi-process memory sharing

  6. Optimization Techniques - Batching strategies - Transaction sizing - Database configuration (map_size, max_dbs) - Key design for performance

  7. Platform-Specific Notes - Linux performance characteristics - macOS differences - Windows performance - ARM64 vs x86-64

Benchmark Results

The following results highlight the performance of zLMDB as an embedded, ACID key–value database for Python applications.

All benchmarks were executed using the example scripts examples/lmdb/nastybench.py and examples/lmdb/parabench.py, which exercise both the low–level LMDB API and the higher–level ORM layer.

Note

The numbers below are representative of real hardware tests on modern x86 servers and are intended to show the achievable order of magnitude of performance. Results depend on CPU, memory, storage medium, and dataset size.

Key Metrics

  • Runtime environment: PyPy 3.11 (JIT enabled)

  • Hardware: 32 CPU cores, 100 GB RAM

  • Dataset: 100 GB LMDB environment (entirely memory-resident)

  • Transactions: Read-only workloads distributed across multiple concurrent processes

Observed Performance

Operation

Throughput (approx.)

Random lookups (single process)

≈ 4 million ops / sec

Sequential reads

≈ 6 million ops / sec

Multi-process random reads (32 cores)

> 20 million IOPS

Bulk inserts (append, sequential)

≈ 4 million ops / sec

Random inserts (transactional)

≈ 1–2 million ops / sec

Highlights

  • Zero-copy reads: Data pages are memory-mapped, eliminating intermediate copies between the kernel and Python.

  • True ACID transactions: Single-writer, multi-reader concurrency with crash-safe durability.

  • PyPy acceleration: Under PyPy, inner loops and cursor iterations are JIT-compiled to native code, giving 5–10× faster ORM queries compared to CPython.

  • Scales with cores: Read transactions scale linearly across processes since LMDB supports concurrent readers without locks.

Comparison (Indicative)

System / Library

Typical Read Throughput

Notes

SQLite (memory)

0.3 – 0.8 M ops / s

Single-threaded SQL engine

Redis (localhost)

1 – 2 M ops / s

In-memory cache via TCP

RocksDB (C++)

10 – 20 M ops / s

Compiled LSM-tree engine

zLMDB (PyPy 3.11)

> 20 M ops / s

Pure-Python API, transactional

Summary

Even when accessed through Python, zLMDB achieves performance comparable to native C/C++ databases. When running on PyPy, read-intensive workloads can exceed 20 million random key–value lookups per second on multi-core hardware — making zLMDB one of the fastest fully transactional, embedded databases accessible from Python.

See Also