BingsanBingsan
Architecture

Architecture

Understanding Bingsan's architecture for deployment and tuning

Architecture

Understanding Bingsan's architecture helps with deployment planning, performance tuning, and troubleshooting.

Overview

Bingsan is a stateless Go application that implements the Apache Iceberg REST Catalog specification. All persistent state is stored in PostgreSQL.

┌─────────────────────────────────────────────────────────────┐
│                      Clients                                │
│         (Spark, Trino, Flink, PyIceberg, etc.)             │
└─────────────────────────┬───────────────────────────────────┘
                          │ REST API (HTTP)
┌─────────────────────────▼───────────────────────────────────┐
│                    Bingsan Cluster                          │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐                     │
│  │ Node 1  │  │ Node 2  │  │ Node N  │  (Stateless)        │
│  │  :8181  │  │  :8181  │  │  :8181  │                     │
│  └────┬────┘  └────┬────┘  └────┬────┘                     │
│       │            │            │                           │
│       └────────────┼────────────┘                           │
│                    │ Distributed Locking                    │
└────────────────────┼────────────────────────────────────────┘

        ┌────────────┴────────────┐
        │                         │
┌───────▼───────┐       ┌────────▼────────┐
│  PostgreSQL   │       │   S3 / GCS      │
│  (Metadata)   │       │   (Data Lake)   │
└───────────────┘       └─────────────────┘

Key Components

HTTP Server

  • Built on Fiber (fasthttp)
  • High-performance, low-memory HTTP handling
  • Supports HTTP/1.1 with keep-alive

Database Layer

  • PostgreSQL for all metadata storage
  • Connection pooling via pgx/v5
  • Automatic schema migrations
  • Advisory locks for distributed locking

Storage Integration

  • Generates storage paths for tables
  • Vends credentials for client data access
  • Supports S3, GCS, and local filesystem

Event Streaming

  • WebSocket-based real-time events
  • Publish/subscribe model
  • Namespace-level filtering

Design Principles

Stateless Nodes

Each Bingsan instance is stateless:

  • All state in PostgreSQL
  • No inter-node communication
  • Any node can handle any request
  • Easy horizontal scaling

Optimistic Concurrency

Table commits use optimistic concurrency control:

  1. Client reads current metadata
  2. Client submits changes with requirements
  3. Server validates requirements against current state
  4. If valid, changes are applied atomically

Distributed Locking

PostgreSQL row-level locking with configurable timeouts prevents concurrent modifications:

  • Row-level locks with SELECT ... FOR UPDATE
  • Configurable lock_timeout per transaction
  • Automatic retry with exponential backoff
  • Handles lock conflicts gracefully

See Distributed Locking for configuration details.

Object Pooling

Memory optimization through buffer reuse:

  • sync.Pool-based buffer pooling
  • Reduces GC pressure under high load
  • Prometheus metrics for pool health

See Object Pooling for implementation details.

Sections

On this page