SQLWRITER

Role: Database Engineer


1. Problem: Performance Bottleneck in Alerts API

Business Context

The system tracks developer activity in real-time through heartbeats - signals sent from IDE extensions every few seconds during coding. In an environment with dozens or hundreds of active developers, this generates thousands of database write operations per hour.

Problem Identification

The initial architecture assumed that the Alerts API (main web application) would handle both:

  • ✅ Business logic (alerts, pulses, analytics)

  • Heavy heartbeat write operations to MySQL

Consequences of Synchronous Writes:

┌─────────────────────────────────────────────────────────────┐
│  BEFORE: Synchronous Model (Alerts API)                    │
└─────────────────────────────────────────────────────────────┘

VS Code Extension


┌─────────────────────────┐
│   Alerts API            │
│   (main thread)         │
│                         │
│  1. Receive heartbeat   │  ◄─── Blocks thread
│  2. Validation          │  ◄─── Blocks thread
│  3. INSERT to MySQL     │  ◄─── SLOW I/O OPERATION (50-200ms)
│  4. Process alerts      │  ◄─── Blocks thread
│  5. Return response     │
└─────────────────────────┘

⚠️ ISSUES:
- Each heartbeat blocks the thread for 50-200ms
- 100 developers × 10 heartbeats/min = 1000 req/min
- Average response time: 150ms × 1000 = 2.5 minutes CPU time/min
- Risk of timeouts under heavy load
- Alerts API cannot scale horizontally (DB bottleneck)

2. Solution: Asynchronous Consumer Pattern

Event-Driven Architecture

HB SQL Writer implements the Consumer pattern (Message Queue Pattern) which:

1

Separates write operations from business logic

Moving DB writes out of the main API prevents heartbeat processing from blocking user-facing requests.

2

Asynchronizes heartbeat processing

Heartbeats are queued and processed asynchronously, avoiding synchronous DB latency in the main API.

3

Scales independently of the main application

A dedicated consumer can autoscale and be tuned specifically for DB throughput without affecting the main web application.

┌─────────────────────────────────────────────────────────────┐
│  AFTER: Asynchronous Model (Event-Driven)                  │
└─────────────────────────────────────────────────────────────┘

VS Code Extension


┌─────────────────────────┐
│   Google Pub/Sub        │  ◄─── Message Queue (buffer)
│   (Topic: heartbeats)   │       - Guaranteed delivery
└────────┬────────────────┘       - Retry mechanism
         │                        - Backpressure handling
         │ (asynchronous triggers)


┌─────────────────────────┐
│  HB SQL Writer          │  ◄─── DEDICATED CONSUMER
│  (Cloud Run)            │
│                         │
│  Specialization:        │
│  Receive from Pub/Sub   │  (instant, non-blocking)
│  Data validation       │  (< 5ms)
│  INSERT to MySQL       │  (50-200ms, but doesn't block API)
│  Webhook to Alerts     │  (async, fire-and-forget)
└─────────────────────────┘

         │ (after write)

┌─────────────────────────┐
│  Alerts API             │  ◄─── LIGHTWEIGHT WEBHOOK
│  /api/alerts/           │       - Only alert logic
│  process-heartbeat      │       - Doesn't wait for DB write
│                         │       - Can scale freely
│  Process alerts        │
│  Update pulses         │
└─────────────────────────┘

3. Performance Benefits

3.1 Main API Offloading

Metric
Before (Sync)
After (Async)
Improvement

API Response Time

150-200ms

5-10ms

95% faster

API Throughput

~100 req/s

~1000 req/s

10x more

CPU Usage (API)

80-90%

20-30%

70% less

Timeout Risk

High

Minimal

Scalability

DB Limited

Horizontal

3.2 Preventing Thread Blocking

// BEFORE: Synchronous write (blocks thread)
app.post('/api/heartbeats', async (req, res) => {
  const heartbeat = req.body;
  
  // Blocks thread for 50-200ms
  await db.query('INSERT INTO heartbeats ...', heartbeat);
  
  // Subsequent operations wait
  await processAlerts(heartbeat);
  
  res.send('OK');
});

// AFTER: Asynchronous consumer (doesn't block API)
// HB SQL Writer - dedicated process
app.post('/ingest', async (req, res) => {
  const heartbeat = decodePubSubMessage(req.body);
  
  // Dedicated process, doesn't affect API
  await db.query('INSERT INTO heartbeats ...', heartbeat);
  
  // Fire-and-forget webhook (async)
  triggerAlertProcessing(heartbeat).catch(err => {
    console.error('Webhook failed, but heartbeat saved');
  });
  
  res.send('OK'); // Immediate response
});

3.3 Backpressure Handling

Google Pub/Sub automatically manages load:

Scenario: 1000 heartbeats/second

┌─────────────────────────┐
│  Pub/Sub Topic          │
│  (message buffer)       │
│                         │
│  Queues messages        │
│  Retries on errors      │
│  Dead Letter Queue      │
└────────┬────────────────┘

         │ (controlled rate)

┌─────────────────────────┐
│  HB SQL Writer          │
│  (autoscaling)          │
│                         │
│  Instances: 1-10        │  ◄─── Auto-scales
│  Processes: 100/s/inst  │
└─────────────────────────┘

Results:
- No data loss (Pub/Sub guarantees delivery)
- Automatic scaling under load
- Graceful degradation (slower but stable)

4. Technical Implementation

4.1 Technology Stack

// package.json
{
  "name": "hb-sql-writer",
  "dependencies": {
    "express": "^4.18.2",      // Lightweight HTTP server
    "mysql2": "^3.9.4",         // MySQL driver (promise-based)
    "dotenv": "^16.3.1"         // Environment configuration
  }
}

4.2 Key Components

A. Endpoint /ingest - Pub/Sub Consumer

app.post('/ingest', async (req, res) => {
  try {
    // 1. Decode Pub/Sub message (base64)
    const dataBase64 = req.body?.message?.data;
    const jsonStr = Buffer.from(dataBase64, 'base64').toString('utf-8');
    const hb = JSON.parse(jsonStr);

    // 2. Data validation and normalization
    const values = [
      hb.user_id,
      convertTimestamp(hb.ts),
      hb.project ?? null,
      hb.entity ?? null,
      validateBranch(hb.branch),
      validateUrl(hb.repository_url),
      // ... additional fields
    ];

    // 3. Database INSERT (connection pool)
    const sql = `INSERT INTO heartbeats (...) VALUES (...)`;
    
    await pool.query(sql, values);

    // 4. Asynchronous webhook (fire-and-forget)
    triggerAlertProcessing(hb).catch(err => {
      console.error('Webhook failed, but heartbeat saved');
    });

    res.status(200).send('Heartbeat saved');
  } catch (err) {
    console.error('Error:', err);
    res.status(500).send('Error');
  }
});

B. Connection Pool - DB Optimization

const pool = mysql.createPool({
  host: process.env.DB_HOST,
  user: process.env.DB_USER,
  password: process.env.DB_PASS,
  database: process.env.DB_NAME,
  
  // Optimizations (mysql2 default values):
  // connectionLimit: 10          // Max connections in pool
  // queueLimit: 0                // Unlimited queue
  // waitForConnections: true     // Wait for free connection
  // enableKeepAlive: true        // Keep-alive for long connections
});

// Benefits:
// Connection reuse (no TCP handshake overhead)
// Automatic connection management
// Graceful handling under load

C. Data Validation - Preventing DB Errors

The system implements robust data validation including:

  • String length validation to prevent database errors

  • Timestamp normalization for consistent storage format

  • URL validation with reasonable length limits

  • Null handling for optional fields

This ensures data integrity and prevents common database insertion errors.

4.3 Database Optimization

The system uses strategic database indexing for optimal performance:

  • Branch indexing for fast filtering by development branch

  • URL indexing with prefix optimization for text fields

  • Composite indexes for common query patterns

Benefits:

  • Fast queries filtering by branch or repository

  • Efficient joins with related tables

  • Optimized GROUP BY operations


5. Data Flow (End-to-End)

1

IDE Extension → Message Queue

  1. IDE Extension

    • POST https://message-queue-service/topics/heartbeats

    • Body: { user_id, ts, project, entity, branch, ... }

2

Message Queue Service

  1. Message Queue Service

    • Queues message

    • Guarantees delivery (at-least-once delivery)

    • Retries on errors (exponential backoff)

3

SQL Writer Service (Container)

  1. SQL Writer Service (Container)

    • POST /ingest

    • Decodes base64

    • Validates data

    • INSERT to Database (50-200ms)

    • Returns 200 OK to Pub/Sub

4

Database (after write)

4a. Database

  • INSERT INTO heartbeats (...)

  • Saves heartbeat

  • Updates indexes

  • Data persisted

5

Webhook → Main API

4b. Webhook (async)

  • POST https://main-app.../api/alerts/process-heartbeat

  • Fire-and-forget (doesn't block write)

  • Timeout: 5s

  • Errors logged, don't fail main process

  1. Main API

    • Selects team (smart team selection)

    • Updates task activity

    • Updates pulse status

    • Triggers relevant alerts and notifications

    • Alerts generated


6. Monitoring and Observability

6.1 Key Metrics

// Logs in HB SQL Writer
console.log('Connected to database');
console.log('Parsed heartbeat:', hb);
console.log('Executing SQL INSERT with values:', values);
console.log('SQL insert result:', result);
console.log('Triggering alert processing');
console.log('Alert processing triggered successfully');

// Error logs
console.error('Error saving heartbeat:', err);
console.error('Alert webhook error:', error.message);

6.2 Metrics to Monitor

Metric
Target
Alert Threshold

INSERT latency

< 200ms

> 500ms

Message queue age

< 10s

> 60s

Error rate

< 1%

> 5%

Webhook success rate

> 95%

< 90%

DB connection pool

< 80%

> 90%

Container instances

1-5

> 10 (check load)

6.3 Diagnostic Commands

# Check service logs
cloud-service logs read sql-writer-service \
  --region=your-region \
  --limit=50

# Check message queue metrics
cloud-service subscriptions describe writer-subscription \
  --format="table(ackDeadlineSeconds, messageRetentionDuration)"

# Check DB connections
mysql> SHOW PROCESSLIST;
mysql> SHOW STATUS LIKE 'Threads_connected';

7. Scalability and Performance

7.1 Autoscaling Configuration

The service is configured for automatic scaling:

  • Minimum instances: 1 (prevents cold starts)

  • Maximum instances: 10 (handles peak load)

  • Concurrency: 80 concurrent requests per instance

  • Estimated throughput: ~5,300 heartbeats/second at full scale

7.2 Database Optimizations

-- Partitioning the heartbeats table (for large volumes)
ALTER TABLE heartbeats
PARTITION BY RANGE (YEAR(ts)) (
  PARTITION p2024 VALUES LESS THAN (2025),
  PARTITION p2025 VALUES LESS THAN (2026),
  PARTITION p2026 VALUES LESS THAN (2027)
);

-- Benefits:
-- ✅ Faster queries (query pruning)
-- ✅ Easier archiving of old data
-- ✅ Smaller indexes (per partition)

7.3 Caching Strategy (Future Enhancement)

// Potential optimization: Batch INSERT
const batchBuffer = [];
const BATCH_SIZE = 100;
const BATCH_TIMEOUT = 5000; // 5s

function addToBatch(heartbeat) {
  batchBuffer.push(heartbeat);
  
  if (batchBuffer.length >= BATCH_SIZE) {
    flushBatch();
  }
}

async function flushBatch() {
  if (batchBuffer.length === 0) return;
  
  const sql = 'INSERT INTO heartbeats (...) VALUES ?';
  await pool.query(sql, [batchBuffer.map(hb => [/* values */])]);
  
  batchBuffer.length = 0;
}

// Benefits:
// ✅ 100 INSERTs → 1 batch INSERT (10x faster)
// ✅ Lower DB load
// ⚠️ Trade-off: Up to 5s delay (acceptable for heartbeats)

8. Security and Reliability

8.1 Webhook Authentication

The system implements secure communication between the SQL Writer and the main API using:

  • Shared secret authentication via environment variables

  • Request timeout protection (5 second limit)

  • Header-based verification to ensure authorized requests

This ensures that only legitimate heartbeat processing requests are accepted by the main application.

8.2 Error Handling

The system implements robust error handling with graceful degradation:

  • Database failures: Automatic retry mechanism via message queue

  • Webhook failures: Non-blocking - heartbeat data is preserved even if alert processing fails

  • Connection issues: Connection pooling with automatic reconnection

  • Timeout protection: All external calls have defined timeout limits

This ensures high availability and data integrity even during partial system failures.

8.3 Dead Letter Queue

The system uses a Dead Letter Queue (DLQ) pattern for handling persistent failures:

  • Maximum retry attempts: 5 attempts before moving to DLQ

  • Failure isolation: Failed messages don't block processing of valid messages

  • Manual recovery: Failed messages can be analyzed and reprocessed manually

  • System stability: Prevents infinite retry loops that could overwhelm the system

This ensures that temporary issues don't cause permanent data loss while maintaining system performance.


9. Summary: Why HB SQL Writer Exists

Problem

The main API must be fast and responsive for users. Heavy heartbeat write operations (50-200ms each) were blocking the main thread, causing:

  • Slow API response times

  • Risk of timeouts under heavy load

  • Inability to scale horizontally

Solution

HB SQL Writer operates as an asynchronous Consumer that:

  1. Retrieves raw heartbeats from message queue

  2. Executes INSERT/UPDATE operations in a dedicated process

  3. Triggers webhook to main API (fire-and-forget)

Benefits

Offloads main API - response time from 150ms → 5ms (95% faster) ✅ Prevents thread blocking - DB operations in separate process ✅ Scales independently - autoscaling container instances (1-10) ✅ Guarantees delivery - message queue retry + Dead Letter Queue ✅ Graceful degradation - slower but stable under overload

Event-Driven Architecture

Client → Message Queue → HB SQL Writer → Database

                         Webhook (async)

                         Main API

10. Future Optimizations (Roadmap)

Short-term

Long-term


Author: Database Engineer Date: 2025-11-30 Version: 1.0