Author

Started July 5, 2025

Completed

Real-Time SKU Leaderboard

Built a low-latency leaderboard for top-clicked SKUs using Flink, Redis, and Iceberg.

Apache Flink Redis Iceberg Streaming Kafka

Overview

Built a real-time SKU leaderboard for e-commerce marketing dashboards. Events were ingested via Kafka, processed with Apache Flink for sliding-window aggregations, and surfaced via Redis for low-latency querying. Iceberg was used to store hourly snapshots for audit and replay.

It demonstrates:

  • Real-time stream aggregation
  • Sliding window logic in Flink
  • Low-latency read layer via Redis
  • Iceberg for append-only analytics

Architecture

  • Kafka ingests clickstream events
  • Apache Flink performs real-time aggregation
  • Redis stores hot leaderboard counts
  • Iceberg stores durable historical snapshots
  • Airflow handles periodic TTL cleanup and exports

Status

System was deployed as a prototype and delivered real-time metrics within 1–2 seconds of ingest.

Challenges

  • Sliding window alignment across Flink job restarts
  • Memory tuning Redis to avoid eviction under burst
  • Backpressure in Kafka during simulated traffic spikes

Lessons Learned

  • Always simulate burst load before go-live
  • Use state TTL and lazy expiration in Redis for leaderboard rollups
  • Flink watermarking config affects latency under clock skew

Related Services

  • Real-time analytics dashboards
  • Event-driven architecture
  • Redis-based low-latency APIs
  • Stream-to-lake architecture using Iceberg

Recent Activity

Optimized Flink job by tuning watermark lag and operator parallelism; latency dropped to 800ms P95.

Flink Flink Kafka Kafka

Deployed Redis-backed real-time leaderboard using Flink sliding window aggregates.

Redis Redis Flink Flink