Started July 5, 2025
Completed
Real-Time SKU Leaderboard
Built a low-latency leaderboard for top-clicked SKUs using Flink, Redis, and Iceberg.
Apache Flink Redis Iceberg Streaming Kafka
Overview
Built a real-time SKU leaderboard for e-commerce marketing dashboards. Events were ingested via Kafka, processed with Apache Flink for sliding-window aggregations, and surfaced via Redis for low-latency querying. Iceberg was used to store hourly snapshots for audit and replay.
It demonstrates:
- Real-time stream aggregation
- Sliding window logic in Flink
- Low-latency read layer via Redis
- Iceberg for append-only analytics
Architecture
- Kafka ingests clickstream events
- Apache Flink performs real-time aggregation
- Redis stores hot leaderboard counts
- Iceberg stores durable historical snapshots
- Airflow handles periodic TTL cleanup and exports
Status
System was deployed as a prototype and delivered real-time metrics within 1–2 seconds of ingest.
Challenges
- Sliding window alignment across Flink job restarts
- Memory tuning Redis to avoid eviction under burst
- Backpressure in Kafka during simulated traffic spikes
Lessons Learned
- Always simulate burst load before go-live
- Use state TTL and lazy expiration in Redis for leaderboard rollups
- Flink watermarking config affects latency under clock skew
Related Services
- Real-time analytics dashboards
- Event-driven architecture
- Redis-based low-latency APIs
- Stream-to-lake architecture using Iceberg
Recent Activity
Optimized Flink job by tuning watermark lag and operator parallelism; latency dropped to 800ms P95.
Deployed Redis-backed real-time leaderboard using Flink sliding window aggregates.