### caching ```{eval-rst} .. plantuml:: @startuml title Kafka + Fast Cache + Tiered Storage + Lakehouse — Reference Architecture skinparam backgroundColor #FFFFFF skinparam defaultTextAlignment center skinparam packageStyle rectangle skinparam rectangle { BorderColor black RoundCorner 12 } skinparam linetype ortho skinparam Arrow { Color black Thickness 1 Padding 2 ' small arrowheads } ' left to right direction ' ===== Nodes ===== ' Top row rectangle "Producers\n(Services, CDC, IoT)" as Producers #cce5ff rectangle "Ingress\n(Load Balancer)" as Ingress #cce5ff rectangle "Kafka Brokers\n(Cluster)" as Kafka #ffedcc rectangle "Policy Engine\nLRU / LFU / Hybrid\nWarm-up & Promotion" as Policy #e6ccff ' Middle row rectangle "Fast Cache (NVMe/SSD)\nSegment/Key Cache" as FastCache #ffd9b3 ' Bottom row left (Consumers include ML/RAG) rectangle "Consumers\n(Apps, Stream proc, ML/RAG, BI, Replays)" as Consumers #cce5ff rectangle "AI Pipelines\n(Chunk/Embed, Feature Compute)" as AIPipe #cce5ff ' Bottom row right: Cold Tier containing Lakehouse tables + object store rectangle "Cold Tier\n(Storage: S3 / GCS / HDFS)" as ColdTier #d6f5d6 { rectangle "Lakehouse Tables\n(Iceberg / Delta / Hudi)" as Lakehouse #d6f5d6 { rectangle "AI Tables\n(Features, Embeddings, Labels)" as AITables #d6f5d6 } rectangle "Object Store\n(Parquet/ORC files)" as ObjectStore #d6f5d6 } rectangle "Control Plane & Admin UI\nTopics, Policies, SLAs, Quotas" as Control #f2f2f2 rectangle "Observability\nMetrics, Traces, Cache Hit%, Backfill Latency, Cost" as Observability #f2f2f2 ' ----- Soft positioning (hidden links to guide layout) ----- Producers -[hidden]- Ingress Ingress -[hidden]- Kafka Kafka -[hidden]- Policy FastCache -[hidden]- ColdTier Consumers -[hidden]- AIPipe AIPipe -[hidden]- Lakehouse ColdTier -[hidden]- Control Control -[hidden]- Observability ' ===== Flows (dashed with small arrowheads) ===== Producers -[#black,dashed]-> Ingress : produce Ingress -[#black,dashed]-> Kafka : ingest Kafka <-[#black,dashed]-> FastCache : spill / fetch FastCache <-[#black,dashed]-> ColdTier : tier / hydrate ' Operational consumption Kafka -[#black,dashed]-> Consumers : real-time events ' Control & policy Policy -[#black,dashed]-> FastCache : evict / promote policy Control -[#black,dashed]-> Kafka : configure policies ' Observability (components emit telemetry) Kafka -[#black,dashed]-> Observability : broker metrics / traces FastCache -[#black,dashed]-> Observability : cache hit% / latency ColdTier -[#black,dashed]-> Observability : storage metrics Policy -[#black,dashed]-> Observability : policy decisions / actions Consumers -[#black,dashed]-> Observability : app/ML latency & drift ' ===== Lakehouse (realized through Kafka) ===== Kafka -[#black,dashed]-> Lakehouse : sink to tables\n(Stream→Batch materialization) Kafka -[#black,dashed]-> AIPipe : events/docs for features & RAG AIPipe <-[#black,dashed]- AITables : write features & embeddings Consumers <-[#black,dashed]- Lakehouse : BI / Analytics / RAG batch ' ===== Notes ===== note bottom of Lakehouse Lakehouse is a logical layer on the Cold Tier, realized via Kafka sinks that write to Iceberg/Delta/Hudi tables. AI Pipelines produce unified AI Tables; Consumers (incl. ML/RAG) query them. end note ' ===== Legend ===== legend right |= Color |= Component / Role | |<#cce5ff> Blue| Producers, Consumers & AI pipelines | |<#ffedcc> Orange (light)| Kafka Brokers | |<#ffd9b3> Orange (dark)| Fast Cache (NVMe/SSD) | |<#d6f5d6> Green| Cold Tier + Lakehouse tables (on object store) | |<#e6ccff> Purple| Policy Engine | |<#f2f2f2> Gray| Control Plane & Observability | endlegend @enduml