SPRINT-RAG: Secure Partitioned Retrieval-Augmented LLM Diagnosis for Privacy-Preserving Incident Analysis in Distributed Clouds
Main Article Content
Abstract
Large language models are increasingly used to interpret operational evidence during cloud incidents, yet centralized retrieval over logs, traces, and service graphs creates privacy, governance, and scalability risks in distributed environments. This article presents SPRINT-RAG, a secure partitioned retrieval-augmented diagnosis framework for privacy-preserving incident analysis. The framework keeps raw observability evidence within tenant, service, or region partitions, performs local event retrieval and risk encoding, and exposes only policy-filtered evidence summaries to a global diagnosis layer. A controlled partition benchmark with 4,800 incident windows, 18 service groups, and three workload-drift regimes is used to evaluate diagnosis quality, evidence use, communication cost, and leakage proxies. Compared with centralized retrieval-augmented diagnosis, SPRINT-RAG improves F1 from 0.842 to 0.874 and top-1 root-cause accuracy from 0.681 to 0.724, while reducing the leakage proxy from 0.139 to 0.043. Compared with local-only diagnosis, it improves top-3 root-cause accuracy by 9.8 percentage points with a moderate p95 latency increase of 1.54 s. Ablation analysis shows that partition trust weighting and cross-partition evidence summaries are both necessary for stable performance under drift, while leakage filtering provides the largest privacy gain with limited diagnostic cost. The results indicate that secure partitioned retrieval can make LLM-based incident diagnosis more compatible with multi-tenant cloud governance without sacrificing operational usefulness.