Redis cluster zone-discovery UAF on overlapping CLUSTER SLOTS refresh
#45,870 opened on Jun 29, 2026
Repository metrics
- Stars
- (27,997 stars)
- PR merge metrics
- (Avg merge 8d) (303 merged PRs in 30d)
Description
Original reporter: @omkhar
Summary
envoy's redis cluster discovery session checks current_request_ to avoid concurrent CLUSTER SLOTS requests but does NOT check the in-flight zone-discovery INFO replies that follow a CLUSTER SLOTS completion. An external refresh trigger arriving between CLUSTER SLOTS completion and the final INFO reply re-enters startZoneDiscovery(), overwrites zone_callbacks_[addr] for in-flight requests, and frees the ZoneDiscoveryCallback that the upstream ClientImpl::PendingRequest still references by reference — UAF on response.
Details
source/extensions/clusters/redis/redis_cluster.cc::RedisDiscoverySession::startResolveRedis() near line 357:
void RedisCluster::RedisDiscoverySession::startResolveRedis() {
parent_.info_->configUpdateStats().update_attempt_.inc();
if (current_request_) {
return; // misses the case where zone-discovery is still in flight
}
...
startZoneDiscovery(); // overwrites zone_callbacks_[addr]
}
CLUSTER SLOTS request completes; zone-discovery INFO requests fire per-replica. Before they all return, an external refresh (e.g., DNS update, periodic refresh timer) calls startResolveRedis(). current_request_ is empty, so a new CLUSTER SLOTS goes out and a new startZoneDiscovery() overwrites zone_callbacks_[addr]. The previous round's ZoneDiscoveryCallback is freed; in-flight ClientImpl::PendingRequest that captured it by reference deref's freed memory when the INFO response finally arrives.
PoC
The attached patch (patch-zone-discovery-overlap.patch) adds a pending_zone_requests_ counter and gates re-entry on both current_request_ and pending zone replies.
Reproduction: configure envoy with a Redis cluster (CLUSTER mode); arrange for a slow INFO reply (slow replica or slowlog) while triggering a CLUSTER SLOTS refresh from outside (e.g., DNS update). Run under ASAN to see the UAF; otherwise sporadic crash.
patch-zone-discovery-overlap.patch
git apply <PATH_TO_DOWNLOADED>/patch-zone-discovery-overlap.patch
bazel test --config=clang --config=clang-asan //test/extensions/clusters/redis:redis_cluster_test
Patch applies cleanly to current envoyproxy/envoy main. Note: redis_proxy filter is mainline envoy (not contrib).
Impact
Use-after-free in worker process (CWE-416). Heap-grooming under right object lifetimes can produce RCE. Reachable on any deployment using mainline Redis cluster discovery with periodic refresh + slow INFO replies (real-world Redis topologies, slowlog-enabled replicas).