envoyproxy/envoy

Redis cluster zone-discovery UAF on overlapping CLUSTER SLOTS refresh

Open

#45,870 opened on Jun 29, 2026

View on GitHub
 (0 comments) (0 reactions) (0 assignees)C++ (5,373 forks)batch import
area/redisbughelp wantedno stalebot

Repository metrics

Stars
 (27,997 stars)
PR merge metrics
 (Avg merge 8d) (303 merged PRs in 30d)

Description

Original reporter: @omkhar

Summary

envoy's redis cluster discovery session checks current_request_ to avoid concurrent CLUSTER SLOTS requests but does NOT check the in-flight zone-discovery INFO replies that follow a CLUSTER SLOTS completion. An external refresh trigger arriving between CLUSTER SLOTS completion and the final INFO reply re-enters startZoneDiscovery(), overwrites zone_callbacks_[addr] for in-flight requests, and frees the ZoneDiscoveryCallback that the upstream ClientImpl::PendingRequest still references by reference — UAF on response.

Details

source/extensions/clusters/redis/redis_cluster.cc::RedisDiscoverySession::startResolveRedis() near line 357:

void RedisCluster::RedisDiscoverySession::startResolveRedis() {
  parent_.info_->configUpdateStats().update_attempt_.inc();
  if (current_request_) {
    return;  // misses the case where zone-discovery is still in flight
  }
  ...
  startZoneDiscovery();  // overwrites zone_callbacks_[addr]
}

CLUSTER SLOTS request completes; zone-discovery INFO requests fire per-replica. Before they all return, an external refresh (e.g., DNS update, periodic refresh timer) calls startResolveRedis(). current_request_ is empty, so a new CLUSTER SLOTS goes out and a new startZoneDiscovery() overwrites zone_callbacks_[addr]. The previous round's ZoneDiscoveryCallback is freed; in-flight ClientImpl::PendingRequest that captured it by reference deref's freed memory when the INFO response finally arrives.

PoC

The attached patch (patch-zone-discovery-overlap.patch) adds a pending_zone_requests_ counter and gates re-entry on both current_request_ and pending zone replies.

Reproduction: configure envoy with a Redis cluster (CLUSTER mode); arrange for a slow INFO reply (slow replica or slowlog) while triggering a CLUSTER SLOTS refresh from outside (e.g., DNS update). Run under ASAN to see the UAF; otherwise sporadic crash.

patch-zone-discovery-overlap.patch

git apply <PATH_TO_DOWNLOADED>/patch-zone-discovery-overlap.patch
bazel test --config=clang --config=clang-asan //test/extensions/clusters/redis:redis_cluster_test

Patch applies cleanly to current envoyproxy/envoy main. Note: redis_proxy filter is mainline envoy (not contrib).

Impact

Use-after-free in worker process (CWE-416). Heap-grooming under right object lifetimes can produce RCE. Reachable on any deployment using mainline Redis cluster discovery with periodic refresh + slow INFO replies (real-world Redis topologies, slowlog-enabled replicas).

Contributor guide