Redesigning Ingress: Docker's Transition to the Next-Gen Ingress System

Introduction

In the rapidly evolving landscape of cloud-native applications, the ingress system plays a pivotal role in managing external access to services within a Kubernetes cluster. Traditional ingress solutions, while functional, often face scalability, observability, and maintainability challenges. This article explores Docker's strategic redesign of its ingress system, transitioning from legacy components like EngineX and HAProxy to a modern architecture centered around Envoy Gateway. The goal is to enhance performance, reduce operational complexity, and align with industry standards such as the CNCF (Cloud Native Computing Foundation) ecosystem.

Key Components of the New Architecture

1. Envoy Gateway as the Core Solution

Envoy Gateway is a Kubernetes-native ingress controller that leverages the Envoy Proxy as its data plane. It provides dynamic XDS (eXtensible Data Plane API) configuration, enabling real-time updates to routing rules and policies. Key advantages include:

  • Dynamic Reconfiguration: Supports live updates to routing policies without service interruption.
  • Strong Community & Extensibility: Built on Go, it allows for custom extensions and integrations.
  • Gateway API Compliance: Adheres to the CNCF's Gateway API, offering a standardized way to define routes, TLS configurations, and traffic policies.

2. **System Architecture Overview

The redesigned architecture is structured into multiple layers:

  • Entry Layer: AWS ALB (Application Load Balancer) handles TLS termination, host-based routing, and load balancing at the L7 level.
  • Data Plane: A group of Envoy Proxy instances, segmented by traffic type (external/internal), manage application-level routing and termination.
  • Control Plane: Envoy Gateway manages XDS configurations, translates routing policies, and enforces validation rules.
  • Abstraction Layer: A custom API encapsulates the Gateway API, providing default values and policy validation.
  • Policy Enforcement: OPA (Open Policy Agent) validates route configurations, ensuring uniqueness of domains and paths.

3. **Critical Integrations

  • Rate Limiting: Utilizes Envoy Rate Limiter (a Go service) with XDS for dynamic configuration.
  • Traffic Switching: Leverages ALB weight routing for canary deployments and immediate rollback.
  • Observability: Integrates Open Telemetry and Envoy's native tracing capabilities for end-to-end visibility.

Migration Process and Challenges

1. **Network Layer Migration

  • From NLB (L4) to ALB (L7): Migrated using Route 53 DNS weight adjustment for gradual traffic shift.
  • Address Header Adjustment: Resolved application-layer issues with the X-Forwarded-For header.
  • Smooth Traffic Transition: ALB's real-time routing enabled service-by-service migration.

2. **Configuration Management

  • Helm Charts: Used to manage routing configurations, with Kubernetes annotations defining traffic rules.
  • Example Configuration: http.route and trafficPolicy annotations specify traffic ratios and routing targets.
  • Canary Deployment: 90% of traffic remained on HAProxy, while 10% was directed to Envoy for testing.

3. **Migration Outcomes

  • Performance: Throughput increased by 4x, with core resource usage reduced by 50%.
  • Reliability: Separation of load balancer and control plane minimized failure risks.
  • Resource Optimization: Reduced sidecar containers and proxy layers simplified operations.
  • Observability: Open Telemetry integration enabled full-chain tracing.

Technical Challenges and Solutions

1. **Gateway API Limitations

  • Missing Features: Global rate limiting and IP tagging were not supported, requiring XDS Patch Policy for customization.
  • Control Plane Extensions: Custom policies were implemented via the control plane to avoid developing new services.

2. **Migration Risk Mitigation

  • ALB Weight Routing: Enabled gradual traffic switching and rollback.
  • OPA Policy Validation: Prevented configuration errors (e.g., duplicate domains/paths).
  • Helm Chart Standardization: Reduced manual errors in configuration management.

Key Technical Insights

1. **Service Migration Strategy

  • Service Registration: 90% of traffic was routed through Envoy, with 10% in canary mode.
  • Account Service: 90% remained on HAProxy, while 10% transitioned to Envoy.
  • Traffic Switching: ALB-based mechanisms ensured smooth migration and rollback.

2. **Ongoing Challenges

  • Gateway API Limitations: Required manual XDS patching for unsupported features.
  • Canary Deployment: Needed custom solutions due to lack of native support.
  • Envoy Admin Console: Limited remote access, requiring port forwarding for troubleshooting.

Future Roadmap and Improvements

1. **Technical Goals

  • Reduce Cross-Region Outbound Costs: Optimize routing for cost efficiency.
  • Control Plane Canary Deployments: Enhance confidence in upgrades.
  • Integration with CNCF: Participate in Envoy Gateway Steering Committee to influence development.

2. **Architecture Adjustments

  • EngineX Deprecation: Move its logic to Envoy for streamlined operations.
  • Kubernetes Integration: Optimize custom strategies for better alignment with Kubernetes.

Conclusion

Docker's transition to a modern ingress system using Envoy Gateway addresses critical pain points of legacy solutions, including scalability, observability, and operational complexity. By leveraging XDS, Gateway API, and Open Telemetry, the new architecture delivers superior performance and reliability. While challenges like API limitations and migration risks exist, the benefits of reduced resource consumption and enhanced observability justify the investment. This redesign exemplifies the power of adopting CNCF standards and modern proxy technologies to future-proof cloud-native infrastructure.