As large language models (LLMs) become central to modern applications, the need for efficient, secure, and scalable infrastructure to serve these models has grown exponentially. Envoy Proxy, a high-performance edge and service proxy, has evolved to address the unique challenges of LLM serving. This article explores how Envoy Proxy leverages its advanced capabilities in cloud load balancing, upstreaming, and CNCF-aligned architecture to meet the demands of LLM deployment, while ensuring reliability, security, and cost efficiency.