Skip to product information
1 of 1

Generative AI on Kubernetes: Operationalizing Large Language Models by Roland Huß, Daniele Zonca

Generative AI on Kubernetes: Operationalizing Large Language Models by Roland Huß, Daniele Zonca

Regular price Tk 350.00 BDT
Regular price Tk 600.00 BDT Sale price Tk 350.00 BDT
Sale Sold out
Shipping calculated at checkout.

🚚 ক্যাশ অন ডেলিভারি সারা বাংলাদেশ 🕒 ৭২ ঘন্টার মধ্যে সারা দেশ এ ডেলিভারি

Quantity

Generative AI on Kubernetes: Operationalizing Large Language Models by Roland Huß, Daniele Zonca

The core thesis of Generative AI on Kubernetes is that the biggest challenge facing generative AI today is no longer model design, but orchestration at scale. Large Language Models (LLMs) are uniquely demanding: they consume massive amounts of GPU memory, require specialized hardware acceleration, incur incredibly high cloud compute bills, and exhibit non-deterministic runtime behaviors. Running these workloads on rigid, traditional virtual machines leads to wasted hardware power, slow scaling, and frequent production outages.

Huß and Zonca position Kubernetes as the ultimate control plane for production-grade AI. They provide a practical roadmap for training, fine-tuning, deploying, and auto-scaling generative models. The authors walk readers through configuring advanced GPU scheduling, setting up fractional hardware virtualization, and minimizing the cost of idle cluster space. Crucially, the text details how to run open runtimes like vLLM and TGI within a cloud-native mesh, connect secure agent networks to real-time external tools, and establish monitoring pipelines to track deep LLM health metrics.

As our regional software ecosystem rapidly transitions from simple AI experiments to launching live, production-grade applications, teams are hitting a massive infrastructure wall. Companies are burning through capital on unmanaged, always-running cloud servers that sit idle for hours but crash instantly under sudden user traffic surges. MLOps teams are struggling to manage GPU allocations, protect underlying data from security vulnerabilities, and balance compute costs against real-time application speeds.

Language: English.

Genre: Systems Architecture.

Binding: সেলাই করা বাইন্ডিং

Quality: Premium Quality Books.

Printing: High Quality Printing.

Paper: Eye Friendly paper (Cream White)

Cover: Matt cover (Paperback).

View full details