久本草在线中文字幕亚洲,精品一区二区三区四区五区六区

如果不想再為管理底層Kubernetes基礎(chǔ)架構(gòu)而費(fèi)神，可以選擇通過在Amazon EKS Auto Mode托管DeepSeek，以獲得更高的靈活性和可擴(kuò)展性。本文主要詳細(xì)介紹如何使用Amazon EKS托管DeepSeek-R1模型。

一、Amazon EKS介紹

Amazon EKS是高度可用、可擴(kuò)展且安全的Kubernetes服務(wù)，主要用于在亞馬遜云科技（AWS）上使用Kubernetes輕松部署、管理和擴(kuò)展容器化應(yīng)用程序。Amazon EKS跨多個(gè)亞馬遜云科技可用區(qū)運(yùn)行Kubernetes管理基礎(chǔ)設(shè)施，并自動(dòng)檢測和替換運(yùn)行狀況不佳的控制平面節(jié)點(diǎn)，同時(shí)提供按需升級(jí)和修補(bǔ)。只需預(yù)置工作節(jié)點(diǎn)并將其連接到提供的Amazon EKS終端節(jié)點(diǎn)。

亞馬遜云科技官網(wǎng)：點(diǎn)擊創(chuàng)建免費(fèi)賬戶

本文將使用DeepSeek-R1-Distill-Llama-8B蒸餾模型，與擁有671B參數(shù)的完整DeepSeek-R1模型相比，對(duì)資源的需求更少，雖然功能相對(duì)較弱，但提供了一個(gè)更輕量級(jí)的選擇。如果希望部署完整的DeepSeek-R1模型，請(qǐng)?jiān)趘LLM配置中替換掉蒸餾模型。

二、安裝PreReqs

為簡化設(shè)置流程，本文將使用Amazon CloudShell。

# Installing kubectl
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
# Install Terraform
sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://rpm.releases.hashicorp.com/AmazonLinux/hashicorp.repo
sudo yum -y install terraform

三、使用Terraform創(chuàng)建啟用Auto Mode的Amazon EKS集群

使用Terraform輕松配置基礎(chǔ)架構(gòu)，包括Amazon VPC、Amazon ECR存儲(chǔ)庫以及啟用Auto Mode的Amazon EKS集群。

# Clone the GitHub repo with the manifests
git clone -b v0.1 https://github.com/aws-samples/deepseek-using-vllm-on-eks
cd deepseek-using-vllm-on-eks
# Apply the Terraform configuration
terraform init
terraform apply -auto-approve
# After Terraform finishes, configure kubectl with the new EKS cluster
$(terraform output configure_kubectl | jq -r)

四、創(chuàng)建Amazon EKS Auto Mode NodePool

創(chuàng)建一個(gè)自定義NodePool，用來支持GPU。

# Create a custom NodePool with GPU support
kubectl apply -f manifests/gpu-nodepool.yaml
# Check if the NodePool is in 'Ready' state
kubectl get nodepool/gpu-nodepool

五、部署DeepSeek模型

為簡化使用vLLM部署DeepSeek-R1-Distill-Llama-8B模型的過程，本演示提供了一個(gè)sed命令，可以設(shè)置模型名稱和參數(shù)。

# Use the sed command to replace the placeholder with the model name and configuration parameters
sed -i "s|__MODEL_NAME_AND_PARAMETERS__|deepseek-ai/DeepSeek-R1-Distill-Llama-8B --max_model 2048|g" manifests/deepseek-deployment-gpu.yaml
# Deploy the DeepSeek model on Kubernetes
kubectl apply -f manifests/deepseek-deployment-gpu.yaml
# Check the pods in the 'deepseek' namespace 
kubectl get po -n deepseek

起初，在Amazon EKS Auto Mode為底層EC2實(shí)例配置所需的GPU驅(qū)動(dòng)程序時(shí)，pod可能會(huì)處于Pending（待處理）狀態(tài)。

如果pod在Pending（待處理）狀態(tài)停留了幾分鐘，請(qǐng)確認(rèn)亞馬遜云科技賬戶有足夠的服務(wù)配額來啟動(dòng)所需的實(shí)例，需要檢查G或P實(shí)例的配額限制情況。

注意：這些配額基于vCPU，而非實(shí)例數(shù)量，因此請(qǐng)務(wù)必按要求申請(qǐng)配額。

# Wait for the pod to reach the 'Running' state
watch -n 1 kubectl get po -n deepseek
# Verify that a new Node has been created
kubectl get nodes -l owner=data-engineer
# Check the logs to confirm that vLLM has started
kubectl logs deployment.apps/deepseek-deployment -n deepseek

部署準(zhǔn)備就緒后，日志條目將顯示“應(yīng)用程序啟動(dòng)完成”。

六、與DeepSeek LLM交互

接下來，創(chuàng)建一個(gè)本地代理，使用curl請(qǐng)求與模型進(jìn)行交互。

# Set up a proxy to forward the service port to your local terminal
kubectl port-forward svc/deepseek-svc -n deepseek 8080:80 > port-forward.log 2>&1 &
# Send a curl request to the model
curl -X POST "http://localhost:8080/v1/chat/completions" -H "Content-Type: application/json" --data '{
"model": "deepseek-ai/DeepSeek-R1-Distill-Llama-8B",
"messages": [
{
"role": "user",
"content": "What is Kubernetes?"
}
]
}'

根據(jù)模型輸出的復(fù)雜程度，響應(yīng)可能需要幾秒鐘時(shí)間，可以通過deepseek-deployment日志監(jiān)控進(jìn)度。

七、為模型構(gòu)建聊天機(jī)器人UI界面

直接使用API請(qǐng)求可以正常工作，但也可以構(gòu)建一個(gè)更友好的聊天機(jī)器人UI界面來與模型交互，該界面的源代碼已在GitHub中提供。

# Retrieve the ECR repository URI created by Terraform
export ECR_REPO=$(terraform output ecr_repository_uri | jq -r)
# Build the container image for the Chatbot UI
docker build -t $ECR_REPO:0.1 chatbot-ui/application/.
# Login to ECR and push the image
aws ecr get-login-password | docker login --username AWS --password-stdin $ECR_REPO
docker push $ECR_REPO:0.1
# Update the deployment manifest to use the image
sed -i "s#__IMAGE_DEEPSEEK_CHATBOT__#$ECR_REPO:0.1#g" chatbot-ui/manifests/deployment.yaml
# Generate a random password for the Chatbot UI login
sed -i "s|__PASSWORD__|$(openssl rand -base64 12 | tr -dc A-Za-z0-9 | head -c 16)|" chatbot-ui/manifests/deployment.yaml
# Deploy the UI and create the ingress class required for load balancers
kubectl apply -f chatbot-ui/manifests/ingress-class.yaml
kubectl apply -f chatbot-ui/manifests/deployment.yaml
# Get the URL for the load balancer to access the application
echo http://$(kubectl get ingress/deepseek-chatbot-ingress -n deepseek -o json | jq -r '.status.loadBalancer.ingress[0].hostname')

等待幾秒鐘，以便負(fù)載均衡器完成配置。

要訪問聊天機(jī)器人UI界面，需要使用存儲(chǔ)在Kubernetes密鑰中的用戶名和密碼。

echo -e "Username=$(kubectl get secret deepseek-chatbot-secrets -n deepseek -o jsonpath='{.data.admin-username}' | base64 --decode)\nPassword=$(kubectl get secret deepseek-chatbot-secrets -n deepseek -o jsonpath='{.data.admin-password}' | base64 --decode)"

登錄后，界面將顯示一個(gè)新的“聊天機(jī)器人”選項(xiàng)卡，可以在其中與模型進(jìn)行交互。

按照上述步驟，可以在Amazon EKS上高效部署DeepSeek-R1模型，并利用Amazon EKS靈活的擴(kuò)展選項(xiàng)和精細(xì)的資源控制，在保持高性能的同時(shí)優(yōu)化成本。該解決方案利用了Kubernetes的原生功能和Amazon EKS Auto Mode等特性，提供了一個(gè)高度可配置的部署方案，可以精確滿足運(yùn)營需求和預(yù)算范圍。

相關(guān)推薦：《亞馬遜云科技Amazon EC2部署DeepSeek-R1蒸餾模型教程》