DeepSeek模型Linux部署文档

DeepSeek模型Linux部署

一、部署Ollama并调用

1. 安装Ollama

# 使用官方脚本安装
curl -fsSL https://ollama.com/install.sh | sh
# 启动服务（部分系统可能需要手动启动）
systemctl start ollama

2. 下载并运行模型

# 例如下载llama2模型
ollama run llama2
# 或者下载mistral模型
ollama run mistral

3. 验证Ollama运行

访问 http://localhost:11434，若返回Ollama相关信息则表示服务正常。

二、部署DeepSeek

1. 安装依赖

# 安装Python和pip
sudo apt update
sudo apt install python3 python3-pip
# 安装PyTorch（根据CUDA版本选择）
pip3 install torch torchvision torchaudio
# 安装Hugging Face库
pip3 install transformers huggingface_hub

2. 下载DeepSeek模型

# 使用Hugging Face CLI登录（需账号）
huggingface-cli login
# 下载模型（假设模型为deepseek-ai/deepseek-r1）
git lfs install
git clone https://huggingface.co/deepseek-ai/deepseek-r1

3. 创建FastAPI服务

创建 app.py：

from fastapi import FastAPI
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

app = FastAPI()
tokenizer = AutoTokenizer.from_pretrained("/path/to/deepseek-r1")
model = AutoModelForCausalLM.from_pretrained("/path/to/deepseek-r1", torch_dtype=torch.float16)

@app.post("/ask")
async def ask(prompt: str):
    inputs = tokenizer(prompt, return_tensors="pt")
    outputs = model.generate(**inputs, max_new_tokens=50)
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return {"answer": response}

4. 启动服务

pip3 install fastapi uvicorn
uvicorn app:app --host 0.0.0.0 --port 8000

三、程序调用示例

调用Ollama API（Python）

import requests

def ask_ollama(prompt, model="llama2"):
    response = requests.post(
        "http://localhost:11434/api/generate",
        json={"model": model, "prompt": prompt, "stream": False}
    )
    return response.json()["response"]

answer = ask_ollama("为什么天空是蓝色的？")
print(answer)

调用DeepSeek API（Python）

import requests

def ask_deepseek(prompt):
    response = requests.post(
        "http://localhost:8000/ask",
        json={"prompt": prompt}
    )
    return response.json()["answer"]

answer = ask_deepseek("为什么天空是蓝色的？")
print(answer)

四、注意事项

硬件要求：确保有足够GPU内存运行模型。Ollama的7B模型需至少8GB内存，DeepSeek模型可能需求更高。
模型路径：替换/path/to/deepseek-r1为实际模型存放路径。
服务端口：避免端口冲突，Ollama默认使用11434，DeepSeek示例使用8000。
安全设置：若需外部访问，配置防火墙规则开放相应端口。
模型选择：Ollama支持模型列表可通过ollama list查看，DeepSeek需确保有合法模型权限。