How to Build a DeepSeek Telegram Bot With V4 in Python
You want a private chat assistant that lives in Telegram, answers in seconds, and costs cents per thousand conversations — not a SaaS subscription. A DeepSeek Telegram bot does exactly that. You wire Telegram’s Bot API to DeepSeek’s OpenAI-compatible chat endpoint, run a small Python process, and you have a working assistant your team can message from any device.
This tutorial builds that bot end to end on DeepSeek V4 (released April 24, 2026), the current generation. You’ll register a bot with BotFather, write an async Python handler with `python-telegram-bot`, call `deepseek-v4-flash` for cheap general chat (and switch to thinking mode when you need it), persist per-user conversation history, and ship a production-ready cost calculation. Total build time: about 30 minutes.
What you’ll build
A single-file Python bot that:
- Receives messages in Telegram via long polling (no public webhook required for development).
- Maintains per-chat conversation history in memory, because the DeepSeek API is stateless and will not remember turns for you.
- Calls
deepseek-v4-flashby default for fast, cheap replies, and lets users flip to V4-Pro thinking mode with a/thinkcommand. - Streams responses back so users see typing-style updates instead of staring at a long blank.
- Logs token usage so you can verify cost math against the DeepSeek API pricing page.
By the end you’ll have a bot you could plausibly leave running for a small team, plus the patterns to harden it for production. If you’ve never touched the DeepSeek API before, skim the DeepSeek API getting started guide first — this tutorial assumes you know what an API key is.
Prerequisites
- Python 3.10 or newer. The
python-telegram-botlibrary is async-first and requires 3.10+. - A Telegram account on any device.
- A DeepSeek API key from
platform.deepseek.com. See get a DeepSeek API key if you don’t have one yet. - Two PyPI packages:
python-telegram-botandopenai(we’ll use the OpenAI SDK against DeepSeek’s OpenAI-compatible base URL). - A code editor and a terminal. No public IP, no SSL certificate, no Docker required for the polling-based version in this tutorial.
Step 1: Register the bot with BotFather
Telegram’s @BotFather is the official tool for creating bots. Obtaining a token is as simple as contacting @BotFather, issuing the /newbot command and following the steps until you’re given a new token.
- Open Telegram and search for @BotFather. Select the result with the blue verification checkmark — this confirms it’s the official BotFather, not an impostor.
- Send
/newbot. - Choose a display name (e.g. “DeepSeek Helper”) when asked.
- Choose a username. It must be between 5 and 32 characters long, is not case sensitive, may only include Latin characters, numbers, and underscores, and must end in bot, like tetris_bot or TetrisBot.
- Copy the token BotFather sends you. It looks like
123456789:AA....
Treat that token like a password. If you suspect your token has been compromised, revoke it immediately using BotFather’s /revoke command. Never commit it to git; we’ll load it from an environment variable.
Optional: register slash commands
While you’re in BotFather, send /setcommands, pick your bot, and paste:
start - Greet and explain the bot
reset - Clear conversation history
think - Toggle V4-Pro thinking mode
mode - Show current model and mode
Telegram will then show these as a menu in the chat input.
Step 2: Project setup
Create a directory and install dependencies. The shell snippet below uses bash:
mkdir deepseek-tg && cd deepseek-tg
python3 -m venv .venv
source .venv/bin/activate
pip install "python-telegram-bot==21.*" "openai>=1.40"
Create a .env file (and add it to .gitignore):
TELEGRAM_BOT_TOKEN=123456789:AA-your-token
DEEPSEEK_API_KEY=sk-your-deepseek-key
Step 3: Wire DeepSeek to Telegram
This is the core file, bot.py. The python-telegram-bot library is async — it provides a pure Python, asynchronous interface for the Telegram Bot API and is compatible with Python versions 3.10+. Chat requests hit POST /chat/completions, the OpenAI-compatible endpoint at https://api.deepseek.com, so we use the OpenAI Python SDK with a swapped base_url.
import os
import logging
from collections import defaultdict, deque
from openai import AsyncOpenAI
from telegram import Update
from telegram.ext import (
ApplicationBuilder, CommandHandler, MessageHandler,
ContextTypes, filters,
)
logging.basicConfig(level=logging.INFO)
log = logging.getLogger("deepseek-tg")
TG_TOKEN = os.environ["TELEGRAM_BOT_TOKEN"]
DS_KEY = os.environ["DEEPSEEK_API_KEY"]
client = AsyncOpenAI(
base_url="https://api.deepseek.com",
api_key=DS_KEY,
)
SYSTEM_PROMPT = (
"You are a concise assistant inside Telegram. "
"Reply in plain text under 1500 characters unless asked for code."
)
# Per-chat rolling history. Stateless API => we resend each turn.
HISTORY: dict[int, deque] = defaultdict(lambda: deque(maxlen=20))
THINKING: dict[int, bool] = defaultdict(bool)
async def start(update: Update, ctx: ContextTypes.DEFAULT_TYPE):
await update.message.reply_text(
"DeepSeek bot online. Send a message. "
"/think toggles V4-Pro thinking mode. /reset clears history."
)
async def reset(update: Update, ctx: ContextTypes.DEFAULT_TYPE):
HISTORY[update.effective_chat.id].clear()
await update.message.reply_text("History cleared.")
async def toggle_think(update: Update, ctx: ContextTypes.DEFAULT_TYPE):
cid = update.effective_chat.id
THINKING[cid] = not THINKING[cid]
await update.message.reply_text(
f"Thinking mode: {'ON (deepseek-v4-pro)' if THINKING[cid] else 'OFF (deepseek-v4-flash)'}"
)
async def show_mode(update: Update, ctx: ContextTypes.DEFAULT_TYPE):
cid = update.effective_chat.id
model = "deepseek-v4-pro" if THINKING[cid] else "deepseek-v4-flash"
await update.message.reply_text(f"Model: {model} | thinking: {THINKING[cid]}")
async def on_message(update: Update, ctx: ContextTypes.DEFAULT_TYPE):
cid = update.effective_chat.id
user_text = update.message.text
HISTORY[cid].append({"role": "user", "content": user_text})
messages = [{"role": "system", "content": SYSTEM_PROMPT}, *HISTORY[cid]]
kwargs = {"model": "deepseek-v4-flash", "messages": messages,
"temperature": 1.3, "max_tokens": 1500, "stream": True}
if THINKING[cid]:
kwargs["model"] = "deepseek-v4-pro"
kwargs["reasoning_effort"] = "high"
kwargs["extra_body"] = {"thinking": {"type": "enabled"}}
placeholder = await update.message.reply_text("…")
buffer = ""
try:
stream = await client.chat.completions.create(**kwargs)
async for chunk in stream:
delta = chunk.choices[0].delta.content or ""
buffer += delta
# Edit every ~80 chars to keep Telegram happy
if len(buffer) % 80 < len(delta):
try:
await placeholder.edit_text(buffer[:4000])
except Exception:
pass
final = buffer[:4000] or "(empty response)"
await placeholder.edit_text(final)
HISTORY[cid].append({"role": "assistant", "content": final})
except Exception as e:
log.exception("DeepSeek call failed")
await placeholder.edit_text(f"Error: {e}")
def main():
app = ApplicationBuilder().token(TG_TOKEN).build()
app.add_handler(CommandHandler("start", start))
app.add_handler(CommandHandler("reset", reset))
app.add_handler(CommandHandler("think", toggle_think))
app.add_handler(CommandHandler("mode", show_mode))
app.add_handler(MessageHandler(filters.TEXT & ~filters.COMMAND, on_message))
app.run_polling(allowed_updates=Update.ALL_TYPES)
if __name__ == "__main__":
main()
A few things to notice:
- Stateless API, client-side memory. DeepSeek's API does not remember previous turns. The
HISTORYdict resends the last 20 messages on every call, capped to keep token spend predictable. The web/app surface keeps history server-side; the API does not. - Two model IDs, one parameter for thinking. Both
deepseek-v4-pro(1.6T total / 49B active) anddeepseek-v4-flash(284B / 13B active) are open-weight MoE models under MIT. Thinking mode is enabled by settingreasoning_effort="high"plusextra_body={"thinking": {"type": "enabled"}}— there's no separate "reasoner" model ID in V4. - Temperature 1.3 matches DeepSeek's official guidance for general conversation and translation. Drop it to 0.0 for code, 1.5 for creative writing.
- Streaming via
stream=Trueyields server-sent-event chunks; we edit the placeholder message progressively so the user sees the answer build.
Step 4: Run it
From the project directory, with the venv active:
set -a && source .env && set +a
python bot.py
You should see Application started in the logs. Open Telegram, find your bot by the username you chose, and send /start. A reply within a couple of seconds means polling, the API call, and streaming all work.
Verify it worked
- Send a multi-turn question ("My name is Sam." then "What's my name?"). The second answer should reference Sam — proof that history is being resent correctly.
- Run
/thinkand ask a reasoning-heavy question ("If a train leaves Boston at 70 mph and another from NYC at 60 mph…"). The reply should be visibly more deliberate. Behind the scenes, V4-Pro returnsreasoning_contentalongside the finalcontent; the SDK exposes onlycontentby default unless you readchunk.choices[0].delta.model_extra. - Run
/reset, ask "What's my name?" again — the bot should not know.
Step 5: Cost math you can trust
Here's a worked example for a small team using deepseek-v4-flash, the default in this bot. Assume 10,000 messages per month, with an average 400-token system + history prefix (cached across calls), 60-token user message, and 250-token reply.
| Bucket | Tokens | Rate (V4-Flash, USD/1M) | Cost |
|---|---|---|---|
| Input, cache hit | 4,000,000 | $0.028 | $0.112 |
| Input, cache miss | 600,000 | $0.14 | $0.084 |
| Output | 2,500,000 | $0.28 | $0.700 |
| Total | — | — | $0.896 |
Under one dollar a month at this volume. Switch the same workload to deepseek-v4-pro ($0.145 / $1.74 / $3.48 per 1M tokens) and you'd pay roughly $10.30 — a 12× jump for the frontier tier. For most chat-style traffic, V4-Flash is the right default; reserve V4-Pro for hard reasoning, agentic loops, or coding tasks where the benchmark lift matters. A clean way to plan this is the DeepSeek pricing calculator.
Two cost gotchas to remember:
- The system-prompt cache hit doesn't cover the user message on each call — that's an uncached miss against the prefix. Both lines must appear in any cost example.
- Off-peak discounts ended September 5, 2025 and have not returned with V4. Don't budget for a "night-time" 50% off.
Step 6: Migrating from legacy model IDs
If you copied an older tutorial that uses deepseek-chat or deepseek-reasoner, those still work — they currently route to deepseek-v4-flash in non-thinking and thinking mode respectively. They will be retired on 2026-07-24 at 15:59 UTC, after which requests using those IDs will fail. The migration is a one-line model= swap in your chat.completions.create call; base_url does not change.
Common errors and fixes
| Symptom | Likely cause | Fix |
|---|---|---|
Conflict: terminated by other getUpdates request |
Two copies of the bot running against the same token | Stop the duplicate; only one polling instance per token |
401 Unauthorized from DeepSeek |
Wrong or expired API key | Regenerate at platform.deepseek.com; verify env var is loaded |
insufficient_balance |
Account out of credit | Top up; check the billing console for any granted balance |
Telegram message is not modified |
Streaming edit produced identical text | Already wrapped in try/except in the example |
Empty content in response |
JSON mode without "json" in prompt, or max_tokens too low |
Include the word "json" plus a schema example; raise max_tokens |
| Long replies cut off mid-sentence | Telegram's 4096-char message limit | Split with a paginator or send as a file |
For deeper diagnostics, the DeepSeek API error codes reference lists every status code DeepSeek returns and what it usually means.
Hardening for production
The polling bot above is fine for personal use. If you're putting it in front of users, layer in:
- Rate limiting per user — wrap
on_messagewith a token-bucket keyed onupdate.effective_user.id. Without it, one user can drain your DeepSeek balance. - Persistent history — swap the in-memory
HISTORYdict for SQLite or Redis. The current setup loses everything on restart. - Webhooks instead of polling for higher throughput. Use this method to specify a URL and receive incoming updates via an outgoing webhook. Whenever there is an update for the bot, Telegram will send an HTTPS POST request to the specified URL, containing a JSON-serialized Update.
- Allowlist — gate
on_messageon a set of approvedchat_idvalues until you're confident in your spend cap. - Function calling — for tools like web search or database lookups, declare them in OpenAI-compatible format and let V4 invoke them. See DeepSeek API function calling.
- Context caching awareness — keep your system prompt stable across calls so the cache-hit tier kicks in. Details in the DeepSeek context caching guide.
One technical fact worth keeping in mind as you scale: V4's context window is 1,000,000 tokens by default on both tiers, with output up to 384,000 tokens. You will not run out of context for a chat workload; you'll run out of latency budget first. Trim history aggressively.
Next steps
Two natural follow-ups once your bot is alive:
- Wrap the same DeepSeek call in a DeepSeek Discord bot so your team has the same assistant in both apps.
- Layer document Q&A on top with the DeepSeek RAG tutorial — the bot becomes a private knowledge assistant rather than a generic chat.
For a wider menu of integration patterns, the DeepSeek tutorials hub collects every step-by-step guide on the site.
Last verified: 2026-04-25. DeepSeek AI Guide is an independent resource and is not affiliated with DeepSeek or its parent company. Model IDs, pricing and API behaviour change; check the official DeepSeek documentation and pricing page before committing to a production decision.
How do I create a Telegram bot for DeepSeek?
Open Telegram, message @BotFather, send /newbot, and follow the prompts to pick a name and username ending in "bot". BotFather replies with a token — that's your Telegram credential. Pair it with a DeepSeek API key from platform.deepseek.com, then run a small Python script using python-telegram-bot and the OpenAI SDK pointed at https://api.deepseek.com. The full code lives in Step 3 above.
Is the DeepSeek API free for a Telegram bot?
The API is paid per token, not free, but rates are low: deepseek-v4-flash charges $0.028 cache-hit / $0.14 cache-miss / $0.28 output per 1M tokens. A small team's bot typically costs under a dollar a month. DeepSeek may offer a granted balance — a small promotional credit that can expire — so check the billing console for current offers. For more detail see the is DeepSeek free guide.
What is the difference between deepseek-v4-flash and deepseek-v4-pro?
Both are V4 MoE models released April 24, 2026 under MIT. deepseek-v4-flash is 284B total / 13B active and is the cost-efficient default. deepseek-v4-pro is 1.6T total / 49B active, frontier tier, and roughly 12× more expensive on output. Use Flash for chat; switch to Pro for agentic coding or hard reasoning. Compare both on the DeepSeek V4 model page.
Does the DeepSeek API remember previous Telegram messages?
No. The DeepSeek POST /chat/completions endpoint is stateless — it does not store conversation history on the server. Your bot must resend the prior messages array on every request to sustain context. The web chat and mobile app maintain session history, but the API does not. The example in this tutorial keeps a per-chat deque capped at 20 turns. Background in DeepSeek API documentation.
Can I use thinking mode in a DeepSeek Telegram bot?
Yes. Thinking is a request parameter on either V4 model, not a separate model ID. Set reasoning_effort="high" plus extra_body={"thinking": {"type": "enabled"}} on your chat.completions.create call. The response then returns reasoning_content alongside the final content. The bot in this guide toggles it via a /think command. More patterns in DeepSeek prompt engineering.
