Model information
DeepSeek R1 is a 671 billion parameter open-source reasoning model. It utilizes tokens specifically for reflecting on how to respond to the user prompt, shown within the<think>
tags in an output. Due to this difference of structure from typical model requests, consider the following alterations.
System prompt
- Remove instructional system prompts and minimize additional prompting beyond the user query. Excessive instructions can restrict the model’s reasoning scope and reduce output quality.
User prompt
- Avoid additional chain of thought prompting and explicating how to respond since the model already processes queries in its thinking approach. It’s best to use zero-shot or single-instruction plain language prompts for complex tasks. This approach allows the model’s internal reasoning capabilities to shine.
- Experiment with the structure of these sections: goal, return format, warnings, and context dump.
- In the rare case the
<think>
tags are bypassed, you can enforce using them by telling the model to start with<think>
tags. - To generate concise thinking, the idea of chain-of-draft can be used, that is, such as adding “only keep a minimum draft for each thinking step, with 5 words at most” as a constraint in your prompt.

Parameters
- For general reasoning (non-math reasoning), the suggested
temperature
= 0.6 andtop-p
= 0.95. If you prefer more factual response,temperature
can be set as smaller one, such as 0.5. - For math-style reasoning, the suggested
temperature
is 0.7 andtop-p
= 1.0.
Example request
Use cases
Report generator
DeepSeek-R1 is particularly adept at processing unstructured information, making it ideal for analyzing complex documents like legal contracts, financial statements, or scientific papers. Reasoning models can facilitate pattern recognition by analyzing multiple facets of the information, prior to distilling it into a comprehensive summary. Example promptDevelop a comprehensive report on the state of autonomous vehicles. Present this report with organized sections and a breif summarization. Be careful to cite the achievements with the proper entity that made that achievement or contribution. For context: I am knowledgable in this field and have a technical understanding of autonmous vehicle systems. I've been working most of my career in artificial intelligence but have not yet joined a company with the sole focus of autonomous vehicles. I am considering making the career change and wanted to understand the current ecosystem before I go through the job search process.
Planner for workflows and agents
Reasoning models excel at tackling ambiguous and complex tasks. They can break down intricate problems, strategize solutions, and make decisions based on large volumes of ambiguous information. Use DeepSeek-R1 as a strategic planner for complex, multi-step problems. It can break down tasks, develop detailed solutions, and even orchestrate other AI models for specific subtasks within an agentic system. To visualize the vast capabilities of thoughtful planning in a powerful workflow, the demo app implements DeepSeek-R1 as a planner. It orchestrates agents with various workflows, such as Deep Research, Financial Analysis, and Sales Leads. The app is open-source allowing developers quick experimentation and easy production of their own agent workflows within the system.
Coding and mathematical guru
These models are effective at reviewing and improving code. They can detect minor changes in a codebase that might be missed by human reviewers. In solving math problems, reasoning is helpful to breakdown tasks into many steps and verify its work throughout solving.Best Practices
Latency and cost
Latency and cost
Reasoning model outputs have higher latency and token usage with its
<think>
process, so consider using non-reasoning models for simpler tasks to optimize for budget and response time needs. It’s features have advantages of considering a user prompt more holistically, but also takes up token capacity and time to produce a complete answer. Developers should apply the powerful model in optimal situations for its response approach.Streaming
Streaming
Enabling streaming can improve user experience in applications using DeepSeek-R1. Due to the longer process of generation, this feature confirms the reduces ambiguity and anticipation by displaying tokens as they are available. Implement this by adding the parameter
stream=True
into the model request.Function calling
Function calling
DeepSeek-R1 the current enablement of function calling is unstable, occasionally resulting in looped calls or empty responses having as noted in the DeepSeek documentation. Prompt engineering can be implemented as a workaround with trial and error, but using a different model would be optimal. Learn more about function calling with models like Llama-3.3-70B by viewing the Function calling and JSON mode document.
FAQs
I got BadRequestError: 400 about the maximum context length of DeepSeek-R1. Where can I check this information?
I got BadRequestError: 400 about the maximum context length of DeepSeek-R1. Where can I check this information?
Check the SambaCloud models page for more information on the current context length of the model. Also feel free to reach out about any errors in our Community.
If the model was created by a Chinese company, where is it being hosted?
If the model was created by a Chinese company, where is it being hosted?
We are hosting the model primarily in US-based data centers, with some additional data centers in Japan.
How can I access it?
How can I access it?
Try the model in our Playground and then get an API key to get access as it becomes available.