Amazon Bedrock: Model-Specific Prompting Guide

Each foundation model in Amazon Bedrock has its own prompting format and best practices. This guide covers model-specific techniques to get optimal results from Claude, Titan, Llama, and Mistral.

Why Model-Specific Prompting Matters

Each model was trained differently and responds best to specific formats. Using the right format can significantly improve output quality.

Anthropic Claude

Claude excels with structured prompts using XML tags and clear instructions.

System Prompt Usage

response = client.converse(
    modelId='anthropic.claude-3-sonnet-20240229-v1:0',
    system=[{
        "text": "You are an expert Python developer. Write clean code with type hints."
    }],
    messages=[
        {"role": "user", "content": [{"text": "Write a JSON parser function."}]}
    ]
)

XML Tags for Structure

Analyze this feedback:

<feedback>
The product arrived late but quality was excellent.
</feedback>

Respond in this format:
<analysis>
<sentiment>positive/negative/mixed</sentiment>
<summary>Brief summary</summary>
</analysis>

Prefilled Assistant Response

messages = [
    {"role": "user", "content": [{"text": "List three benefits:"}]},
    {"role": "assistant", "content": [{"text": "1."}]}  # Guides format
]

Claude Parameters

Parameter	Range	Recommended
temperature	0-1	0.3-0.7
top_p	0-1	0.9
max_tokens	1-4096+	Task-dependent

Amazon Titan

Titan uses straightforward instruction-based prompts.

Basic Format

response = client.invoke_model(
    modelId='amazon.titan-text-express-v1',
    body=json.dumps({
        "inputText": "Explain machine learning simply.",
        "textGenerationConfig": {
            "maxTokenCount": 1024,
            "temperature": 0.7
        }
    })
)

Structured Instructions

Task: Summarize the following text in 3 bullet points.

Text: [Your content here]

Summary:

Titan Parameters

Parameter	Range	Notes
temperature	0-1	Lower = more deterministic
topP	0-1	Nucleus sampling
maxTokenCount	1-8192	Output limit

Meta Llama

Llama 3 uses special tokens for instruction formatting.

Llama 3 Format

prompt = """<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a helpful coding assistant.<|eot_id|><|start_header_id|>user<|end_header_id|}

Write a Python hello world program.<|eot_id|><|start_header_id|>assistant<|end_header_id|>

"""

response = client.invoke_model(
    modelId='meta.llama3-1-70b-instruct-v1:0',
    body=json.dumps({
        "prompt": prompt,
        "max_gen_len": 1024,
        "temperature": 0.7
    })
)

Llama Special Tokens

Token	Purpose
`<\|begin_of_text\|>`	Start of prompt
`<\|start_header_id\|>`	Role header start
`<\|end_header_id\|>`	Role header end
`<\|eot_id\|>`	End of turn

Llama Parameters

Parameter	Range	Notes
temperature	0-1	Creativity control
top_p	0-1	Nucleus sampling
max_gen_len	1-2048	Output length

Mistral AI

Mistral uses instruction tags for structured prompts.

Mistral Format

prompt = """<s>[INST] You are a helpful assistant. [/INST]</s>
[INST] Explain cloud computing. [/INST]"""

response = client.invoke_model(
    modelId='mistral.mistral-large-2402-v1:0',
    body=json.dumps({
        "prompt": prompt,
        "max_tokens": 1024,
        "temperature": 0.7
    })
)

Mistral Tokens

Token	Purpose
`<s>`	Start of sequence
`[INST]`	Instruction start
`[/INST]`	Instruction end
`</s>`	End of sequence

Using Converse API (Unified)

The Converse API abstracts model-specific formats:

# Works with ANY text model
response = client.converse(
    modelId='anthropic.claude-3-sonnet-20240229-v1:0',  # or titan, llama, mistral
    system=[{"text": "You are a helpful assistant."}],
    messages=[
        {"role": "user", "content": [{"text": "Hello!"}]}
    ],
    inferenceConfig={"maxTokens": 1024, "temperature": 0.7}
)

Model Comparison

Feature	Claude	Titan	Llama	Mistral
System prompt	Yes	Limited	Yes	Yes
XML tags	Excellent	Basic	Basic	Basic
Multi-turn	Native	Native	Token-based	Token-based
Best for	Complex tasks	General	Code, reasoning	Efficiency

Best Practices Summary

Model	Key Practice
Claude	Use XML tags for structure
Titan	Keep instructions clear and simple
Llama	Use correct special tokens
Mistral	Use [INST] tags properly

Choosing the Right Model

flowchart TB
    A["Task Type"] --> B{"Complex reasoning?"}
    B -->|Yes| C["Claude Opus/Sonnet"]
    B -->|No| D{"Cost priority?"}
    D -->|Yes| E["Claude Haiku / Llama 8B"]
    D -->|No| F{"Need speed?"}
    F -->|Yes| G["Mistral / Titan"]
    F -->|No| H["Claude Sonnet / Llama 70B"]

    style C fill:#8b5cf6,color:#fff
    style E fill:#22c55e,color:#fff

Key Takeaways

Each model has optimal formats - Use model-specific prompting
Claude loves XML tags - Structure improves output quality
Llama/Mistral need special tokens - Format correctly for best results
Converse API simplifies - Use when model-agnostic prompting is acceptable
Test across models - Same prompt may perform differently