Skip to content

Prompt Engineering

Prompts that survive production — tested, versioned, measurably better.

We treat prompts like software: with an eval suite, a diff workflow and CI. So your model delivers consistent quality, not situational sparks.

Start a prompt audit

Iteration

How a prompt moves from "kind of works" to "ships reliably".

Three versions of the same prompt, three eval runs, three measurable steps. This is what our iterations look like.

Capabilities

Three disciplines that turn a prompt into a production asset.

Prompt engineering is more than word choice. It is test discipline, architecture and knowledge work combined.

Eval suites

We build domain-specific test sets from real examples — auto-graded, with a clear pass criterion per use case.

System architecture

Roles, output schemas, tool specs and few-shot strategy — cleanly separated, versioned, in the repo.

Enablement

We hand over the craft — playbooks, internal training and review sessions — so your team keeps prompts evolving.

Diff example

What a prompt iteration looks like in the repo.

Every change lives in a pull request, runs the eval suite, and only lands in main when the numbers move.

prompts/support_summary.xml
Beforev1
<task>
- Summarise the ticket.
</task>
- Answer:
Afterv3
<role>support_lead</role>
+ <task>Summarise the ticket in 3 sentences.</task>
+ <output_schema>
+ { summary: string, urgency: "low"|"med"|"high" }
+ </output_schema>

Eval results

What changed measurably between v1 and v3 at a client.

Eval suite with 240 real tickets from a mid-sized mobility provider service desk — same model, only prompt work.

v1 — Naivev3 — Eval-tested
  • Pass rate

    42%93%+51pp

  • Schema adherence

    55%99%+44pp

  • Hallucinations

    18%3%-15pp

  • Tone of voice fit

    60%91%+31pp

  • Cost per 1k calls

    6231-31

By the numbers

This is hafencity.dev

15+ Experts

Engineers, designers, and strategists working as one practice from our Hamburg HQ.

50+ Clients

Companies across consumer, healthcare, and B2B trust us with their digital products. Long-term partnerships are the default.

97% Recommendation rate

Repeat engagements and references our buyers actually call. Trust compounds when delivery does.

50+ Launches

Custom mobile and web products shipped from concept to maintenance — owned end-to-end by our team.

100% In-house

Strategy, design, and engineering all live at our Hamburg HQ. One team, one project lead, accountable to you from kickoff to launch.

Since 2023

Helping companies ship digital products — and growing alongside the teams we work with.

Next steps

Let's talk about your project

Book a 30-minute discovery call. We'll review your goals, surface unknowns, and outline how we would run the engagement.

Schedule a call