What risks does AI introduce into software projects?

The five main ones are: privacy (sensitive data in external tools), hallucinations (invented APIs, packages or versions), insecure code, prompt injection in agents with tool access, and missing traceability. Per Veracode, around 45% of AI-generated code contains at least one OWASP Top 10 vulnerability — the risks are real and measurable.

What changes with the EU AI Act in 2026?

The EU AI Act is risk-based: not every system carries the same obligations. With the Digital Omnibus (provisional agreement of 7 May 2026), the obligations for high-risk systems under Annex III move from August 2026 to December 2027, and for AI embedded in regulated products to August 2028. The transparency obligations and the GPAI rules (in force since August 2025) remain.

Should AI-generated code be reviewed differently?

Yes. A review must ask more than "does it run?". It has to check whether the code fits the architecture, whether unnecessary or invented dependencies slipped in, whether secrets are protected and error states handled, and whether permissions and edge cases are tested. Hallucinated packages — in roughly one in five AI answers — are a concrete supply-chain risk.

When does local or private AI make sense?

When data must not leave the company, low latency matters, or industry-specific compliance imposes strict requirements. Typical candidates are internal knowledge search, contract and document analysis, or sensitive product data. For many standard cases, EU-hosted models with a data processing agreement and clear data classes are enough.

Contact

All posts

AIMay 8, 20266 min read

Managing AI Risk in Software Projects: Governance 2026

Q: What is prompt injection and why is it so dangerous?

Prompt injection means an input — a document, website or user request — contains hidden instructions that push an AI system toward unintended behavior. In the OWASP Top 10 for LLM Applications 2025 it ranks first (LLM01). It becomes especially critical once an AI agent can use tools or trigger actions, because real operations can then be abused.

Q: What does practical AI governance look like?

It rests on five building blocks: an AI policy (allowed tools, data classes, approvals), a use-case assessment before implementation, technical guardrails (roles, permissions, logging, tool boundaries), quality assurance (tests, evaluation sets, human reviews) and defined operations (monitoring, error analysis, ownership). Governance does not have to be bureaucratic — it has to be doable.

AI brings speed to software projects — and shifts the risk to places traditional processes rarely control: 45% of generated code carries a vulnerability per Veracode, and prompt injection tops the OWASP Top 10 for LLMs. We map the five risk categories, the updated EU AI Act timeline, and a governance model that works in daily practice.

Marius Gill

Geschäftsführer und Softwareentwickler mit über 10 Jahren Erfahrung

Updated on

June 29, 2026

Share

6 min read

AI makes software projects faster — and shifts the risk to a place traditional development processes rarely control. A Veracode study across more than 100 language models found that 45% of generated code introduces at least one of the OWASP Top 10 vulnerabilities, and that rate has stagnated for two years despite noticeably better syntax (Veracode GenAI Code Security Report).

The goal is not to avoid AI. In Germany, per Bitkom, 36% of companies now use AI — almost twice as many as a year earlier. The goal is to control AI like any other production dependency: with clear rules, ownership and technical guardrails.

Where the real risks arise — not in the tool

AI risk rarely comes from a single tool — it comes from missing boundaries, missing tests and missing ownership. The sober numbers help sharpen the gut feeling. Veracode had more than 100 models solve 80 coding tasks and checked the result against the OWASP Top 10: 45% of solutions were insecure, Java's failure rate reached 72%, and cross-site scripting and log injection exceeded 85%.

A second risk is more subtle. A study on package hallucinations presented at USENIX Security 2025 generated 2.23 million code samples — 19.7% referenced a package that simply does not exist. Attackers pre-register those invented names on npm or PyPI ("slopsquatting"). Adopt AI suggestions unchecked, and in the worst case you pull a stranger's code into your supply chain.

Key metrics on AI code risk: 45 percent of generated code carries a vulnerability, 19.7 percent hallucinated packages, 55 percent stagnant security pass rate, 72 percent failure rate for Java. — AI code risk in numbers. Sources: Veracode GenAI Code Security (2025/26), USENIX Security 2025. As of June 2026, no warranty.

The lesson is not "AI writes bad code" but: AI output is a suggestion, not a finished result. That is exactly what reviews, tests and clear rules are for — just as we do in a code audit anyway.

Five risk categories traditional processes overlook

Most AI incidents map to five categories — and each has a concrete countermeasure. Cover these five fields systematically and diffuse "AI anxiety" turns into a manageable risk profile.

Risk category	What happens	Countermeasure
Privacy	logs, customer data, schemas land in external tools	data classes, anonymization, EU hosting, clear approvals
Hallucinations	invented APIs, packages, wrong versions	reference sources, require tests, compare with docs
Insecure code	broad permissions, weak error handling, weak typing	architecture-aware review, secret protection, edge-case tests
Prompt injection	hidden instructions in content or requests	hard tool boundaries, approvals, input and output filters
Traceability	nobody knows what the system saw and why	audit logs, prompt versioning, monitoring, approval history

Order matters: privacy and traceability are architecture decisions that belong at the start — not something you bolt on shortly before launch.

Prompt injection: the new attack model for AI agents

As soon as an AI system reads external content or uses tools, it gains an attack surface that classical software does not have. A document, website or user request can carry hidden instructions that push the model toward unintended behavior. In the OWASP Top 10 for LLM Applications 2025, prompt injection ranks first (LLM01) — not by accident, since the stochastic nature of the models means there is no foolproof remedy.

In practice: an AI agent must not automatically do everything just because a document says so. What works is defense in depth — input validation, output filtering, restricted permissions and human approval for sensitive operations. The more tool access an agent has, the tighter the guardrails must be. How that looks in production AI agents in the enterprise depends directly on which actions are reversible and which are not.

EU AI Act: risk-based — and the timeline moved in 2026

The EU AI Act follows a risk-based approach: not every AI application carries the same obligations. An internal code-summary tool is assessed differently from a system that screens applicants, prepares credit decisions or supports medical recommendations. And the timeline moved noticeably in 2026: with the Digital Omnibus — a provisional agreement between Council and Parliament of 7 May 2026 — the high-risk obligations are deferred.

EU AI Act timeline: GPAI obligations since August 2025, high-risk under Annex III deferred to December 2027, AI in regulated products to August 2028. — The EU AI Act timeline after the Digital Omnibus (provisional agreement 7 May 2026). Source: Council of the EU. As of June 2026, formal adoption pending.

Concretely: the obligations for stand-alone high-risk systems under Annex III (recruiting, credit scoring, education and more) move from August 2026 to December 2027, and for AI embedded in regulated products under Annex I to August 2028 (timeline overview). The GPAI rules have applied since August 2025 and the transparency obligations largely stay on the original schedule. For companies the deferral changes little about the principle: what is the system's purpose, who is affected, are decisions automated, is there human approval? That classification belongs at the start, not shortly before launch.

Governance that is not bureaucratic

AI governance does not have to be heavy — it has to be practical. A good start rests on five building blocks: an AI policy (allowed tools, data classes, approvals), a use-case assessment before implementation, technical guardrails (roles, permissions, logging, tool boundaries), quality assurance (tests, evaluation sets, human reviews) and defined operations (monitoring, error analysis, ownership).

On the technical level, AI follows the same rules as any production system — with a few extras:

no secrets in prompts, server-side tool execution with permission checks
separate development and production environments
input and output validation, rate and cost limits
logging without unnecessary personal data
human approval for irreversible actions
tests against prompt injection and unauthorized tool use

Local or private models are not required for every project, but make sense when data must not leave the company or industry-specific compliance imposes strict requirements — for internal knowledge search, contract analysis or sensitive product data. How we think about that alongside privacy and operations is shown in our AI integration.

Next steps

Three questions quickly show how large your AI risk really is today:

Data: Do you know which data flows into which tools — and which data must not leave the company?
Code & agents: Is AI output reviewed, tested and bounded by hard tool limits before it reaches production?
Ownership: Does every AI system have a clear classification, audit logs and a responsible person?

Wherever you hesitate, it's worth a closer look. We treat AI projects as software projects with special requirements — pragmatically, with an eye on roadmap and budget. Take a look at our AI integration or book an intro call directly.

Frequently asked questions

Conclusion

AI risk rarely comes from a single tool. It comes from missing boundaries, missing tests and missing ownership. With practical governance — policy, use-case assessment, technical guardrails, reviews and operations — AI becomes usable in production instead of an unpredictable variable.

Written by

Marius Gill

Geschäftsführer und Softwareentwickler mit über 10 Jahren Erfahrung

Share

All posts

Keep reading

Let's talk about your project

Book a 30-minute discovery call. We'll review your goals, surface unknowns, and outline how we would run the engagement.

Schedule a call

Booking calendar (Cal.com)

This area embeds the external service Cal.com. By loading it you agree that a connection to Cal.com is established and data may be transferred to the USA.

Privacy policy