Back to blog

Buy vs build agentic AI: a single-workflow decision

Antoni Kozelski

CEO & Co-founder

Bartosz Adam Gonczarek

Chief Transformation Officer and Co-founder of Vstorm

July 3, 2026

Category Post

Agentic AI AI Advisory

TL;DR

Every organisation automating a workflow with agentic AI faces the same fork: buy or build. With Gartner projecting more than 40% of agentic AI projects cancelled by 2027, a wrong call is a written-off investment, not a delayed feature. This commentary gives a repeatable, per-workflow framework: name the four real paths, score the workflow on a difficulty matrix from four to twelve, weigh data sensitivity and model choice, and watch the lock-in trap that turns a fast start into a costly dependency. The rule: automate a discrete task, buy; rethink operations, build.

Table of content

The fork: one workflow, two paths

An organisation identifies a single workflow worth automating with agentic AI, and immediately hits the same fork every operations leader now faces: should we buy vs build agentic AI for this process, or assemble it ourselves?

The stakes are not abstract. Gartner projects that more than 40% of agentic AI projects will be cancelled by the end of 2027, citing escalating costs, unclear value, and inadequate risk controls. A wrong call here is not a delayed feature, but a substantial financial loss.

This commentary offers a repeatable way to make that call for one workflow at a time, before scaling the pattern across other business processes. This is not a universal verdict, because the right answer changes with the workflow in front of you.

How the buy-vs-build call gets made today

In most organisations, the decision is not made with a framework at all. It is made by decision makers reacting to whichever vendor presented most recently, by board pressure to ship a visible, AI-driven win, or by the platform a team already holds a licence for.

The result is predictable: enterprises now run a mix of bought and built agentic workflows, many of them generative AI pilots that graduated into production, yet almost none of them arrived at that hybrid state deliberately. They accumulated it by accident, one tool here and one custom build there, with no shared observability and no governance across the seams.

That accidental drift is the real cost. The framework below replaces it with a deliberate, per-workflow decision.

The four paths and their trade-offs

There are four realistic routes to process automation with agentic AI, not two. Naming all four prevents the false binary that traps most teams into agentic AI workflow automation projects they later regret.

Integrate. Buy a ready system and integrate it (Palantir, Automation Anywhere, UiPath).
Low/no-code. Buy a low or no-code platform and implement it yourself (Botpress, CrewAI).
Build bespoke. Engineer a tailored solution, the path Vstorm delivers through multi-agent system development.
OSS self-build. Take half-ready open-source and build on top of it with an in-house team.

Each path trades the same set of variables against one another:

Dimension	Integrate	Low/no-code	Build bespoke	OSS self-build
Implementation cost	High licence, low engineering	Low to start	Higher upfront	Low licence, high engineering
Ongoing cost	Recurring, vendor-set	Recurring, scales with use	Owned, predictable	Owned, plus upkeep burden
Control	Low	Limited to platform	Full	Full
Speed-to-market	Fast for templated fit	Fastest prototype	Moderate	Slowest
IP ownership	Vendor	Vendor	You	You
Lock-in risk	High	High	Low	Low
Customisation	Bounded by product	Bounded by platform	Unbounded	High, effort-dependent
Accuracy and reliability	Strong on standard cases	Degrades on edge cases	Tuned to your data	Depends on in-house skill

Scoring the workflow: a difficulty matrix

Before choosing a path, score the workflow itself. Four properties determine how hard it is to automate well: the number of steps, the volume of edge cases, the integrations required with other enterprise systems, and whether the work is deterministic or judgment-heavy, which sets how much human oversight it will need. Data preparation, a familiar burden from machine learning projects, compounds the difficulty further; on AI agent projects it can account for up to 80% of total effort.

The difficulty matrix below breaks a workflow into four measurable dimensions. Each dimension gets a score from one to three, and the total tells an operator, before any engineering work begins, whether the workflow sits in off-the-shelf territory, a hybrid zone, or the range where custom agentic engineering earns its cost.

Score the four dimensions

Rate the workflow against each dimension and add the results together for a total between four and twelve.

Dimension	Score 1 (low)	Score 2 (medium)	Score 3 (high)
Number of steps	One to three steps, single system	Four to eight steps, sequential logic	Nine or more steps, branching sub-processes
Edge case frequency	Rare, covered by existing rules	Regular, follows known patterns	Frequent and unpredictable
Required integrations	Single system or none	Two to three systems, standard APIs	Four or more systems, legacy or undocumented
Determinism vs. judgment	Fully rule-based	Mostly rule-based, occasional discretion	Requires contextual reasoning or domain expertise

Ready to see how agentic AI transforms business workflows?

Meet directly with our founders and PhD AI engineers. We will demonstrate real implementations from 30+ agentic projects and show you the practical steps to integrate them into your specific workflows—no hypotheticals, just proven approaches.

Book your session today

Read the total

The total places a workflow into one of three zones.

Total score	What it means
4 to 6	An off-the-shelf tool such as Zapier, Make, or n8n likely covers the workflow without custom engineering.
7 to 9	An off-the-shelf tool covers part of the workflow, but reliability gaps appear at the edges. Configuration decides whether the workflow holds up in production.
10 to 12	The workflow needs agentic AI built for the specific process, systems, and exception patterns involved.

Most process failures we encounter trace back to skipping this step. A team adopts a workflow builder because it worked for a simple use case, then applies the same tool to a workflow scoring nine or ten, and the tool cannot hold up once real edge cases and multi-system dependencies show up in production. An agent working across four or more systems rarely survives that mismatch.

Two factors the matrix does not capture

Two considerations sit outside the difficulty grid and can override it.

Data sensitivity: How confidential or regulated the data is, whether PII, financial, health, or trade secrets, and where it is permitted to flow. High sensitivity pushes toward building or self-hosting to keep control over data residency and processing, including the audit trails regulators expect. Low sensitivity widens the buy options.

Model choice: The criteria for selecting an AI model, including capability, cost per call, latency, the large language model’s context window, and hosting terms, apply on both paths. Framing it as an explicit decision surfaces a hidden constraint: a vendor that locks you to a single model removes your ability to switch when prices change or a better model ships. This means keeping the architecture model-agnostic is a critical design choice.

The lock-in trap

The fastest path can quietly become the most expensive one. Agentic AI vendor lock-in is the risk of becoming dependent on a single vendor, model, or proprietary stack that is painful and costly to leave. Time-to-market pressure is what cements it: teams ship fast on an AI platform, then discover the customisation ceiling and the renewal invoice twelve months later, once the dependency is irreversible.

The numbers bear this out. A Zapier survey of enterprise AI adopters found that 81% of enterprise leaders are concerned about AI vendor dependency, yet only 6% believe they could switch providers without material disruption.

The repricing risk is real, not theoretical: in April 2026, Anthropic moved its Claude enterprise edition from fixed to dynamic usage-based pricing, which observers expect may double or triple costs for heavy users.

We have watched clients escape this trap. The AI-powered order recommendation agent we built for Mixam required a migration off a low/no-code platform onto an owned, bespoke system, recorded in our order recommendation case study. Ownership returned both pricing control and customisation depth to the company’s hands.

We keep our clients on open-source architecture for one reason: the freedom to change the model, the vendor, or the direction without rebuilding from scratch.
Wojciech Achtelik PhD(c), AI Engineer Lead, Vstorm

The decision rule: task or operating model

The heuristic that ties everything together is a question of intent, not technology. If the goal is to automate a discrete set of repetitive tasks, buy. If the goal is to rethink how operations run, and the workflow is core to that vision, build.

The economics mirror the rule: practitioner frameworks place the build-versus-buy total-cost crossover near one million agent conversations per year, below which buying wins on speed, and above which an owned system pulls ahead over the long term. Strategic centrality and volume point the same way.

When the answer is build, the commitment is real. The concerns that follow; the technology stack, keeping the system model-agnostic, the architecture, orchestration, and governance and observability; are covered by our TriStorm methodology, demonstrated in the text-to-workflow platform we engineered. This article frames which workflows earn that commitment, and which are better served by an off-the-shelf tool.