Buy vs build agentic AI: a single-workflow decision

Antoni Kozelski
CEO & Co-founder
Bartosz Gonczarek Autor
Bartosz Adam Gonczarek
Vice President, Co-founder
July 3, 2026
AFDD E EA BAFE BCECEB c
Category Post
TL;DR

Every organisation automating a workflow with agentic AI faces the same fork: buy or build. With Gartner projecting more than 40% of agentic AI projects cancelled by 2027, a wrong call is a written-off investment, not a delayed feature. This commentary gives a repeatable, per-workflow framework: name the four real paths, score the workflow on a difficulty matrix from four to twelve, weigh data sensitivity and model choice, and watch the lock-in trap that turns a fast start into a costly dependency. The rule: automate a discrete task, buy; rethink operations, build.

Table of content

The fork: one workflow, two paths

An organisation identifies a single workflow worth automating with agentic AI, and immediately hits the same fork every operations leader now faces: should we buy vs build agentic AI for this process, or assemble it ourselves?

The stakes are not abstract. Gartner projects that more than 40% of agentic AI projects will be cancelled by the end of 2027, citing escalating costs, unclear value, and inadequate risk controls. A wrong call here is not a delayed feature, but a substantial financial loss.

This commentary offers a repeatable way to make that call for one workflow at a time, before scaling the pattern across other business processes. This is not a universal verdict, because the right answer changes with the workflow in front of you.

How the buy-vs-build call gets made today

In most organisations, the decision is not made with a framework at all. It is made by decision makers reacting to whichever vendor presented most recently, by board pressure to ship a visible, AI-driven win, or by the platform a team already holds a licence for.

The result is predictable: enterprises now run a mix of bought and built agentic workflows, many of them generative AI pilots that graduated into production, yet almost none of them arrived at that hybrid state deliberately. They accumulated it by accident, one tool here and one custom build there, with no shared observability and no governance across the seams.

That accidental drift is the real cost. The framework below replaces it with a deliberate, per-workflow decision.

The four paths and their trade-offs

There are four realistic routes to process automation with agentic AI, not two. Naming all four prevents the false binary that traps most teams into agentic AI workflow automation projects they later regret.

  • Integrate. Buy a ready system and integrate it (Palantir, Automation Anywhere, UiPath).
  • Low/no-code. Buy a low or no-code platform and implement it yourself (Botpress, CrewAI).
  • Build bespoke. Engineer a tailored solution, the path Vstorm delivers through multi-agent system development.
  • OSS self-build. Take half-ready open-source and build on top of it with an in-house team.

Each path trades the same set of variables against one another:

Dimension

Integrate

Low/no-code

Build bespoke

OSS self-build

Implementation cost

High licence, low engineering

Low to start

Higher upfront

Low licence, high engineering

Ongoing cost

Recurring, vendor-set

Recurring, scales with use

Owned, predictable

Owned, plus upkeep burden

Control

Low

Limited to platform

Full

Full

Speed-to-market

Fast for templated fit

Fastest prototype

Moderate

Slowest

IP ownership

Vendor

Vendor

You

You

Lock-in risk

High

High

Low

Low

Customisation

Bounded by product

Bounded by platform

Unbounded

High, effort-dependent

Accuracy and reliability

Strong on standard cases

Degrades on edge cases

Tuned to your data

Depends on in-house skill

Scoring the workflow: a difficulty matrix

Before choosing a path, score the workflow itself. Four properties determine how hard it is to automate well: the number of steps, the volume of edge cases, the integrations required with other enterprise systems, and whether the work is deterministic or judgment-heavy, which sets how much human oversight it will need. Data preparation, a familiar burden from machine learning projects, compounds the difficulty further; on AI agent projects it can account for up to 80% of total effort.

The difficulty matrix below breaks a workflow into four measurable dimensions. Each dimension gets a score from one to three, and the total tells an operator, before any engineering work begins, whether the workflow sits in off-the-shelf territory, a hybrid zone, or the range where custom agentic engineering earns its cost.

Score the four dimensions

Rate the workflow against each dimension and add the results together for a total between four and twelve.

Dimension

Score 1 (low)

Score 2 (medium)

Score 3 (high)

Number of steps

One to three steps, single system

Four to eight steps, sequential logic

Nine or more steps, branching sub-processes

Edge case frequency

Rare, covered by existing rules

Regular, follows known patterns

Frequent and unpredictable

Required integrations

Single system or none

Two to three systems, standard APIs

Four or more systems, legacy or undocumented

Determinism vs. judgment

Fully rule-based

Mostly rule-based, occasional discretion

Requires contextual reasoning or domain expertise

Ready to see how agentic AI transforms business workflows?

Meet directly with our founders and PhD AI engineers. We will demonstrate real implementations from 30+ agentic projects and show you the practical steps to integrate them into your specific workflows—no hypotheticals, just proven approaches.

Read the total

The total places a workflow into one of three zones.

Total score

What it means

4 to 6

An off-the-shelf tool such as Zapier, Make, or n8n likely covers the workflow without custom engineering.

7 to 9

An off-the-shelf tool covers part of the workflow, but reliability gaps appear at the edges. Configuration decides whether the workflow holds up in production.

10 to 12

The workflow needs agentic AI built for the specific process, systems, and exception patterns involved.

Most process failures we encounter trace back to skipping this step. A team adopts a workflow builder because it worked for a simple use case, then applies the same tool to a workflow scoring nine or ten, and the tool cannot hold up once real edge cases and multi-system dependencies show up in production. An agent working across four or more systems rarely survives that mismatch.

Two factors the matrix does not capture

Two considerations sit outside the difficulty grid and can override it.

Data sensitivity: How confidential or regulated the data is, whether PII, financial, health, or trade secrets, and where it is permitted to flow. High sensitivity pushes toward building or self-hosting to keep control over data residency and processing, including the audit trails regulators expect. Low sensitivity widens the buy options.

Model choice: The criteria for selecting an AI model, including capability, cost per call, latency, the large language model’s context window, and hosting terms, apply on both paths. Framing it as an explicit decision surfaces a hidden constraint: a vendor that locks you to a single model removes your ability to switch when prices change or a better model ships. This means keeping the architecture model-agnostic is a critical design choice.

The lock-in trap

The fastest path can quietly become the most expensive one. Agentic AI vendor lock-in is the risk of becoming dependent on a single vendor, model, or proprietary stack that is painful and costly to leave. Time-to-market pressure is what cements it: teams ship fast on an AI platform, then discover the customisation ceiling and the renewal invoice twelve months later, once the dependency is irreversible.

The numbers bear this out. A Zapier survey of enterprise AI adopters found that 81% of enterprise leaders are concerned about AI vendor dependency, yet only 6% believe they could switch providers without material disruption.

The repricing risk is real, not theoretical: in April 2026, Anthropic moved its Claude enterprise edition from fixed to dynamic usage-based pricing, which observers expect may double or triple costs for heavy users.

We have watched clients escape this trap. The AI-powered order recommendation agent we built for Mixam required a migration off a low/no-code platform onto an owned, bespoke system, recorded in our order recommendation case study. Ownership returned both pricing control and customisation depth to the company’s hands.

We keep our clients on open-source architecture for one reason: the freedom to change the model, the vendor, or the direction without rebuilding from scratch.

Wojciech Achtelik PhD(c), AI Engineer Lead, Vstorm

The decision rule: task or operating model

The heuristic that ties everything together is a question of intent, not technology. If the goal is to automate a discrete set of repetitive tasks, buy. If the goal is to rethink how operations run, and the workflow is core to that vision, build.

The economics mirror the rule: practitioner frameworks place the build-versus-buy total-cost crossover near one million agent conversations per year, below which buying wins on speed, and above which an owned system pulls ahead over the long term. Strategic centrality and volume point the same way.

When the answer is build, the commitment is real. The concerns that follow; the technology stack, keeping the system model-agnostic, the architecture, orchestration, and governance and observability; are covered by our TriStorm methodology, demonstrated in the text-to-workflow platform we engineered. This article frames which workflows earn that commitment, and which are better served by an off-the-shelf tool.

Ready to see how agentic AI transforms business workflows?

Meet directly with our founders and PhD AI engineers. We will demonstrate real implementations from 30+ agentic projects and show you the practical steps to integrate them into your specific workflows—no hypotheticals, just proven approaches.

Last updated: July 3, 2026

The LLM Book

The LLM Book explores the world of Artificial Intelligence and Large Language Models, examining their capabilities, technology, and adaptation.

Read it now