Buy vs build agentic AI: a single-workflow decision

Every organisation automating a workflow with agentic AI faces the same fork: buy or build. With Gartner projecting more than 40% of agentic AI projects cancelled by 2027, a wrong call is a written-off investment, not a delayed feature. This commentary gives a repeatable, per-workflow framework: name the four real paths, score the workflow on a difficulty matrix from four to twelve, weigh data sensitivity and model choice, and watch the lock-in trap that turns a fast start into a costly dependency. The rule: automate a discrete task, buy; rethink operations, build.
The fork: one workflow, two paths
An organisation identifies a single workflow worth automating with agentic AI, and immediately hits the same fork every operations leader now faces: should we buy vs build agentic AI for this process, or assemble it ourselves?
The stakes are not abstract. Gartner projects that more than 40% of agentic AI projects will be cancelled by the end of 2027, citing escalating costs, unclear value, and inadequate risk controls. A wrong call here is not a delayed feature, but a substantial financial loss.
This commentary offers a repeatable way to make that call for one workflow at a time, before scaling the pattern across other business processes. This is not a universal verdict, because the right answer changes with the workflow in front of you.
How the buy-vs-build call gets made today
In most organisations, the decision is not made with a framework at all. It is made by decision makers reacting to whichever vendor presented most recently, by board pressure to ship a visible, AI-driven win, or by the platform a team already holds a licence for.
The result is predictable: enterprises now run a mix of bought and built agentic workflows, many of them generative AI pilots that graduated into production, yet almost none of them arrived at that hybrid state deliberately. They accumulated it by accident, one tool here and one custom build there, with no shared observability and no governance across the seams.
That accidental drift is the real cost. The framework below replaces it with a deliberate, per-workflow decision.
The four paths and their trade-offs
There are four realistic routes to process automation with agentic AI, not two. Naming all four prevents the false binary that traps most teams into agentic AI workflow automation projects they later regret.
- Integrate. Buy a ready system and integrate it (Palantir, Automation Anywhere, UiPath).
- Low/no-code. Buy a low or no-code platform and implement it yourself (Botpress, CrewAI).
- Build bespoke. Engineer a tailored solution, the path Vstorm delivers through multi-agent system development.
- OSS self-build. Take half-ready open-source and build on top of it with an in-house team.
Each path trades the same set of variables against one another:
Dimension |
Integrate |
Low/no-code |
Build bespoke |
OSS self-build |
Implementation cost |
High licence, low engineering |
Low to start |
Higher upfront |
Low licence, high engineering |
Ongoing cost |
Recurring, vendor-set |
Recurring, scales with use |
Owned, predictable |
Owned, plus upkeep burden |
Control |
Low |
Limited to platform |
Full |
Full |
Speed-to-market |
Fast for templated fit |
Fastest prototype |
Moderate |
Slowest |
IP ownership |
Vendor |
Vendor |
You |
You |
Lock-in risk |
High |
High |
Low |
Low |
Customisation |
Bounded by product |
Bounded by platform |
Unbounded |
High, effort-dependent |
Accuracy and reliability |
Strong on standard cases |
Degrades on edge cases |
Tuned to your data |
Depends on in-house skill |
Scoring the workflow: a difficulty matrix
Before choosing a path, score the workflow itself. Four properties determine how hard it is to automate well: the number of steps, the volume of edge cases, the integrations required with other enterprise systems, and whether the work is deterministic or judgment-heavy, which sets how much human oversight it will need. Data preparation, a familiar burden from machine learning projects, compounds the difficulty further; on AI agent projects it can account for up to 80% of total effort.
The difficulty matrix below breaks a workflow into four measurable dimensions. Each dimension gets a score from one to three, and the total tells an operator, before any engineering work begins, whether the workflow sits in off-the-shelf territory, a hybrid zone, or the range where custom agentic engineering earns its cost.
Score the four dimensions
Rate the workflow against each dimension and add the results together for a total between four and twelve.
Dimension |
Score 1 (low) |
Score 2 (medium) |
Score 3 (high) |
Number of steps |
One to three steps, single system |
Four to eight steps, sequential logic |
Nine or more steps, branching sub-processes |
Edge case frequency |
Rare, covered by existing rules |
Regular, follows known patterns |
Frequent and unpredictable |
Required integrations |
Single system or none |
Two to three systems, standard APIs |
Four or more systems, legacy or undocumented |
Determinism vs. judgment |
Fully rule-based |
Mostly rule-based, occasional discretion |
Requires contextual reasoning or domain expertise |
Ready to see how agentic AI transforms business workflows?
Meet directly with our founders and PhD AI engineers. We will demonstrate real implementations from 30+ agentic projects and show you the practical steps to integrate them into your specific workflows—no hypotheticals, just proven approaches.
Read the total
The total places a workflow into one of three zones.
Total score |
What it means |
4 to 6 |
An off-the-shelf tool such as Zapier, Make, or n8n likely covers the workflow without custom engineering. |
7 to 9 |
An off-the-shelf tool covers part of the workflow, but reliability gaps appear at the edges. Configuration decides whether the workflow holds up in production. |
10 to 12 |
The workflow needs agentic AI built for the specific process, systems, and exception patterns involved. |
Most process failures we encounter trace back to skipping this step. A team adopts a workflow builder because it worked for a simple use case, then applies the same tool to a workflow scoring nine or ten, and the tool cannot hold up once real edge cases and multi-system dependencies show up in production. An agent working across four or more systems rarely survives that mismatch.
Two factors the matrix does not capture
Two considerations sit outside the difficulty grid and can override it.
Data sensitivity: How confidential or regulated the data is, whether PII, financial, health, or trade secrets, and where it is permitted to flow. High sensitivity pushes toward building or self-hosting to keep control over data residency and processing, including the audit trails regulators expect. Low sensitivity widens the buy options.
Model choice: The criteria for selecting an AI model, including capability, cost per call, latency, the large language model’s context window, and hosting terms, apply on both paths. Framing it as an explicit decision surfaces a hidden constraint: a vendor that locks you to a single model removes your ability to switch when prices change or a better model ships. This means keeping the architecture model-agnostic is a critical design choice.
The lock-in trap
The fastest path can quietly become the most expensive one. Agentic AI vendor lock-in is the risk of becoming dependent on a single vendor, model, or proprietary stack that is painful and costly to leave. Time-to-market pressure is what cements it: teams ship fast on an AI platform, then discover the customisation ceiling and the renewal invoice twelve months later, once the dependency is irreversible.
The numbers bear this out. A Zapier survey of enterprise AI adopters found that 81% of enterprise leaders are concerned about AI vendor dependency, yet only 6% believe they could switch providers without material disruption.
The repricing risk is real, not theoretical: in April 2026, Anthropic moved its Claude enterprise edition from fixed to dynamic usage-based pricing, which observers expect may double or triple costs for heavy users.
We have watched clients escape this trap. The AI-powered order recommendation agent we built for Mixam required a migration off a low/no-code platform onto an owned, bespoke system, recorded in our order recommendation case study. Ownership returned both pricing control and customisation depth to the company’s hands.
We keep our clients on open-source architecture for one reason: the freedom to change the model, the vendor, or the direction without rebuilding from scratch.
Wojciech Achtelik PhD(c), AI Engineer Lead, Vstorm
The decision rule: task or operating model
The heuristic that ties everything together is a question of intent, not technology. If the goal is to automate a discrete set of repetitive tasks, buy. If the goal is to rethink how operations run, and the workflow is core to that vision, build.
The economics mirror the rule: practitioner frameworks place the build-versus-buy total-cost crossover near one million agent conversations per year, below which buying wins on speed, and above which an owned system pulls ahead over the long term. Strategic centrality and volume point the same way.
When the answer is build, the commitment is real. The concerns that follow; the technology stack, keeping the system model-agnostic, the architecture, orchestration, and governance and observability; are covered by our TriStorm methodology, demonstrated in the text-to-workflow platform we engineered. This article frames which workflows earn that commitment, and which are better served by an off-the-shelf tool.
Ready to see how agentic AI transforms business workflows?
Meet directly with our founders and PhD AI engineers. We will demonstrate real implementations from 30+ agentic projects and show you the practical steps to integrate them into your specific workflows—no hypotheticals, just proven approaches.
Summarize with AI
The LLM Book
The LLM Book explores the world of Artificial Intelligence and Large Language Models, examining their capabilities, technology, and adaptation.



