Stop Waiting for Perfect Data to Start with AI

Somewhere on your roadmap, an AI initiative has been parked for a quarter or more, waiting for a data project to clear. The pitch was reasonable when you signed it. Before the AI work could start, the firm explained, the data had to be fixed. A readiness assessment would scope the gap. A modernization program would close it. The AI roadmap would follow. Eighteen months and six or seven figures later, the AI work would begin. You agreed because the firm was credible and the logic sounded right. Now you have a quarterly review next week, a board that wants to see AI in production somewhere, and a roadmap that keeps slipping to the right.

The default consultant pitch in 2026 is that before you can do AI, you have to fix your data. For a real but narrow set of AI work, that pitch is correct. For most of the AI value an operator can capture this quarter, it is not. The difference is which workflow you are talking about, and almost no one making the data-first pitch is being precise about that.

The pattern is mature enough to name

The data-readiness assessment as a precursor to AI is now a recognizable category. Information-architecture and data-modernization consultancies have packaged it as a service. Large advisory firms have productized it. The shape is consistent across providers. A maturity model with several levels. Foundational data readiness as the first level. AI deployment somewhere later, conditional on clearing the levels below. The pattern has spread far enough that operators can name it, which is what makes a piece like this possible. You have either bought a version of it, been pitched a version of it, or watched a peer company budget for one.

Two structural facts make the asymmetry visible.

The price of the workflow tooling has collapsed. Operator-grade AI tools are at $20 to $30 per user per month. Claude Team. ChatGPT Business. Gemini Workspace add-ons. They accept uploaded PDFs, Word files, and spreadsheets and produce structured extraction without a single line of enterprise data integration.
The price of the wait has not. Mid-market data-modernization projects are routinely scoped at six or seven figures over nine to eighteen months. When the practical workflow can ship in two weeks at a $200-a-month team licence, asking whether you should be waiting at all stops being abstract. It is a calendar decision worth a quarter of your AI strategy.

Both numbers are public. The asymmetry is not subtle. What the data-first pitch quietly assumes, and what most operators do not push back on, is that the workflow they are trying to ship needed to wait in the first place.

The right scope is the workflow, not the enterprise

Architech ran the experiment on this in our own engagement work. The first AI Jumpstart engagement we delivered was scoped the way every major firm scopes AI advisory: enterprise readiness across six pillars (technology, data, IT capability, leadership, culture, governance). That was the industry-default frame at the time. Across subsequent engagements, what we were assessing progressively shrank. Not the methodology, not the rigour. The scope.

What we found was steady. At the enterprise level, AI readiness is a multi-quarter, multi-pillar question with no clear unlock. At the workflow level, AI readiness is a one-week question with a concrete answer. Same client, same data, same team. Narrow the scope from the company to a single workflow and the readiness barrier either disappears or becomes a small, scoped engineering task. AI Jumpstart sits in the operator's calendar today as a decision accelerator, ending in a decision to redesign a specific workflow or a decision to stop. AI Foundations is the conditional, workflow-specific production-readiness step that follows when a workflow needs it. It is explicitly not a mandatory precondition. Both names are artifacts of that scope shift, not the cause of it.

Six pillars asked at the enterprise vs the same six pillars asked at the workflow - the scope shift The shift mattered because it lined up with where the value actually lives. McKinsey's March 2025 State of AI tested 25 organizational attributes against EBIT impact from gen AI.

Workflow redesign had the biggest single effect on EBIT. Only 21% of gen-AI adopters have fundamentally redesigned at least some workflows.

The other roughly 79% have layered AI onto existing processes or are still waiting on the foundation. BCG's 10-20-70 framework, published in September 2025, lands the same point from the cost side. About 10% of AI value comes from algorithms, 20% from data and technology, and 70% from people and processes. Data and technology are real and they matter. They are also one fifth of where the value is. The data-first pitch routinely treats them as the whole.

Two buckets, calibrated

There are two kinds of AI workflow on your roadmap.

Bucket A. Workflows where the data-first wait is misapplied.

Drafting, summarization, and structured extraction from documents the operator uploads. A finance team pulling figures out of vendor invoices. An HR team drafting first-pass policy responses against an employee handbook. The data is operator-controlled and already on a laptop or in a SharePoint folder. There is no enterprise data integration to wait on.
Analyst-grade synthesis from operator inputs. HubSpot deployed Claude across customer success, marketing, and engineering by connecting it to existing internal services. Customer success managers cut escalation troubleshooting from three to five days down to under an hour. The unlock was a connector to data the firm already controlled, not enterprise data modernization.
Role-emulation and thinking-partner agents. Zapier deployed more than 800 internal Claude-driven agents across engineering, marketing, and customer success. Internal task volume completed via Claude grew tenfold year over year. None of it required an enterprise-wide data overhaul to ship.
Decision-staging on inputs the operator already controls. A pricing model fed by the operator's spreadsheet. A sales-call coaching agent fed by the operator's CRM exports. The data does not have to be made enterprise-ready to be used. It has to be made workflow-ready, which is a smaller question.

Bucket B. Workflows where data-first really is the right call.

Customer-360, regulated decisioning, fraud risk, and AI-as-system-of-record. When the AI output becomes the audited record of an entity (customer, claim, asset, payment), the foundation has to hold up under audit. A unified, governed, queryable data layer is not optional. The pile-on of vendor and analyst voices on this point is consistent and well-defended.
Agentic AI at scale across heterogeneous systems. When agents reason across many enterprise systems on behalf of the operator, the data-context problem is real. As a16z's Cui and Li put it, data and analytics agents are essentially useless without the right context. The data-readiness work is real and serious here.

The argument is not that the data-first pitch is wrong. The argument is that it is being applied to workflows where it does not belong.

Two-bucket taxonomy of AI workflows, where the data-first wait is misapplied vs where it is the right call

The operating example

We worked with a professional services firm to redesign how they produced sales proposals at scale. The data the redesigned workflow needed was past proposals and a service catalog. Both were already on disk. The firm's legacy ERP system carried substantial data-quality issues and was not in scope for this workflow. Had readiness been scoped at the enterprise level, the ERP issues would have blocked the AI work for quarters. Scoped at the workflow level, the ERP was simply not part of this workflow's data surface.

The redesigned workflow cut effort to produce each proposal by approximately 80% and shifted non-billable senior-staff time to billable, generating multi-million-dollar annualized ROI. The data foundation the data-first pitch would have insisted on was not the bottleneck. The workflow design was. The same logic plays out in our own operation. The pipeline shipping the post you are reading runs on git, role-separated agents, an external LLM, an external image model, and an external CMS. There is no enterprise data lake. There is no customer-360. There is no data-modernization predicate. We told the longer version of that story in the Customer Zero piece a few weeks ago. It is a second example of the same workflow-scoped pattern.

The strongest counter-position, taken seriously

The data-first counter-POV is not weak and it is not held by quiet voices.

Through 2026, Gartner expects organizations to abandon 60% of AI projects unsupported by AI-ready data.

Bain frames a robust data strategy and operating model as core enablers of AI value realization, not a nice-to-have. Fortune's coverage of agentic AI at scale puts it in sharper terms still: what looked like a capability problem is revealed to be a data infrastructure problem, a failure of accessible, consistent, and usable data across systems, and no amount of model improvement will solve it. The same article notes that 80% of companies cite data limitations as the primary obstacle to scaling AI.

That position is correct, taken on its own terms. For any AI workflow that becomes the system of record, makes regulated decisions, depends on a unified view of an entity, or operates as an autonomous agent across many enterprise systems, the data foundation has to come first. Skipping it is malpractice. The composite steelman has to land, and it does.

The piece's claim is narrower. The data-first pitch is misapplied when it is applied universally. It is the right call for Bucket B. It is the wrong call for Bucket A, and Bucket A is where most of the operator AI value an enterprise can capture this quarter actually lives. Practitioners closer to the work say versions of the same thing. One Salesforce SVP, interviewed in CIO, described the posture as reverse-engineering from what you need: put something in production, observe it, scale it, then put in the next one. One head of group technology strategy at a large bank made the case that rather than treating imperfect data as a constraint, organizations can ask how AI might help improve and connect the data they already have. One CIO contributor warned that if ownership is unclear or quality is unknown, the issue is a data governance problem wearing an AI costume, and the diagnostic is workflow-specific, not enterprise-wide. One practitioner essay argued that the AI-first versus data-first debate is fundamentally flawed because it frames the problem as a sequencing decision when in reality it is a systems design problem.

Same answer either way: focus on the workflow, not the enterprise.

The question to take into your week

Pull up the AI initiative at your company that has been parked the longest. Look at why it is parked. If the answer is some version of "we are waiting for the data project," ask your AI lead one question. What data does this workflow actually need, and where does it already live? The answer will be one of two things. Either the workflow lives in Bucket B, the data really is the bottleneck, and the modernization program is the right path, in which case the parked status is correct and the calendar is real. Or the workflow lives in Bucket A, the data it needs is already on disk somewhere your team controls, and the foundation you are paying to build is blocking something that does not need it.

That second answer is more common than the data-first pitch admits. When you find one, you have a workflow you can ship in two weeks at a $200-a-month team licence, not in 18 months at six or seven figures. The same operator-as-judge posture the one-claim-test piece argued from the homepage side applies here. Test us on it. If the answer to the question turns out to be "we do not actually need the assessment for this one," reach out. That is the conversation worth having before next quarter's review.