Before you sign: A CIO’s framework for evaluating hotel AI pilots

By Walid Al-Hajj

The demo runs flawlessly. A guest-messaging chatbot answers a tricky multi-part question in two seconds, a revenue-management model lifts a sample property’s projected RevPAR and the room goes quiet. Then the contract lands, integration starts and the pilot is switched off a year later.

I have watched this pattern repeat across banking and PropTech for two decades. An MIT NANDA study reported by Fortune found that 95% of enterprise generative AI pilots deliver no measurable impact on the bottom line, and Gartner predicted that 30% of generative AI projects would be abandoned after proof of concept by the end of 2025. I am not a hotelier. I am a technology executive who spent years putting models in front of auditors, and the rigor regulated industries apply to new systems travels into a hotel portfolio.

Why the demo is the wrong evidence

A demo is a controlled environment built to sell. It runs on clean sample data, a narrow set of questions and none of the integration debt that defines a live property. McKinsey’s State of AI survey found that only 39% of organizations attribute any EBIT impact to AI, and most of those report less than 5%. The gap between the demo and the income statement is where pilots die.

Hospitality has a sharper version of this problem. Skift Research found that 63% of hotel tech budgets still go toward maintaining legacy systems, many never designed to share data cleanly. A model is only as good as the property management system, channel manager and CRM feeding it, so evaluate the plumbing, not the polish.

The total cost of ownership lens

The license fee is the smallest line on the bill. Total cost of ownership counts everything required to keep a system useful, and one analysis puts a custom enterprise AI build at $500,000 to $2 million upfront, plus 30% to 40% of that every year to run it. Build the estimate across six buckets:

●      Integration: Connecting the tool to your PMS, POS and booking channels, including the hours your vendor will not absorb
●      Data plumbing: Cleaning, mapping and maintaining the feeds the model relies on
●      Staff retraining: Front-desk and revenue teams need time to learn the workflow, and that time is payroll
●      Model monitoring: Someone has to watch for drift when pricing or messaging quietly degrades
●      Ongoing licensing: Per-seat, per-room or usage-based fees that scale as you roll out across properties
●      Change management: Pushing adoption above the threshold where the investment pays back

A pilot that looks cheap on the license and expensive on every other line is the norm. Price all six.

Risk-adjusted ROI, not headline ROI

A vendor’s ROI projection assumes the project works. Risk-adjusted ROI discounts that return by the probability it does not. If a guest-messaging deployment promises a strong return but carries a real chance of stalling on integration, low adoption or model drift, your expected value is the return multiplied by the probability of success, minus the cost you incur either way.

McKinsey reports that 51% of organizations have experienced at least one AI-related incident, so the chance of something going wrong is not theoretical. Build a base case, a downside case and a failure case, then size the pilot so the failure is survivable.

KPIs to set before the pilot starts

Define success before the vendor does. A pilot without pre-set metrics will always be declared a success by whoever championed it. Set targets in writing, tied to metrics your GMs already track:

  • Revenue effect: Movement in RevPAR, ADR or GOPPAR against a matched control property, not a vendor baseline
  • Guest experience: Change in guest satisfaction scores (GSS) and review sentiment over the pilot window
  • Operational load: Staff hours saved or reallocated, with the calculation documented
  • Adoption rate: The share of eligible staff or guest interactions actually using the tool, since unused systems return nothing
  • Accuracy and escalation: For a chatbot, the rate of correct resolutions and of handoffs to a human
  • Unit economics: Cost per booking, message or occupied room, tracked against the TCO estimate.

Exit criteria: Decide now when to stop

The hardest discipline is shutting down a pilot that key people want to keep. Set kill criteria before launch, beside the KPIs, so the decision is mechanical, not political. Shut it down if the tool misses its primary KPI by a defined margin past a defined date, or if adoption stays below the payback threshold once training ends. Pull the pilot when integration or data-quality costs exceed the TCO ceiling by a set percentage, when model drift produces pricing or guest-facing errors above an agreed tolerance, or when the vendor cannot meet a security or data-handling requirement. A clean exit is a successful outcome. It protects capital and produces evidence for the next decision.

The compliance overlay most groups skip

Most hotel groups treat AI as an operations purchase and meet the compliance questions after signing. In a regulated industry, you do the reverse. The U.K. Information Commissioner’s Office fined Marriott £18.4 million ($21.4 million) over a breach tied to 339 million guest records, a reminder that the data controller, the hotel, carries the liability even when the failure sits in a vendor’s cloud.

Run five checks on every AI pilot:

  • Guest PII: Know exactly what personal data the model ingests, where it is stored and how long it is kept
  • Vendor data handling: Confirm in the contract whether your guest data trains the vendor’s models and who can access it
  • Algorithmic fairness: A model that sets prices or allocates upgrades can disadvantage protected groups, a documented risk in algorithmic decisioning for hospitality, so test for it
  • Security review: Treat the AI vendor as part of your attack surface and review it accordingly
  • Auditability: Require that the system can explain a given decision after the fact, because regulators and guests will eventually ask

These checks are not obstacles to innovation. They are the conditions under which it scales.

The framework is simple to state and demanding to apply: price the full cost, discount the return by the risk, fix the metrics and the exit before you start, and run compliance first. Hotels that do will place better bets and walk away from the rest early.

Walid Al-Hajj is managing director of Technium Consulting and cofounder of Buttonwood Property Management. A member of the Fast Company Executive Board, he has spent more than 20 years leading enterprise technology and operations across banking and consulting, including senior technology and risk-management roles at Scotiabank and RBC, and now advises on Fintech, PropTech and AI adoption in regulated industries.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>