Sabre with Mindtrip and PayPal is shipping end-to-end agentic booking in Q2 2026. Google AI Mode is booking Marriott directly. Amadeus Cytric Easy lives inside Microsoft Teams. Your board is asking what your AI travel strategy is. We build the deterministic, vendor-neutral alternative for TMCs and OTAs who cannot afford a hallucinated hotel or an Air Canada-style liability ruling.
0.6%
GPT-4 success rate on the TravelPlanner benchmark
OSU NLP Group, arXiv 2402.01622
$812.02
Air Canada ordered to pay after chatbot invented a bereavement fare policy
Moffatt v. Air Canada, BC CRT, Feb 2024
Aug 2, 2026
EU AI Act transparency rules go live for deployers
Kennedys, AI Act Article 50
Who this page is for. Travel management company operations leads, OTA product leads, and corporate travel directors who need an agentic AI layer that respects their existing GDS contracts, NDC servicing reality, corporate policy obligations, and EU exposure. It also works if you have already piloted a vendor chatbot and watched it confidently confirm a room that was never in inventory.
The classic failure, so you know we are talking about the same thing: a family asks a travel agency's AI planner for a luxury eco-lodge in Costa Rica under $200. The LLM blends Tabacon Resort and Nayara Springs into a fictional Tabacon Springs Eco-Lodge. The description is gorgeous. The booking confirmation is generated. The family flies in. The property does not exist.
This is not a quality problem to iterate on. It is a legal problem, a duty-of-care problem, and a margin problem all at once.
On Feb 14, 2024 the British Columbia Civil Resolution Tribunal ordered Air Canada to pay Jake Moffatt $812.02 after its chatbot invented a retroactive bereavement fare policy that contradicted the airline's actual fare rules. Air Canada argued the chatbot was a separate legal entity. The tribunal rejected that defense in plain language: the company is responsible for every statement on its surfaces, whether it comes from a static page or a model. Every travel-tech counsel memo written since cites this case. It is the one precedent your legal team is worried about.
In 2025 a pair of tourists trekked up to 4,000 meters in the Peruvian Andes looking for the Sacred Canyon of Humantay, a destination an AI planner had invented wholesale. A Malaysian couple drove 400 kilometers to ride a non-existent Kuak Skyride after an AI-generated video convinced them it was real. A Tasmanian village of 33 residents started receiving hotel calls about thermal springs that do not exist. ISO 31030 makes traveler safety the deployer's obligation. These are exactly the incidents it is written to prevent, and your insurance carrier is already asking about your AI posture.
A realistic flight booking is roughly ten sequential steps: extract intent, search, filter, price, hold, policy-check, passenger details, payment hand-off, PNR commit, ticketing. If every step is a probabilistic LLM call at 90% reliability, your end-to-end success rate is 0.9^10, about 34%. The OSU NLP Group's TravelPlanner benchmark found GPT-4 with ReAct completes real multi-day itineraries at 0.6%. You cannot prompt your way out of compounding stochastic failure. You have to remove the LLM from the control flow.
GDS providers charge per segment, typically $3 to $3.50 plus commission, and they enforce look-to-book ratios that penalize speculative searches. Lufthansa Group hiked GDS booking fees again effective Jan 1, 2026 across Amadeus, Sabre, and Travelport. An agent that happily runs four exploratory searches per user turn will burn through an OTA's 3 to 5% merchant-model margin inside a quarter. This is the single most overlooked number in agentic travel pitches, and it is why most vendor demos never survive production.
The uncomfortable truth
Fluent is not the same as correct. The current generation of travel LLM wrappers fail exactly where travel buyers cannot afford a failure: at the seam between probability and inventory. A human travel agent who guesses availability is fired. An AI that guesses availability is praised for its tone until a customer arrives at an airport.
Every option below is a reasonable choice for some buyer. We are a consultancy, not a platform vendor, so the gaps column here is written the way we would write it for a client evaluation, including the gaps on our own offering.
| Option | What they actually ship | Where they fit | Real gap |
|---|---|---|---|
| Sabre + PayPal + Mindtrip | End-to-end agentic booking on Sabre Mosaic, 420+ airlines, 2M hotels, Mindtrip's 6.5M POI knowledge base, PayPal checkout | Consumer and leisure OTAs ready to distribute Sabre inventory on Sabre rails | Sabre-locked supply, no corporate policy layer, no NDC servicing story, no ISO 31030 traveler-safety instrumentation |
| Amadeus Cytric Easy + Microsoft Teams | Generative AI assistant inside Teams for Cytric customers, Accenture-built integrations, Microsoft is the reference deployment | Microsoft-native enterprises already on Cytric and Concur | Only reaches Teams surface, only serves Cytric-contracted customers, thin for non-Microsoft business units |
| Google AI Mode + hotel brands | Direct-to-supplier booking via Gemini inside Search. Partners include Marriott, IHG, Booking.com, Expedia, Choice, Wyndham | Large hotel chains that want to skip OTAs and own the guest relationship | Disintermediates the OTA channel entirely. Not a path for TMCs or for OTAs protecting their own funnel |
| Navan (TripActions) | AI-native corporate travel platform, reports 73% touchless expense and policy violations cut from 35% to under 5% | Mid-market to enterprise buyers willing to rip-and-replace their TMC | Platform lock-in, enterprise pricing, limited flexibility for bespoke policy logic or non-standard GDS contracts |
| Kayak AI, Expedia Romie, Booking.com Smart Messenger | Consumer-facing chat concierges on their own inventory, iMessage and WhatsApp surfaces | Leisure consumers inside each brand's owned funnel | B2C only, not addressable for TMCs building their own corporate agent |
| Big 4 and global SIs (Accenture, Deloitte, Capgemini) | Advisory plus implementation on a platform partner stack, typically $2M to $10M multi-year engagements | Enterprises that need a single-throat-to-choke and the brand weight for a board deck | Platform allegiance skews the recommendation, senior expertise sits in the sales cycle, implementation staffed with junior consultants |
| Build in-house on LangGraph + Amadeus Self-Service | Open-source state machine framework, free tier GDS APIs, 10+ engineer team, 12 to 18 month effort | Companies with deep AI engineering benches and a tolerance for the learning curve | Self-Service Production specifically excludes Flight Create Orders, IATA or ARC ticketing still needed separately, no pre-built error recovery library |
| Veriprajna custom build | Deterministic state-machine agent with GDS + NDC dual-pipe, verification loop, corporate policy enforcement, EU AI Act transparency layer, PCI-scoped payment handoff | TMCs and mid-market OTAs that need to ship an agent without surrendering inventory strategy, buyer relationships, or regulatory posture | Not a managed SaaS (we build, you operate), not IATA-accredited (ticketing goes through your host), cannot fix ambiguous corporate travel policy docs |
Sources: Sabre press release Feb 12 2026, Skift Feb 11 2026 (Marriott), Amadeus and Accenture newsroom, navan.com 2026, developers.amadeus.com, OSU NLP Group TravelPlanner.
Three capability clusters, not one product. Most engagements combine two of them. We do not ship the same page-deck to every buyer, and the stack we reach for depends on your existing GDS contracts and your engineering bench, not on a platform preference.
01 · CORE BUILD
A LangGraph state machine with a Pydantic-typed state schema. The LLM handles natural-language extraction and formatting only. Every GDS call, every policy check, every payment handoff is hard-coded Python. We reach for LangGraph as the default because its checkpointing and time-travel debugging are mature, but if your stack already lives on AWS Bedrock AgentCore or Vertex AI Agent Builder we use those instead.
WHAT IT COVERS
02 · GUARDRAIL
If you already shipped a vendor chatbot and your legal team just emailed you the Air Canada ruling, you do not need a rebuild. You need a guardrail. We ship a standalone verification API that sits between your current LLM wrapper and your surfaces. Before any hotel, rate, or PNR is shown to a user, the verification call confirms it against real inventory. No HK status, no surface.
WHAT IT COVERS
03 · COMPLIANCE
Corporate travel policy as enforceable code, not as LLM prompt tricks. We ingest your policy doc, compile it into rule predicates, and make the agent physically incapable of presenting out-of-policy options. We also ship the EU AI Act transparency layer you need for the Aug 2, 2026 deadline.
WHAT IT COVERS
These are the specific technical traps we see on every engagement. They are the reason demos look great and production burns money.
The IATA NDC Offer and Order specs are in relatively good shape. Post-Order servicing, exchanges, refunds, and schedule-change rebooking is still messy. Many NDC-booked tickets cannot be exchanged through the same NDC pipe that created them and have to route back to GDS infrastructure or to a human queue. This is the gap that turns an elegant NDC demo into a $500 per-disruption operations bill when irregular operations hit your customer base.
Our production graphs separate the Offer-and-Order pipe from the servicing pipe explicitly. The agent knows, per carrier and per fare family, which actions it can attempt via NDC and which it must route to the GDS mid-office or escalate to a human. The routing table is code, not a prompt. When a carrier closes its GDS content pull, we update the table. When a new NDC aggregator ships better servicing coverage (Duffel and Verteil are in an arms race right now), we update the table. Your agent does not need to relearn anything.
Reference: Business Travel News NDC coverage, IATA NDC Implementation Guide, Duffel and Verteil servicing matrices.
A chatty agent running four speculative searches per user turn against a $3.00 per-segment GDS will add $12 of search cost to every conversation, most of which never converts. On a 3 to 5% merchant-model margin, this is a direct hit to P&L. Lufthansa Group raised GDS booking fees again effective Jan 1, 2026. The economics are tightening, not loosening.
Three mechanisms fix this in production, and they must all be there. First, an in-memory result cache keyed on normalized origin-destination-date-pax tuples, with a TTL tuned to carrier volatility (international long-haul tolerates longer TTLs than low-cost domestic). Second, deferred search: the agent does not run a GDS query until the user has confirmed the filters it needs, even if that means one extra turn of dialogue. Third, a pre-ticketing re-verify call, because cached data will occasionally cause stale-price confirmations that become chargebacks. These are cheap mechanisms engineering-wise and they are the difference between an agent that works in production and one that gets pulled after the first quarterly budget review.
Reference: Travel Market Report Jan 2026 (Lufthansa fee hike), D-EDGE 2026 GDS Consortia Guide, AltexSoft distribution costs analysis.
Here is the concrete failure: the agent successfully ticketed the outbound flight via ARC settlement, then the hotel booking fails because the cached rate expired and the traveler has a hard check-in deadline. The demo version of an agent does not handle this. It either pretends the hotel succeeded or it leaves the user with a flight and no place to sleep. Both are Air Canada precedents waiting to happen.
The production answer is the Saga pattern: every forward step has a compensating action registered at the time it executes. If step N fails, the graph runs the compensating action for steps 1 through N-1 in reverse order. For a flight-plus-hotel booking that means a void ticket within the 24-hour void window, or a refund request via ARC if void is unavailable, plus a cancellation on any held hotel inventory, plus a user-facing explanation and an offer of alternate options. LangGraph's checkpointing makes this tractable because you can replay the compensating path as cleanly as the forward path. This is a mature pattern in distributed transactions. It is not well-known in the travel AI community yet, and it is the single most important thing to get right before you put an agent in front of a customer.
Reference: LangGraph Time Travel documentation, Temporal and Dagster Saga pattern literature, airline void-window rules (typically 24 hours from ticketing).
We are small. Engagements are staffed with senior engineers who stay on the work from discovery to handover. There is no junior-consultant layer.
PHASE 1 · DISCOVERY
We map your current surfaces, contracts, GDS mix, NDC exposure, IATA/ARC status, payment rails, and EU/UK regulatory footprint. Output is a written posture memo your legal team can take to the steering committee. Two to three weeks.
PHASE 2 · ARCHITECTURE
We design the specific graph for your use case (corporate booking, OTA leisure, IROPS rebooking, or a guardrail retrofit). Every node, every compensating action, every error path is written down before a single line of LangGraph is committed. Three to four weeks.
PHASE 3 · BUILD
We build against your GDS sandbox, wire the NDC pipes, implement the policy compiler, and exercise the graph through adversarial test scenarios (hallucinated entity tests, Saga rollback drills, L2B ratio stress tests). Eight to twelve weeks depending on scope.
PHASE 4 · HANDOVER
Your team operates the agent after we leave. We document the graph, the error mappings, the policy-rule grammar, and the escalation runbooks. We train your engineers on LangSmith observability and the replay workflow. Two to three weeks.
A typical guardrail-only engagement (Capability 02) runs four to six weeks. A full core build with NDC dual-pipe runs four to six months. Numbers you see on Big 4 proposals (12 to 24 months) reflect the overhead of a different delivery model, not the work itself.
Seven questions, one honest answer. Scores your current posture against the prerequisites for an agentic deployment and gives you specific next actions, whether or not you ever call us. Use it as a conversation tool with your legal team or your steering committee.
1. What percentage of your bookings currently go through a GDS (Amadeus, Sabre, Travelport) versus NDC direct versus supplier direct?
2. What is your current touchless booking rate (bookings completed without human agent intervention)?
3. Who is the merchant of record on bookings, and how is payment handled today?
4. IATA or ARC accreditation status?
5. Current AI layer in production?
6. Corporate travel policy: is it a living document with enforceable rules?
7. EU or UK exposure (travelers, offices, or end customers)?
This is a diagnostic tool. It is not a lead-gen form. Your answers stay in your browser.
Pulled verbatim from pre-engagement calls with TMC operations leads and OTA product leads in 2025 and 2026. Answers add depth beyond what is in the main sections.
The only reliable fix is architectural. Wrap the LLM in a verification loop that refuses to surface any property, price, or PNR unless it has been confirmed against real inventory with a holding-confirmed status code. Concretely: the LLM parses intent and formats output, but never invents supply. Every hotel name, rate, and availability statement routes through a deterministic call to Amadeus Hospitality, Sabre CSL, or a direct hotel CRS, and the result must match on property ID plus rate code before the agent is allowed to say it out loud. If the verification call fails, the agent returns an honest I-could-not-confirm response instead of a fabrication. This is not prompt engineering. It is a hard-coded guardrail around a probabilistic component.
It depends on what you are. If you are an OTA with no GDS lock-in preference and your strategy is to distribute Sabre inventory on Sabre rails, then Sabre plus Mindtrip is probably the right answer and we will tell you so. If you are a TMC with corporate policy obligations, multi-GDS supply, NDC exposure, and an existing mid/back office on Concur or Cytric, the Sabre plus Mindtrip stack does not fit. It is consumer-first, Sabre-locked, and has no corporate policy layer or ISO 31030 duty-of-care instrumentation. Our build gives you the same agentic front-end without surrendering your inventory strategy or your buyer relationship.
Moffatt v. Air Canada, decided Feb 14, 2024 at the BC Civil Resolution Tribunal, held the airline responsible for a bereavement-fare policy its chatbot invented. Air Canada argued the chatbot was a separate legal entity. The tribunal rejected that defense outright. The practical consequence for any TMC or OTA deploying a customer-facing travel agent: the company bears the full legal weight of every statement the agent makes, whether it came from a vendor LLM, a fine-tuned model, or a wrapper your team shipped last week. The defense of it-was-the-AI does not work. This is why our engagements always start with a liability-posture review before any line of code is written, and why the verification loop and audit trail are non-negotiable in the architecture.
The honest answer is that you cannot fully close it yet, and any vendor who says otherwise is selling you something. The IATA NDC Offer and Order specs are in better shape than the post-Order servicing flows, which is why exchanges, refunds, and irregular-operations rebooking still leak back to GDS infrastructure or to agent queues. What we build is a dual-pipe agent: Offer and Order go through Verteil or Duffel for NDC content; servicing routes are wired to your GDS host for the EDIFACT fallback. The agent knows which pipe to use per carrier, logs every handoff, and escalates cleanly to a human queue for the carriers with the worst servicing coverage. You do not get a perfect solution. You get a solution that degrades gracefully and does not leave travelers stranded during IROPS.
Probably not, and we will say so in the first call. Cytric Easy embedded in Microsoft Teams with the Accenture-built Copilot integrations is a sensible default if your corporate travel already runs on Cytric and your workforce lives in Teams. Where we help is the gap cases: you have non-Microsoft business units, you need policy enforcement that Cytric does not cover, you have a second GDS or direct-supplier relationships Cytric does not reach, or you have regulated markets where the Aug 2, 2026 EU AI Act transparency obligations require documentation Cytric does not yet emit. If none of those apply, buy Cytric Easy and skip the consulting engagement.
It gets worse unless you design around it. Agentic workflows do multi-turn refinement, which means multiple speculative searches per booking. GDS providers charge for searches, not just bookings, and they enforce look-to-book ratios (commonly a warning above 250:1 and commercial penalties above 1000:1 depending on your contract). Lufthansa Group hiked GDS booking fees again effective Jan 1, 2026. The unit economics will break if you do not cache. Our production graphs use three mechanisms: an in-memory result cache keyed on normalized origin-destination-date-pax tuples with a TTL tuned to carrier volatility; deferred search for filters the user has not confirmed yet; and a pre-ticketing re-verify call so the cache does not cause stale-price confirmations. Without these, a chatty agent will burn through your GDS budget in a quarter.
We do not hold IATA or ARC accreditation. Ticketing authority is a heavy regulatory burden and we have no interest in becoming a travel agency. The architecture we ship integrates with your accredited host for ticket issuance: the agent prepares the PNR, validates fare rules and policy, routes payment to your PSP, and then hands off to your existing ARC or IATA settlement pipe. If you do not yet have accreditation and you need it, ARC runs about 25 days after prerequisites; full IATA can run 6 to 12 months. We will tell you that in week one of discovery so it does not ambush you in month four.
The chat surface never touches card data. This is a hard rule in every graph we ship. When the agent reaches the payment step, it hands off to a PCI-scoped component: your existing PSP, or a tokenization vault like Very Good Security or Checkout.com, depending on your stack. The agent receives a token back, attaches it to the PNR, and the human authorizes via a conventional payment button or 3DS2 flow. This keeps the LLM and the conversational state store entirely out of PCI scope, which is both a compliance win and a chargeback-dispute win. Agentic commerce protocols are still maturing here and we track OpenAI, Stripe, and Adyen updates quarterly because the right pattern this month may not be the right pattern next quarter.
Three things, minimum. First, Article 50 disclosure in the agent UX: users must be told they are interacting with an AI system, in language they can understand, before any substantive exchange. Not buried in a privacy policy. Second, a log-and-explain audit trail: for any decision the agent makes that affects a traveler, you need a retrievable record of the inputs, the reasoning path, and the outputs. LLM-generated text alone is not an audit trail. Third, a high-risk self-assessment against Article 6 guidance published Feb 2, 2026. Most travel booking agents will land outside Annex III high-risk classification, but if your agent touches employment, creditworthiness, or critical infrastructure adjacencies, the answer changes. We build the disclosure UX, the audit event schema, and the self-assessment documentation as standard deliverables in any EU-exposed engagement.
You can, and if you want a single-throat-to-choke for a $2M to $10M multi-year program, a Big 4 or global SI is the traditional answer. Two practical differences. First, most global SIs have platform partnerships: Accenture built the Cytric Easy Copilot integration with Amadeus, so an Accenture engagement will gravitate toward an Amadeus-centric answer regardless of whether that is optimal for your inventory mix. We have no platform allegiance and will recommend the stack that fits your buyer and your margins. Second, these engagements typically staff junior consultants for implementation; the senior expertise is in the sales cycle and the steering committee. We staff the same senior engineer across the engagement because the team is small. You get depth faster and pay less. What you give up is the weight of a brand name in your board deck.
The interactive whitepapers below are the long-form research this page is built on. Both are from the Veriprajna travel series.
Why LLM wrappers fail travel logistics, the Orchestrator-Worker pattern, and the specific GDS integration patterns (Amadeus, Sabre) that make verified bookings possible.
The control-flow argument: why LangGraph state machines beat prompt chains, the TravelPlanner benchmark breakdown, and a node-by-node walkthrough of a production flight-booking graph.
Q2 2026 is when the major platforms go live. The window to build a differentiated agentic travel layer without the Sabre-Mindtrip or Navan lock-in is open, and it closes fast.
Start with a two-week liability and readiness review. You walk away with a written posture memo whether or not we move forward together.