Harbourmaster AI — Autonomous Front-Desk Agent for Marinas

Platform Architecture

Every request enters through Cloudflare Workers and stays on the edge. No round-trips to a centralised cloud region.

Ingress Channels

Twilio Voice

Voice calls

Resend / Email Routing

Inbound email

Twilio SMS

Text messages

Web Chat

Browser widget

Webhooks

Worker Layer

Channel Router

Hono middleware

AI Gateway

Cache / Rate-limit / Fallback

Agent Orchestrator

Durable Object

Function Calling

AI + Tool Chain

Kimi K2.6

LLM Inference

Whisper v3

STT

ElevenLabs

TTS

check_availability

Tool

quote_price

Tool

book_slip

Tool

Read / Write

Storage Layer

D1

Relational (SQLite)

KV

Config & rate cards

R2

Contracts & recordings

Vectorize

RAG over policies

Queues

Async tasks

Workers AI

Inference

Kimi K2.6 inference — MoE model with native function-calling. Runs on Cloudflare GPUs at the edge with zero cold starts.

D1 Database

Relational

SQLite at the edge. 7 tables with multi-tenant indexes, foreign keys, and CHECK constraints. Global replication.

Durable Objects

Sessions

Stateful agent sessions per (marina_id, conversation_id). Maintains context across multi-turn voice conversations.

Workers KV

Config

Globally replicated key-value store for hot-path data: rate cards, agent configs. Sub-ms reads from 300+ PoPs.

R2 Storage

Files

S3-compatible object storage for contracts (PDF), call recordings, and email attachments. Zero egress fees.

Vectorize

RAG

Vector database for RAG over marina policies, FAQ docs, and historical interactions. Enables policy citation.

AI Gateway

Gateway

Sits in front of all LLM calls. Response caching, rate limiting, cost tracking, and automatic fallback routing.

Queues

Async

Async task processing: DockMaster sync, email dispatch, Slack notifications, contract generation.

AI Stack

Three models working together: one thinks, one listens, one speaks.

Kimi K2.6

@cf/moonshotai/kimi-k2.6

Primary reasoning model. MoE architecture activates only relevant expert sub-networks per token, keeping inference cost low at high volume.

Architecture MoE — 1T total, ~32B active

Context 128K tokens

Function calling Native (not prompt-injected)

Temperature 0.2 (low creativity for reliability)

Tool chain 5 tools in booking sequence

Fallback Llama 3.3 70B via AI Gateway

Whisper Large v3 Turbo

@cf/openai/whisper-large-v3-turbo

Speech-to-text for all voice interactions. Processes Twilio Media Stream audio chunks in near real-time with speaker diarisation.

Latency < 300ms per chunk

Languages 99+ (English primary)

Input Twilio Media Streams (mulaw/8kHz)

Features Timestamps, confidence scores

Runs on Cloudflare Workers AI GPU

ElevenLabs TTS

Primary + Workers AI fallback

Text-to-speech for voice responses. ElevenLabs for production-quality voices; Workers AI MeloTTS as a zero-latency fallback.

Primary ElevenLabs (configurable voice)

Fallback Workers AI MeloTTS

Latency < 200ms first byte

Output PCM/mulaw streamed to Twilio

Voice IDs Per-marina configurable

5-Tool Booking Chain

Kimi K2.6 calls these tools via native function-calling. The model receives JSON-schema definitions and returns structured tool_call objects.

1

check_availability

Queries D1 for matching slips with date-overlap exclusion and vessel dimension filtering.

2

quote_price

Deterministic pricing engine: base × season × DOW × occupancy × events. Never LLM-generated.

3

draft_contract

Generates PDF rental agreement from template, stores in R2, returns DocuSign e-sign link.

4

take_payment

Creates Stripe Checkout session with booking amount. Returns payment link to guest.

5

book_slip

Writes confirmed booking to D1, syncs to DockMaster PMS via API, logs audit event.

Escape hatch: escalate_to_human

A 6th tool the agent can call at any point to route the conversation to a human. Triggered automatically when confidence drops below threshold, dollar cap is exceeded, or max turns is reached.

Data Layer — D1 Schema

7 tables, all scoped by marina_id for strict multi-tenant isolation.

marinas

Tenant root table. One row per marina property.

id TEXT PK

name TEXT

timezone TEXT

address TEXT

lat REAL / lng REAL

total_slips INTEGER

dm_api_endpoint TEXT

slips

Physical slip inventory with dimensions and amenities.

id TEXT PK

marina_id TEXT FK

slip_no TEXT

dock_section TEXT

length_ft / beam_ft / depth_ft

has_power_30a / has_power_50a

has_water / has_wifi

status ENUM

rate_cards

Pricing configuration with JSON curve definitions.

id TEXT PK

marina_id TEXT FK

base_rate_json

season_curve_json

dow_curve_json

event_premiums_json

occupancy_curve_json

cancellation_policy_json

agent_configs

Per-marina AI agent personality and guardrails.

marina_id TEXT FK

system_prompt TEXT

greeting_message TEXT

voice_id TEXT

dollar_cap_per_booking INT

confidence_threshold REAL

max_turns_before_escalation INT

escalation_rules_json

inquiries

Every inbound interaction across all channels.

id TEXT PK

marina_id TEXT FK

channel ENUM

caller_info TEXT

transcript_text TEXT

confidence_score REAL

status ENUM

assigned_to TEXT

bookings

Confirmed reservations with PMS sync status.

id TEXT PK

marina_id TEXT FK

slip_id TEXT FK

inquiry_id TEXT FK

guest_name / guest_email / vessel_name

start_ts / end_ts DATETIME

price_cents INT

dm_synced BOOLEAN

agent_attributed BOOLEAN

events

Full audit trail — every action the agent takes.

id INTEGER PK AUTOINCREMENT

marina_id TEXT FK

inquiry_id TEXT FK

event_type TEXT

actor TEXT

detail_json TEXT

ts DATETIME

Multi-Tenancy Rule

Every D1 table, KV key prefix, Vectorize namespace, R2 bucket prefix, and Durable Object ID includes marina_id as a scoping dimension. Zero cross-tenant data leakage by design.

Deterministic Pricing Engine

The LLM never generates prices. Every dollar amount comes from this formula, executed deterministically on the Worker.

// Final price calculation

total = base_rate × vessel_length × nights
        × season_multiplier
        × avg(dow_multipliers)
        × occupancy_multiplier
        × event_premium
        + add_ons

Season Curve

Peak (Dec-Mar): 1.55×

Shoulder (Apr-May, Oct-Nov): 1.20×

Off-Peak (Jun-Sep): 0.80×

Day of Week

Mon–Wed: 1.00×

Thu: 1.05× / Fri: 1.15×

Sat: 1.25× / Sun: 1.10×

Occupancy

> 90%: 1.35×

> 80%: 1.15× / > 70%: 1.00×

< 60%: 0.85× (fill incentive)

Events

FLIBS: 2.00× (Oct)

Winterfest Parade: 1.50× (Dec)

July 4th: 1.40×

Base Rates

Nightly: $2.75/ft

Weekly: $16.50/ft

Monthly: $45.00/ft

Add-Ons

30A power: $15/night

50A power: $25/night

WiFi: $8 / Pump-out: $50

Why deterministic?

LLMs are great at conversation but unreliable at arithmetic. A hallucinated price creates legal liability and erodes guest trust. By running pricing as a pure function on the Worker, the agent can confidently quote exact rates that match your published rate card.

Guardrails & Security

Production AI needs more than vibes. These are hard constraints, not suggestions.

Agent Guardrails

Dollar cap — Max booking value before auto-escalation. Default: $15,000.

Confidence threshold — Below this score, the agent escalates. Default: 0.75.

Max turns — Turn limit before forcing human handoff. Default: 20.

Deterministic pricing — Prices always from rate card function, never LLM-generated.

Double-booking lock — D1 query checks date overlaps before any reservation.

Out-of-policy detection — Liveaboards, groups, insurance — auto-routed to staff.

Infrastructure Security

Google OAuth SSO — Google OAuth 2.0 with JWT session cookies on all dashboard routes.

API tokens as secrets — Stripe, Twilio, ElevenLabs keys stored as Cloudflare Secrets.

Tenant isolation — All data paths include marina_id. No shared-namespace leaks.

AI Gateway — Rate-limits LLM calls per tenant. Prevents runaway token spend.

Audit trail — Every agent action logged with actor, timestamp, and detail JSON.

Data residency — D1 replication stays within configured jurisdiction.

Voice Pipeline — End to End

From phone ring to spoken response in under 1.5 seconds.

TWI

Twilio

Guest calls marina number. Twilio opens a Media Stream WebSocket.

~100ms

WOR

Worker

Channel Router receives audio chunks via WebSocket on Cloudflare edge.

~5ms

WHI

Whisper

Audio chunks transcribed to text in near real-time. Partial results streamed.

~300ms

KIM

Kimi K2.6

Transcript → agent reasoning → tool calls → response text.

~600ms

ELE

ElevenLabs

Response text → natural speech audio. First byte in < 200ms.

~200ms

TWI

Twilio

Audio streamed back to caller via Media Stream.

~100ms

Total roundtrip: < 1.5 seconds (vs. 8-15s for typical IVR + hold queue)

Full Stack Reference

Everything that powers Harbourmaster AI, in one table.

Layer	Technology	Purpose
Framework	Hono 4	Lightweight, fast web framework for Workers
Build	Vite + @hono/vite-build	SSR bundle for Cloudflare Pages
Runtime	Cloudflare Workers	V8 isolates at 300+ global PoPs
LLM	Kimi K2.6 (MoE)	Reasoning + native function-calling
STT	Whisper Large v3 Turbo	Real-time speech transcription
TTS	ElevenLabs / MeloTTS	Natural voice synthesis
Database	Cloudflare D1 (SQLite)	Relational data, multi-tenant
KV Store	Cloudflare Workers KV	Config, rate cards, session cache
Object Storage	Cloudflare R2	Contracts, recordings, attachments
Vector DB	Cloudflare Vectorize	RAG over marina policies
Sessions	Durable Objects	Stateful multi-turn agent sessions
Gateway	Cloudflare AI Gateway	LLM caching, rate limits, fallback
Queues	Cloudflare Queues	Async PMS sync, notifications
Auth	Google OAuth 2.0 + JWT	SSO for dashboard with session cookies
Voice	Twilio Media Streams	Telephony ingress/egress
Email	Resend + CF Email Routing	Inbound/outbound email
SMS	Twilio Messaging	Text message channel
Payments	Stripe Checkout	Guest payment collection
Contracts	DocuSign	E-signature for rental agreements
PMS	DockMaster API	Property management sync
Alerts	Slack API	Staff notifications & escalations
Frontend	Tailwind CSS + Space Grotesk	Utility-first styling, Abyssal Intelligence theme
TypeScript	ES2022 target	Type-safe Workers code

Want to kick the tyres?

Open the dashboard, try the live chat, explore the API. Everything's running.

Open Dashboard ← Back to Home

100% Cloudflare-native. Zero origin servers.

Platform Architecture

AI Stack

Kimi K2.6

Whisper Large v3 Turbo

ElevenLabs TTS

5-Tool Booking Chain

Data Layer — D1 Schema

Multi-Tenancy Rule

Deterministic Pricing Engine

Why deterministic?

Guardrails & Security

Agent Guardrails

Infrastructure Security

Voice Pipeline — End to End

Full Stack Reference

Want to kick the tyres?

100% Cloudflare-native.
Zero origin servers.