Back to Journal

Teaching my chatbot what's actually true

2 min readJacob
  • Chatbot
  • Hallucinations
  • Fact Cards
  • Builder Log

The chatbot on Thetasimplified was articulate, confident, and wrong. A few things it told users:

  • Block time is "around 2 seconds" (it's ~6)
  • "50% of TFUEL paid to Edge Network nodes is burned" (it's 25%)
  • THETA is inflationary alongside TFUEL (it's hard-capped, never minted)
  • Specific EdgeCloud pricing like "850 TFUEL/hour for a regular node" (operator-set, no fixed rates exist)
  • Made-up Main Chain Activity Index weights ("40% transactions, 30% TFUEL, 20% nodes")
  • POGS excluded from the Metachain Index "because it's a community-issued token" (actually because the chain has been offline >30 days)
  • Absorption rate computed as "burned tokens / minted tokens from on-chain logs" (real method is supply-delta with rolling averages)

Each hallucination sounded right. None of them were.

The fix: a fact card

Correcting individual answers is whack-a-mole. I built a single TypeScript file — theta-facts.ts — with every protocol constant the bot must never get wrong: block time, block reward, daily TFUEL issuance, burn percentage, THETA total supply, validator architecture, node tier stake requirements, index weights, methodology details. Each entry has a value, a source, and an explicit "DO NOT" rule next to it (DO NOT describe THETA as inflationary, DO NOT quote a 50% burn rate).

The fact card gets injected into the bot's system prompt on every request. It doesn't replace the model — it gives the model something solid to anchor to before it generates.

Cite our own data

When a user asked about index methodology, the bot would explain "blockchain activity scoring" generically and link to thetatoken.org. But our methodology is on /methodology, with the exact weights I spent days getting right. New rule: when we have authoritative data on the site, cite that. Don't punt to external sources when the answer is right here.

Site-wide

The floating chat was scoped to /use-edgecloud while I shook out the failure modes. Once the fact card covered protocol mechanics, index formulas, EdgeCloud pricing, the POGS exclusion rule, and the "cite our own data" instruction — and testing stopped surfacing new categories of mistakes — I unlocked it across the whole site.

Lessons

Fluent isn't correct. A confident answer from an LLM tells you nothing about whether it's right. The model has no internal signal for "I'm guessing now."

Structural guardrails beat corrections. A fact card in the system prompt scales. Patching individual answers doesn't.

The site is the source of truth, not the bot. The bot's job is to translate the site into a conversation, not substitute its own answers for what's already there.

The work is never finished. Every new methodology page is another potential hallucination. The fact card grows with the site.

— Jacob