AI-PoweredShipped to ProdConcept to CodeAgent NativeStartups

Grantsville

TurboTax for grants — an org knowledge vault with AI grant writing on top.

Role

Design Engineer

Type

B2B SaaS
web app

Year

2026

Status

In production · grantsville.co ↗

The dashboard the Accidental Grant Writer opens.

Vault readiness — 57% of the 97 atoms a funder-ready org needs filled. Active applications with their six-axis scores. The operator review queue — every draft passes through here before anything goes out.

The argument

The vault is the moat — not the model.

Every competitor is going to have access to the same foundation models. Generic AI on org documents produces grant slop funders catch immediately — not because the model is bad, but because it has nothing specific to draw on.

The defensible product is the structured knowledge layer underneath: language from proposals that won, financials that match this funder, impact data buried two years deep. Grantsville's job is to ingest the org's documents and turn them into a queryable, versioned source of truth that any model can be pointed at.

Portable

Applies to any AI-on-domain-documents product.

Defensible

Structured extraction is the work; the model is interchangeable.

Counter-conventional

Most AI products lead with the model. The data layer is the answer.

The user

The user the $5,000 consultant doesn’t serve.

An ED at a sub-$500K nonprofit who never wanted to write grants. They need them to survive, can't afford the $5,000 consultant, and can't afford to send a generic AI draft a funder will throw away in ten seconds. The Accidental Grant Writer.

The asset isn't missing — past proposals, financials, impact reports, board decks are all sitting in Google Drive folders. They've just never been connected to the next application.

$5K

Average grant consultant fee. Out of reach for a sub-$500K org.

Document types the vault recognizes and processes.

Structured atoms a funder-ready org needs filled.

“I run a $400K family-services nonprofit. I write the grants because there's no one else. I have one full-time afternoon a week to do it. The ChatGPT drafts read like ChatGPT drafts.”

— Composite of four early user interviews

What it does

A four-step path from upload to submission.

Each step exists so the next one can be grounded in real org context, not a prompt template.

Step 01 — Vault

Upload, classify, extract.

The org's documents go in. The vault classifies each file, extracts structured data across 27 document types, and surfaces a readiness gauge — what's filled, what's missing, what to chase down next. The vault has to populate itself, or it never gets populated.

Step 02 — Funder intake

Paste a guidelines PDF, get a brief.

The system pulls funder requirements, deadlines, evaluation criteria, and compliance rules out of the guidelines doc. You confirm the parsed brief before generation starts — wrong-on-paper beats wrong-in-a-draft.

Step 03 — Drafting

Submission-ready proposals, grounded.

The agent pre-fetches the full vault context, maps funder requirements to your deliverables, builds budget narratives from your actual financials, and assembles language from proposals that already won. Every sentence has to be defensible against a citation in the vault.

Step 04 — Scoring & review

Six-axis quality gate before you see it.

Every deliverable is scored on NOFO Coverage, Budget Coherence, Voice Match, Compliance, Funder Alignment, and Org Readiness — then it sits in an operator review gate before anything goes out the door. The weakest dimension has to be unskippable, or the gate is decoration.

Defending the thesis

Four decisions, all defending the same claim.

If the vault is the moat, then the product’s hardest problems are: how to populate it, how to query it, and how to charge for what comes out of it. Every meaningful architectural call traces back to one of those three.

01 ·The central claim

Vault as moat, not the model.

The org’s documents are ingested into a queryable knowledge base — versioned, with funder-specific variants and a full audit trail. The model is a renderer over the vault, not the source of truth.

Why this, not thatA prompt-only approach produces output any competitor reproduces in a week. The org’s structured context is the part that doesn’t commodify.

02 ·How the vault gets populated

Document intelligence over manual tagging.

Uploads run through a multi-step pipeline that classifies the file (990, audit, proposal, board minutes…), routes it to the right extractor, and pulls the relevant atoms — all without asking the user to label anything.

Why this, not thatNonprofits don’t know what type each document is, and won’t tag 200 files to find out. If the vault requires manual setup, it never gets populated, and the moat doesn’t exist.

03 ·How the vault gets queried

Agent-native architecture.

AI capabilities are markdown prompt files plus a small set of tools — read/write vault, process documents, draft sections, manage workflow state. There are no hardcoded generation chains.

Why this, not thatA tool-based agent’s capabilities grow with prompt-file edits, not deploys. When a new funder format shows up or a scoring axis needs a tweak, the change is text — not a release.

04 ·How the vault makes money

Per-deliverable pricing, not per-seat access.

$99 per Letter of Inquiry, with Stripe at the end of the workflow. First project is free. There is no monthly access fee — you pay when the vault produces something you ship.

Why this, not thatPer-deliverable pricing is what keeps the moat sharp. A monthly subscription pays the same whether the vault produces a winning LOI or a generic one — it removes the architectural pressure to invest in extraction, scoring, and review. Charging only when the vault produces something a nonprofit will ship forces every layer above the model to actually earn its keep.

A builder’s account

What broke in production, and what fixed it.

Three production failures that revised the architecture. Each one was the vault thesis pushing back on a shortcut.

Failure 01

Document classification failed quietly on the first real org.

The classifier worked fine on clean, well-named test files. The first nonprofit's Google Drive was a thirteen-year archive: scanned PDFs of board minutes labeled “FINAL_v2_USE_THIS.pdf”, merged 990s containing three years' filings, a “Proposals” folder with rejection letters mixed in.

Half the files came in unclassified. The vault sat at 38% readiness with no signal about why.

The fix

Two-pass classification with explicit “needs human” as a first-class outcome — plus a small inline correction UI for the few files the model couldn’t place.

What it taught

“Classification accuracy” on real-world inputs is a UX problem, not just a model problem. Surfacing the unknowns is half the work.

Failure 02

The first agent was a hardcoded generation chain. It didn’t survive the second funder.

The earliest version had each section — narrative, budget, evaluation plan — as a fixed sequence of model calls, glued together with TypeScript. Adding a funder that wanted a logic-model section in the middle meant a code change, a deploy, and a new test pass.

The product wasn't going to scale to forty funders that way.

The fix

Refactored to a markdown-prompt-files-plus-tools agent. Sections became prompts. Funder formats became data. Architecture changes became text edits.

What it taught

If the agent’s capabilities can’t be edited as prose, you’ve built a slow product. Tool-based agents aren’t a stylistic choice — they’re a velocity choice.

Failure 03

The operator review gate was passing drafts that shouldn’t have passed.

The original review gate showed a green checkmark when all six scoring dimensions were above threshold. In practice, a draft could clear the bar on aggregate while one specific axis — usually Voice Match — was quietly off, and the operator approved on vibes.

One submission went out reading like a tech-startup pitch deck for a faith-based food pantry. It came back rejected.

The fix

Per-axis review with the lowest-scoring dimension hoisted to the top and an explicit “I’ve read the Voice Match section” interaction before approval is possible.

What it taught

Aggregate scores let humans skip. The review gate has to make the weakest dimension unskippable, or the gate is decoration.

What shipped

A working vault, generating real proposals.

Four surfaces — dashboard, vault, project workspace, generation flow — running against live nonprofit documents and Stripe-billed deliverables. The dashboard is in the hero. These are the other three.

Grantsville vault — source documents view listing real uploaded files (LOIs, board docs, financial budgets, program descriptions, strategic plans) with classification badges and processing status; vault readiness 57% with per-category breakdown — Vault — what the org has

Grantsville generation flow — live proposal drafting with progress checklist on the right (Retrieving Vault atoms, Applying Funder Policy Engine, Drafting Statement of Need, Building program narrative, Constructing budget summary, Running compliance check, Calculating quality scores); United Way Tier 1 community foundation context shown in the header — Generation — the agent at work

Grantsville project workspace — three-pane editor for a specific grant (Greater Hartford Gives Foundation, Tutoring & Mentoring Program). Left pane: outline + six-axis health checks (NOFO coverage, Technical compliance, Funder alignment, Voice authenticity, Internal coherence, Budget coherence) with a Certification Gate flagging 3 blocking issues (NOFO 83% requires 100%, Technical Compliance FAIL, Voice Authenticity 66 requires 75). Center: proposal text with a citation flag — “Citation needed — Bare ref against structured atom prog.outcomes — pick a selector.” Right: AI Copilot with a Review Roadmap (Resolve bare-reference citations, Polish voice authenticity) and a chat interface. — Workspace — where the gate makes you look

What I’m watching for

Proposal acceptance rate — the only metric tied to a real user outcome, and the one that tells you whether the vault thesis actually held in production.

What’s running

Payments and multi-tenancy on Stripe. Document classification across 27 types. The agent on markdown prompt files. The six-axis review gate with per-axis sign-off. First paying customers walking through the four-surface flow end-to-end.

Grantsville

The vault is the moat — not the model.

The user the $5,000 consultant doesn’t serve.

A four-step path from upload to submission.

Upload, classify, extract.

Paste a guidelines PDF, get a brief.

Submission-ready proposals, grounded.

Six-axis quality gate before you see it.

Four decisions, all defending the same claim.

Vault as moat, not the model.

Document intelligence over manual tagging.

Agent-native architecture.

Per-deliverable pricing, not per-seat access.

What broke in production, and what fixed it.

Document classification failed quietly on the first real org.

The first agent was a hardcoded generation chain. It didn’t survive the second funder.

The operator review gate was passing drafts that shouldn’t have passed.

A working vault, generating real proposals.

Ready for more?

Prompt Theory

Vouch

SG Resale