Shop by vibe, not by database query
Secondhand platforms force you to search like a database, not like a person. Nobody shops by typing "blue jeans size 32." They shop by vibe: "y2k denim under $80" or "quiet luxury with a worn-in feel." The gap between how people talk about fashion and how resale platforms let you search for it is the entire product opportunity.
This is not AI-assisted search. The AI is the entire shopping experience. Google Gemini runs in an agent loop where each user message triggers a sequence of tool calls until the agent decides to respond.
The agent uses hybrid model routing: Flash handles tool selection (cheap, fast, 10x less cost), Pro handles synthesis and complex reasoning (quality). Automatic fallback with rate-limit quarantine ensures the experience degrades gracefully, not catastrophically.
Complex behaviors emerge from simple tools — "compare the cheapest and most expensive" triggers search, sort, add-to-compare, and get-comparison in sequence. The agent self-evaluates confidence before acting: low confidence triggers a clarifying question, not a guess.
Every filter tracks its source through a Glass Box UI: was it inferred by the agent, set by the user, or negotiated collaboratively? This is not cosmetic — it determines clearing behavior. User Supremacy means agent-cleared filters behave differently from user-cleared filters. The user can always see what the agent decided and why.
“The agent decides what to do. The rules decide what it is allowed to do.”
Deterministic logic evaluates whether the agent should act or ask for clarification — not a second model call assessing its own confidence.
A model evaluating its own confidence is circular and unauditable. Deterministic rules are debuggable, testable, and add zero latency.
The agent can clear filters it set, but cannot override filters the user set. Every filter tracks provenance (agent | user | collaborative) so this rule is enforceable at the system level.
An agent that overrides your choices is not helpful — it is adversarial. Filter provenance makes this a system guarantee, not a prompt instruction.
Any cart addition over $300 requires explicit user confirmation. The agent cannot spend freely above this threshold.
High-value actions require human approval. The threshold balances convenience (most resale items are under $300) with risk. This is the line between autonomy and oversight.
Flash for tool selection (cheap, fast). Pro for synthesis and complex reasoning (quality). Automatic fallback with rate-limit quarantine prevents cascading failures.
Pro for everything costs 10x more and adds latency on tool calls where quality does not matter. Flash for everything degrades synthesis quality. The hybrid approach optimizes cost and quality per task type.
No runaway loops possible. The agent must resolve within 10 tool calls or surface what it has.
Unbounded agent loops are unpredictable and expensive. A cap forces efficient tool use and guarantees response time.
25 tests validate behaviors that emerge from tool composition — not individual tool correctness. Tests like "a vague query should trigger clarification" verify system-level outcomes.
Unit tests verify tools work in isolation. Emergence tests verify the system produces useful outcomes when tools interact — the gap where real bugs hide in agent systems.
Shipped and live. The platform runs a 41-tool agent system across 25,295 lines of TypeScript backend code and 28 React components. It indexes 4,000+ resale products with zero manual curation.
The agent handles natural language queries end-to-end: parsing intent, selecting tools via hybrid Flash/Pro routing, applying filters across 19 semantic dimensions, evaluating confidence through deterministic rules, and synthesizing responses with source attribution. Ambiguous terms like "vintage," "designer," and "affordable" trigger clarification rather than a guess.
The system includes durable checkpointing (state persists after every tool call), SSE streaming with heartbeat keepalive, and a Glass Box UI that shows users exactly which filters the agent inferred versus which they set themselves.
What we track to know the agent is working.
How often users override agent-inferred filters. A declining rate means the agent interprets vibes more accurately over time.
How often confidence drops low enough to ask versus act. Too high means the agent is timid; too low means it is guessing.
Breakdown of agent-set vs user-set vs collaborative filters. Reveals whether the agent is doing useful work or just adding noise.
Hybrid routing effectiveness — what percentage of tool calls use Flash versus Pro, and whether synthesis quality holds.
Queries that return nothing reveal vocabulary gaps in the 19-dimension filter system.

A Chrome extension for saving, organizing, and injecting prompts into ChatGPT, Claude, and Gemini with semantic search.

Automated design governance using AI agents and deterministic rules. Scans live web pages against hand-authored design rules and surfaces drift.
An outcome-as-a-service platform that delivers submission-ready grant proposals assembled from your actual organizational documents.