Every GEO tool on the market measures the same four things in slightly different ways.
Profound, BrandRadar, Otterly, Siftly, HubSpot AEO, Conductor AgentStack. They differentiate on dashboard UX, alerting workflows, and which integrations they ship. They do not differentiate on the underlying data. The data is the same: did my brand show up in an AI-generated answer, where in the answer, was my URL cited, and what was the sentiment.
That data layer is the hard part. Most of those tools are building it badly, in-house, with bugs they will not admit. We checked and found the API/UI gap reaches 96% of queries on ChatGPT, which means most GEO dashboards are measuring the wrong half of reality.
This post walks through building a GEO tool that gets the data right. The atoms-and-surfaces framework, the API calls, the rollup, the dashboard, the alerts. About 200 lines of code. One weekend.
The four atoms of GEO
Every GEO measurement reduces to one of these four:
Presence. Was my brand mentioned in this answer? Boolean. Roll it up across queries to get presence_rate per surface.
Prominence. If yes, where? Position 1 (named first) or position 7 (buried)? Integer rank.
Citation rate. Was my URL cited as a source for this answer? Different from "mentioned" because mentions can happen without citing my domain.
Sentiment. If mentioned, was the framing positive, neutral, or negative? "Linear is the keyboard-first alternative to Jira" is positive. "Linear had outages last week" is negative.
The six surfaces (and why all of them matter)
ChatGPT, Claude, Gemini, Perplexity, Google AI Overviews, Bing Copilot. Each surface has different audience and different citation behavior. Skipping any of them means missing real visibility.
ChatGPT is the volume leader (largest user base). API and UI diverge on 96% of queries.
Claude has the most engineering-heavy audience. Lowest citation count per answer (often 0-2).
Gemini has the most aggressive query fan-out (3-7 sub-queries). UI scraping is mandatory because the API misses the fan-out completely.
Perplexity is the most citation-heavy: 6-10 cited URLs per answer typically. Lowest API/UI gap (42%).
Google AI Overviews trigger on 48% of tracked queries. Source URLs there function as the new top three blue links.
Bing Copilot is the smallest by volume but disproportionately important for Microsoft and enterprise audiences. No public API exists.
Every GEO tool needs the same data. Most are building it themselves, badly. The data layer is a separately fundable wedge.The picks-and-shovels framing
The build
Aim for 30 queries split evenly across three buckets: category-defining, comparison, use-case. Track yourself plus 3-5 competitors.
{
"brand": "Pipedrive",
"competitors": ["HubSpot", "Salesforce", "Zoho", "Close"],
"queries": [
"best CRM for small business",
"best CRM for sales teams",
"lightweight CRM for consultants",
"Pipedrive vs HubSpot",
"alternatives to Salesforce",
"Pipedrive vs Salesforce for SMB",
"CRM for a 5-person startup",
"what is the cheapest CRM with email automation",
"best CRM with kanban view",
"CRM with WhatsApp integration"
]
}One call per query, with mode: all_live. The response includes a providers map covering all six surfaces. Each cell has the four atoms.
import config from "./config.json" assert { type: "json" };
import fs from "node:fs/promises";
async function checkOne(query) {
const res = await fetch("https://api.mentionsapi.com/v1/check", {
method: "POST",
headers: {
Authorization: `Bearer ${process.env.MENTIONSAPI_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
mode: "all_live",
query,
track_brands: [config.brand, ...config.competitors],
}),
});
return res.json();
}
const today = new Date().toISOString().slice(0, 10);
const snapshot = {};
for (const q of config.queries) {
snapshot[q] = await checkOne(q);
}
await fs.writeFile(`snapshots/${today}.json`, JSON.stringify(snapshot, null, 2));Walk every (query, surface) cell. Aggregate per surface. The output is your dashboard data: 6 surfaces × 4 atoms = 24 numbers that summarize your GEO state.
function rollup(snapshot, brand) {
const SURFACES = ["chatgpt", "claude", "gemini", "perplexity", "ai_overview", "bing_copilot"];
const result = {};
for (const surface of SURFACES) {
let mentions = 0, total = 0, ranks = [], citationHits = 0, sentScores = [];
for (const query of Object.keys(snapshot)) {
const cell = snapshot[query].providers[surface];
if (!cell) continue;
total++;
const me = cell.brands?.find(b => b.name === brand);
if (me?.mentioned) {
mentions++;
if (me.rank !== null) ranks.push(me.rank);
if (me.sentiment !== undefined) sentScores.push(me.sentiment);
}
const cited = cell.citations?.some(c => c.url.includes(brand.toLowerCase() + ".com"));
if (cited) citationHits++;
}
result[surface] = {
presence_rate: total > 0 ? mentions / total : 0,
avg_rank: ranks.length > 0 ? ranks.reduce((a, b) => a + b, 0) / ranks.length : null,
citation_rate: total > 0 ? citationHits / total : 0,
sentiment_score: sentScores.length > 0 ? sentScores.reduce((a, b) => a + b, 0) / sentScores.length : null,
};
}
return result;
}
console.log(rollup(snapshot, "Pipedrive"));Run the snapshot pipeline daily via cron, GitHub Actions scheduled workflows, Cloudflare Cron Triggers, or any equivalent. Save snapshots keyed by date so you can compute deltas later.
# Run daily at 09:00 UTC 0 9 * * * cd /path/to/geo-tool && node snapshot.mjs && node rollup.mjs && node alert.mjs
For the dashboard, the simplest path is a static page rendered from the snapshot data with Recharts or similar. Plot the four atoms per surface, plus the overall GEO score over time.
For alerts, diff today's rollup against yesterday's. Send to Slack on meaningful changes: rank drops of 2+, presence drops of 10%+, new competitor entries, citation gaps that opened up.
import fs from "node:fs/promises";
const today = new Date().toISOString().slice(0, 10);
const yesterday = new Date(Date.now() - 86400000).toISOString().slice(0, 10);
const t = JSON.parse(await fs.readFile(`rollups/${today}.json`));
const y = JSON.parse(await fs.readFile(`rollups/${yesterday}.json`));
const alerts = [];
for (const surface of Object.keys(t)) {
const presenceDelta = t[surface].presence_rate - y[surface].presence_rate;
if (Math.abs(presenceDelta) >= 0.10) {
alerts.push(`${surface}: presence ${presenceDelta > 0 ? "up" : "down"} by ${Math.round(Math.abs(presenceDelta) * 100)}pp`);
}
}
if (alerts.length > 0) {
await fetch(process.env.SLACK_WEBHOOK_URL, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ text: `*GEO changes detected*\n` + alerts.join("\n") }),
});
}What this costs
30 queries × $0.50 × 30 days = $450 worst case. Cache at 60% reduces real cost to $80-$150 a month. Add competitors and queries linearly.
What separates a good GEO tool from a mediocre one
Three things.
UI scraping, not API-only. The 96% API/UI divergence on ChatGPT means API-only tools measure something almost completely different from what real users see. Insist on mode: all_live.
The four atoms separately. Most mediocre tools collapse them into one "AI visibility score." That number is a vibe, not a measurement. The atoms separately are what drives action.
Longitudinal storage from week zero. The score on day one is meaningless. The score on day one minus day thirty is the whole point. Build snapshot storage before you build the dashboard.
Frequently asked questions
What is GEO and what does a GEO tool do?
How is a GEO tool different from a traditional SEO tool?
How long does it take to build a GEO tool from scratch?
Do I need to ship UI scraping or is API data enough?
What does a GEO tool cost to run?
Can I run this on a serverless or edge platform?
Ship it this weekend
Day one: write the snapshot pipeline, get the data flowing into JSON files. Day two: write the rollup and the dashboard. Day three: schedule the cron and start collecting trend data.
By the end of week four you have a complete GEO tool that beats most $200-a-month SaaS dashboards on data quality alone, and gives you full ownership of the data, the alerts, and the workflows.
The shovel is the API. The pickaxe is the four atoms. The mine is whatever category you compete in.