The experiment
The top-mentioned brands are taken from our open AI agent captures dataset: 1M+ ground-truth captures (released under MIT at commerce-agentic/ai-visibility-metrics). For each brand domain, we tried to fetch the standard Shopify public catalog endpoint:
GET https://{brand}.com/products.json?limit=1
Shopify exposes this endpoint by default on every store, no auth required. It's how shopping comparison sites, scrapers, and audit tools (including ours) read public catalog data. It's also the endpoint any AI agent could call to verify a recommendation.
Forty-six brands answered. The rest didn't.
What the 154 failures looked like
| Outcome | Count | Share | What it means |
|---|---|---|---|
| 200 OK · valid catalog | 46 | 23% | Public products.json returns real product data |
| 404 Not Found | 75 | 37.5% | Brand isn't on Shopify, OR the public feed is disabled |
| 403 Forbidden | 34 | 17% | On Shopify but bot/scraper blocking is enabled |
| HTML instead of JSON | 20 | 10% | Auth wall or different platform serves a landing page |
| Timeout / abort | 13 | 6.5% | Site too slow or rate-limit aborted us |
| Other (410, 451, errors) | 12 | 6% | Gone, geo-blocked, or upstream errors |
Concrete examples in each bucket:
- 23% open:Everlane, Allbirds, Reebok, Eddie Bauer, Smartwool, Alo Yoga, Tentree, Outdoor Research, Girlfriend Collective. Many of these are DTC-native Shopify brands without aggressive scraper protection.
- 37.5% returning 404:Target, Walmart, Amazon, Columbia, Nike, Under Armour, Nordstrom, L.L. Bean. Most of these are not on Shopify at all; they run on Salesforce, custom stacks, or marketplace platforms. Their /products.json doesn't exist by default.
- 17% returning 403:REI, Adidas, The North Face, Sephora, New Balance, Brooks Running, Zara, Carhartt. These are mostly on Shopify Plus (or comparable enterprise platforms) and have explicit bot/scraper blocking enabled. The endpoint exists; we're just not allowed to read it.
- 10% serving HTML:Patagonia, Gap, Dr. Martens, Backcountry, Logitech. The URL returns an HTML page instead of JSON, either a different platform, a redirect, or a "membership-required" wall.
What this means
1. AI agents see what audit tools can't
The single most interesting implication: the brands AI agents recommend most are largely brands you can't easily verify externally. Adidas, Sephora, Patagonia, REI: all heavily blocked, all heavily recommended. AI agents got their catalog data from somewhere (training corpora, commerce partnerships, first-party integrations like the OpenAI / Shopify deal), but a third-party auditor doesn't have the same access.
This creates a two-tier ecosystem:
- Open-catalog brands (the 23%) can be audited, benchmarked, and optimized by any third-party tool, including ours.
- Closed-catalog brands (the 77%) can only be audited by tools the merchant installs themselves, which is exactly the install-based path our Shopify app takes.
2. "Public catalog audit" is a self-selecting sample
Every benchmark, ranking, or "state of e-commerce" study built on public catalog data is, inevitably, a study of brands that chose to leave their data accessible. That's not the full population. Our quarterly State of AI Commerce report is honest about this: the brand ranking is from the captures dataset (what AI cites), but the audit-score leaderboard is from public catalogs only.
3. The audit market is bifurcated
The 23% open / 77% closed split predicts the structure of the AI catalog optimization market:
- For open-catalog brands, there's room for free / freemium tools (like our public audit) that work without install: high reach, low conversion.
- For closed-catalog brands, the only viable path is install-based optimization: lower reach, higher conversion, and the only place real optimization (writing fixes back to the store) happens anyway.
Anyone trying to build "an AI catalog SaaS" without an install path is, definitionally, locked out of the most-recommended brands. We learned this the hard way and pivoted to install-first early.
4. If you're a merchant reading this
Two questions worth asking:
- Did you intentionally block your
/products.json? Many merchants do via Shopify Plus / Cloudflare scraper rules. The cost is real: every third-party tool that could surface your brand (price comparison, audit, AI training scrapes that complement first-party integrations) is locked out. The benefit is real too: you control who can build on your catalog. There's no universally right answer, but the trade-off should be deliberate. - If your catalog is open, is it actually optimized? The same tools AI agents would use to verify you are the tools we audit with. Run a free audit and see what they see.
Run your own audit
Free, no install, no signup. Whatever AI agents would see in your public catalog.
Audit a store → Leaderboard