First three makers onboarded with a hand-rolled structured-data schema across furniture, pantry goods, and small-batch beauty. Two-year hypothesis test begins: factory-direct schema for makers, owned by the seller, can outperform marketplace surfacing on per-effort discoverability for describably-distinct categories.
What this is actually testing
SPF predates the publication and its framework, but the test it runs is one the framework would have proposed. The structural argument is general: any system whose value flows through a closed legibility layer will, over time, extract rent from the participants on both sides of the layer. Marketplaces are one specific instance. They are not unique.
For independent retailers, the closed legibility layer is the marketplace itself — Amazon, Etsy, Shopify’s surfaces. The marketplace reduces a seller’s catalog into a schema it controls, then surfaces that catalog through ranking the seller does not control. Two layers of control, both held by an entity whose interests are not aligned with the seller’s.
The structural alternative is to make a seller’s data publicly legible at the source, in formats every reader — search engines, AI agents, comparison tools, third-party guides, the next generation of agentic commerce — can parse without paying the marketplace. The data sits at a URL the seller owns. There is no platform; therefore no rent.
How the test is designed
Independent makers
62 makers currently active, in three product categories: handmade furniture, specialty pantry goods, and small-batch beauty. All operate independently and have a marketplace presence to baseline against.
Owned product feeds
Each maker maintains a structured-data feed at their own domain, conforming to a public schema designed to be read by search, AI agents, and third-party tools without authentication.
The same products elsewhere
Identical SKUs listed on at least one marketplace, with identical pricing and copy. Discovery and conversion are measured side by side over rolling 12-week windows.
The design is intentionally narrow. SPF doesn’t claim that every seller benefits from leaving a marketplace; it claims that describably-distinct sellers — those whose differentiation can be expressed in attributes rather than imagery alone — can outperform on a per-unit-effort basis. The test holds that variable constant by limiting participation to sellers in that population.
What is being measured
Four measurements form the test:
Discovery-to-purchase conversion, for identical products published in both places. Time-from-publish to first agent reference, where an agent is any external system (search engine, AI assistant, third-party tool) that cites the product. Per-seller take-home margin on the same product, after marketplace fees on one side and SPF’s flat per-seller cost on the other. Time-to-first-sale on a new SKU, which proxies how quickly the data routes new inventory to the right buyers.
None of the measurements is a vanity metric — none is “traffic” or “engagement” or “follower count.” Each is tied directly to whether the seller ended up with more money, with the same effort.
The numbers above are eight months in, across the 62 active makers. They are real but the sample is narrow. They should not be treated as a generalized claim yet.
Sample of 62 across three categories is enough to spot a signal. It is not enough to claim the signal generalizes. The next year of the experiment is mostly about widening the test.
Field notes
When the experiment started, the assumed primary reader of an SPF feed was search engines. Search is still important. The surprise is how quickly AI assistants — Claude, ChatGPT, Perplexity — have become substantial referrers. The feeds were already structured the way they want to read; we did not redesign for them.
I expected the strongest results in categories where the seller has the most to say about each product — meaning furniture, with its long copy and material distinctions. Pantry goods, where the copy is shorter, has outperformed. Hypothesis: structured data favors items where attributes are discriminating, not items where they are merely numerous.
Not because SPF didn’t work, but because the marketplace they were leaving sued them — and the cost of fighting the suit was higher than the SPF margin uplift. Filed under: the closed layer extracts rent even on the way out. Predictable in retrospect; should have been planned for.
Three months in, two large marketplaces began offering “structured catalog export” features that look unmistakably like SPF’s schema, in a clear attempt to make leaving the marketplace less attractive. This is, in a perverse way, the strongest signal yet — they are now treating the absence of structured exports as a competitive vulnerability.
Three makers, three categories, a hand-rolled schema, and an unreasonably long Saturday. The hypothesis is on paper. The test will run for at least two years.
What this is teaching
The findings below are provisional. The experiment is not concluded. Each of the lessons here is what I currently believe; each is also something I would revise without much resistance if a year of additional data disagreed.
- The structural critique generalizes. Marketplaces are not the only domain where a closed legibility layer extracts rent on top of value created elsewhere. The same shape recurs in publishing, in talent matching, in finance, in employment platforms. SPF is the small, cheap, falsifiable version of a much larger claim.
- Structured data favors discriminating attributes, not numerous ones. The category that performs best is the one where each product has 3–5 attributes that materially distinguish it. Where every product has 30 attributes and only one of them matters, the structure is wasted.
- Agents are now the primary reader, faster than expected. A market for products that are legible to agentic systems is forming earlier than the experiment assumed. The right way to design for “search” in 2026 is to design for AI assistants first and search engines as a secondary reader.
- Exit costs are part of the test, not a separate problem. The maker who dropped out wasn’t a failure of the SPF hypothesis; it was a failure of the experiment to model the cost of leaving a closed system. The cost of departure is itself one of the rents the closed system extracts. Future iterations factor it in.
Each of these lessons connects to a forthcoming essay. The cross-references are not coincidence; they are the publication doing its job.
What's next
Three moves are queued, in this order. First: widen the categories. The current three are too narrow to support generalization. Adding two more in the second half of 2026 — leaning toward apparel and small-format consumer electronics, both of which have notably distinct attribute structures.
Second: build the exit-cost model. The maker who left taught us that leaving a marketplace is itself a separate cost the experiment should model. Future intake will price that cost into the SPF onboarding decision rather than discover it after the fact.
Third: publish the schema. SPF’s structured-data format has been refined through eight months of use. It is ready to be opened to anyone, including makers who never join the program, on the theory that the schema’s value is in being adopted, not in being controlled.
The last move — opening the schema — is the structural test of whether we are willing to live by the same critique we make of marketplaces.