Data Leverage Digital Garden

This site hosts a variety of longer newsletter-style posts and short “focused idea” and “reaction” posts, all generally relating to data leverage and touching on how people contribute to AI systems, how that dependence creates power, and what institutions could return some of the value.

Start here

About

The Long Posts section includes longer blog posts, including pieces already published on Substack and early drafts. I plan to continue cross-posting all longer posts to Substack, Leaflet, and this site. Focus Posts are relatively short explorations of a single concept that I think is important. Short Reactions are posts that mainly respond to a news hook. Finally, Meta Notes are short posts—close to “notes to self”—about my blogging pipeline and the goals of blogging.

The Extras tab has some fun, mostly vibe-coded analytics on words and phrases and outbound links. I also keep a registry of other cross-posted copies there.

Long Posts

37 total

AI Dividends Without Taxing Compute, Automation, or Equity: A Presumptive Commons-Rent Tax Based on Capabilities and Data Dependence
2026-06-07 Long Posts Leaflet copy

A proposal for updating data dividend ideas around capability measurement, data provenance, and AI auditing.
The AI "Evaluation Crisis" Is an Opportunity to Get Data Flow Right
2026-04-30 Long Posts Leaflet copy

Why the AI evaluation crisis could force a reckoning on dataset provenance, attribution, and consent.
Attestation across the AI Supply Chain
2026-04-08 Long Posts Leaflet copy

A proposal for interoperable attestation objects that connect training data, evaluation labor, and AI-generated outputs across the AI supply chain.
AI is driving the cost of polish down; some musings on fancy versus terse artifacts
2026-04-01 Long Posts Leaflet copy

AI progress means the "polish" of a figure or website no longer proxies for quality. Can we try to turn this into a good thing for curation, attention allocation, and even AI progress itself?
A Short Guide to Data Strikes and Conscious Data Contribution in the Context of 2026 Frontier AI
2026-03-03 Long Posts Leaflet copy

Back to the basics of data leverage.
The Paradox of Reuse in 2026: A Case of Quasi-Enclosure, or "Subsidized Club Goods that Sort of Look Like Public Goods"
2026-02-17 Long Posts Leaflet copy

How we can understand, and react to, the complicated impacts of AI systems on online communities and knowledge commons
The Coding Agent Data Deal
2026-01-12 Long Posts Leaflet copy

On user data control, coding agents as retrievers, and the value of your coding transcripts
Coding agents are (1) a big deal, (2) very relevant to data leverage, and (3) able to help build tools that support data leverage!
2026-01-05 Long Posts Leaflet copy

Sharing an early reaction to recent coding agent discourse and two relevant projects

Focus Posts

4 total

Short notes built around one claim I want to make precisely.

Using AI kind of involves interacting with other people
2026-07-08 Focus Posts Public copy

A focused note on generative AI use as mediated interaction with upstream human contributors.
N-gram search as posterior updating for data attribution
2026-05-27 Focus Posts Leaflet copy

A short argument that model outputs and training-data priors can improve rough approximations of data attribution.
Augmentation is a data flow problem
2026-05-26 Focus Posts Leaflet copy

A short argument that "augment, do not replace" is about data control.
AI progress as quasi-public good production
2025-05-30 Focus Posts Leaflet copy

A short argument that AI systems pool diffuse human work into privately governed, public-good-like model weights.

Short Reactions

5 total

Responses to a paper, event, or claim while it is still live.

Data labeling work should be dignified, not dismissed
2026-06-20 Short Reactions Public copy

Reacting to the Meta data labeling reassignment story as a data bargain problem rather than only a status story.
"People First" Policy Ideas that Complement Each Other (through better data flow)
2026-04-06 Short Reactions Leaflet copy

Reacting to a wide-ranging set of policy ideas from OpenAI.
Two natural allies of a "Data Transparency" agenda: capabilities forecasters and social simulators
2026-03-09 Short Reactions Leaflet copy

Making an "if you like X, you might want to support Y" argument for data-focused policy
[microblog] One book is worth "0.06%" benchmark points to AI; is "no different from noise". What gives?
2025-04-21 Short Reactions Leaflet copy

Commenting on recent coverage of, and discussion about, Meta's arguments about training data value quantification.
Perplexity CEO's Interaction with Striking New York Times Workers Does Not Reflect Well on the AI Industry
2024-11-09 Short Reactions Leaflet copy

The idea that data-dependent AI systems are ready and willing to crush any leverage from knowledge workers is unlikely to make the AI industry look good to the public.

Meta Notes (About the Blogging Pipeline Itself)

2 total

How I am publishing right now
2026-05-28 Meta Notes Leaflet copy

A rough map of my current local Markdown to Leaflet, Substack, and social-post workflow.
April 2026 small points
2026-04-28 Meta Notes Leaflet copy

A running collection of short ideas about sovereign AI, augmentation, evaluation labor, data pipelines, and quasi-enclosure.