26+ years across marketing and engineering. I build custom, production-grade datasets for teams that are too busy to build them in-house. Common Crawl extraction, SERP analysis, competitive intelligence — whatever you need, delivered ready to use.
Your team is busy. I take on the complex data work — web scraping, parsing, enrichment, analysis — and deliver production-ready datasets.
Working with Me →The complete Common Crawl web graph history in SQLite. Every domain and hostname, with normalized PageRank and Harmonic Centrality scores.
Buy the Database →No API key. No registration. Just hit the endpoint. 100 free lookups per day — send hostnames or domains, get historical metrics back instantly.
Search Now →Who This Is For
You know the data exists. You know it can be extracted. You just don't have the time or the team to do it yourself.
SEO, analytics, competitive intelligence
Your engineers are busy. Your analysts are buried. You need someone who understands both marketing and engineering — and can take a data project from concept to deliverable without hand-holding.
SEO, marketing, digital, PR
Your client needs data you can't build in-house. I build it, you deliver it. Clean handoff, no drama. You set your own margin.
Case Studies
Every project is different. Here are the kinds of work that come through the door most often.
Parse and analyze pages across Common Crawl releases at scale. Extract structured data from HTML, URLs, tags, and page elements across billions of records.
Hundreds of thousands of search queries, downloading and analyzing every ranked URL. Contact info, partnership opportunities, competitive gaps — extracted and structured.
Track competitors across web properties, search results, and market signals. Build databases your team can search and act on immediately.
Monitor search results, news, RSS feeds, and web mentions for brand terms, product names, executives, and competitors. Near-real-time, highly thorough.
Build searchable databases of sponsorship opportunities, link prospects, local partnerships, and outreach targets — nationwide or by specific location.
Phone numbers, emails, social accounts, named entities — extracted, validated, and delivered in your preferred format. CSV, JSON, SQLite, Excel.
13+ years in marketing. 13+ years in engineering. 26+ years of combined experience building datasets, managing teams, and solving hard data problems for companies of every size.
Most marketers can't engineer complex data systems. Most engineers don't understand marketing well enough to build the right thing. I've spent over a decade in each discipline, and that combination is rare.
On the marketing side, I've directed teams of 70–80 people responsible for over 1,400 SEO client accounts. I've led international SEO campaigns spanning 30–40 countries for companies like a major home improvement retailer. I was the weekly point of contact for Fortune 500 accounts. I grew a company from zero to $140K+/month in revenue in under a year as VP of Operations. I've run SEO and PPC campaigns for small businesses, managed agency teams, and built marketing strategy at every scale.
On the engineering side, I've built large-scale web scraping and indexing systems that process billions of records. I wrote a marketing SaaS platform from scratch in pure C that could download and parse up to 400 million URLs per day. I've built custom databases, data pipelines, sharded storage systems, firmware for embedded devices, cross-platform networking libraries, and the complete Common Crawl web graph history — over 850 billion records compiled into queryable SQLite databases.
When you describe a data problem to me, I don't just understand the technical requirements — I understand the marketing objective behind it. I know what you're trying to accomplish, why it matters, and how the data needs to be structured so your team can actually use it. That's what 26 years across both disciplines gives you.
Every project starts with a conversation about what you're trying to accomplish — not a requirements document or a feature list. I want to understand the business objective. Once I understand the goal, I scope the work, define exactly what you'll receive, and quote a fixed price. No hourly billing surprises. No scope creep. You know what you're getting and what it costs before anything starts.
From there, I build. You get regular updates with working deliverables — not status reports, not slide decks. Actual data you can look at. If something needs to change mid-project, we talk about it and adjust. I'm direct about what I can and can't do. If I think something won't work, I'll tell you before I waste your time or money on it.
Deliverables are production-grade. You get the dataset in whatever format your team needs — CSV, JSON, Excel, SQLite — along with schema definitions, field-level documentation, notes on assumptions and edge cases, and QA methodology. Everything is clean, documented, and ready to plug into your workflows.
Here's something I've learned from years of doing this work: once people see their data for the first time, they almost always want it differently than they originally described. That's not a failure of scoping — it's how data projects actually work. You don't fully know what you need until you see what's possible.
Iterations and revisions are baked into every project I take on. The price I quote assumes we're going to go back and forth. I expect it. I'll refine the deliverable until it's right. If you ask for something that's outside the scope, I'll tell you — but within the scope of the project, I'm not going to nickel-and-dime you on changes. The goal is a deliverable your team actually uses, not one that technically meets a spec but sits in a folder.
Schedule a CallFAQ
Whatever works for your team. CSV, JSON, Excel, SQLite — I deliver in your preferred format with field-level documentation and schema definitions. The web graph databases are SQLite.
Almost always fixed price. I scope the project, define clear deliverables, and quote a number before work begins. Revisions and iterations are included — we go back and forth until it's right.
It depends entirely on the project. There's no single answer. Some are a few weeks, some are longer. I'll give you a realistic timeline during the scoping conversation.
That's expected. Once people see their data, they almost always want adjustments. Iterations are baked into every project. If you need something I can't provide, I'll tell you upfront.
The complete Common Crawl web graph history, compiled into SQLite databases. Search any domain or hostname and get full historical metrics including custom normalized PageRank and Harmonic Centrality scores (0–100).
Yes. No API key, no registration, no credit card. 100 hostnames per day, up to 10 per request. Just hit the endpoint and get data back. Resets every 24 hours.
Trusted by
Tell me what you're working on. I'll let you know if I can help and what it would cost. No pitch, no pressure — just a direct conversation about your project.
Contact form coming soon. For now, send me an email with a brief description of your project.
Web Graph Database
The raw Common Crawl web graph data is a collection of tab-separated value files you have to download and compile yourself. No database. No search. Just raw data and a lot of work.
I've done that work for you. Every release, every domain, every hostname — compiled into SQLite databases with instant key-value lookup. Search any domain or hostname and get its complete history across every metric, every crawl.
Yearly subscriptions available — updated with each new Common Crawl release.
View Full PricingUpdates
We continuously process new Common Crawl releases and improve the dataset.
Latest Common Crawl release processed and added to all database products. API updated.
Refined 0-100 scoring algorithm for better distribution across the range.
All API plans now include 2x the previous rate limits at no additional cost.
Reduced file sizes by 15% through improved compression without data loss.