List-Building Audit β€” What We're Overthinking

Top-to-bottom review of the signal engine and target-list approach. Written 2026-04-17 after a day of signal design, dry-runs, and bouncing questions.


1. The single most important insight

We are trying to invent signals that detect companies the industry has already pre-qualified.

Manufacturers, trade associations, and state agencies already publish lists that filter for revenue, reputation, and operating scale. Starting from those lists is 10–100Γ— cheaper than scoring the whole SMB universe and guessing.

For Design Precast specifically, the NPCA producer member directory is the list. Everything else is a re-derivation.


2. What we are making too hard

OverthinkingWhat we should do instead
Building custom gatherers for every signalStart with pre-qualified lists, only score inside them
Scoring every company on fitAssume fit; score only on sell-intent
Inventing sell-intent proxies from web scrapesUse real trigger events: SBA loan defaults, probate, license lapses
Verifying $15M–$50M revenue from weak proxiesUse dealer/association membership as a revenue floor
Paid vs. free signal debatesBudget-first; let Claude pick from a menu
Building the signal registry from scratchImport PCI / NPCA / ENR / state DOT lists as "seed signals"

3. Published marquee lists we are not using

General construction / trades (cross-vertical)

Precast / concrete (Design Precast's own world)

Manufacturer authorized dealer / certified contractor programs

(Inclusion usually implies a revenue floor, credit check, insurance minimums, and training investment β€” pre-qualifies the business)

Trade association member directories

State & federal public data

Sell-intent triggers (real, not inferred)


4. Cheap paid data that skips half our work

SourceRoughlyWhat it gives you
Data Axle / InfoUSA~$1–2k flat for 10k companiesNAICS + revenue + employees + owner name + phone. One-shot lists.
Reference Solutions (public library card)Free via many librariesSame data as Data Axle, free for residents. Most under-used asset we have.
IBISWorld industry report~$900/reportNames the top 5–10 companies in each industry with revenue.
BizMiner vertical report~$200Same idea, deeper financial benchmarks.
Apollo.io$49–99/user/moContact data, revenue ranges, tech stack.
Crunchbase Pro$49/moPE ownership flag (solves one of the Y/N questions outright).
LinkedIn Sales Navigator$99/moHeadcount, hiring velocity, owner tenure.
D&B Hoovers$75–200/moRevenue ranges for private SMB.
Blue Book Network (construction)variesNamed specialty contractors by region, with scope & history.
SafeGraph POIfree sample tierFoot traffic for retail-ish verticals.
OpenCorporatesfree β†’ paid APIEntity graph, officers, filings β€” great for PE roll-up detection.

The punchline: a $2,000 Data Axle pull for "precast concrete manufacturers, USA, 20+ employees, independently owned" probably beats 100 hours of gatherer-building.


5. What's missing from our current thinking

  1. Owner demographics > firmographics for sell-intent. Age of owner, years in business, LinkedIn tenure β€” these predict sale readiness far better than web signals. Voter registration records give age for free in many states.
  2. PE/search-fund exclusion is a data pull, not a signal. Crunchbase + PitchBook Basic answers this deterministically. Stop trying to infer it.
  3. The "marquee" shortcut. If a company shows up in any national trade award, they are >$15M. Period. Save the signal spend.
  4. Vertical partner supplies the seed list. Cress (CII) already knows every precast producer worth a letter. The engine's job is to rank and enrich his list, not discover one.
  5. The DNC list is a data model, not an afterthought. One table, globally enforced. Every tool writes to it, every query reads from it.
  6. Distributor relationships. Ferguson, HD Supply, Winsupply, White Cap, Hajoca β€” their top-tier customer lists are gold, and sometimes accessible through a vendor partnership or rep-firm introduction. Not scrapable β€” relational.
  7. Surety bonding capacity. Bonded capacity is public for government-work contractors and scales directly with revenue. Surety agents publish client lists as marketing.
  8. Franchise FDDs. Every franchise system files public FDDs listing every franchisee with contact info. For restoration, pest, lawn, auto-glass, etc. β€” the list already exists.
  9. Buying groups and co-ops. PRO Group (outdoor power), Distribution America, Affiliated Distributors β€” membership = scale threshold.
  10. Apprenticeship program participants. Listed by state DOL. If a company has registered apprentices, they have real employees and a multi-year horizon.

6. Reshape of the approach (proposed)

Phase 1 β€” Seed (per vertical, one-time):

  1. Buy / pull the one canonical association directory (NPCA for precast, NRCA for roofing, etc.).
  2. Cross-reference with state contractor license DB + state DOT pre-qualified list.
  3. De-dupe, geocode, attach NAICS. This is the base universe. ~500–5,000 companies per vertical.

Phase 2 β€” Exclude (cheap, deterministic):

  1. PE-backed β†’ Crunchbase lookup, auto-exclude.
  2. DNC list β†’ auto-exclude.
  3. Recently acquired (last 24 mo) β†’ Crunchbase / Axial / PitchBook, auto-exclude.
  4. Public company β†’ skip.
  5. <$15M revenue proxy (employees < 15) β†’ skip or defer.

Phase 3 β€” Rank (signal engine, but on a small universe):

  1. Owner age / tenure (sell-intent).
  2. Multi-generational flag.
  3. Geographic adjacency to acquirer.
  4. Trade/product adjacency.
  5. Trigger events (probate, distress, license lapse).
  6. Marquee status (award lists) β†’ fit confidence booster.

Phase 4 β€” Enrich only the top N: Only after the list is ranked and sliced to ~50, run expensive enrichment (Exa, LinkedIn scraping, phone append). Stops us spending $3.50 Γ— 5,000 companies.


7. The single cheapest thing we can do this week

Pull the NPCA producer member directory and the state DOT pre-qualified precast supplier lists for MS, LA, AL, GA, SC, NC, TN, FL. De-dupe. That is the Design Precast universe β€” a hand-counted ~200–400 companies, and the engine's first real job is just ranking that list.

No $3.50/company gatherers needed. No 33-signal battery. Just: which 30 of these 300 are most likely to sell in the next 18 months, and which are closest to Design Precast geographically and operationally?

That is a solvable problem with the data already in hand plus one afternoon of public-list fetching.


8. Questions back to Mark and Ewing

  1. Is there budget (~$2k) for a Data Axle / Reference Solutions pull per active vertical? It likely replaces 80% of the gatherer work.
  2. Do we have (or can Cress provide) the NPCA producer list for precast right now?
  3. Should the engine's v1 output be a ranked version of Cress's seed list, not an independently-generated target set?
  4. Do we want the DNC list + PE-backed exclusion list as the first two database tables built this week, before any more signals?
  5. Which vertical partner (after CII) has a canonical directory we can seed from β€” HR.com for HR services? A roofing partner? This shapes the pattern.

Tasks from this artifact (0)

See all β†’
No open tasks against this.