How We Built a B2B Lead Generation Machine

There is a painful irony at the heart of most B2B sales operations. The people you've hired to sell — to build relationships, run demos, negotiate deals — spend the majority of their time not selling at all. They spend it finding people to sell to.

Research consistently shows that sales representatives spend as little as 30% of their working week on actual selling activities. The rest is prospecting: manually searching for companies that fit the target profile, hunting down the right contact names and email addresses, checking whether the information is current, and entering it all into a system. This is work that is mechanical, repetitive, and — critically — fully automatable.

This is the problem a B2B SaaS company came to us with. Their product served operations managers at mid-market logistics and supply chain businesses in the UK and Europe. They had a well-defined ideal customer profile, a strong product, and a sales team that was genuinely good at closing — when they got in front of the right people. The problem was getting there. Their sales reps were spending an estimated 60% of their week on prospecting activities that weren't generating enough pipeline to meet targets.

We built them a lead generation pipeline that delivered 10,000 verified, enriched contacts — matched to their exact ideal customer profile — in six weeks. This post is a full account of how it worked.

💡 The scale of the problem: Studies show B2B sales reps spend as little as 30% of their week on actual selling. The rest goes to admin and prospecting. Meanwhile, 80% of B2B leads generated by volume-focused tactics never convert — the quality of the list matters as much as the size of it. The pipeline we built was designed around quality first.

Step 1: Defining the Target Before Building Anything

The most common mistake in lead generation projects — automated or manual — is starting with data collection before the target is precisely defined. A vague ideal customer profile produces a large list of poorly matched contacts. A precise one produces a smaller list of contacts that actually convert.

We spent the first week not writing code, but working with the client's sales team to sharpen their ideal customer profile into something specific enough to act on programmatically.

The starting definition was something like: "operations managers at logistics companies in the UK and Europe." Useful as a direction, but not specific enough to build a pipeline from. By the end of the week, we had something we could actually target:

Company size: 50–500 employees — large enough to have budget, small enough to not require a six-month procurement process
Industry: Third-party logistics providers, freight forwarders, last-mile delivery companies, and supply chain consultancies
Geography: UK, Germany, Netherlands, France — the markets where the product had existing traction and localised support
Signals of fit: Companies actively hiring in operations roles (a reliable indicator of growth and operational complexity), companies that had recently expanded their warehouse footprint, companies that had raised funding in the past 18 months
Contact persona: Operations Director, Head of Operations, Supply Chain Manager, Logistics Manager — decision-makers or strong influencers, not junior coordinators

This precision matters more than it might seem. The difference between targeting "logistics companies" and targeting "third-party logistics providers with 50–500 employees that are actively hiring in operations roles" is the difference between a list that wastes your sales team's time and one that generates conversations.

Step 2: Identifying the Right Data Sources

With a precise target definition in hand, the next question is: where does the data live? For B2B lead generation, publicly available data is spread across several source types, each with different strengths and different technical characteristics.

For this project, we identified four primary source types:

Business directories and company databases: Industry-specific directories — freight association member lists, logistics trade body directories, sector-specific business registries — are gold mines for well-targeted company lists. Many of these are publicly accessible and updated regularly by the organisations that maintain them. We identified 14 relevant directories across the four target geographies.

Job posting platforms: Job postings are one of the most underused signals in B2B prospecting. A company posting three operations manager roles is telling you something important: they're growing, they have budget, and they have operational complexity. We scraped relevant job postings weekly to identify companies that were actively hiring in target roles — a real-time signal of fit that a static database can't provide.

Company websites: For contact information, company websites remain one of the most reliable sources. "About" pages, team pages, and leadership pages frequently contain names, titles, and sometimes direct contact details for the people we were targeting.

Professional networking platforms: Publicly available professional profile data provides the most reliable source of current job titles and employment status — critical for ensuring that the contact you're reaching out to actually still holds the role you're targeting. People change jobs. A static database goes stale quickly; real-time scraping of public professional data keeps the list current.

Step 3: Building the Pipeline

The pipeline has four stages: collection, extraction, enrichment, and verification. Each runs independently, stores its output before passing it downstream, and fails loudly if something goes wrong.

Collection

We built individual collection modules for each source type — one per directory, a shared module for job posting sources, and a crawl-based module for company websites. Each module is responsible for fetching raw data from its source and storing it in a structured intermediate format. Nothing downstream has any knowledge of how the data was collected — this separation is what makes the pipeline maintainable as sources change over time.

For HTML-based sources, we used a combination of a standard HTTP library for straightforward pages and a headless browser for JavaScript-rendered content. For sources with accessible API endpoints, we used those directly — API data is faster, more structured, and more reliable than scraped HTML.

Rate limiting and request pacing were built in from the start. Every collection module respects the source's robots.txt, enforces delays between requests, and backs off gracefully when it encounters rate limit responses. We collect only publicly available data, only at a pace the source can accommodate without disruption.

Extraction

Raw collected data is messy. Company names are inconsistently formatted. Titles vary widely — "Head of Ops", "Operations Director", "Director, Operations", and "VP Operations" might all refer to the same level of seniority at different companies. Addresses come in different formats across different countries.

The extraction stage normalises all of this into a consistent schema: company name (normalised), company size (banded: 50–100, 100–250, 250–500), industry classification (mapped to our target categories), country, contact name, contact title (normalised against a seniority taxonomy), and source identifier.

Records that can't be normalised reliably — ambiguous titles, incomplete company information, records missing required fields — are quarantined for manual review rather than passed downstream with low-confidence data.

Enrichment

The enrichment stage takes the normalised company and contact records and adds the signals that make the list genuinely useful for sales outreach.

For companies, we added: estimated employee count (cross-referenced across multiple sources to improve accuracy), recent hiring activity (number of relevant job postings in the past 90 days), recent news mentions (funding announcements, expansion announcements, new service launches), and technology signals where publicly available (what logistics software they reference on their website, integrations they mention in job postings).

For contacts, we added: current employment verification (confirming the person still holds the stated role at the stated company), seniority classification, and a fit score — a simple numerical score from 0 to 10 that weights company size, industry match, hiring signals, and contact seniority against the client's ideal customer profile.

Contacts with a fit score below 6 were filtered out before delivery. The client received a list of 10,000 contacts — but the pipeline evaluated over 40,000 before the filtering stage. Quality, not volume, was the design principle.

Verification

Contact data goes stale fast. People change jobs. Companies restructure. An email address that was valid six months ago may now bounce — damaging your sender reputation and wasting your sales team's time.

Every contact email in the final list went through a three-step verification process: syntax validation, domain verification (confirming the domain exists and accepts email), and mailbox verification (confirming the specific address is active). Only contacts that passed all three steps were included in the final output.

The target we set — and achieved — was a verified deliverability rate above 92% across the full list. Industry benchmarks suggest scraped and enriched lists should achieve 90%+ after verification; lists below 85% indicate source or extraction problems worth investigating.

What the Pipeline Delivered

Six weeks from project start to delivery of the first full list. Here's what the numbers looked like:

10,247 verified contacts delivered, all matching the refined ideal customer profile
92.4% email deliverability rate on the initial send — well above the industry benchmark
85% reduction in prospecting time — sales reps went from spending an estimated 60% of their week on prospecting to under 10%
Pipeline generated within 8 weeks of the list being handed to the sales team: sufficient qualified opportunities to represent a meaningful increase in forecast against the previous quarter
Ongoing refresh built in — the pipeline runs weekly, adding new companies that match the profile and updating records where signals have changed. The list doesn't go stale

The qualitative feedback from the sales team was consistent: they were having more conversations with people who already fit the profile before the call started. Less time explaining what the product does and why it's relevant; more time qualifying genuine interest and discussing specifics. The quality of pipeline, not just the quantity, improved.

The Compliance Question

This section exists because it's the first question serious buyers ask — and it should be.

B2B outreach using publicly sourced contact data sits within a legal framework that varies by geography. In the UK and EU, the relevant regulations govern how personal data can be collected, stored, and used for marketing purposes. The framework is nuanced, but for B2B prospecting, the key principles are well-established.

Collecting publicly available professional information — names, job titles, work email addresses — for legitimate B2B marketing purposes is generally permissible under the legitimate interests basis, provided the data subjects are informed of the use and given a clear mechanism to opt out. Professional contact information that a person has made publicly available in a professional context carries different privacy expectations than personal information.

Every list we build is designed with compliance in mind from the start:

We collect only publicly available data — no private networks, no authenticated sessions, no data that requires bypassing access controls
We do not collect personal contact information such as personal email addresses or personal phone numbers — only professional contact details
All lists include clear guidance on required opt-out mechanisms and consent language for outreach sequences
Data is stored securely and not retained beyond the agreed project scope

If you are operating in a regulated industry or have specific compliance requirements, we recommend taking independent legal advice on your outreach programme. The technical pipeline we build is designed to be compliant; the specific legal obligations of your outreach depend on your industry and geography.

Build Custom vs. Use a Data Provider

The obvious question: why not just buy a contact list from a data provider?

Data providers have their place. For broad, generic targeting — all marketing managers at companies above 500 employees in a given country — a purchased list can be a reasonable starting point. Fast to acquire, no build time required.

The limitations become significant when your ideal customer profile is specific:

Stale data: Most commercial databases update their records quarterly or annually. A significant percentage of any purchased list — estimates range from 20% to 30% — will have stale or inaccurate information by the time you use it
No real-time signals: A purchased database can tell you that a company is in a given industry and size band. It can't tell you that they started hiring operations managers aggressively three weeks ago — which is exactly the kind of signal that makes outreach timely rather than random
Generic coverage: Industry-specific directories, niche trade body membership lists, and sector-specific sources are rarely well-represented in general-purpose commercial databases. If your target market is specific, generic data providers often have thin coverage of exactly the segment you care most about
No ongoing refresh: A purchased list is a snapshot. A custom pipeline is a live feed — continuously updated as new companies enter the market, existing companies grow or shrink, and contacts change roles

For the client in this case study, a purchased list would have given them a broad set of logistics companies with low targeting precision. The custom pipeline gave them a continuously refreshed set of companies matched to their specific profile, with real-time hiring signals, at a deliverability rate they couldn't have achieved with a generic purchase.

The Takeaway

B2B lead generation has shifted. The organisations generating the most efficient pipeline in 2026 are not the ones with the largest lists — they're the ones with the most precisely targeted data, the most current signals, and the fastest follow-up.

A sales team that spends 30% of its time selling will always outperform one that spends 60% of its time prospecting, regardless of how good the individual salespeople are. The pipeline we built didn't make the client's sales team better at closing. It gave them the time and the data to do what they were already good at — and the results followed.

If your sales team is spending more time finding prospects than talking to them, the problem is almost certainly solvable — and the solution is almost certainly faster to build than you'd expect.

Talk to us about your lead generation pipeline →

Building a Lead Generation Machine: How We Scraped 10,000 Qualified Contacts for a B2B SaaS Client

Step 1: Defining the Target Before Building Anything

Step 2: Identifying the Right Data Sources