Enrich Your HubSpot Contact Data at Scale: Python Code Tutorial

Your CRM is only as powerful as the data inside it. And let’s face it — most HubSpot databases are filled with contacts missing critical information. A name without an email. A company with no phone number. It’s not just messy — it slows your team down.
This tutorial shows you how to solve that with a simple bit of code. Using Surfe’s Contact Enrichment API and a Python script, you’ll fetch contacts from HubSpot, enrich them with verified emails and phone numbers, then push the updated data straight back into your CRM. From there, the possibilities are endless!
Run the script, fill the gaps, and equip your team to reach out with confidence.
By the end of this tutorial, you’ll have a fully functioning Python script that:
- Fetches contacts directly from your HubSpot CRM
- Extracts key information for enrichment
- Sends bulk enrichment requests to Surfe’s Enrichment API
- Retrieves enriched contact data (email addresses and phone numbers)
- Updates your HubSpot contacts with the enriched data
Ready? Let’s get started.
- Prerequisites
- Setting Up Your Environment
- Fetching Contacts from HubSpot
- Preparing Data for Surfe Enrichment
- Starting the Enrichment Process
- Polling for Enrichment Results
- Processing Enriched Contact Data
- Updating HubSpot Contacts
- Putting It All Together
- Running the Script
- Final Notes: Credits, Quotas, and Rate Limiting
Want the full script? Jump to the bottom for a quick copy-paste or view it on GitHub!
Prerequisites
- Python 3.x Installation
Most modern operating systems come with Python 3 pre-installed. To check if Python is installed on your system:
- Windows: Open Command Prompt (Win + R, type cmd, press Enter) and run:
py --version
- macOS/Linux: Open Terminal and run:
python3 --version
- If Python is not installed, download it from the official Python website and follow the installation instructions for your OS.
- Basic Python Programming Knowledge
- HubSpot Account with API Access
To fetch and update contacts, you’ll need a HubSpot account with API access.
- Create a private app in your HubSpot account settings (link to documentation) and obtain the access token.
- Navigate to the Auth Tab in your private app settings and copy the Client Secret.
- Surfe Account and API Key
To use Surfe’s API, you’ll need to create an account and obtain an API key. You can find the API documentation and instructions for generating your API key in the Surfe Developer Docs.
Step 1: Setting Up Your Environment
Let’s begin by setting up your development environment. We’ll start by creating a virtual environment.
1.1 Creating a Virtual Environment (Optional but Recommended)
Creating a virtual environment is recommended to keep your project dependencies organized:
# macOS/Linux
python3 -m venv env
source env/bin/activate
# Windows
py -m venv env
env\Scripts\activate
1.2 Installing Required Packages
Make sure you have the following Python packages installed:
- requests (for API calls)
- python-dotenv (for storing API keys securely)
# macOS/Linux
python3 -m pip install requests python-dotenv
# Windows
py -m pip install requests python-dotenv
1.3 Storing Your API Keys Securely
It’s best to avoid hardcoding API keys in your script. Instead, store them as environment variables.
- Create a file named .env in your project’s root directory:
HUBSPOT_ACCESS_TOKEN=your_hubspot_access_token
SURFE_API_KEY=your_surfe_api_key
- Create a Python script file named main.py, add the necessary package imports, and load the API keys:
import os
import time
import requests
from dotenv import load_dotenv
# Load environment variables
load_dotenv()
hubspot_api_key = os.getenv("HUBSPOT_ACCESS_TOKEN")
surfe_api_key = os.getenv("SURFE_API_KEY")
Step 2: Fetching Contacts from HubSpot
We will now build our enrichment script by examining each component function. The first step is to retrieve contacts from HubSpot using their retrieve a batch of contacts API.
This function sends a GET request to HubSpot’s contacts API endpoint. We are specifically requesting key properties needed for enrichment, such as names, company information, and existing contact data. The function returns a list of contact objects from your HubSpot CRM.
Here is what is happening within this function:
- We construct the API endpoint URL for HubSpot’s contacts.
- We set up the authentication headers using the Bearer token.
- We specify which contact properties we want to retrieve.
- We set a limit of 100 contacts per request, which can be adjusted based on your requirements.
- We make the request and return the results after validating the response.
def get_hubspot_contacts(api_key):
"""
Fetch contacts from HubSpot that need enrichment
"""
url = "https://api.hubapi.com/crm/v3/objects/contacts"
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
params = {
"properties": ["firstname", "lastname", "company", "hs_email_domain", "email", "phone", "jobtitle", "hs_linkedin_url"],
"limit": 100
}
response = requests.get(url, headers=headers, params=params)
response.raise_for_status()
return response.json()["results"]
Step 3: Preparing Data for Surfe Enrichment
After retrieving the contacts from HubSpot, we need to format them for Surfe’s API.
This function transforms the HubSpot contact data into the format required by Surfe’s API. It is important to note that we are using the HubSpot contact ID as Surfe’s externalID, which allows us to match the enriched data back to the correct HubSpot contacts later.
def prepare_surfe_payload(hubspot_contacts):
"""
Prepare contacts for Surfe enrichment API
"""
people = []
for contact in hubspot_contacts:
properties = contact.get("properties", {})
person = {
"externalID": contact["id"], # Use HubSpot ID as external ID
"firstName": properties.get("firstname", ""),
"lastName": properties.get("lastname", ""),
"companyName": properties.get("company", ""),
"companyWebsite": properties.get("hs_email_domain", ""),
"linkedinUrl": properties.get("hs_linkedin_url", "")
}
# Only add if we have enough data to enrich
if (person["linkedinUrl"] or
(person["firstName"] and person["lastName"] and person["companyName"]) or
(person["firstName"] and person["lastName"] and person["companyWebsite"])):
people.append(person)
return {
"enrichmentType": "emailAndMobile",
"listName": f"HubSpot Enrichment {time.strftime('%Y-%m-%d %H:%M:%S')}",
"people": people
}
The function also performs data validation to ensure we only send contacts that have sufficient information for enrichment. Surfe’s enrichment API requires either:
- A LinkedIn URL, or
- First Name + Last Name + Company Name, or
- First Name + Last Name + Company Website
By filtering out contacts that do not meet these requirements, we avoid wasting API calls on contacts that cannot be enriched.
Step 4: Starting the Enrichment Process
With our data prepared, we can start the enrichment process by sending a request to Surfe’s bulk enrichment API.
def start_surfe_enrichment(api_key, payload):
"""
Start the bulk enrichment process with Surfe API
"""
url = "https://api.surfe.com/v1/people/enrichments/bulk"
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()
return response.json()["id"]
New: We now send our contacts to Surfe’s bulk enrichment API to initiate an asynchronous enrichment process. The API returns an enrichment ID, which we will use to track progress and retrieve results.
Key aspects of this function include:
- We are using Bearer token authentication with the Surfe API.
- We are making a POST request with the formatted payload.
- The API response contains an ID for the enrichment job.
- The function handles API errors using `raise_for_status()`.
Step 5: Polling for Enrichment Results
Surfe’s bulk enrichment process operates asynchronously, meaning we need to poll the API to check when the process is completed and retrieve the results.
def poll_enrichment_status(api_key, enrichment_id, max_attempts=60, delay=5):
"""
Poll the enrichment status until it's complete
"""
url = f"https://api.surfe.com/v1/people/enrichments/bulk/{enrichment_id}"
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
attempts = 0
while attempts < max_attempts:
response = requests.get(url, headers=headers)
response.raise_for_status()
data = response.json()
status = data.get("status")
if status == "COMPLETED":
return data
elif status == "FAILED":
raise Exception(f"Enrichment failed: {data.get('error', 'Unknown error')}")
print(f"Enrichment status: {status}. Waiting {delay} seconds...")
time.sleep(delay)
attempts += 1
raise Exception("Enrichment timed out")
This function repeatedly checks the status of the enrichment job until it is complete or fails. We have set default values for the polling frequency (5 seconds) and maximum attempts (60), allowing the enrichment process up to 5 minutes to complete.
The function handles different status responses:
- If the status is “COMPLETED”, it returns the enriched data
- If the status is “FAILED”, it raises an exception with the error message
- While the status is “IN_PROGRESS”, it waits and checks again
If the maximum number of attempts is reached, it raises a timeout exception
Step 6: Processing Enriched Contact Data
After receiving the enriched data from Surfe, we need to format it for updating the contacts in HubSpot.
def prepare_hubspot_update_data(enriched_data):
"""
Prepare data for HubSpot update by comparing original and enriched data
"""
update_data = []
for person in enriched_data.get("people", []):
# Extract the best email and phone
email = None
if person.get("emails"):
# Sort by validation status and take the first one
sorted_emails = sorted(
person["emails"],
key=lambda x: 0 if x.get("validationStatus") == "VALID" else 1
)
if sorted_emails:
email = sorted_emails[0].get("email")
mobile_phone = None
if person.get("mobilePhones"):
# Sort by confidence score and take the highest
sorted_phones = sorted(
person["mobilePhones"],
key=lambda x: x.get("confidenceScore", 0),
reverse=True
)
if sorted_phones:
mobile_phone = sorted_phones[0].get("mobilePhone")
# Only update if we have new data
properties = {}
if email:
properties["email"] = email
if mobile_phone:
properties["phone"] = mobile_phone
if person.get("jobTitle"):
properties["jobtitle"] = person["jobTitle"]
if person.get("linkedinUrl"):
properties["hs_linkedin_url"] = person["linkedinUrl"]
# Only add to update list if we have properties to update
if properties:
update_data.append({
"id": person["externalID"],
"properties": properties
})
return update_dat
This function processes the enriched data and prepares it for updating contacts in HubSpot. It handles several key tasks:
- Email selection: Surfe may return multiple email addresses for a contact. We sort these addresses by validation status and select the most reliable one.
- Phone selection: Similarly, we sort mobile phone numbers by their confidence score and select the one with the highest score.
- Data filtering: We only include properties that have values, thus avoiding unnecessary updates.
- ID mapping: We use the externalID (which contains the HubSpot ID) to ensure the enriched data is mapped back to the correct contact.
By carefully processing the enriched data, we ensure we’re only updating HubSpot with high-quality, verified information.
Step 7: Updating HubSpot Contacts
The final step is updating the contacts in HubSpot with the enriched data.
def update_hubspot_contacts(api_key, update_data):
"""
Update contacts in HubSpot with enriched data
"""
if not update_data:
print("No contacts to update")
return {"status": "skipped", "message": "No contacts to update"}
url = "https://api.hubapi.com/crm/v3/objects/contacts/batch/update"
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
batch_size = 100
results = []
for i in range(0, len(update_data), batch_size):
batch = update_data[i:i + batch_size]
payload = {"inputs": batch}
response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()
results.append(response.json())
return {
"status": "success",
"updated_contacts": len(update_data),
"results": results
}
This function sends the enriched data to HubSpot using their bulk contact update API, updating the contacts with their new information. It includes several optimizations:
- Batch processing: HubSpot limits batch updates to 100 contacts at a time, therefore we split our updates into batches if necessary.
- Empty check: If there is no data to update, we skip the API call and return a status message.
- Error handling: The function handles errors with raise_for_status() and collects all the API responses for tracking.
Result summary: The function returns a summary that includes the number of contacts updated and detailed results.
Step 8: Putting It All Together
Now we will combine all the functions into a cohesive script that orchestrates the entire enrichment process.
The `main()` function orchestrates the entire enrichment process from start to finish. It includes the following:
- Environment setup: Loading API keys from the `.env` file
- Error handling: Checking for missing API keys and catching exceptions
- Process flow: Executing each step in sequence with status updates
Early returns: Stopping the process if no contacts are found or if there is insufficient data

def main():
"""
Main function to orchestrate the enrichment process
"""
load_dotenv()
hubspot_api_key = os.getenv("HUBSPOT_ACCESS_TOKEN")
surfe_api_key = os.getenv("SURFE_API_KEY")
if not hubspot_api_key or not surfe_api_key:
print("Error: Missing API keys. Please check your .env file.")
return
try:
# Step 1: Get contacts from HubSpot
print("Fetching contacts from HubSpot...")
hubspot_contacts = get_hubspot_contacts(hubspot_api_key)
print(f"Found {len(hubspot_contacts)} contacts to enrich")
if not hubspot_contacts:
print("No contacts found that need enrichment")
return
# Step 2: Prepare payload for Surfe API
print("Preparing data for Surfe enrichment...")
surfe_payload = prepare_surfe_payload(hubspot_contacts)
if not surfe_payload["people"]:
print("No contacts with sufficient data for enrichment")
return
# Step 3: Start enrichment process
print("Starting Surfe enrichment process...")
enrichment_id = start_surfe_enrichment(surfe_api_key, surfe_payload)
print(f"Enrichment started with ID: {enrichment_id}")
# Step 4: Poll for results
print("Polling for enrichment results...")
enriched_data = poll_enrichment_status(surfe_api_key, enrichment_id)
print("Enrichment completed successfully")
# Step 5: Prepare data for HubSpot update
print("Preparing data for HubSpot update...")
update_data = prepare_hubspot_update_data(enriched_data)
print(f"Prepared {len(update_data)} contacts for update")
# Step 6: Update HubSpot contacts
print("Updating contacts in HubSpot...")
update_result = update_hubspot_contacts(hubspot_api_key, update_data)
print(f"Update completed: {update_result['status']}")
print(f"Updated {update_result.get('updated_contacts', 0)} contacts")
except Exception as e:
print(f"Error: {str(e)}")
if __name__ == "__main__":
main()
This function provides a clear overview of the overall workflow and makes it easy to understand how all the component functions work together.
Step 9: Running the Script
Now that you have your complete script, It is time to run it and watch the magic happen! Here’s how to execute the script and what to expect during its operation.
- Ensure your .env file contains your Surfe API key and your HubSpot private app access token:
HUBSPOT_API_KEY=your_hubspot_api_key_here
SURFE_API_KEY=your_surfe_api_key_here
- Open your terminal or command prompt, navigate to the directory containing your script, and run:
python main.py
Expected Output
When you run the script, you will see a series of status messages in the console that help you track its progress:

Complete Code for Easy Integration
import os
import time
import requests
from dotenv import load_dotenv
def get_hubspot_contacts(api_key):
"""
Fetch contacts from HubSpot that need enrichment
"""
url = "https://api.hubapi.com/crm/v3/objects/contacts"
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
params = {
"properties": ["firstname", "lastname", "company", "hs_email_domain", "email", "phone", "jobtitle", "hs_linkedin_url"],
"limit": 100
}
response = requests.get(url, headers=headers, params=params)
response.raise_for_status()
return response.json()["results"]
def prepare_surfe_payload(hubspot_contacts):
"""
Prepare contacts for Surfe enrichment API
"""
people = []
for contact in hubspot_contacts:
properties = contact.get("properties", {})
person = {
"externalID": contact["id"], # Use HubSpot ID as external ID
"firstName": properties.get("firstname", ""),
"lastName": properties.get("lastname", ""),
"companyName": properties.get("company", ""),
"companyWebsite": properties.get("hs_email_domain", ""),
"linkedinUrl": properties.get("hs_linkedin_url", "")
}
# Only add if we have enough data to enrich
if (person["linkedinUrl"] or
(person["firstName"] and person["lastName"] and person["companyName"]) or
(person["firstName"] and person["lastName"] and person["companyWebsite"])):
people.append(person)
return {
"enrichmentType": "emailAndMobile",
"listName": f"HubSpot Enrichment {time.strftime('%Y-%m-%d %H:%M:%S')}",
"people": people
}
def start_surfe_enrichment(api_key, payload):
"""
Start the bulk enrichment process with Surfe API
"""
url = "https://api.surfe.com/v1/people/enrichments/bulk"
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()
return response.json()["id"]
def poll_enrichment_status(api_key, enrichment_id, max_attempts=60, delay=5):
"""
Poll the enrichment status until it's complete
"""
url = f"https://api.surfe.com/v1/people/enrichments/bulk/{enrichment_id}"
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
attempts = 0
while attempts < max_attempts:
response = requests.get(url, headers=headers)
response.raise_for_status()
data = response.json()
status = data.get("status")
if status == "COMPLETED":
return data
elif status == "FAILED":
raise Exception(f"Enrichment failed: {data.get('error', 'Unknown error')}")
print(f"Enrichment status: {status}. Waiting {delay} seconds...")
time.sleep(delay)
attempts += 1
raise Exception("Enrichment timed out")
def prepare_hubspot_update_data(enriched_data):
"""
Prepare data for HubSpot update by comparing original and enriched data
"""
update_data = []
for person in enriched_data.get("people", []):
# Extract the best email and phone
email = None
if person.get("emails"):
# Sort by validation status and take the first one
sorted_emails = sorted(
person["emails"],
key=lambda x: 0 if x.get("validationStatus") == "VALID" else 1
)
if sorted_emails:
email = sorted_emails[0].get("email")
mobile_phone = None
if person.get("mobilePhones"):
# Sort by confidence score and take the highest
sorted_phones = sorted(
person["mobilePhones"],
key=lambda x: x.get("confidenceScore", 0),
reverse=True
)
if sorted_phones:
mobile_phone = sorted_phones[0].get("mobilePhone")
# Only update if we have new data
properties = {}
if email:
properties["email"] = email
if mobile_phone:
properties["phone"] = mobile_phone
if person.get("jobTitle"):
properties["jobtitle"] = person["jobTitle"]
if person.get("linkedinUrl"):
properties["hs_linkedin_url"] = person["linkedinUrl"]
# Only add to update list if we have properties to update
if properties:
update_data.append({
"id": person["externalID"],
"properties": properties
})
return update_data
def update_hubspot_contacts(api_key, update_data):
"""
Update contacts in HubSpot with enriched data
"""
if not update_data:
print("No contacts to update")
return {"status": "skipped", "message": "No contacts to update"}
url = "https://api.hubapi.com/crm/v3/objects/contacts/batch/update"
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
batch_size = 100
results = []
for i in range(0, len(update_data), batch_size):
batch = update_data[i:i + batch_size]
payload = {"inputs": batch}
response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()
results.append(response.json())
return {
"status": "success",
"updated_contacts": len(update_data),
"results": results
}
def main():
"""
Main function to orchestrate the enrichment process
"""
load_dotenv()
hubspot_api_key = os.getenv("HUBSPOT_ACCESS_TOKEN")
surfe_api_key = os.getenv("SURFE_API_KEY")
if not hubspot_api_key or not surfe_api_key:
print("Error: Missing API keys. Please check your .env file.")
return
try:
# Step 1: Get contacts from HubSpot
print("Fetching contacts from HubSpot...")
hubspot_contacts = get_hubspot_contacts(hubspot_api_key)
print(f"Found {len(hubspot_contacts)} contacts to enrich")
if not hubspot_contacts:
print("No contacts found that need enrichment")
return
# Step 2: Prepare payload for Surfe API
print("Preparing data for Surfe enrichment...")
surfe_payload = prepare_surfe_payload(hubspot_contacts)
if not surfe_payload["people"]:
print("No contacts with sufficient data for enrichment")
return
# Step 3: Start enrichment process
print("Starting Surfe enrichment process...")
enrichment_id = start_surfe_enrichment(surfe_api_key, surfe_payload)
print(f"Enrichment started with ID: {enrichment_id}")
# Step 4: Poll for results
print("Polling for enrichment results...")
enriched_data = poll_enrichment_status(surfe_api_key, enrichment_id)
print("Enrichment completed successfully")
# Step 5: Prepare data for HubSpot update
print("Preparing data for HubSpot update...")
update_data = prepare_hubspot_update_data(enriched_data)
print(f"Prepared {len(update_data)} contacts for update")
# Step 6: Update HubSpot contacts
print("Updating contacts in HubSpot...")
update_result = update_hubspot_contacts(hubspot_api_key, update_data)
print(f"Update completed: {update_result['status']}")
print(f"Updated {update_result.get('updated_contacts', 0)} contacts")
except Exception as e:
print(f"Error: {str(e)}")
if __name__ == "__main__":
main()
Final Notes: Credits, Quotas, and Rate Limiting
Credits & Quotas
Surfe’s API uses a credit system for people enrichment. Retrieving email, landline, and job details consumes email credits, while retrieving mobile phone numbers consumes mobile credits. There are also daily quotas, such as 2,000 people enrichments per day and 200 organization look-alike searches per day. For more information on credits and quotas, please speak to a Surfe representative to discuss a tailored plan that works for you and your business needs. Quotas reset at midnight (local time), and additional credits can be purchased if needed. For full details, refer to the Credits & Quotas documentation.
Rate Limiting
Surfe enforces rate limits to ensure fair API usage. Users can make up to 10 requests per second, with short bursts of up to 20 requests allowed. The limit resets every minute. Exceeding this results in a 429 Too Many Requests error, so it’s recommended to implement retries in case of rate limit issues. Learn more in the Rate Limits documentation.

Ready to enrich your contacts and accelerate your sales process?
Give Surfe a go and make sure your team never wastes time on bad data again.
Contact Enrichment API FAQs
What Is a Contact Enrichment API?
A contact enrichment API is a tool that fills in the gaps in your CRM data by adding missing contact details — like job title, seniority, company website, email address, and phone number. For sales teams, this means no more guessing who to reach out to or wasting time manually researching leads. Instead, you get complete, up-to-date contact profiles delivered straight into your CRM.
Why Do Sales Teams Need a Contact Enrichment API?
Sales teams move fast — but incomplete data slows everything down. A contact enrichment API helps you quickly identify high-value leads, prioritize outreach, and personalize your messaging. Instead of cold emails to “info@” addresses, you get verified contact info that helps you land in the right inbox, every time.
What’s the Difference Between a Contact Enrichment API and Other Data Tools?
Most traditional data tools require manual CSV uploads, long setup times, or separate platforms to search for leads. A contact enrichment API plugs directly into your workflow. It works with your CRM (like Pipedrive) and enriches contact data programmatically — so you don’t need to copy, paste, or switch tabs to get the info you need.
How Does Surfe’s Contact Enrichment API Work with Pipedrive?
Surfe’s Contact Enrichment API doesn’t connect to Pipedrive out of the box — but that’s what this tutorial is for. With a custom Python script, you can pull contacts from Pipedrive, send them to Surfe for enrichment, and update your CRM with enriched data like email addresses, phone numbers, and job titles — all in one smooth workflow.
What Data Can I Enrich with Surfe’s API?
Surfe’s API can return:
-
Verified email addresses (work and personal)
-
Mobile and landline numbers
-
Job titles and seniority
-
Company name and website
-
Location and department
-
Social profiles (LinkedIn, Meta, X, etc.)
The more input data you provide (like full name + company), the better the results.
Do I Need a Developer to Use the API?
Not necessarily. If you’re comfortable running basic Python scripts, you can follow this guide and set it up yourself — no full dev team required. And if you get stuck, tools like ChatGPT can help generate or fix the code.
How Often Can I Run the Enrichment Script?
You can run it as often as your workflow requires — once a day, once a week, or whenever new contacts are added to Pipedrive. Just be mindful of your API usage and credit limits depending on your Surfe plan.
Can I Combine Surfe’s API with Other Tools?
Yes! Once you’ve enriched your contacts, you can push that data into your email platform, sales engagement tool, or even trigger workflows using tools like Zapier. It’s flexible, scalable, and works with whatever your stack looks like.