Skip to content

HubSpot Source

The HubSpot source extracts CRM data including contacts, companies, deals, and other objects from the HubSpot API.

Terminal window
pip install bizon[hubspot]
name: hubspot-pipeline
source:
name: hubspot
stream: contacts
sync_mode: full_refresh
authentication:
type: api_key
params:
token: BIZON_ENV_HUBSPOT_TOKEN
destination:
name: bigquery
config:
project_id: my-project
dataset_id: crm
gcs_buffer_bucket: my-bucket

Check available streams:

Terminal window
bizon stream list hubspot

Common streams include:

  • contacts - CRM contacts
  • companies - CRM companies
  • deals - Sales deals
  • owners - HubSpot users/owners

HubSpot supports API key (private app token) authentication:

source:
name: hubspot
stream: contacts
authentication:
type: api_key
params:
token: BIZON_ENV_HUBSPOT_TOKEN

To get your token:

  1. Go to HubSpot Settings > Integrations > Private Apps
  2. Create a new private app
  3. Grant required scopes (crm.objects.contacts.read, etc.)
  4. Copy the access token

Syncs all records from scratch:

source:
name: hubspot
stream: contacts
sync_mode: full_refresh

Syncs only new/updated records since last sync:

source:
name: hubspot
stream: contacts
sync_mode: incremental

Check incremental support:

Terminal window
bizon stream list hubspot
# [Supports incremental] - contacts
# [Full refresh only] - owners
name: hubspot-contacts
source:
name: hubspot
stream: contacts
sync_mode: incremental
authentication:
type: api_key
params:
token: BIZON_ENV_HUBSPOT_TOKEN
destination:
name: bigquery
config:
project_id: my-project
dataset_id: crm
dataset_location: US
gcs_buffer_bucket: my-staging-bucket
unnest: true
record_schemas:
- destination_id: my-project.crm.contacts
record_schema:
- name: id
type: STRING
mode: REQUIRED
- name: email
type: STRING
mode: NULLABLE
- name: firstname
type: STRING
mode: NULLABLE
- name: lastname
type: STRING
mode: NULLABLE
- name: createdate
type: TIMESTAMP
mode: NULLABLE
- name: lastmodifieddate
type: TIMESTAMP
mode: NULLABLE
name: hubspot-companies
source:
name: hubspot
stream: companies
sync_mode: incremental
authentication:
type: api_key
params:
token: BIZON_ENV_HUBSPOT_TOKEN
destination:
name: bigquery
config:
project_id: my-project
dataset_id: crm
gcs_buffer_bucket: my-staging-bucket

Run separate pipelines for each object type:

Terminal window
# contacts.yml
bizon run contacts.yml
# companies.yml
bizon run companies.yml
# deals.yml
bizon run deals.yml

HubSpot has API rate limits. Bizon handles this automatically with:

source:
name: hubspot
stream: contacts
api_config:
retry_limit: 10 # Max retries on rate limit

HubSpot records include:

FieldDescription
idHubSpot object ID
propertiesObject properties (email, name, etc.)
createdAtRecord creation timestamp
updatedAtLast modification timestamp
archivedWhether record is archived

Flatten HubSpot properties:

transforms:
- label: flatten-properties
python: |
props = data.get('properties', {})
data = {
'id': data.get('id'),
'email': props.get('email'),
'firstname': props.get('firstname'),
'lastname': props.get('lastname'),
'company': props.get('company'),
'created_at': data.get('createdAt'),
'updated_at': data.get('updatedAt')
}