HubSpot Source

The HubSpot source extracts CRM data including contacts, companies, deals, and other objects from the HubSpot API.

Installation

pip install bizon[hubspot]

Quick Start

name: hubspot-pipeline

source:
  name: hubspot
  stream: contacts
  sync_mode: full_refresh
  authentication:
    type: api_key
    params:
      token: BIZON_ENV_HUBSPOT_TOKEN

destination:
  name: bigquery
  config:
    project_id: my-project
    dataset_id: crm
    gcs_buffer_bucket: my-bucket

Available Streams

Check available streams:

bizon stream list hubspot

Common streams include:

contacts - CRM contacts
companies - CRM companies
deals - Sales deals
owners - HubSpot users/owners

Configuration

Authentication

HubSpot supports API key (private app token) authentication:

source:
  name: hubspot
  stream: contacts
  authentication:
    type: api_key
    params:
      token: BIZON_ENV_HUBSPOT_TOKEN

To get your token:

Go to HubSpot Settings > Integrations > Private Apps
Create a new private app
Grant required scopes (crm.objects.contacts.read, etc.)
Copy the access token

Sync Modes

Full Refresh

Syncs all records from scratch:

source:
  name: hubspot
  stream: contacts
  sync_mode: full_refresh

Incremental

Syncs only new/updated records since last sync:

source:
  name: hubspot
  stream: contacts
  sync_mode: incremental

Check incremental support:

bizon stream list hubspot
# [Supports incremental] - contacts
# [Full refresh only] - owners

Example Configurations

Contacts to BigQuery

name: hubspot-contacts

source:
  name: hubspot
  stream: contacts
  sync_mode: incremental
  authentication:
    type: api_key
    params:
      token: BIZON_ENV_HUBSPOT_TOKEN

destination:
  name: bigquery
  config:
    project_id: my-project
    dataset_id: crm
    dataset_location: US
    gcs_buffer_bucket: my-staging-bucket
    unnest: true
    record_schemas:
      - destination_id: my-project.crm.contacts
        record_schema:
          - name: id
            type: STRING
            mode: REQUIRED
          - name: email
            type: STRING
            mode: NULLABLE
          - name: firstname
            type: STRING
            mode: NULLABLE
          - name: lastname
            type: STRING
            mode: NULLABLE
          - name: createdate
            type: TIMESTAMP
            mode: NULLABLE
          - name: lastmodifieddate
            type: TIMESTAMP
            mode: NULLABLE

Companies to BigQuery

name: hubspot-companies

source:
  name: hubspot
  stream: companies
  sync_mode: incremental
  authentication:
    type: api_key
    params:
      token: BIZON_ENV_HUBSPOT_TOKEN

destination:
  name: bigquery
  config:
    project_id: my-project
    dataset_id: crm
    gcs_buffer_bucket: my-staging-bucket

Multiple Objects

Run separate pipelines for each object type:

# contacts.yml
bizon run contacts.yml

# companies.yml
bizon run companies.yml

# deals.yml
bizon run deals.yml

Rate Limiting

HubSpot has API rate limits. Bizon handles this automatically with:

source:
  name: hubspot
  stream: contacts
  api_config:
    retry_limit: 10  # Max retries on rate limit

Data Structure

HubSpot records include:

Field	Description
`id`	HubSpot object ID
`properties`	Object properties (email, name, etc.)
`createdAt`	Record creation timestamp
`updatedAt`	Last modification timestamp
`archived`	Whether record is archived

Transforms

Flatten HubSpot properties:

transforms:
  - label: flatten-properties
    python: |
      props = data.get('properties', {})
      data = {
        'id': data.get('id'),
        'email': props.get('email'),
        'firstname': props.get('firstname'),
        'lastname': props.get('lastname'),
        'company': props.get('company'),
        'created_at': data.get('createdAt'),
        'updated_at': data.get('updatedAt')
      }

Next Steps

Sources Overview - Learn about source connectors
Sync Modes - Understand incremental sync
Authentication - Configure API keys