Skip to content

singer-io/tap-segment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tap-segment

This is a Singer tap that produces JSON-formatted data following the Singer spec.

This tap:

Streams

sources

  • Data Key = sources
  • Primary keys: ['id']
  • Replication strategy: FULL_TABLE

destinations

  • Data Key = destinations
  • Primary keys: ['id']
  • Replication strategy: FULL_TABLE

warehouses

  • Data Key = warehouses
  • Primary keys: ['id']
  • Replication strategy: FULL_TABLE

source_connected_destinations

  • Data Key = destinations
  • Primary keys: ['sourceId']
  • Replication strategy: FULL_TABLE

source_connected_warehouses

  • Data Key = warehouses
  • Primary keys: ['sourceId']
  • Replication strategy: FULL_TABLE

destination_subscriptions

  • Data Key = subscriptions
  • Primary keys: ['destinationId']
  • Replication strategy: FULL_TABLE

catalog_sources

  • Data Key = sourcesCatalog
  • Primary keys: ['id']
  • Replication strategy: FULL_TABLE

catalog_destinations

  • Data Key = destinationsCatalog
  • Primary keys: ['id']
  • Replication strategy: FULL_TABLE

catalog_warehouses

  • Data Key = warehousesCatalog
  • Primary keys: ['id']
  • Replication strategy: FULL_TABLE

usage_api_calls_per_source_daily

  • Data Key = dailyPerSourceAPICallsUsage
  • Primary keys: ['sourceId']
  • Replication strategy: INCREMENTAL

usage_api_calls_workspace_daily

  • Data Key = dailyWorkspaceAPICallsUsage
  • Primary keys: ['timestamp']
  • Replication strategy: INCREMENTAL

usage_mtu_per_source_daily

  • Data Key = dailyPerSourceMTUUsage
  • Primary keys: ['sourceId']
  • Replication strategy: INCREMENTAL

usage_mtu_workspace_daily

  • Data Key = dailyWorkspaceMTUUsage
  • Primary keys: ['timestamp']
  • Replication strategy: INCREMENTAL

destination_delivery_metrics_summary

  • Data Key = deliveryMetricsSummary
  • Primary keys: ['destinationId']
  • Replication strategy: INCREMENTAL

audit_events

  • Data Key = events
  • Primary keys: ['id']
  • Replication strategy: INCREMENTAL

users

  • Data Key = users
  • Primary keys: ['id']
  • Replication strategy: FULL_TABLE

groups

  • Data Key = userGroups
  • Primary keys: ['id']
  • Replication strategy: FULL_TABLE

transformations

  • Data Key = transformations
  • Primary keys: ['id']
  • Replication strategy: FULL_TABLE

Authentication

Quick Start

  1. Install

    Clone this repository, and then install using setup.py. We recommend using a virtualenv:

    > virtualenv -p python3 venv
    > source venv/bin/activate
    > python setup.py install
    OR
    > cd .../tap-segment
    > pip install -e .
  2. Dependent libraries. The following dependent libraries were installed.

    > pip install singer-python
    > pip install target-stitch
    > pip install target-json
    
  3. Create your tap's config.json file. The tap config file for this tap should include these entries:

    • start_date - the default value to use if no bookmark exists for an endpoint (rfc3339 date string)
    • user_agent (string, optional): Process and email for API logging purposes. Example: tap-segment <api_user_email@your_company.com>
    • request_timeout (integer, 300): Max time for which request should wait to get a response. Default request_timeout is 300 seconds.
    {
        "start_date": "2019-01-01T00:00:00Z",
        "user_agent": "tap-segment <api_user_email@your_company.com>",
        "request_timeout": 300
    }

    Optionally, also create a state.json file. currently_syncing is an optional attribute used for identifying the last object to be synced in case the job is interrupted mid-stream. The next run would begin where the last job left off.

    {
        "currently_syncing": "engage",
        "bookmarks": {
            "export": "2019-09-27T22:34:39.000000Z",
            "funnels": "2019-09-28T15:30:26.000000Z",
            "revenue": "2019-09-28T18:23:53Z"
        }
    }
  4. Run the Tap in Discovery Mode This creates a catalog.json for selecting objects/fields to integrate:

    tap-segment --config config.json --discover > catalog.json

    See the Singer docs on discovery mode here.

  5. Run the Tap in Sync Mode (with catalog) and write out to state file

    For Sync mode:

    > tap-segment --config tap_config.json --catalog catalog.json > state.json
    > tail -1 state.json > state.json.tmp && mv state.json.tmp state.json

    To load to json files to verify outputs:

    > tap-segment --config tap_config.json --catalog catalog.json | target-json > state.json
    > tail -1 state.json > state.json.tmp && mv state.json.tmp state.json

    To pseudo-load to Stitch Import API with dry run:

    > tap-segment --config tap_config.json --catalog catalog.json | target-stitch --config target_config.json --dry-run > state.json
    > tail -1 state.json > state.json.tmp && mv state.json.tmp state.json
  6. Test the Tap While developing the segment tap, the following utilities were run in accordance with Singer.io best practices: Pylint to improve code quality:

    > pylint tap_segment -d missing-docstring -d logging-format-interpolation -d too-many-locals -d too-many-arguments

    Pylint test resulted in the following score:

    Your code has been rated at 9.67/10

    To check the tap and verify working:

    > tap_segment --config tap_config.json --catalog catalog.json | singer-check-tap > state.json
    > tail -1 state.json > state.json.tmp && mv state.json.tmp state.json

    Unit Tests

    Unit tests may be run with the following.

    python -m pytest --verbose
    

    Note, you may need to install test dependencies.

    pip install -e .'[dev]'
    

Copyright © 2019 Stitch

About

A Singer tap for extracting data from the Segment API

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages