This is a Singer tap that produces JSON-formatted data following the Singer spec.
This tap:
-
Pulls raw data from the [Harvest API].
-
Extracts the following resources:
-
Outputs the schema for each resource
-
Incrementally pulls data based on the input state
- Data Key = projects
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
- Data Key = clients
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
- Data Key = contacts
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
- Data Key = estimate_item_categories
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
- Data Key = estimate_line_items
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
- Data Key = estimate_messages
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
- Data Key = estimates
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
- Data Key = expense_categories
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
- Data Key = expenses
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
- Data Key = external_reference
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
- Data Key = invoice_item_categories
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
- Data Key = invoice_line_items
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
- Data Key = invoice_messages
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
- Data Key = invoice_payments
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
- Data Key = invoices
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
- Data Key = task_assignments
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
- Data Key = project_users
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
- Data Key = roles
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
- Data Key = tasks
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
- Data Key = time_entries
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
- Data Key = time_entry_external_reference
- Primary keys: ['time_entry_id', 'external_reference_id']
- Replication strategy: INCREMENTAL
- Data Key = user_project_tasks
- Primary keys: ['user_id', 'project_task_id']
- Replication strategy: INCREMENTAL
- Data Key = project_assignments
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
- Data Key = user_roles
- Primary keys: ['role_id', 'user_id']
- Replication strategy: INCREMENTAL
- Data Key = users
- Primary keys: ['id']
- Replication strategy: INCREMENTAL
-
Install
Clone this repository, and then install using setup.py. We recommend using a virtualenv:
> virtualenv -p python3 venv > source venv/bin/activate > python setup.py install OR > cd .../tap-harvest > pip install -e .
-
Dependent libraries. The following dependent libraries were installed.
> pip install singer-python > pip install target-stitch > pip install target-json
-
Create your tap's
config.jsonfile. The tap config file for this tap should include these entries:start_date- the default value to use if no bookmark exists for an endpoint (rfc3339 date string)user_agent(string, optional): Process and email for API logging purposes. Example:tap-harvest <api_user_email@your_company.com>request_timeout(integer,300): Max time for which request should wait to get a response. Default request_timeout is 300 seconds.
{ "start_date": "2019-01-01T00:00:00Z", "user_agent": "tap-harvest <api_user_email@your_company.com>", "request_timeout": 300, ... }Optionally, also create a
state.jsonfile.currently_syncingis an optional attribute used for identifying the last object to be synced in case the job is interrupted mid-stream. The next run would begin where the last job left off.{ "currently_syncing": "engage", "bookmarks": { "export": "2019-09-27T22:34:39.000000Z", "funnels": "2019-09-28T15:30:26.000000Z", "revenue": "2019-09-28T18:23:53Z" } } -
Run the Tap in Discovery Mode This creates a catalog.json for selecting objects/fields to integrate:
tap-harvest --config config.json --discover > catalog.jsonSee the Singer docs on discovery mode here
-
Run the Tap in Sync Mode (with catalog) and write out to state file
For Sync mode:
> tap-harvest --config tap_config.json --catalog catalog.json > state.json > tail -1 state.json > state.json.tmp && mv state.json.tmp state.json
To load to json files to verify outputs:
> tap-harvest --config tap_config.json --catalog catalog.json | target-json > state.json > tail -1 state.json > state.json.tmp && mv state.json.tmp state.json
To pseudo-load to Stitch Import API with dry run:
> tap-harvest --config tap_config.json --catalog catalog.json | target-stitch --config target_config.json --dry-run > state.json > tail -1 state.json > state.json.tmp && mv state.json.tmp state.json
-
Test the Tap
While developing the harvest tap, the following utilities were run in accordance with Singer.io best practices: Pylint to improve code quality
> pylint tap_harvest -d missing-docstring -d logging-format-interpolation -d too-many-locals -d too-many-argumentsPylint test resulted in the following score:
Your code has been rated at 9.67/10
> tap_harvest --config tap_config.json --catalog catalog.json | singer-check-tap > state.json > tail -1 state.json > state.json.tmp && mv state.json.tmp state.json
Unit tests may be run with the following.
python -m pytest --verboseNote, you may need to install test dependencies.
pip install -e .'[dev]'
Copyright © 2025 Stitch