Skip to content

Mechanism for creating and populating tables #9

@jthandy

Description

@jthandy

There are multiple examples where we want to supply supplemental data to be joined in with data used for analysis:

  • mapping data to decode status codes received from services. for example: pardot visitor_activities has type and type_name that we're decoding and then mapping to event_action.
  • creating calendar tables. there isn't a standard way to do this in redshift and, after much investigation, the best process for doing this really is joining against a table with all relevant dates in it.

In order to deal with this, we need a procedural way to build and tear down datasets that is 100% integrated into the way we're deploying models. This should likely look like a python script that is integrated into runner.py, and runs multiple python files, each of them responsible for building or tearing down a particular table. Some tables will be best built via code and some will be best loaded from a CSV, so it should be flexible enough to handle multiple methods of data population.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions