Skip to content

Schema resources POC#529

Draft
beevital wants to merge 8 commits intomainfrom
schema-resource-poc
Draft

Schema resources POC#529
beevital wants to merge 8 commits intomainfrom
schema-resource-poc

Conversation

@beevital
Copy link
Collaborator

@beevital beevital commented Feb 26, 2026

Summary

Introduces a modular, hierarchical approach to managing Fivetran connection schema configuration as an alternative to the monolithic fivetran_connector_schema_config resource. The new resources allow independent management of schemas, tables, and columns with proper dependency ordering, concurrency safety, and drift detection.

New Resources

Resource Purpose
fivetran_connection_schemas_config Manages schema_change_handling policy and schema-level enable/disable
fivetran_connection_schema_tables_config Manages table enable/disable and sync modes within a schema
fivetran_connection_table_columns_config Manages column enable/disable, hashing, and primary keys within a table

New Action

Action Purpose
fivetran_connection_schema_reload Triggers schema discovery from source with configurable timeout and polling

Key Design Decisions

  • Two mutually exclusive lists per level (disabled_* / enabled_*): the list you choose defines the complete desired state — listed items get one state, everything else gets the opposite. This eliminates ambiguity about unmanaged items.
  • Policy-independent lists: either list can be used with any schema_change_handling policy. Changing the policy in connection_schemas_config does not break downstream table/column resources.
  • Selective tracking: sync_mode, hashed_columns, and primary_key_columns are only tracked when explicitly configured. When omitted, external changes to these settings are ignored.
  • Per-connection mutex (core.SchemaLocks): serializes all schema-modifying API calls for the same connection_id to prevent optimistic lock conflicts.
  • 409 Conflict retry: automatic retry with exponential backoff (up to 5 attempts) for external conflicts outside Terraform's control.
  • PATCH response reuse: eliminates redundant GET calls after PATCH by using the PATCH response directly as the new state.
  • enabled_patch_settings validation: checks that system tables/columns can be modified before sending the PATCH request, reporting all blocked items in a single error.
  • Column list API: connection_table_columns_config uses the dedicated column list endpoint (GET .../columns) since schema details only returns previously configured columns.
  • FastStringSetType: custom attribute type backed by tftypes.List with ListSemanticEquals for O(n) set comparison, solving the O(n²) performance problem with types.Set at scale.

Test Coverage

  • 60+ mock tests across all resources covering: CRUD, import, drift detection (managed items changed externally, new items appearing, items dropped from source), error handling (schema not loaded, connection deleted, table not found, system columns blocked), 409 conflict retry, concurrent access (mutex serialization verified with atomic counters), duplicate validation, large-scale performance (10k schemas), and enabled_patch_settings validation.
  • Full-flow integration test (TestSchemaManagementFullFlow): exercises the complete lifecycle — connection creation → schema reload via lifecycle action → 3 schemas × 3 tables configuration with locals-driven config → column-level settings with hashing and PKs → unpause via connector_schedule. Stateful mock verifies every mutation.

Documentation

  • Doc templates for all 4 new resources/actions with subcategory: "Preview"
  • Modular Schema Management guide covering the full setup pattern with for_each examples
  • Generated docs via tfplugindocs

Files

New:

  • fivetran/framework/resources/connection_schemas_config.go
  • fivetran/framework/resources/connection_schema_tables_config.go
  • fivetran/framework/resources/connection_table_columns_config.go
  • fivetran/framework/actions/connection_schema_reload.go
  • fivetran/framework/core/schema_locks.go
  • fivetran/framework/core/fivetrantypes/fast_string_set.go
  • Tests and doc templates

Modified:

  • fivetran/framework/provider.go — registered new resources and action

@beevital beevital self-assigned this Feb 26, 2026
ModifyPlan on all three schema config resources now warns when modifying
a connection that has already synced data, as changes may trigger expensive
resyncs. Connection sync status is cached per connection_id (1min TTL)
to avoid redundant API calls when multiple resources share a connection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Contributor

@fivetran-renat-sh fivetran-renat-sh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. Thank you!

Would be good to add E2E test to check against real-world API

},
"exclude_mode": actionSchema.StringAttribute{
Optional: true,
Description: "The exclude mode for the schema reload. Accepted values: PRESERVE, EXCLUDE. Default: PRESERVE.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be god to describe what is the difference between PRESERVE and EXCLUDE, and what are scenarios to use each of them

Comment on lines +64 to +69
- `is_historical_sync` (Boolean) The boolean specifying whether the connector should be triggered to re-sync all historical data. If you set this parameter to TRUE, the next scheduled sync will be historical. If the value is FALSE or not specified, the connector will not re-sync historical data. NOTE: When the value is TRUE, only the next scheduled sync will be historical, all subsequent ones will be incremental. This parameter is set to FALSE once the historical sync is completed.
- `setup_state` (String) The current setup state of the connector. The available values are: <br /> - incomplete - the setup config is incomplete, the setup tests never succeeded `connected` - the connector is properly set up, `broken` - the connector setup config is broken.
- `sync_state` (String) The current sync state of the connector. The available values are: `scheduled` - the sync is waiting to be run, `syncing` - the sync is currently running, `paused` - the sync is currently paused, `rescheduled` - the sync is waiting until more API calls are available in the source service.
- `tasks` (Attributes Set) The collection of tasks for the connector. (see [below for nested schema](#nestedatt--status--tasks))
- `update_state` (String) The current data update state of the connector. The available values are: `on_schedule` - the sync is running smoothly, no delays, `delayed` - the data is delayed for a longer time than expected for the update.
- `warnings` (Attributes Set) The collection of warnings for the connector. (see [below for nested schema](#nestedatt--status--warnings))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert connection -> connector change (let's may be revert connection.md, connections.md, destination.md, destinations.md). Descriptions should be fixed, but it's unrelated to this PR activity

# All schemas are enabled except these:
disabled_schemas = ["staging", "temp"]

depends_on = [fivetran_connector.pg]
Copy link
Contributor

@fivetran-renat-sh fivetran-renat-sh Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to depend on schema reload instead? Like

Suggested change
depends_on = [fivetran_connector.pg]
depends_on = [fivetran_connection_schema_reload.reload]

Because currently it is the same as in the old way, where fivetran_connector_schema_config is dependent on fivetran_connector


The modular resources detect drift at every level:

- A schema disabled externally → `disabled_schemas` grows, plan shows the change
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- A schema disabled externally → `disabled_schemas` grows, plan shows the change
- A schema is disabled externally → `disabled_schemas` grows, plan shows the change

- A schema disabled externally → `disabled_schemas` grows, plan shows the change
- A new table appears disabled → `disabled_tables` grows, plan shows the change
- A column gets hashed externally (when `hashed_columns` is managed) → drift detected
- A managed item is dropped from the source → it disappears from the list, plan shows the change
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- A managed item is dropped from the source → it disappears from the list, plan shows the change
- Schema/table/column is dropped from the source → it disappears from the list, plan shows the change


- **`disabled_tables`** — listed tables are **disabled**, all other tables are **enabled**
- **`enabled_tables`** — listed tables are **enabled**, all other tables are **disabled**

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like better to emphasize that choice is strictly dependent on connection's schema_change_handling, that disabled_tables is used in case of ALLOW_ALL, and enabled_tables is used in case of BLOCK_ALL or ALLOW_COLUMNS

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it's actually not.
You can define exact set of disabled or enabled tables/columns/schemas without reffering to the policy.

Policy regulates how Fivetran handles new elements that comes from source.

It's up to you as a TF user which list you want to use (maybe the one that is shorter).
But the behavior differs a bit. For example if we specify "enabled_tables" with "ALLOW_ALL - if the new table appears in schema - it will produce a drift on next plan. And if you use disabled_tables with ALLOW_ALL - no drifts. So it's fully up to you.

Do(ctx)
if detailsErr == nil && detailsResponse.Code != "NotFound_SchemaConfig" {
return
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If API returns 500 we should return immediately as well

Comment on lines +228 to +229
data connectionSchemaTablesConfigModel,
tables map[string]*connections.ConnectionSchemaConfigTableResponse,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
data connectionSchemaTablesConfigModel,
tables map[string]*connections.ConnectionSchemaConfigTableResponse,
localData connectionSchemaTablesConfigModel,
upstreamTables map[string]*connections.ConnectionSchemaConfigTableResponse,

d.EnabledTables = buildOrderedSet(collectTableNames(tables, true), d.EnabledTables)
} else {
// Import: populate based on API policy
if schemaResp.Data.SchemaChangeHandling == "ALLOW_ALL" || schemaResp.Data.SchemaChangeHandling == "ALLOW_COLUMNS" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From docs/resources/connection_schema_tables_config.md :

  • disabled_tables (List of String) Set of table names to disable. Use when the connection's schema_change_handling is ALLOW_ALL.
  • enabled_tables (List of String) Set of table names to enable. Use when the connection's schema_change_handling is BLOCK_ALL or ALLOW_COLUMNS.

So it is probably expected to be:

Suggested change
if schemaResp.Data.SchemaChangeHandling == "ALLOW_ALL" || schemaResp.Data.SchemaChangeHandling == "ALLOW_COLUMNS" {
if schemaResp.Data.SchemaChangeHandling == "ALLOW_ALL" {


- **`disabled_columns`** — listed columns are **disabled**, all other columns are **enabled**
- **`enabled_columns`** — listed columns are **enabled**, all other columns are **disabled**

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to mention that terraform import populates disabled_columns and does not populate enabled_columns

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants