Skip to content
This repository was archived by the owner on Sep 6, 2023. It is now read-only.

Conversation

@HenriSchulte-MS
Copy link
Contributor

Currently, data is not deliberately partitioned in the dataflow. Partitioning based on a unique identifier (systemid + company) can reduce data shuffling between worker nodes and reduce execution time.

HenriSchulte-MS and others added 3 commits January 11, 2023 09:41
* Removing tracked deleted records should be treated as separate from the export process (#79)

* First draft

* Adding notable change

* update app.json

* adding tooltip help

Co-authored-by: Soumya Dutta <soudutta@microsoft.com>

* TryFunction should not make DB calls (#82)

* first draft

* Further changes

* Correcting the telemetry IDs

Co-authored-by: Soumya Dutta <soudutta@microsoft.com>

* Separate changelog (#85)

Added separate changelog to shorten readme

* Added ADLS Run API page (#90)

* Merge branch 'main' of https://github.com/microsoft/bc2adls

* Adjusted version

* Improvements to logging (#89)

* Improvements to logging

* LockTable in Try function

---------

Co-authored-by: Soumya Dutta <soudutta@microsoft.com>

* Access denied issue on spark notebook (#92)

* Added step

* minor

* Update SharedMetadataTables.md

Clarified instructions reg. naming of the managed identity and reason for adding the permissions

---------

Co-authored-by: Soumya Dutta <soudutta@microsoft.com>
Co-authored-by: Henri Schulte <77101781+HenriSchulte-MS@users.noreply.github.com>

* Warn user before makign schema changes if data already exported. (#96)

Co-authored-by: Soumya Dutta <soudutta@microsoft.com>

* Internal Fields cannot be exported (#98)

Co-authored-by: Soumya Dutta <soudutta@microsoft.com>

* Only start export for Enabled tables (#97)

Co-authored-by: Soumya Dutta <soudutta@microsoft.com>

* Skip global trigger event subscriber on missing license or permissionset (#100)

* Skip event subscribers when no license or permissions

* Increase version

---------

Co-authored-by: Ron Koppelaar <Ron.Koppelaar@cegeka-dsa.nl>

* Update Execution.md

Adding link to Microsoft documentation to consume ADLS Gen 2 resources

* Allow telemetry to be logged at all outputs. (#102)

Co-authored-by: Soumya Dutta <soudutta@microsoft.com>

* Adding the file path to the telemetry

* Add the testimonials received (#103)

* Add the testimonials received

* remove logos

---------

Co-authored-by: Soumya Dutta <soudutta@microsoft.com>

---------

Co-authored-by: Soumya Dutta <38040179+DuttaSoumya@users.noreply.github.com>
Co-authored-by: Soumya Dutta <soudutta@microsoft.com>
Co-authored-by: Bert Verbeek <71499421+Bertverbeek4PS@users.noreply.github.com>
Co-authored-by: Ron Koppelaar <33791875+RonKoppelaar@users.noreply.github.com>
Co-authored-by: Ron Koppelaar <Ron.Koppelaar@cegeka-dsa.nl>
@Arthurvdv
Copy link

This sounds promising, may I ask a question about this?

image
"..If you plan on using non-equality comparisons in your custom expression, you should utilize the 'Fixed' broadcast setting and specify a minimum of 1 stream to be broadcast. If broadcasting, ensure that your Integration Runtime is sized appropriately.."

Is this a warning we should consider? And if so, what would be the best setting for the Broadcast options?

image

@HenriSchulte-MS
Copy link
Contributor Author

@Arthurvdv The custom expression in the "Remove Deleted" step does not involve any non-quality comparisons, so I have not paid any mind to this warning.

@Arthurvdv
Copy link

@HenriSchulte-MS, thank you for sharing. I'll update our pipeline ahead of the merge of this PR.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants