Skip to content
This repository was archived by the owner on Oct 21, 2024. It is now read-only.
This repository was archived by the owner on Oct 21, 2024. It is now read-only.

📝 [Docs] - Guides to use Spark Job #44

@Taekyoon

Description

@Taekyoon

dataverse version checks

  • I have checked that the issue still exists on the latest versions of the dataverse.

Location of the documentation

Setting Configuration

Documentation problem

When developers use Spark Job, executor, and driver setting is very important.
Depending on how many executors are used and how much memory is consumed, costs and execution time will be different.
Especially for deduplication, number of executors and memory consumption is really critical to process a huge dataset.

Suggestion

Need to explicitly show how developers control executor resources, and how much cost will be used as a default setting.

Metadata

Metadata

Assignees

No one assigned

    Labels

    DocsImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions