-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Tess can process all files in a given directory. The limitation is that they must all have the exact same schema if fields are required during the transforms.
New intrinsics have been added to clean up field names and add missing columns, but when applied to a set of files, Tess must be run on each file individually with a final Tess pass to merge them all.
Tess should accept multiple sources with the intent on merging each before the sink. That is, identical branches will be created for each source, but these branches will accommodate the unique field names for each file. The requirement is that by the merge before the sink, all branches have the same declared fields.
Cascading supports this natively, but the PipelineDef model will need to be updated, or better, any named source not declared as a join file will be treated as a source the main branch.