Implementing logBERT for Spark data

Dear All,
I would like to implement logBERT for Spark data. I have a question. Our Spark log data folder contains many subfolders, each of these subfolders
"application_1448006111297_0137", "application_1448006111297_0138" etc has many .log files.  Do I need to join all these .log files into a single file, say "SPARK.log" and then  use it in logBERT's  "data_process.py" script where I can set log_file   = "SPARK.log"? 

I understand that I also need to  bring changes to hdfs_sampling function so that it becomes a suitable spark_sampling function.

Could you please suggest me how should I proceed in a best possible way?

Thank you very much for your kind attention.

Regards,
Shariful





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing logBERT for Spark data #56

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Implementing logBERT for Spark data #56

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions