Skip to content

Implementing logBERT for Spark data #56

@mshariful

Description

@mshariful

Dear All,
I would like to implement logBERT for Spark data. I have a question. Our Spark log data folder contains many subfolders, each of these subfolders
"application_1448006111297_0137", "application_1448006111297_0138" etc has many .log files. Do I need to join all these .log files into a single file, say "SPARK.log" and then use it in logBERT's "data_process.py" script where I can set log_file = "SPARK.log"?

I understand that I also need to bring changes to hdfs_sampling function so that it becomes a suitable spark_sampling function.

Could you please suggest me how should I proceed in a best possible way?

Thank you very much for your kind attention.

Regards,
Shariful

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions