Skip to content

Ignore files starting with "_" #13

@ghost

Description

Dear devs,

I wanted first to thank you for this piece of software, really great!

I have one request I would like to raise with you, if possible. Could you please set the code so files starting with "" are ignored? The use case is as follows:
I have a data source that is quite slow. I use Apache Flume to store that data into HDFS. Because the data velocity is small, I set up Flume to roll to a new file after 10mn. This results in creating a lot of small files which your crusher handles just perfectly.
Now the issue is that Flume's temp files (i.e. files that are not closed yet) start with "
" and are appended a ".tmp". When I run the crusher, if the file is closed in the meantime, well... the file is not found. I would like also to avoid errors from Flume's side and thus avoid manipulating those files.

The request is thus to either have a new option to ignore files starting with "_" or just ignore them by default.

Thanks a lot!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions