-
Notifications
You must be signed in to change notification settings - Fork 182
Open
Description
Hi, by using the code below, it takes > 6hours to write imagenet1k to beton, which is nontrivial.
from ffcv.writer import DatasetWriter
from ffcv.fields import RGBImageField, IntField
# Your dataset (`torch.utils.data.Dataset`) of (image, label) pairs
my_dataset = make_my_dataset()
write_path = '/output/path/for/converted/ds.beton'
# Pass a type for each data field
writer = DatasetWriter(write_path, {
# Tune options to optimize dataset size, throughput at train-time
'image': RGBImageField(
max_resolution=256
),
'label': IntField()
})
# Write dataset
writer.from_indexed_dataset(my_dataset)
Is there a solution or any tricks?
Thank you!
Metadata
Metadata
Assignees
Labels
No labels