Skip to content

mini-imageNet #515

@HeYiyang2

Description

@HeYiyang2

Due to the large size of the ImageNet dataset, I am using the MiniImageNet dataset. I modified the YAML file accordingly.
datasets:
target: flava.definitions.TrainingDatasetsInfo
selected:
- image
- vl
- text
image:
target: flava.definitions.TrainingSingleDatasetInfo
train:
- target: flava.definitions.HFDatasetInfo
key: mini_train
subset: default
data_dir: >-
/home/liumaofu/hyy/multimodal/examples/flava/mini/ok/train/
val:
- target: flava.definitions.HFDatasetInfo
key: mini_val
subset: default
data_dir: >-
/home/liumaofu/hyy/multimodal/examples/flava/mini/ok/val/
At the same time, I modified the examples/flava/data/utils. py file:
def build_datasets_from_info(dataset_infos: List[HFDatasetInfo], split: str = "train"):
dataset_list = []
for dataset_info in dataset_infos:
print(f"Loading dataset from {dataset_info.data_dir}")

    current_dataset = load_from_disk(dataset_info.data_dir)

    if dataset_info.remove_columns is not None:
        current_dataset = current_dataset.remove_columns(dataset_info.remove_columns)
    if dataset_info.rename_columns is not None:
        for rename in dataset_info.rename_columns:
            current_dataset = current_dataset.rename_column(rename[0], rename[1])

    dataset_list.append(current_dataset)

return concatenate_datasets(dataset_list)

However, when executing the code:python -m flava.train config=flava/configs/pretraining/debug.yaml
, an error is reported:Directory /home/liumaofu/hyy/multimodal/examples/flava/mini/ok/train/ is neither a dataset directory nor a dataset dict directory.
The structure of my miniimagenet dataset is as follows:
miniImagenet
|-- train
| |-- class1
| | |-- image1.jpg
| | |-- image2.jpg
| | |-- ...
| |-- class2
| | |-- image1.jpg
| | |-- image2.jpg
| | |-- ...
| |-- ...
|-- val
| |-- class1
| | |-- image1.jpg
| | |-- image2.jpg
| | |-- ...
| |-- class2
| | |-- image1.jpg
| | |-- image2.jpg
| | |-- ...
| |-- ...
|-- test
| |-- class1
| | |-- image1.jpg
| | |-- image2.jpg
| | |-- ...
| |-- class2
| | |-- image1.jpg
| | |-- image2.jpg
| | |-- ...
| |-- ..
I ensure that their storage path is not a problem. May I ask why this error is reported and what should I do?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions