python scripts/generate_activations.py \
--model-name "allenai/OLMo-2-0425-1B-Instruct" \
--mlp-input-template "model.layers.{}.mlp.input" \
--mlp-output-template "model.layers.{}.mlp.output" \
--dataset-path "allenai/olmo-mix-1124" \
--context-size 4096 \
--inference-batch-size 16 \
--prepend-bos \
--target-total-tokens 1000000 \
--activation-dir $DATA_DIR/activations \
--compression "gzip" \
--chunk-token-threshold 100000 \
--activation-dtype "bfloat16" \
--compute-norm-stats
INFO:clt.activation_generation.generator:Layers=16 d_model=2048 dtype=torch.float32
WARNING:clt.activation_generation.generator:Storing bfloat16 as uint16 in HDF5. Ensure client handles conversion.
ERROR:clt.activation_generation.generator:Error writing layer 0 to HDF5 chunk 0: Can't broadcast (119397, 4096) -> (119397, 2048)
ERROR:clt.activation_generation.generator:Error writing layer 4 to HDF5 chunk 0: Can't broadcast (119397, 4096) -> (119397, 2048)
ERROR:clt.activation_generation.generator:Error writing layer 5 to HDF5 chunk 0: Can't broadcast (119397, 4096) -> (119397, 2048)
...
ERROR:clt.activation_generation.generator:Failed HDF5 write for layer 13 in chunk 0: Can't broadcast (119397, 4096) -> (119397, 2048)
ERROR:clt.activation_generation.generator:Failed HDF5 write for layer 4 in chunk 0: Can't broadcast (119397, 4096) -> (119397, 2048)
ERROR:clt.activation_generation.generator:Failed HDF5 write for layer 5 in chunk 0: Can't broadcast (119397, 4096) -> (119397, 2048)
...
I'm currently running into an error when calling scripts/generate_activations.py. The script is successfully generating the activations, but it's failing in the part that writes the chunks to disk. I think it's related to the conversion from
float32tobfloat16. I've included the relevant parts of the log below. Thanks so much.Script parameters:
python scripts/generate_activations.py \ --model-name "allenai/OLMo-2-0425-1B-Instruct" \ --mlp-input-template "model.layers.{}.mlp.input" \ --mlp-output-template "model.layers.{}.mlp.output" \ --dataset-path "allenai/olmo-mix-1124" \ --context-size 4096 \ --inference-batch-size 16 \ --prepend-bos \ --target-total-tokens 1000000 \ --activation-dir $DATA_DIR/activations \ --compression "gzip" \ --chunk-token-threshold 100000 \ --activation-dtype "bfloat16" \ --compute-norm-statsOutput: