Inference entropix by tomoqt · Pull Request #11 · tomoqt/multimodal

tomoqt · 2025-03-05T11:46:54Z

adding:
-support for muon optimizer
-simple gradio app
-simple inference with beam search, nucleus sampling,entropix-like decoding
-GRPO fine-tuning of saved model

- Introduced new functions `load_raw_spectrum_tokens` and `load_raw_ir` for direct spectrum file processing - Added command-line arguments in `search_and_infer.py` to support raw spectrum inference - Implemented alternative IR encoder (`Regular1DCNNEncoder`) alongside ConvNeXt - Updated configuration and model to support configurable IR encoder type - Added new dependencies: gradio, py3Dmol, gradio_molecule3d - Enhanced spectral encoder flexibility with `ir_encoder_type` parameter

- Added random sample inference for Gradio app when in test mode - Simplified SMILES visualization and error handling - Integrated test dataset sampling for random molecule generation - Updated model inference to support flexible decoding strategies - Improved error handling and visualization for predicted molecules

- Implemented Entropix decoding strategy in `inference.py` with entropy-based decision making - Added new decoding method `entropix_decode` with dynamic layer looping for uncertain predictions - Updated `INFERENCE_README.md` with comprehensive documentation on Entropix strategy - Enhanced `test_inference.py` to support Entropix evaluation and metrics comparison - Added new command-line arguments for Entropix configuration (entropy/varentropy thresholds, max loops) - Updated model and decoder to support flexible layer looping during inference - Integrated Entropix into existing decoding strategies with minimal code changes

…t directory support - Updated `test_inference.py` to canonicalize SMILES for exact match comparison - Added `--output_dir` argument to save inference results in a specified directory - Introduced `canonicalize_smiles()` helper function in `train_autoregressive.py` - Modified `greedy_decode()` to support optional temperature-based sampling - Updated `run.sh` to use torchrun with a test configuration - Improved SMILES decoding and evaluation with canonical SMILES handling

- Increased batch size from 32 to 64 in real_config.yaml

- Implemented mixed precision training with configurable precision (fp32, fp16, bf16) - Added precision configuration option in training config - Integrated torch.cuda.amp.autocast for automatic mixed precision - Added GradScaler for FP16 numerical stability - Updated greedy decoding and validation to support mixed precision - Enhanced training loop to handle different precision modes - Added precision-aware logging and device compatibility checks

- Introduced a new decoding strategy 'GREEDY_LOOP' in inference.py, allowing for layer looping during greedy decoding. - Updated ModelInference class to support the new strategy with a dedicated method for greedy decoding with loops. - Modified test_inference.py to include 'greedy_loop' in the list of strategies and added testing logic for varying loop counts. - Enhanced muon optimizer in muon.py with additional parameters for orthogonalization. - Updated training scripts to accommodate new configuration options for GRPO and adjusted default values in YAML config files. - Ensured backward compatibility by maintaining existing configurations while adding new features.

- Increased embed_dim to 2048, num_heads to 16, and adjusted num_layers to 8 in real_config.yaml for improved model capacity. - Changed training precision to bf16 in real_config.yaml for enhanced performance. - Updated greedy_decode_frequency to 1000 in real_config.yaml to optimize decoding strategy. - Modified training precision to fp16 in test_config.yaml for consistency with mixed precision training. - Changed optimizer type to AdamW in test_config.yaml for better performance with vector parameters.

- Increased batch_size from 32 to 64 in local_config.yaml, real_config_mup.yaml, and real_config.yaml to enhance training efficiency.

- Reduced batch_size from 64 to 32 in real_config.yaml for optimized training. - Changed precision from fp16 to fp32 and adjusted batch_size from 32 to 20 in test_config.yaml for improved consistency and performance. - Increased learning_rate from 1.0e-4 to 1.0e-3 and updated min_learning_rate from 1.0e-5 to 1.0e-4 in test_config.yaml to enhance training dynamics.

- Adjusted learning rate calculation in muon.py for improved performance. - Increased max_samples from 25 to 500 in test_looping.py to allow for more extensive testing. - Updated batch_size from 16 to 32 in real_config.yaml for enhanced training efficiency. - Modified loop_range and max_loops in test_config.yaml to support more iterations during testing. - Changed precision to fp16 and updated batch_size to 32 in test_config.yaml for consistency with training settings. - Updated optimizer type to muon_mix in test_config.yaml for better handling of matrix parameters.

- Reduced max_samples from 500 to 50 in test_looping.py for more controlled testing. - Changed use_stablemax from True to False in real_config.yaml to modify model behavior. - Updated loop_range and max_loops to [0, 0] and 0 respectively in test_config.yaml for limited iterations. - Added default parameters (learning_rate, epsilon, beta, temperature, log_wandb) in test_config.yaml to enhance training configuration.

…ling - Changed optimizer type from AdamW to muon_mix to enhance performance with matrix parameters.

- Increased weight_decay from 0.1 to 0.2 for improved regularization. - Adjusted learning rate for muon optimizer from 1.0e-4 to 2.5e-4 to enhance training dynamics.

- Changed embed_dim from 1600 to 512 and num_heads from 16 to 4 for reduced model complexity. - Updated training batch_size from 64 to 1024 and learning_rate from 1.0e-4 to 1.0e-3 for improved training dynamics. - Modified optimizer type to muon_mix and adjusted weight_decay to 0.1 for better parameter handling. - Added new parameters for GRPO including learning_rate, epsilon, beta, temperature, and logging options. - Updated save_model_frequency and greedy_decode_frequency for enhanced training control.

- Increased batch_size from 512 to 768 for improved training throughput. - Adjusted weight_decay from 0.1 to 0.2 to enhance regularization.

…r enhanced regularization.

tomoqt added 30 commits February 21, 2025 17:56

adding looping

fe43ffa

Adjust training batch size for improved model performance

eab348e

- Increased batch size from 32 to 64 in real_config.yaml

fp16

bf49cba

lower depth

f1f80a2

lower bsz

6aef4ac

lr?

198c750

trying fp32

865f6f9

fp16 again

7dcd427

16

5a8362b

back to fp32

d72dce9

Update batch size in configuration files for training

bf0cee5

- Increased batch_size from 32 to 64 in local_config.yaml, real_config_mup.yaml, and real_config.yaml to enhance training efficiency.

.

d16915f

.

b58557c

Update optimizer type in real_config.yaml for improved parameter hand…

1267282

…ling - Changed optimizer type from AdamW to muon_mix to enhance performance with matrix parameters.

Update learning rate and weight decay parameters in real_config.yaml

2802bd8

- Increased weight_decay from 0.1 to 0.2 for improved regularization. - Adjusted learning rate for muon optimizer from 1.0e-4 to 2.5e-4 to enhance training dynamics.

.

1756c50

epochs to 50

c2e9349

Update training parameters in local_config.yaml

16c9973

- Increased batch_size from 512 to 768 for improved training throughput. - Adjusted weight_decay from 0.1 to 0.2 to enhance regularization.

Update weight_decay parameter in local_config.yaml from 0.1 to 0.2 fo…

8fdf906

…r enhanced regularization.

.

e567fc9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference entropix#11

Inference entropix#11
tomoqt wants to merge 31 commits intocleaned_tokenizedfrom
inference_entropix

tomoqt commented Mar 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tomoqt commented Mar 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant