[BUG] The evaluation metrics of YouTubeDNN model is worse after the refactory of model.compile()

### Bug description
After the refactory moving loss and metrics to model.compile(), the loss and eval metrics are worse for a YouTubeDNN retrieval models. For the LastFM dataset for example (using the [retrieval experiments script](https://github.com/NVIDIA-Merlin/research/blob/main/retrieval_exp/scripts/retrieval_train_eval.py)), the Recall@100-final dropped from `0.08148` to `0.01429`. 

The results can be reproduced using the retrieval experiments script, with the following arguments and LastFM dataset:

```bash
python scripts/retrieval_train_eval.py --dataset lastfmB --wandb_exp_group lastfmB_youtubednn_xe_sampledsofmax_v07.2 --model_type youtubednn --two_tower_activation selu --epochs 20 --lr_decay_steps 50 --output_path /results --data_path /data --log_to_wandb --optimizer adam --eval_batch_size 2048 --train_metrics_steps 100 --topk_metrics_cutoffs 100 --max_seq_length 20 --youtubednn_sampled_softmax True --fail_if_recall_at_100_higher_than 0.5 --xe_label_smoothing 0.0 --two_tower_mlp_layers None --two_tower_dropout 0.3 --two_tower_embedding_sizes_multiplier 5.0 --logits_temperature 0.8 --lr 0.02238982864512884 --lr_decay_rate 0.9400000000000001 --l2_reg 1.1472035643715902e-05 --embeddings_l2_reg 1.01774316277423e-06 --train_batch_size 4096 --item_id_emb_size 512 --youtubednn_sampled_softmax_n_candidates 500
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] The evaluation metrics of YouTubeDNN model is worse after the refactory of model.compile() #496

Bug description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] The evaluation metrics of YouTubeDNN model is worse after the refactory of model.compile() #496

Description

Bug description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions