-
Notifications
You must be signed in to change notification settings - Fork 54
Open
Description
Bug description
After the refactory moving loss and metrics to model.compile(), the loss and eval metrics are worse for a YouTubeDNN retrieval models. For the LastFM dataset for example (using the retrieval experiments script), the Recall@100-final dropped from 0.08148 to 0.01429.
The results can be reproduced using the retrieval experiments script, with the following arguments and LastFM dataset:
python scripts/retrieval_train_eval.py --dataset lastfmB --wandb_exp_group lastfmB_youtubednn_xe_sampledsofmax_v07.2 --model_type youtubednn --two_tower_activation selu --epochs 20 --lr_decay_steps 50 --output_path /results --data_path /data --log_to_wandb --optimizer adam --eval_batch_size 2048 --train_metrics_steps 100 --topk_metrics_cutoffs 100 --max_seq_length 20 --youtubednn_sampled_softmax True --fail_if_recall_at_100_higher_than 0.5 --xe_label_smoothing 0.0 --two_tower_mlp_layers None --two_tower_dropout 0.3 --two_tower_embedding_sizes_multiplier 5.0 --logits_temperature 0.8 --lr 0.02238982864512884 --lr_decay_rate 0.9400000000000001 --l2_reg 1.1472035643715902e-05 --embeddings_l2_reg 1.01774316277423e-06 --train_batch_size 4096 --item_id_emb_size 512 --youtubednn_sampled_softmax_n_candidates 500