Your network has two outputs: semantic predictions and embeddings. Then you get instances by clustering embeddings. However, it seems that you do not get a score for each instance. So how do you get the PR curve and then calculate the AP metric?