Skip to content

NewCRM_Prediction.py fails to predict the simplest textbook Diels-Alder reaction #1

@yaroslavsobolev

Description

@yaroslavsobolev

In your preprint, you mention that

In addition, the inclusion of Diels–Alder cycloaddition reactions brings a desirable mechanistic diversity in the form of pericyclic reactions.

So your training set is described as containing Diels-Alder cycloadditions. However, NewCRM_Prediction.py fails to predict the classical Diels-Alder reaction between 1,3-butadiene and ethene: the results is an empty list. The expected outcome is cyclohexene. Here is the end of the NewCRM_Prediction.py that I ran:

start_sequence = "C=CC=C.C=C"
results = beam_search_last_element2(start_sequence, model_predict, beam_width=2, max_steps=10, alpha=0.5)
print('RESULTS:')
print(results)

The output is:

RESULTS:
[]

It does not seem to be a problem with my installation because your much more complex example of start_sequence in your repository works correctly for me, as expected:

start_sequence = "Cc1c(Cl)cccc1.CN(c2ccc([P+]([C@]34C[C@H]5C[C@@H](C4)C[C@@H](C3)C5)([C@]67C[C@H]8C[C@@H](C7)C[C@@H](C6)C8)[Pd])cc2)C.CC(C)([O-])C.CCCCN.[Na+]"
results = beam_search_last_element2(start_sequence, model_predict, beam_width=2, max_steps=10, alpha=0.5)
print('RESULTS:')
print(results)

Is the failure to predict the simplest Diels-Alder some simple bug of formatting that can be fixed? Or is it a deeper problem with the entire pipeline?

P.S.: In the current state of the repository, the unused method calculate_top_k_accuracies should be removed from the import statement in line 19 of file NewCRM_Prediction.py, otherwise it causes an import error.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions