Skip to content

RFD3 output cif contains duplicates in id coulmn #148

@TristynAlxander

Description

@TristynAlxander

I apologize that I don't know the setting to make rfd3 fully deterministic; however, I believe this should be fairly reproducible.

Run Command:

rfd3 design out_dir=/path/to/output inputs=/path/to/glob.json dump_trajectories=False prevalidate_inputs=True diffusion_batch_size=1 seed=42

with glob.json:

{
  "glob": {
    "length": "100-100",
    "is_non_loopy":true
    }
}

My output json:

{
    "diffused_index_map": {},
    "metrics": {
        "max_ca_deviation": 0.08025217056274414,
        "n_chainbreaks": 0,
        "n_clashing.interresidue_clashes_w_sidechain": 0,
        "n_clashing.interresidue_clashes_w_backbone": 0,
        "non_loop_fraction": 0.87,
        "loop_fraction": 0.13,
        "helix_fraction": 0.87,
        "sheet_fraction": 0.0,
        "num_ss_elements": 4,
        "radius_of_gyration": 13.636231638805786,
        "alanine_content": 0.14,
        "glycine_content": 0.01,
        "num_residues": 100,
        "diffused_com": [
            1.206663727760315,
            0.28238099813461304,
            -2.9324238300323486
        ]
    },
    "specification": {
        "length": "100-100",
        "extra": {
            "example": "glob",
            "task_name": "glob_glob",
            "sampled_contig": "100",
            "num_tokens_in": 100,
            "num_residues_in": 100,
            "num_chains": 1,
            "num_atoms": 500,
            "num_residues": 101,
            "example_id": "glob_glob_0"
        },
        "is_non_loopy": true
    },
    "ckpt_path": "/home/tristin/downloads/large_language_models/foundry/rfd3_latest.ckpt",
    "seed": 42
}

Now the output structure (after gunzip) seems to vary very slightly, but I'm pretty consistently duplicates in id 19:

...
ATOM C CB  . ASP A 0 4   . 4   ASP A CB  19  3.6175138405647456e-12 1.0 ? False -10.776768   -2.932267    4.327731     1
ATOM C CG  . ASP A 0 4   . 4   ASP A CG  19  3.6175138405647456e-12 1.0 ? False -11.970276   -2.7417085   5.2360744    1
ATOM O OD1 . ASP A 0 4   . 4   ASP A OD1 19  3.6175138405647456e-12 1.0 ? False -12.903551   -2.0365722   4.8578053    1
ATOM O OD2 . ASP A 0 4   . 4   ASP A OD2 19  3.6175138405647456e-12 1.0 ? False -11.906276   -3.3094902   6.332546     1
...

While pymol isn't bothered by this cif2pdb doesn't seem to like it -- I guess it's not proper cif format. I can renumber easily enough, but I figured I should note the issue in case you think it's something rfd3 should be doing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggood first issueGood for newcomers

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions