Skip to content

Commit ff85180

Browse files
authored
NERDSS Converter (#178)
* basic read of PDB data as spheres, convert to simularium file format * add starting functionality for drawing bonds between molecules * code cleanup * adding more comments * lint fixes * Add jupyter notebook for NERDSS PDB data, plus a few test files * add DisplayData for bond fibers to example * clean up code for fiber agents * simplify test files * add test for nerdss converter * nerdss converter code cleanup * clean up nerdss tests * revert changes to unit_data * revert changes to unit_data * formatting fixes * incorporate PR feedback * lint fixes
1 parent c017633 commit ff85180

9 files changed

Lines changed: 963 additions & 0 deletions

File tree

examples/Tutorial_nerdss.ipynb

Lines changed: 180 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,180 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# Simularium Conversion Tutorial : NERDSS PDB Data"
8+
]
9+
},
10+
{
11+
"cell_type": "code",
12+
"execution_count": 1,
13+
"metadata": {},
14+
"outputs": [],
15+
"source": [
16+
"from IPython.display import Image\n",
17+
"\n",
18+
"import numpy as np\n",
19+
"\n",
20+
"from simulariumio.nerdss import NerdssConverter, NerdssData\n",
21+
"from simulariumio import MetaData, DisplayData, DISPLAY_TYPE, CameraData, UnitData\n",
22+
"from simulariumio.filters import TranslateFilter\n",
23+
"from simulariumio.writers import JsonWriter\n",
24+
"\n"
25+
]
26+
},
27+
{
28+
"cell_type": "markdown",
29+
"metadata": {},
30+
"source": [
31+
"This notebook provides example python code for converting your own simulation trajectories into the format consumed by the Simularium Viewer. It creates a .simularium file which you can drag and drop onto the viewer like this:"
32+
]
33+
},
34+
{
35+
"cell_type": "markdown",
36+
"metadata": {},
37+
"source": [
38+
"![title](img/drag_drop.gif)"
39+
]
40+
},
41+
{
42+
"cell_type": "markdown",
43+
"metadata": {},
44+
"source": [
45+
"# _Note:_\n",
46+
"To install simulariumio with all depencies needed for Nerdss, use `pip install simulariumio[nerdss]`"
47+
]
48+
},
49+
{
50+
"cell_type": "markdown",
51+
"metadata": {},
52+
"source": [
53+
"***\n",
54+
"## Prepare your spatial data"
55+
]
56+
},
57+
{
58+
"cell_type": "markdown",
59+
"metadata": {},
60+
"source": [
61+
"The Simularium `NerdssConverter` consumes spatiotemporal data from NERDSS outputs using the MDAnalysis Python package. \n",
62+
"\n",
63+
"The converter requires a `NerdssData` object as a parameter ([see documentation](https://simularium.github.io/simulariumio/simulariumio.nerdss.html#simulariumio.nerdss.nerdss_data.NerdssData))\n",
64+
"\n",
65+
"If you'd like to specify radii or color for rendering an agent type, add a `DisplayData` object for that agent type, as shown below ([see documentation](https://simularium.github.io/simulariumio/simulariumio.data_objects.html#module-simulariumio.data_objects.display_data)).\n"
66+
]
67+
},
68+
{
69+
"cell_type": "code",
70+
"execution_count": 2,
71+
"metadata": {},
72+
"outputs": [],
73+
"source": [
74+
"box_size = 150.\n",
75+
"\n",
76+
"example_data = NerdssData(\n",
77+
" path_to_pdb_files=\"../simulariumio/tests/data/nerdss/virus_pdb\",\n",
78+
" meta_data=MetaData(\n",
79+
" box_size=np.array([box_size, box_size, box_size]),\n",
80+
" trajectory_title=\"Some parameter set\",\n",
81+
" camera_defaults=CameraData(position=np.array([0, 0, 200]))\n",
82+
" ),\n",
83+
" display_data={\n",
84+
" \"gag#COM\": DisplayData(\n",
85+
" name=\"GAG - Center of Mass\",\n",
86+
" display_type=DISPLAY_TYPE.SPHERE,\n",
87+
" color=\"#0000FF\",\n",
88+
" ),\n",
89+
" \"pol#COM\": DisplayData(\n",
90+
" name=\"POL - Center of Mass\",\n",
91+
" display_type=DISPLAY_TYPE.SPHERE,\n",
92+
" color=\"#FF00FF\",\n",
93+
" ),\n",
94+
" \"bonds\": DisplayData(\n",
95+
" name=\"Bond\",\n",
96+
" display_type=DISPLAY_TYPE.FIBER,\n",
97+
" ),\n",
98+
" },\n",
99+
" time_units=UnitData(\"us\", 0.2),\n",
100+
")"
101+
]
102+
},
103+
{
104+
"cell_type": "markdown",
105+
"metadata": {},
106+
"source": [
107+
"## Convert and save as .simularium file"
108+
]
109+
},
110+
{
111+
"cell_type": "markdown",
112+
"metadata": {},
113+
"source": [
114+
"Once your data is shaped like in the `example_data` object, you can use the converter to generate the file at the given path:\n",
115+
"\n",
116+
"(since this model's coordinates are all positive, use a `TranslateFilter` to center the data in the viewer.)"
117+
]
118+
},
119+
{
120+
"cell_type": "code",
121+
"execution_count": null,
122+
"metadata": {},
123+
"outputs": [],
124+
"source": [
125+
"converter = NerdssConverter(example_data)\n",
126+
"# this _filter is just roughly centering the data in the box\n",
127+
"_filter = TranslateFilter(default_translation=np.array([-80.0, -80.0, -80.0]))\n",
128+
"filtered_data = converter.filter_data([_filter])\n",
129+
"JsonWriter.save(filtered_data, \"example_virus\", False)"
130+
]
131+
},
132+
{
133+
"cell_type": "markdown",
134+
"metadata": {},
135+
"source": [
136+
"## Visualize in the Simularium viewer"
137+
]
138+
},
139+
{
140+
"cell_type": "markdown",
141+
"metadata": {},
142+
"source": [
143+
"In a supported web-browser (Firefox or Chrome), navigate to https://simularium.allencell.org/ and import your file into the view."
144+
]
145+
},
146+
{
147+
"cell_type": "code",
148+
"execution_count": null,
149+
"metadata": {},
150+
"outputs": [],
151+
"source": []
152+
}
153+
],
154+
"metadata": {
155+
"kernelspec": {
156+
"display_name": "Python 3",
157+
"language": "python",
158+
"name": "python3"
159+
},
160+
"language_info": {
161+
"codemirror_mode": {
162+
"name": "ipython",
163+
"version": 3
164+
},
165+
"file_extension": ".py",
166+
"mimetype": "text/x-python",
167+
"name": "python",
168+
"nbconvert_exporter": "python",
169+
"pygments_lexer": "ipython3",
170+
"version": "3.9.16"
171+
},
172+
"vscode": {
173+
"interpreter": {
174+
"hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"
175+
}
176+
}
177+
},
178+
"nbformat": 4,
179+
"nbformat_minor": 4
180+
}

pyproject.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,9 @@ md = [
5454
cellpack = [
5555
"cellpack==1.0.3",
5656
]
57+
nerdss = [
58+
"MDAnalysis>=2.0.0",
59+
]
5760
tutorial = [
5861
"jupyter",
5962
"scipy>=1.5.2",

simulariumio/nerdss/__init__.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
#!/usr/bin/env python
2+
# -*- coding: utf-8 -*-
3+
4+
from .nerdss_converter import NerdssConverter # noqa: F401
5+
from .nerdss_data import NerdssData # noqa: F401
Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
from MDAnalysis import Universe
2+
import os
3+
4+
from ..trajectory_converter import TrajectoryConverter
5+
from ..data_objects import (
6+
AgentData,
7+
TrajectoryData,
8+
DimensionData,
9+
UnitData,
10+
DisplayData,
11+
)
12+
from ..constants import DISPLAY_TYPE, VALUES_PER_3D_POINT, VIZ_TYPE
13+
from ..exceptions import InputDataError
14+
from .nerdss_data import NerdssData
15+
16+
17+
class NerdssConverter(TrajectoryConverter):
18+
def __init__(
19+
self,
20+
input_data: NerdssData,
21+
):
22+
"""
23+
Parameters
24+
----------
25+
input_data : NerdssData
26+
An object containing info for reading
27+
NERDSS simulation trajectory outputs and plot data
28+
"""
29+
self._data = self._read(input_data)
30+
31+
def _read_pdb_files(self, input_data: NerdssData) -> AgentData:
32+
file_list = os.listdir(input_data.path_to_pdb_files)
33+
file_list.sort()
34+
n_timesteps = len(file_list)
35+
dimensions = DimensionData(
36+
total_steps=n_timesteps, max_agents=0, max_subpoints=VALUES_PER_3D_POINT * 2
37+
)
38+
time_steps = []
39+
for file in file_list:
40+
u = Universe(os.path.join(input_data.path_to_pdb_files, file))
41+
n_agents = len(u.atoms.positions)
42+
if n_agents * 2 > dimensions.max_agents:
43+
# each "atom" will be represented by 1 sphere agent and up
44+
# to 1 fiber agent, representing a bond
45+
dimensions.max_agents = n_agents * 2
46+
file_name = os.path.splitext(file)[0]
47+
if not file_name.isdigit():
48+
raise InputDataError(
49+
f"File name is expected to be the timestep, {file_name} is invalid"
50+
)
51+
time_steps.append(file_name)
52+
time_steps.sort(key=int)
53+
agent_data = AgentData.from_dimensions(dimensions)
54+
agent_data.n_timesteps = n_timesteps
55+
56+
# keep track of fiber positions for bonds as we go
57+
fiber_positions = [[] for i in range(n_timesteps)]
58+
59+
for time_index in range(n_timesteps):
60+
# we are assuming the time is the file name
61+
universe = Universe(
62+
os.path.join(
63+
input_data.path_to_pdb_files, time_steps[time_index] + ".pdb"
64+
)
65+
)
66+
atoms = universe.atoms
67+
temp_n_agents = len(atoms)
68+
agent_data.positions[time_index][0:temp_n_agents] = universe.atoms.positions
69+
70+
agent_data.times[time_index] = (
71+
float(time_steps[time_index]) * input_data.time_units.magnitude
72+
)
73+
74+
agent_data.n_agents[time_index] = len(atoms)
75+
if input_data.display_data.get("bonds") is None:
76+
input_data.display_data["bonds"] = DisplayData(
77+
name="bonds", display_type=DISPLAY_TYPE.FIBER, radius=0.5
78+
)
79+
80+
for residue in universe.residues:
81+
com_pos = None
82+
bond_site_pos = []
83+
resname = residue.resname
84+
85+
for atom in residue.atoms:
86+
name = atom.name
87+
atom_index = atom.index
88+
full_name = resname + "#" + name
89+
position = atom.position
90+
agent_data.types[time_index].append(
91+
TrajectoryConverter._get_display_type_name_from_raw(
92+
full_name, input_data.display_data
93+
)
94+
)
95+
agent_data.unique_ids[time_index][atom_index] = atom.id
96+
97+
# Get the user provided display data for this raw_type_name
98+
input_display_data = (
99+
TrajectoryConverter._get_display_data_for_agent(
100+
full_name, input_data.display_data
101+
)
102+
)
103+
104+
agent_data.radii[time_index][atom_index] = (
105+
input_display_data.radius
106+
if input_display_data and input_display_data.radius is not None
107+
else 1.0
108+
)
109+
110+
# Draw intra-molecular bonds as a fiber between COM (center of
111+
# mass) and binding sites each residue
112+
if name != "ref":
113+
if name == "COM":
114+
# Found the center of mass!
115+
com_pos = list(position)
116+
for bond_site in bond_site_pos:
117+
# Add a fiber to connect COM with bond sites
118+
bond_coords = com_pos + bond_site
119+
fiber_positions[time_index].append(bond_coords)
120+
elif com_pos:
121+
# Already have COM, so add fiber to connect it to this site
122+
bond_coords = com_pos + list(position)
123+
fiber_positions[time_index].append(bond_coords)
124+
else:
125+
# Haven't found COM, so keep track of this bond site to
126+
# connect to COM when we find it
127+
bond_site_pos.append(list(position))
128+
129+
# Add bond data into agent_data
130+
bonds_display_data = TrajectoryConverter._get_display_data_for_agent(
131+
"bonds", input_data.display_data
132+
)
133+
next_uid = agent_data.unique_ids.max() + 1
134+
for timestep in range(n_timesteps):
135+
n_fibers = len(fiber_positions[timestep])
136+
n_atoms = int(agent_data.n_agents[timestep])
137+
agent_data.n_agents[timestep] = n_fibers + n_atoms
138+
for agent_index in range(n_fibers):
139+
agent_data.subpoints[timestep][agent_index + n_atoms] = fiber_positions[
140+
timestep
141+
][agent_index]
142+
agent_data.n_subpoints[timestep][agent_index + n_atoms] = (
143+
VALUES_PER_3D_POINT * 2
144+
)
145+
agent_data.viz_types[timestep][agent_index + n_atoms] = VIZ_TYPE.FIBER
146+
agent_data.radii[timestep][
147+
agent_index + n_atoms
148+
] = bonds_display_data.radius
149+
agent_data.types[timestep].append(bonds_display_data.name)
150+
agent_data.unique_ids[timestep][agent_index + n_atoms] = next_uid
151+
next_uid += 1
152+
return agent_data
153+
154+
def _read(self, input_data: NerdssData) -> TrajectoryData:
155+
"""
156+
Return a TrajectoryData object containing the NERDSS data
157+
"""
158+
print("Reading PDB Data -------------")
159+
agent_data = self._read_pdb_files(input_data)
160+
# get display data (geometry and color)
161+
for tid in input_data.display_data:
162+
display_data = input_data.display_data[tid]
163+
agent_data.display_data[display_data.name] = display_data
164+
input_data.meta_data.scale_factor = 1.0
165+
input_data.meta_data._set_box_size()
166+
result = TrajectoryData(
167+
meta_data=input_data.meta_data,
168+
agent_data=agent_data,
169+
time_units=UnitData(name=input_data.time_units.name),
170+
spatial_units=input_data.spatial_units,
171+
plots=input_data.plots,
172+
)
173+
return result

0 commit comments

Comments
 (0)