Skip to content
13 changes: 13 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,19 @@ Change Log
`GSD <https://github.com/glotzerlab/gsd>`_ releases follow `semantic versioning
<https://semver.org/>`_.

5.x
---

5.0.0 (Not yet released)
------------------------

*Added:*

* ``gsd.hoomd`` schema now accepts float64 in validate, ``gsd.hoomd.open`` will take precision arguments
(`#495 <https://github.com/glotzerlab/gsd/pull/495>`__).
* ``gsd.fl.write_chunk`` will take precision arguments, rounding any floats to 32 or 64 precision with ``double`` or ``single``
(`#495 <https://github.com/glotzerlab/gsd/pull/495>`__).

4.x
---

Expand Down
1 change: 1 addition & 0 deletions doc/credits.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,4 @@ The following people contributed to GSD.
* Charlotte Shiqi Zhao, University of Michigan
* Tim Moore, University of Michigan
* Joseph Burkhart, University of Michigan
* Nicholas Craven, Vanderbilt University
56 changes: 47 additions & 9 deletions gsd/hoomd.py
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I leave it up to you whether you want to lazily cast or cast on validate.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's preferred to lazily cast. An important use case would be to: write out frames in a trajectory in 32 bit precision, then write out a final restart Frame in 64 bit precision. However, we lose the precision if we cast on validate when we write out the trajectory as we go. See test_write_multiple_precision as an example of writing the same Frame with different precisions.

At commit df776ed, we now expectedly cause these tests to fail, since the data objects are modified to represent the correct float_type.

FAILED gsd/test/test_hoomd.py::test_fallback[(r,w)] - AssertionError:
FAILED gsd/test/test_hoomd.py::test_fallback[(a,x)] - AssertionError:
FAILED gsd/test/test_hoomd.py::test_fallback[(r,a)] - AssertionError:
FAILED gsd/test/test_hoomd.py::test_fallback_to_frame0[(r,w)] - AssertionError:
FAILED gsd/test/test_hoomd.py::test_fallback_to_frame0[(a,x)] - AssertionError:
FAILED gsd/test/test_hoomd.py::test_fallback_to_frame0[(r,a)] - AssertionError:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. I see that your current implementation casts both on validate and on write. Is that intentional? To lazily cast, you could remove the ascontiguousarray and reshape calls from validate. You would also need to reinstate the reshape calls in append.

This change would defer any validation error messages to the time that append is called, but at the same time would allow users to keep their unmodified raw input data in a Frame.

Original file line number Diff line number Diff line change
Expand Up @@ -194,6 +194,15 @@ def __init__(self):
self.image = None
self.type_shapes = None

def validate_precision(self, data):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you expect callers to use validate_precision? If not, rename to _validate_precision to make it private.

"""Maintain floats in numpy arrays."""
outFloat = numpy.float32
if isinstance(data, list):
outFloat = numpy.float64
elif data.dtype == numpy.float64:
outFloat = numpy.float64
return outFloat

def validate(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modify ConfigurationData.valid and ConstraintData.valid similarly.

"""Validate all attributes.

Expand All @@ -207,31 +216,39 @@ def validate(self):
replaced with contiguous numpy arrays of the appropriate type.
"""
if self.position is not None:
self.position = numpy.ascontiguousarray(self.position, dtype=numpy.float32)
self.position = numpy.ascontiguousarray(
self.position, dtype=self.validate_precision(self.position)
)
self.position = self.position.reshape([self.N, 3])
if self.orientation is not None:
self.orientation = numpy.ascontiguousarray(
self.orientation, dtype=numpy.float32
self.orientation, dtype=self.validate_precision(self.orientation)
)
self.orientation = self.orientation.reshape([self.N, 4])
if self.typeid is not None:
self.typeid = numpy.ascontiguousarray(self.typeid, dtype=numpy.uint32)
self.typeid = self.typeid.reshape([self.N])
if self.mass is not None:
self.mass = numpy.ascontiguousarray(self.mass, dtype=numpy.float32)
self.mass = numpy.ascontiguousarray(
self.mass, dtype=self.validate_precision(self.mass)
)
self.mass = self.mass.reshape([self.N])
if self.charge is not None:
self.charge = numpy.ascontiguousarray(self.charge, dtype=numpy.float32)
self.charge = numpy.ascontiguousarray(
self.charge, dtype=self.validate_precision(self.charge)
)
self.charge = self.charge.reshape([self.N])
if self.diameter is not None:
self.diameter = numpy.ascontiguousarray(self.diameter, dtype=numpy.float32)
self.diameter = numpy.ascontiguousarray(
self.diameter, dtype=self.validate_precision(self.diameter)
)
self.diameter = self.diameter.reshape([self.N])
if self.body is not None:
self.body = numpy.ascontiguousarray(self.body, dtype=numpy.int32)
self.body = self.body.reshape([self.N])
if self.moment_inertia is not None:
self.moment_inertia = numpy.ascontiguousarray(
self.moment_inertia, dtype=numpy.float32
self.moment_inertia, dtype=self.validate_precision(self.moment_inertia)
)
self.moment_inertia = self.moment_inertia.reshape([self.N, 3])
if self.velocity is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use self.validate_precision in velocity and angmom.

Expand Down Expand Up @@ -674,17 +691,19 @@ class HOOMDTrajectory:

Args:
file (`gsd.fl.GSDFile`): File to access.
precision (str): Precision to use for floats on write.

Open hoomd GSD files with `open`.
"""

def __init__(self, file):
def __init__(self, file, precision='single'):
if file.mode == 'ab':
msg = 'Append mode not yet supported'
raise ValueError(msg)

self._file = file
self._initial_frame = None
self._precision = precision
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below, if version < (2, 0) and version >= (1, 0): should be if version < (3, 0) and version >= (1, 0):


# Used to cache positive results when chunks exist in frame 0.
self._chunk_exists_frame_0 = {}
Expand Down Expand Up @@ -712,6 +731,11 @@ def file(self):
""":class:`gsd.fl.GSDFile`: The file handle."""
return self._file

@property
def precision(self):
""":class:str: The object write precision."""
return self._precision

def __len__(self):
"""The number of frames in the trajectory."""
return self.file.nframes
Expand All @@ -737,6 +761,8 @@ def append(self, frame):
if self._initial_frame is None and len(self) > 0:
self._read_frame(0)

float_type = numpy.float64 if self.precision == 'double' else numpy.float32

for path in [
'configuration',
'particles',
Expand Down Expand Up @@ -764,6 +790,17 @@ def append(self, frame):
wid = max(len(w) for w in data) + 1
b = numpy.array(data, dtype=numpy.dtype((bytes, wid)))
data = b.view(dtype=numpy.int8).reshape(len(b), wid)
if name in [
'position',
'orientation',
'velocity',
'angmom',
'charge',
'box',
'value',
Comment on lines +793 to +800
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add missing float fields.

Suggested change
if name in [
'position',
'orientation',
'velocity',
'angmom',
'charge',
'box',
'value',
if name in [
'diameter',
'mass',
'moment_inertia',
'position',
'orientation',
'velocity',
'angmom',
'charge',
'box',
'value',

]:
# convert using float specified precision
data = numpy.ascontiguousarray(data, dtype=float_type)

self.file.write_chunk(path + '/' + name, data)

Expand Down Expand Up @@ -1062,14 +1099,15 @@ def flush(self):
self._file.flush()


def open(name, mode='r'): # noqa: A001 - allow shadowing builtin open
def open(name, mode='r', precision='single'): # noqa: A001 - allow shadowing builtin open
"""Open a hoomd schema GSD file.

The return value of `open` can be used as a context manager.

Args:
name (str): File name to open.
mode (str): File open mode.
precision (str): Float precision to expect in File. Can be 'single' or 'double'.

Returns:
`HOOMDTrajectory` instance that accesses the file **name** with the
Expand Down Expand Up @@ -1114,7 +1152,7 @@ def open(name, mode='r'): # noqa: A001 - allow shadowing builtin open
schema_version=[1, 4],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
schema_version=[1, 4],
schema_version=[2, 0],

)

return HOOMDTrajectory(gsdfileobj)
return HOOMDTrajectory(gsdfileobj, precision=precision)


def read_log(name: str, scalar_only=False, glob_pattern='*'):
Expand Down
133 changes: 130 additions & 3 deletions gsd/test/test_hoomd.py
Original file line number Diff line number Diff line change
Expand Up @@ -355,7 +355,7 @@ def test_fallback(tmp_path, open_mode):
frame2.pairs.N = 7

with gsd.hoomd.open(
name=tmp_path / 'test_fallback.gsd', mode=open_mode.write
name=tmp_path / 'test_fallback.gsd', mode=open_mode.write, precision='double'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test should fail. It requests a double precision file, then asserts that the values are float32. Add parameters to the test so that it checks single with float32 and double with float64. Then adjust the implementation of defaults so that the test passes both cases.

) as hf:
hf.extend([frame0, frame1, frame2])

Expand Down Expand Up @@ -516,7 +516,7 @@ def test_fallback_to_frame0(tmp_path, open_mode):
frame1.pairs.N = None

with gsd.hoomd.open(
name=tmp_path / 'test_fallback2.gsd', mode=open_mode.write
name=tmp_path / 'test_fallback2.gsd', mode=open_mode.write, precision='double'
) as hf:
hf.extend([frame0, frame1])

Expand Down Expand Up @@ -987,7 +987,9 @@ def test_read_log_warning(tmp_path):
def test_initial_frame_copy(tmp_path, open_mode):
"""Ensure that the user does not unintentionally modify _initial_frame."""
with gsd.hoomd.open(
name=tmp_path / 'test_initial_frame_copy.gsd', mode=open_mode.write
name=tmp_path / 'test_initial_frame_copy.gsd',
mode=open_mode.write,
precision='double',
) as hf:
frame = make_nondefault_frame()

Expand Down Expand Up @@ -1075,3 +1077,128 @@ def test_initial_frame_copy(tmp_path, open_mode):
for key in frame_1.log.keys():
assert frame_1.log[key] is initial.log[key]
assert not frame_1.log[key].flags.writeable


def test_float64(tmp_path, open_mode):
"""Test that charges can be optionally set to float32 or float64."""
frame = gsd.hoomd.Frame()
frame.particles.N = 4
charges32 = numpy.array([-0.1, 0.1 / 3, 0.1 / 3, 0.1 / 3], dtype=numpy.float32)
charges64 = numpy.array([-0.1, 0.1 / 3, 0.1 / 3, 0.1 / 3], dtype=numpy.float64)
frame.particles.charge = charges64
numpy.testing.assert_array_equal(frame.particles.charge, charges64)

with gsd.hoomd.open(
name=tmp_path / 'test_defaults.gsd', mode=open_mode.write, precision='double'
) as hf:
hf.append(frame)

with gsd.hoomd.open(name=tmp_path / 'test_defaults.gsd', mode=open_mode.read) as hf:
s = hf[0]

numpy.testing.assert_array_equal(s.particles.charge, charges64)
numpy.testing.assert_raises(
AssertionError,
numpy.testing.assert_array_equal,
s.particles.charge,
charges32,
)


def test_precision(tmp_path, open_mode):
"""Test that single and double precision files can be stored."""
frame = gsd.hoomd.Frame()
frame.particles.N = 4
charges32 = numpy.array([-0.1, 0.1 / 3, 0.1 / 3, 0.1 / 3], dtype=numpy.float32)
charges64 = numpy.array([-0.1, 0.1 / 3, 0.1 / 3, 0.1 / 3], dtype=numpy.float64)
frame.particles.charge = charges64
numpy.testing.assert_array_equal(frame.particles.charge, charges64)

with gsd.hoomd.open(
name=tmp_path / 'double.gsd', mode=open_mode.write, precision='double'
) as hf:
hf.append(frame)

with gsd.hoomd.open(
name=tmp_path / 'single.gsd', mode=open_mode.write, precision='single'
) as hf:
hf.append(frame)

with gsd.hoomd.open(name=tmp_path / 'double.gsd', mode=open_mode.read) as hf:
s = hf[0]

numpy.testing.assert_array_equal(s.particles.charge, charges64)

with gsd.hoomd.open(name=tmp_path / 'single.gsd', mode=open_mode.read) as hf:
s = hf[0]
numpy.testing.assert_array_equal(s.particles.charge, charges32)


def test_all_precision(tmp_path, open_mode):
"""Test both single and double precision on default Frame."""
frame0 = make_nondefault_frame()
with gsd.hoomd.open(
name=tmp_path / 'double.gsd', mode=open_mode.write, precision='double'
) as hf:
hf.append(frame0)

with gsd.hoomd.open(
name=tmp_path / 'single.gsd', mode=open_mode.write, precision='single'
) as hf:
hf.append(frame0)

with gsd.hoomd.open(
name=tmp_path / 'double.gsd', mode=open_mode.read, precision='double'
) as hf:
s = hf[0]
for name in ['position', 'orientation', 'velocity', 'angmom', 'charge']:
numpy.testing.assert_array_equal(
getattr(frame0.particles, name), getattr(s.particles, name)
)
assert getattr(s.particles, name).dtype == numpy.float64

with gsd.hoomd.open(name=tmp_path / 'single.gsd', mode=open_mode.read) as hf:
s = hf[0]
for name in ['position', 'orientation', 'velocity', 'angmom', 'charge']:
numpy.testing.assert_array_equal(
numpy.ascontiguousarray(
getattr(frame0.particles, name), dtype=numpy.float32
),
getattr(s.particles, name),
)
assert getattr(s.particles, name).dtype == numpy.float32


def test_write_multiple_precision(tmp_path):
"""Test single, then double precision writing on the particles position."""
frame = gsd.hoomd.Frame()
frame.particles.N = 1
frame.particles.position = [[0.1, 0.2, 0.3]] # is a list of list of floats
with gsd.hoomd.open( # converts from float64 to float32
name=tmp_path / 'single.gsd', mode='w', precision='single'
) as hf:
hf.append(frame)
with gsd.hoomd.open( # converts from python float to dtype float64
name=tmp_path / 'double.gsd', mode='w', precision='double'
) as hf:
hf.append(frame)

# frame is now fixed single precision, due to the conversion in hf.append(Frame)
with gsd.hoomd.open(
name=tmp_path / 'single.gsd', mode='r', precision='single'
) as hf:
s = hf[0] # read as a single precision
numpy.testing.assert_array_equal(
numpy.ascontiguousarray(frame.particles.position, dtype=numpy.float32),
s.particles.position,
)
assert s.particles.position.dtype == numpy.float32
with gsd.hoomd.open(
name=tmp_path / 'double.gsd', mode='r', precision='double'
) as hf:
s = hf[0] # read as double precision
numpy.testing.assert_array_equal(
frame.particles.position,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current implementation, validate mutates the values inside frame, so this check will always succeed because the single precision file is written first (therefore the double precision file writes values that have already been rounded to single precision).

This test should validate that the positions are [[0.1, 0.2, 0.3]]. Lazy casting will be needed for both assert checks to pass.

s.particles.position,
)
assert s.particles.position.dtype == numpy.float64
Loading