How to use a Union type array? #151

wuerflts · 2024-05-11T16:43:59Z

wuerflts
May 11, 2024

Say I have a Union like: "UnsignedInteger: [uint8, uint16, uint32, uint64]" and want to use it in an array "data: UnsignedInteger[y,x]". How can I create a numpy array with the correct dtype?
I tried using the "get_type(UnsignedInteger.uint8)" function and manually specifying values but could not make it work.

Answered by johnstairs

May 13, 2024

To get the dtype, you can pass in the generated union type, e.g. sandbox.get_dtype(sandbox.UnsignedInteger). For arrays of unions, this will always be an "object", np.object_. BTW, this kind of array is not going to be efficient because each array element is going to be a heap-allocated Python object. If you can, you would be better off normalizing to int64 or a union of arrays of different types:

UnsignedIntegerArray: !union
  uint8: uint8[y,x]
  uint16: uint16[y,x]
  uint32: uint32[y,x]
  uint64: uint64[y,x]

View full answer

johnstairs · 2024-05-13T15:10:50Z

johnstairs
May 13, 2024
Collaborator

To get the dtype, you can pass in the generated union type, e.g. sandbox.get_dtype(sandbox.UnsignedInteger). For arrays of unions, this will always be an "object", np.object_. BTW, this kind of array is not going to be efficient because each array element is going to be a heap-allocated Python object. If you can, you would be better off normalizing to int64 or a union of arrays of different types:

UnsignedIntegerArray: !union
  uint8: uint8[y,x]
  uint16: uint16[y,x]
  uint32: uint32[y,x]
  uint64: uint64[y,x]

0 replies

wuerflts · 2024-05-17T09:04:58Z

wuerflts
May 17, 2024
Author

OK I followed your suggestion:

UnsignedIntegerArray: !union
  uint8: uint8[y,x]
  uint16: uint16[y,x]
  uint32: uint32[y,x]
  uint64: uint64[y,x]

Sample: !record
  fields:
    timestamp: datetime
    data: UnsignedIntegerArray
    projectionmatrix: float64[y:3,x:4]?

But I struggle to use it:

def generate_projections():
    myDtype = get_dtype(UnsignedIntegerArray.Uint16)
    yield Sample(timestamp=DateTime.now(), data=np.array([[0,1,0],[0,1,0],[0,1,0]], dtype=myDtype))

yields:

File "d:\src\ComputedTomographyRawData\python\test.py", line 5, in generate_projections
    myDtype = get_dtype(UnsignedIntegerArray.Uint16)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "d:\src\ComputedTomographyRawData\python\ctrd\_dtypes.py", line 89, in <lambda>
    return lambda t: get_dtype_impl(dtype_map, t)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "d:\src\ComputedTomographyRawData\python\ctrd\_dtypes.py", line 78, in get_dtype_impl
    raise RuntimeError(f"Cannot find dtype for {t}")
RuntimeError: Cannot find dtype for <class 'abc.UnsignedIntegerArray.Uint16'>

Can you provide more examples how to use the get_dtype function?

1 reply

johnstairs May 17, 2024
Collaborator

Oh, good catch, the get_dtype function does not work with union cases! Opened #152

wuerflts · 2024-05-17T12:01:12Z

wuerflts
May 17, 2024
Author

I realized what I wanted to do is better achieved by using generics:

ProjectionUint8: Projection<uint8>
ProjectionUint16: Projection<uint16>
ProjectionUint32: Projection<uint32>
ProjectionUint64: Projection<uint64>

StreamItem: [ProjectionUint8, ProjectionUint16, ProjectionUint32, ProjectionUint64]

MyProtocol: !protocol
  sequence:
    header: Header
    projections: !stream
      items: StreamItem

TrajectoryType: !enum
  values:
    - circle
    - line
    - helix
    - projectionmatrices
    - compound

# Header defines the trajectory and general information about the measurement
Header: !record
  fields:
    id: string
    trajectory: TrajectoryType

ProjectionData<T>: !array
  items: T
  dimensions:
    y:
    x:

Projection<T>: !record
  fields:
    timestamp: datetime
    data: ProjectionData<T>
    projectionmatrix: float64[y:3,x:4]?

That way I can use it like this:

def generate_projections():
    myDtype = np.uint8
    data = np.array([[0,1,0],[0,1,0],[0,1,0]], dtype=myDtype)
    projection = ctrd.Projection[myDtype](timestamp=ctrd.DateTime.now(), data=data)
    for i in range(5):
        yield ctrd.StreamItem.ProjectionUint8(projection)

path = "ctrd.ndjson" 

with ctrd.BinaryMyProtocolWriter(path) as w: 
    w.write_header(ctrd.Header(id="uid", trajectory=ctrd.TrajectoryType.PROJECTIONMATRICES))
    w.write_projections(generate_projections())

with ctrd.BinaryMyProtocolReader(path) as r: 
    print(r.read_header())
    for sample in r.read_projections():
        print(sample)
        print(sample.value.data.dtype)

Maybe the yardl examples could contain an explicit example how to deal with different bitdepth raw data samples. I believe this is a fairly common usecase because different ADCs might send different bitdepth samples and it would seem wasteful to go all the way to 64 bit from 12 bit. However, that also raises the question how one could deal with 12bit ADCs explicitly without wasting space on the transmission?

2 replies

hansenms May 17, 2024
Collaborator

I may be misreading this, but since Yardl uses variable length integer encoding when serialized, there is not much transmission benefit in the approach above. You might as well just use the uint64 approach. If there is only 12 bits of information, it will only take up 2 bytes on the wire. In memory it would be 64 bit though.

wuerflts May 17, 2024
Author

Point taken - was not aware of that!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to use a Union type array? #151

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to use a Union type array? #151

Uh oh!

wuerflts May 11, 2024

Replies: 3 comments · 3 replies

Uh oh!

johnstairs May 13, 2024 Collaborator

Uh oh!

wuerflts May 17, 2024 Author

Uh oh!

johnstairs May 17, 2024 Collaborator

Uh oh!

wuerflts May 17, 2024 Author

Uh oh!

hansenms May 17, 2024 Collaborator

Uh oh!

wuerflts May 17, 2024 Author

wuerflts
May 11, 2024

Replies: 3 comments 3 replies

johnstairs
May 13, 2024
Collaborator

wuerflts
May 17, 2024
Author

johnstairs May 17, 2024
Collaborator

wuerflts
May 17, 2024
Author

hansenms May 17, 2024
Collaborator

wuerflts May 17, 2024
Author