Skip to content

Fixed length strings #18

@synapticarbors

Description

@synapticarbors

Using Julia 0.3.2 on OS X 10.9.5

I'm moving a discussion that started on the Julia-user list to this issue since the discussion has shifted from talking generally about strategies to get something to work in Julia to possible issues with StrPack. The original thread is at:

https://groups.google.com/forum/?fromgroups=#!topic/julia-users/Q_-xJOvF_vY

The general issue is that I have some raw byte data produced outside of Julia that I want to read into Julia. The program that generates the data does not pad the memory so a standard immutable type is not sufficient since it doesn't get the memory alignment correct. StrPack seems to work for this when I'm just using numerical types, but when I introduce a string field to my struct, the first "row" of data is read in without a thrown error, but it truncates the string, and then fails on the second row with the following message:

ERROR: invalid ASCII sequence
 in convert at /Applications/Julia-0.3.2.app/Contents/Resources/julia/lib/julia/sys.dylib
 in unpack at /Users/lev/.julia/v0.3/StrPack/src/StrPack.jl:162
 in anonymous at no file:20
 in include at /Applications/Julia-0.3.2.app/Contents/Resources/julia/lib/julia/sys.dylib
 in include_from_node1 at loading.jl:128
 in process_options at /Applications/Julia-0.3.2.app/Contents/Resources/julia/lib/julia/sys.dylib
 in _start at /Applications/Julia-0.3.2.app/Contents/Resources/julia/lib/julia/sys.dylib (repeats 2 times)
while loading /Users/lev/Projects/julia_sandbox/packed_struct.jl, in expression starting on line 19

My test code is in the following gist:

https://gist.github.com/synapticarbors/87105316f599f05c3c04

python makes_data.py

will generate the test file. It's basically writing a numpy recarray to a file of bytes, and the idea is that I want to read the same file into Julia. Numpy recarrays are represented in memory as an array of packed structs. The above script also dumps some data about the item sizes of each element in the struct as well as the values (column-wise per struct member rather then row-wise per record). In the test data there are 5 total records:

Data:
a: [ 0.02319369  0.36053342  0.72232703 -1.00344146 -0.23721758]
b: [ 1 -1  2  0 -1]
c: [ 0.30077654 -0.67934191 -0.67194921 -1.14133692  1.32868588]
d: [-1  0  0  0  0]
e: ['abcd' 'abcd' 'abcd' 'abcd' 'abcd']
Itemsizes:
a: 8
b: 4
c: 4
d: 2
e: 4

If you can't run the python script yourself. The file corresponding to the data displayed above is available at the following link:

https://www.dropbox.com/s/tdyv60otzdq6spx/test1.dat

If my StrPack enhanced Julia composite immutable type is:

@struct immutable TestType
    a::Float64
    b::Int32
    c::Float32
    d::Int16
    e::ASCIIString(4)
end

The results of show_struct_layout(TestType) is:

0x0000 [Float64-----------------------------------------------------------------------]
0x0008 [Int32---------------------------------][Float32-------------------------------]
0x0010 [Int16-------------][ASCIIStr][ASCIIStr][ASCIIStr][ASCIIStr][PadByte-][PadByte-]

If I specify some of the option fields:

show_struct_layout(TestType, {"a"=>8, "b"=>4,"c"=>4, "d"=>2, "e"=>4}, align_packed, 8, 10), I get:

0x0000 [Float64-----------------------------------------------------------------------]
0x0008 [Int32---------------------------------][Float32-------------------------------]
0x0010 [Int16-------------][ASCIIStr]

You can see that instead of just removing the padding, it also stripped out 3 character positions. Changing ASCIIString(4) to ASCIIString(8) or some other value still only displays a single ASCIIStr byte. Looks like a bug maybe?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions