Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,8 @@ hl7/
field.go # Field and Repetition value types
segment.go # Segment type with MSH special-case handling
message.go # Message type, ParseMessage(), segment iterators
accessor.go # Terser-style location parser, Location type, Get()/GetBytes()
accessor.go # Terser-style location parser, Location type, Value type, Get()
charset.go # ValueDecoder type; DecodeString on Field/Repetition/Component/Subcomponent/Value
reader.go # io.Reader wrapper with MLLP and raw mode support
writer.go # io.Writer wrapper with MLLP and raw mode support
ack.go # ACK message generation (AckCode, AckOption, Message.Ack, WithErrors)
Expand Down Expand Up @@ -153,7 +154,11 @@ Schema types are defined in `schema.go`:

### Location Type

`Location` in `accessor.go` represents a specific position in an HL7 message hierarchy. `ParseLocation` parses terser-style strings (e.g., `"PID-3[1].4.2"`) into a `Location`. `Location.String()` implements `fmt.Stringer` and produces the inverse terser representation. Both are used by the accessor (`Get`/`GetBytes`), transform, and builder subsystems.
`Location` in `accessor.go` represents a specific position in an HL7 message hierarchy. `ParseLocation` parses terser-style strings (e.g., `"PID-3[1].4.2"`) into a `Location`. `Location.String()` implements `fmt.Stringer` and produces the inverse terser representation. Both are used by the accessor (`Get`), transform, and builder subsystems.

`Value` in `accessor.go` is the return type of `Get()`. It is a lightweight value type (`raw []byte` + `delims Delimiters`) with `String()`, `Bytes()`, `IsEmpty()`, `IsNull()`, and `HasValue()` — the same interface as `Field`, `Repetition`, `Component`, and `Subcomponent`. A zero `Value` (nil raw bytes) is returned for invalid or not-found locations.

`ValueDecoder` in `charset.go` is a `func([]byte) ([]byte, error)` that converts post-unescape bytes to a target encoding (typically UTF-8). `DecodeString(ValueDecoder)` is defined on `Value`, `Field`, `Repetition`, `Component`, and `Subcomponent`. When the decoder is nil, `DecodeString` is equivalent to `String()` with no extra allocation.

## HL7v2 Specification Decisions

Expand Down
63 changes: 47 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ Zero external dependencies. Requires Go 1.23+.
```go
msg, _ := hl7.ParseMessage(rawBytes)

fmt.Println(msg.Get("MSH-9.1")) // "ADT"
fmt.Println(msg.Get("PID-5.1")) // "Smith"
fmt.Println(msg.Get("MSH-9.1").String()) // "ADT"
fmt.Println(msg.Get("PID-5.1").String()) // "Smith"
```

## When to use this library
Expand Down Expand Up @@ -42,27 +42,30 @@ if err != nil {

### Terser-style access

The `Get` method accepts location strings in the format `SEG-Field.Component.SubComponent`:
The `Get` method accepts location strings in the format `SEG-Field.Component.SubComponent`
and returns a `Value` — a lightweight value type holding the raw bytes and delimiters:

```go
msg.Get("MSH-9") // "ADT^A01" (full field, unescaped)
msg.Get("MSH-9.1") // "ADT" (first component)
msg.Get("MSH-9.2") // "A01" (second component)
msg.Get("PID-5.1") // "Smith" (family name)
msg.Get("PID-3.1") // "12345" (patient ID)
msg.Get("PID-3.1.1") // "12345" (first subcomponent)
msg.Get("MSH-9").String() // "ADT^A01" (full field, unescaped)
msg.Get("MSH-9.1").String() // "ADT" (first component)
msg.Get("MSH-9.2").String() // "A01" (second component)
msg.Get("PID-5.1").String() // "Smith" (family name)
msg.Get("PID-3.1").String() // "12345" (patient ID)
msg.Get("PID-3.1.1").String() // "12345" (first subcomponent)
msg.Get("PID-3.1").Bytes() // raw bytes without unescaping
```

Segment occurrence and repetition indices are supported:

```go
msg.Get("OBX(0)-5") // first OBX, observation value
msg.Get("OBX(1)-5") // second OBX, observation value
msg.Get("PID-3[0].1") // first repetition of PID-3, component 1
msg.Get("PID-3[1].1") // second repetition of PID-3, component 1
msg.Get("OBX(0)-5").String() // first OBX, observation value
msg.Get("OBX(1)-5").String() // second OBX, observation value
msg.Get("PID-3[0].1").String() // first repetition of PID-3, component 1
msg.Get("PID-3[1].1").String() // second repetition of PID-3, component 1
```

Missing values return an empty string — no error checking needed for chained reads.
Missing values return a zero `Value` — `String()` returns `""` and `Bytes()` returns `nil`.
No error checking is needed for chained reads.

### Location parsing

Expand Down Expand Up @@ -102,6 +105,34 @@ f.IsNull() // true if field is the HL7 null value ""
f.HasValue() // true if neither empty nor null
```

`Value` (returned by `Get`) has the same `IsEmpty`, `IsNull`, and `HasValue` methods.

### Character set decoding

For messages that declare a non-UTF-8 encoding in MSH-18, use `DecodeString` with a
`ValueDecoder` to convert bytes after unescaping. `DecodeString` is available on `Value`,
`Field`, `Repetition`, `Component`, and `Subcomponent`:

```go
// A ValueDecoder is func([]byte) ([]byte, error) — wrap e.g. golang.org/x/text.
var decode hl7.ValueDecoder
switch msg.Get("MSH-18").String() {
case "8859/1":
decode = latin1ToUTF8 // caller-provided
}

// Terser-style: decode a specific field value.
name, err := msg.Get("PID-5.1").DecodeString(decode)

// Hierarchical: decode a component.
family, err := seg.Field(5).Rep(0).Component(1).DecodeString(decode)
```

When `decode` is `nil`, `DecodeString` is equivalent to `String()` with no extra allocation.
Unescape always runs before the decoder, so the decoder receives resolved bytes. The `\C..\`
and `\M..\` charset escape sequences are passed through verbatim; a sophisticated decoder
may interpret them, but a simple byte-level decoder will treat them as-is.

## Transforming

`Transform` applies changes to a message and returns a new `*Message`. The original is never modified.
Expand Down Expand Up @@ -333,7 +364,7 @@ schema.Segments["PID"] = &hl7.SegmentDef{
```go
schema.Checks = []hl7.MessageCheckFunc{
func(msg *hl7.Message) []hl7.Issue {
if msg.Get("MSH-9.1") == "ORU" && msg.Get("OBX-1") == "" {
if msg.Get("MSH-9.1").String() == "ORU" && msg.Get("OBX-1").String() == "" {
return []hl7.Issue{{
Severity: hl7.SeverityError, Location: "OBX",
Code: "BUSINESS_RULE",
Expand Down Expand Up @@ -468,7 +499,7 @@ os.WriteFile("schema.json", data, 0644)
reader := hl7.NewReader(conn, hl7.WithMode(hl7.ModeMLLP))

err := reader.EachMessage(func(msg *hl7.Message) error {
msgType := msg.Get("MSH-9.1")
msgType := msg.Get("MSH-9.1").String()
fmt.Println("received", msgType)
return nil
})
Expand Down
112 changes: 63 additions & 49 deletions accessor.go
Original file line number Diff line number Diff line change
Expand Up @@ -196,40 +196,71 @@ func ParseLocation(s string) (Location, error) {
return loc, nil
}

// Get retrieves the unescaped string value at the given terser-style location.
// Value holds the raw bytes at a terser-style location, returned by Get.
// It is a lightweight value type (raw []byte + Delimiters), consistent with
// Field, Repetition, Component, and Subcomponent.
//
// Examples:
//
// msg.Get("MSH-9") // Message type field (full value)
// msg.Get("MSH-9.1") // Message code (e.g., "ADT")
// msg.Get("MSH-9.2") // Trigger event (e.g., "A01")
// msg.Get("PID-3.1") // Patient ID
// msg.Get("PID-5.1") // Family name
// msg.Get("OBX(0)-5") // First OBX segment, field 5
//
// Returns an empty string if the location is invalid or the value is not present.
func (m *Message) Get(location string) string {
loc, err := ParseLocation(location)
if err != nil {
return ""
}
result := m.getByLocation(loc)
return result.String()
// A zero Value (nil raw bytes) is returned when the location is invalid or
// the addressed element is not present in the message. A zero Value is empty:
// IsEmpty() returns true, String() returns "", and Bytes() returns nil.
type Value struct {
raw []byte
delims Delimiters
}

// String returns the unescaped string value. Returns an empty string for a
// zero (not-found) Value.
func (v Value) String() string {
return string(Unescape(v.raw, v.delims))
}

// Bytes returns the raw bytes without escape processing. Returns nil for a
// zero (not-found) Value.
func (v Value) Bytes() []byte {
return v.raw
}

// IsEmpty returns true if the value was not present (nil raw bytes).
func (v Value) IsEmpty() bool {
return len(v.raw) == 0
}

// IsNull returns true if the value is the HL7 explicit null, represented by
// two double-quote characters ("").
func (v Value) IsNull() bool {
return len(v.raw) == 2 && v.raw[0] == '"' && v.raw[1] == '"'
}

// GetBytes retrieves the raw bytes at the given terser-style location.
// Returns nil if the location is invalid or the value is not present.
func (m *Message) GetBytes(location string) []byte {
// HasValue returns true if the value is neither empty nor null.
func (v Value) HasValue() bool {
return !v.IsEmpty() && !v.IsNull()
}

// Get retrieves the value at the given terser-style location.
//
// Returns a zero Value if the location string is invalid or the addressed
// element is not present — consistent with how Field(n), Rep(n), Component(n),
// and SubComponent(n) return zero values for out-of-range indices.
//
// Examples:
//
// msg.Get("MSH-9").String() // Message type field (full value, unescaped)
// msg.Get("MSH-9.1").String() // Message code (e.g., "ADT")
// msg.Get("MSH-9.2").String() // Trigger event (e.g., "A01")
// msg.Get("PID-3.1").String() // Patient ID
// msg.Get("PID-5.1").String() // Family name
// msg.Get("OBX(0)-5").String() // First OBX segment, field 5
// msg.Get("PID-3.1").Bytes() // raw bytes without unescaping
func (m *Message) Get(location string) Value {
loc, err := ParseLocation(location)
if err != nil {
return nil
return Value{}
}
result := m.getByLocation(loc)
return result.Bytes()
return m.getByLocation(loc)
}

// getByLocation navigates the message hierarchy to the specified location.
func (m *Message) getByLocation(loc Location) componentOrField {
func (m *Message) getByLocation(loc Location) Value {
// Find the matching segment.
matchIdx := 0
var seg *Segment
Expand All @@ -243,51 +274,34 @@ func (m *Message) getByLocation(loc Location) componentOrField {
}
}
if seg == nil {
return componentOrField{}
return Value{}
}

field := seg.Field(loc.Field)
if field.IsEmpty() {
return componentOrField{}
return Value{}
}

rep := field.Rep(loc.Repetition)
if rep.IsEmpty() {
return componentOrField{}
return Value{}
}

// If no component specified, return the repetition.
if loc.Component == 0 {
return componentOrField{fieldBytes: rep.raw, delims: rep.delims}
return rep.Value
}

comp := rep.Component(loc.Component)
if comp.IsEmpty() && loc.SubComponent == 0 {
return componentOrField{}
return Value{}
}

// If no subcomponent specified, return the component.
if loc.SubComponent == 0 {
return componentOrField{fieldBytes: comp.raw, delims: comp.delims}
return comp.Value
}

sub := comp.SubComponent(loc.SubComponent)
return componentOrField{fieldBytes: sub.raw, delims: sub.delims}
}

// componentOrField is a helper to unify the return type from getByLocation.
type componentOrField struct {
fieldBytes []byte
delims Delimiters
}

func (c componentOrField) String() string {
if len(c.fieldBytes) == 0 {
return ""
}
return string(Unescape(c.fieldBytes, c.delims))
}

func (c componentOrField) Bytes() []byte {
return c.fieldBytes
return sub.Value
}
Loading
Loading