Skip to content

Conversation

@AdamIsrael
Copy link
Owner

Summary

This PR adds a comprehensive relationship and search API to gedcom-rs, providing powerful tools for genealogical analysis and queries.

Key Features

Search Functions

  • find_individual_by_xref() - Fast XRef lookups (~3.3 ns)
  • find_individuals_by_name() - Name-based search with partial matching
  • find_family_by_xref() - Family record lookups
  • find_individuals_by_event_date() - Search by event type and date pattern

Basic Relationship Functions

  • get_parents() - Find all parents (handles multiple families)
  • get_children() - Find all children
  • get_spouses() - Find all spouses
  • get_siblings() - Find all siblings (full and half)
  • get_full_siblings() - Find only full siblings (same parents)
  • get_half_siblings() - Find only half-siblings (one shared parent)

Advanced Relationship Functions

  • get_ancestors() - Traverse family tree upward with generation limit
  • get_descendants() - Traverse family tree downward with generation limit
  • find_relationship_path() - Find connection path between individuals
  • find_relationship() - Determine genealogical relationship with MRCA

New Types

RelationshipResult

Comprehensive relationship information including:

  • Human-readable description (e.g., "1st Cousin 2x Removed")
  • Most Recent Common Ancestor(s) (MRCA)
  • Generational distances

EventType Enum

Supports 20+ event types: Birth, Death, Christening, Baptism, BarMitzvah, BasMitzvah, Blessing, Burial, Census, Confirmation, FirstCommunion, Cremation, Adoption, Emigration, Graduation, Immigration, Naturalization, Probate, Retirement, Will, ChristeningAdult

Technical Highlights

  • Half-sibling detection: Properly identifies half-siblings even when they're in different family records
  • Relationship descriptions: Concise format (e.g., "3rd Cousin 2x Removed" instead of "Third Cousin Twice Removed")
  • Performance optimized: HashMap-based XRef lookups, BFS for relationship paths
  • Comprehensive benchmarks: All functions benchmarked using Criterion
  • Well documented: Extensive doc comments with examples

Examples Added

  1. search_individuals.rs - Demonstrates search functions
  2. basic_relationships.rs - Shows parent/child/spouse/sibling queries
  3. advanced_relationships.rs - Demonstrates ancestor/descendant traversal
  4. find_relationship.rs - Shows relationship detection between individuals
  5. search_by_date.rs - Event date search with CLI interface

Testing

  • All 271 tests passing (200 + 8 + 25 + 21 + 17)
  • All 5 examples tested and working
  • Benchmarks compile and run successfully
  • Code formatted with cargo fmt

Performance

  • XRef lookups: ~3.3 ns
  • Name searches: ~795 ns
  • All relationship functions optimized for production use

- Add search functions: find_individual_by_xref, find_individuals_by_name, find_family_by_xref, find_individuals_by_event_date
- Add basic relationship functions: get_parents, get_children, get_spouses, get_siblings, get_full_siblings, get_half_siblings
- Add advanced relationship functions: get_ancestors, get_descendants, find_relationship_path, find_relationship
- Add RelationshipResult type with MRCA tracking and human-readable descriptions
- Add EventType enum supporting 20+ life events (Birth, Death, Christening, etc.)
- Implement relationship detection: Parent/Child, Sibling/Half-Sibling, Cousins (1st, 2nd, etc.), Removed cousins (1x, 2x, etc.)
- Fix sibling detection to properly identify half-siblings across different family records
- Add comprehensive benchmarks for all new functions
- Add 5 example programs demonstrating the API
- All 271 tests passing
@AdamIsrael AdamIsrael requested a review from Copilot December 7, 2025 00:06
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a comprehensive relationship and search API to gedcom-rs, providing powerful tools for genealogical analysis and queries. The implementation includes search functions for finding individuals and families, basic relationship queries (parents, children, spouses, siblings), and advanced relationship analysis (ancestors, descendants, relationship paths, and genealogical relationship descriptions).

Key changes:

  • Introduces RelationshipResult and EventType types for representing relationships and event categories
  • Adds 14 new methods to the Gedcom struct covering search and relationship functionality
  • Includes extensive documentation, examples, and comprehensive benchmarks

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/types/mod.rs Core implementation of relationship and search APIs with new types and 14 methods on Gedcom
examples/search_individuals.rs Demonstrates search functions (by xref, name, and family)
examples/search_by_date.rs Shows event-based search with CLI interface
examples/find_relationship.rs Demonstrates relationship determination between individuals
examples/basic_relationships.rs Shows parent/child/spouse/sibling queries
examples/advanced_relationships.rs Demonstrates ancestor/descendant traversal and path finding
benches/parse_gedcom.rs Adds comprehensive benchmarks for all new functions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/types/mod.rs Outdated
1 => "Parent".to_string(),
2 => "Grandparent".to_string(),
3 => "Great-Grandparent".to_string(),
n => format!("{}Great-Grandparent", "Great-".repeat(n - 3)),
Copy link

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern creates invalid strings for n=4 (should be '2x Great-Grandparent' but generates 'Great-Great-Grandparent'). The multiplier-based pattern (e.g., '2x Great-Grandparent') is inconsistent with the literal pattern used for n=3. Consider using a consistent approach such as format!('{}x Great-Grandparent', n - 2) for n >= 4.

Copilot uses AI. Check for mistakes.
src/types/mod.rs Outdated
1 => "Child".to_string(),
2 => "Grandchild".to_string(),
3 => "Great-Grandchild".to_string(),
n => format!("{}Great-Grandchild", "Great-".repeat(n - 3)),
Copy link

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as with Great-Grandparent: for n=4, this generates 'Great-Great-Grandchild' instead of a multiplier-based format. Consider using format!('{}x Great-Grandchild', n - 2) for n >= 4 to maintain consistency.

Suggested change
n => format!("{}Great-Grandchild", "Great-".repeat(n - 3)),
n => format!("{}x Great-Grandchild", n - 2),

Copilot uses AI. Check for mistakes.
src/types/mod.rs Outdated
Comment on lines 1311 to 1312
let greats = "Great-".repeat(generations2 - 3);
return format!("{}Grand-Niece/Grand-Nephew", greats);
Copy link

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For generations2=4, this produces 'Great-Grand-Niece/Grand-Nephew', but should likely use a multiplier format like '2x Grand-Niece/Grand-Nephew' for consistency with cousin relationship descriptions. Consider using a multiplier-based format for generations2 >= 4.

Suggested change
let greats = "Great-".repeat(generations2 - 3);
return format!("{}Grand-Niece/Grand-Nephew", greats);
if generations2 == 3 {
return "Grand-Niece/Grand-Nephew".to_string();
} else {
return format!("{}x Grand-Niece/Grand-Nephew", generations2 - 2);
}

Copilot uses AI. Check for mistakes.
src/types/mod.rs Outdated
Comment on lines 1315 to 1316
let greats = "Great-".repeat(generations1 - 3);
return format!("{}Grand-Aunt/Grand-Uncle", greats);
Copy link

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For generations1=4, this produces 'Great-Grand-Aunt/Grand-Uncle', but should likely use a multiplier format like '2x Grand-Aunt/Grand-Uncle' for consistency. Consider using a multiplier-based format for generations1 >= 4.

Suggested change
let greats = "Great-".repeat(generations1 - 3);
return format!("{}Grand-Aunt/Grand-Uncle", greats);
if generations1 == 3 {
return "Great-Grand-Aunt/Grand-Uncle".to_string();
} else {
return format!("{}x Grand-Aunt/Grand-Uncle", generations1 - 2);
}

Copilot uses AI. Check for mistakes.
src/types/mod.rs Outdated
Comment on lines 1335 to 1337
n if n % 10 == 1 && n % 100 != 11 => format!("{}st", n),
n if n % 10 == 2 && n % 100 != 12 => format!("{}nd", n),
n if n % 10 == 3 && n % 100 != 13 => format!("{}rd", n),
Copy link

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The ordinal number formatting logic is duplicated in the relationship description. Consider extracting this into a helper function like fn format_ordinal(n: usize) -> String to improve code maintainability.

Copilot uses AI. Check for mistakes.
- Collapse nested if statements in find_relationship
- Replace unwrap() calls with proper error handling in find_relationship_via_mrca
- All clippy warnings resolved with -D warnings
- All 271 tests still passing
The base case (Great-Grandparent, Grand-Niece/Nephew) represents the implied
'1st' occurrence, so numbering should start at 2nd:

- 4 generations: '2nd Great-Grandparent' (was '1st Great-Grandparent')
- 5 generations: '3rd Great-Grandparent' (was '2nd Great-Grandparent')
- Similar corrections for Great-Grandchild, Grand-Aunt/Uncle, Grand-Niece/Nephew

This aligns with the convention that 'Great-Grandparent' implicitly means
'1st Great-Grandparent', just as 'Grandparent' means '1st generation up from parent'.
Use the same Great-Grand pattern as Grandparents for consistency:

Aunt/Uncle progression:
- Aunt/Uncle → Grand-Aunt/Grand-Uncle → Great-Grand-Aunt/Grand-Uncle →
  2nd Great-Grand-Aunt/Grand-Uncle

Niece/Nephew progression:
- Niece/Nephew → Grand-Niece/Grand-Nephew → Great-Grand-Niece/Grand-Nephew →
  2nd Great-Grand-Niece/Grand-Nephew

This matches the Grandparent pattern:
- Grandparent → Great-Grandparent → 2nd Great-Grandparent
@AdamIsrael AdamIsrael requested a review from Copilot December 7, 2025 00:30
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/types/mod.rs Outdated
Comment on lines 1112 to 1113
return RelationshipResult {
description: "Son".to_string(),
Copy link

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The relationship description 'Son' is returned regardless of the actual gender of person1. This should check person1.gender and return 'Daughter' for female individuals.

Suggested change
return RelationshipResult {
description: "Son".to_string(),
let description = match &person1.gender {
Some(g) if g == "M" => "Son".to_string(),
Some(g) if g == "F" => "Daughter".to_string(),
_ => "Child".to_string(),
};
return RelationshipResult {
description,

Copilot uses AI. Check for mistakes.
src/types/mod.rs Outdated
Comment on lines 1120 to 1121
return RelationshipResult {
description: "Daughter".to_string(),
Copy link

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The relationship description 'Daughter' is returned regardless of the actual gender of person1. This should check person1.gender and return 'Son' for male individuals.

Suggested change
return RelationshipResult {
description: "Daughter".to_string(),
let desc = match person1.gender.as_deref() {
Some("M") => "Son",
Some("F") => "Daughter",
_ => "Child",
};
return RelationshipResult {
description: desc.to_string(),

Copilot uses AI. Check for mistakes.
src/types/mod.rs Outdated
Comment on lines 1326 to 1328
4 => "Great-Grand-Niece/Grand-Nephew".to_string(),
n => format!(
"{} Great-Grand-Niece/Grand-Nephew",
Copy link

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent naming: should be 'Great-Grand-Niece/Great-Grand-Nephew' to match the pattern used in line 1335-1336.

Suggested change
4 => "Great-Grand-Niece/Grand-Nephew".to_string(),
n => format!(
"{} Great-Grand-Niece/Grand-Nephew",
4 => "Great-Grand-Niece/Great-Grand-Nephew".to_string(),
n => format!(
"{} Great-Grand-Niece/Great-Grand-Nephew",

Copilot uses AI. Check for mistakes.
…g fix

- Check child's gender to return 'Son' or 'Daughter' instead of always
  using generic 'Child' (fixes both parent checks)
- Fix naming inconsistency: 'Great-Grand-Niece/Nephew' →
  'Great-Grand-Niece/Great-Grand-Nephew' for consistency with Aunt/Uncle pattern
- Update test expectations to match corrected naming
@AdamIsrael AdamIsrael requested a review from Copilot December 7, 2025 00:34
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/types/mod.rs Outdated
assert_eq!(gedcom.describe_relationship(3, 1), "Grand-Aunt/Grand-Uncle");
assert_eq!(
gedcom.describe_relationship(4, 1),
"Great-Grand-Aunt/Grand-Uncle"
Copy link

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'Grand-Aunt/Grand-Uncle' to 'Great-Grand-Aunt/Great-Grand-Uncle' for consistency with the pattern on line 1337.

Suggested change
"Great-Grand-Aunt/Grand-Uncle"
"Great-Grand-Aunt/Great-Grand-Uncle"

Copilot uses AI. Check for mistakes.
src/types/mod.rs Outdated
Comment on lines 1346 to 1348
4 => "Great-Grand-Aunt/Grand-Uncle".to_string(),
n => format!(
"{} Great-Grand-Aunt/Grand-Uncle",
Copy link

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'Grand-Aunt/Grand-Uncle' to 'Great-Grand-Aunt/Great-Grand-Uncle' for consistency with the pattern established in the code.

Suggested change
4 => "Great-Grand-Aunt/Grand-Uncle".to_string(),
n => format!(
"{} Great-Grand-Aunt/Grand-Uncle",
4 => "Great-Grand-Aunt/Great-Grand-Uncle".to_string(),
n => format!(
"{} Great-Grand-Aunt/Great-Grand-Uncle",

Copilot uses AI. Check for mistakes.
Correct the naming to use 'Great-Grand-Aunt/Great-Grand-Uncle' instead of
'Great-Grand-Aunt/Grand-Uncle' for 4+ generation relationships.

This ensures both parts of the relationship name receive the full prefix,
matching the pattern used for Niece/Nephew relationships.
@AdamIsrael AdamIsrael requested a review from Copilot December 7, 2025 00:38
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/types/mod.rs Outdated
Comment on lines 271 to 275
e.event
.as_ref()
.and_then(|ev| ev.date.as_ref())
.map(|d| d.to_lowercase().contains(&pattern_lower))
.unwrap_or(false)
Copy link

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Death event matching logic duplicates the pattern from event_date_matches instead of using it. Consider using Self::event_date_matches(&e.event.as_ref().and_then(|ev| ev.date.clone()), &pattern_lower) or extracting the event's date first to maintain consistency with other event types.

Suggested change
e.event
.as_ref()
.and_then(|ev| ev.date.as_ref())
.map(|d| d.to_lowercase().contains(&pattern_lower))
.unwrap_or(false)
Self::event_date_matches(
&e.event.as_ref().and_then(|ev| ev.date.clone()),
&pattern_lower,
)

Copilot uses AI. Check for mistakes.
//! Usage:
//! cargo run --example search_by_date <gedcom-file> <event-type> <date-pattern>
//!
//! Event types: Birth, Death, Christening, Marriage
Copy link

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation mentions 'Marriage' as an event type, but it's not included in the parse_event_type function below (lines 34-47) or in the EventType enum. Either add Marriage support or remove it from this documentation.

Copilot uses AI. Check for mistakes.
Replace duplicated date matching logic with call to event_date_matches()
helper function for consistency with other event types.

This improves code maintainability by centralizing the date matching logic.
@AdamIsrael AdamIsrael merged commit 88d61b1 into main Dec 7, 2025
8 checks passed
@AdamIsrael AdamIsrael deleted the query-api branch December 7, 2025 00:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants