Skip to content

Commit aceed17

Browse files
updates to user guide
1 parent 9129ff0 commit aceed17

8 files changed

Lines changed: 1761 additions & 615 deletions

File tree

docs/guides/features/data/current_load.csv

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,9 @@ TXN-002,2024-03-01T09:15:00.000000,S1,R1,Bread,1,2.8,0.0,2.8,
44
TXN-003,2024-03-01T10:02:00.000000,S1,R2,Eggs,1,3.2,0.0,3.1999999999,LC-1003
55
TXN-004,2024-03-01T10:30:00.000000,S1,R2,Butter,3,1.9,0.1,5.6,LC-9999
66
TXN-005,2024-03-01T11:00:00.000000,S1,R1,Cheese,1,4.5,0.0,4.5,
7-
TXN-006,2024-03-01T11:20:00.000000,S2,R4,Apples,4,1.5,0.2,5.8,LC-2001
8-
TXN-007,2024-03-01T11:45:00.000000,S2,R4,Chicken,2,10.8,1.0,20.6,LC-2002
9-
TXN-008,2024-03-01T12:10:00.000000,S2,R4,Rice,1,4.2,0.1,4.1,
7+
TXN-006,2024-03-01T11:20:02.000000,S2,R4,Apples,4,1.5,0.2,5.8,LC-2001
8+
TXN-007,2024-03-01T11:45:03.000000,S2,R4,Chicken,2,10.8,1.0,20.6,LC-2002
9+
TXN-008,2024-03-01T12:10:02.000000,S2,R4,Rice,1,1.5,0.2,1.3,
1010
TXN-009,2024-03-01T13:00:00.000000,S1,R1,Yogurt,2,1.2,0.0,2.4000000001,LC-1006
1111
TXN-010,2024-03-01T13:30:00.000000,S1,R2,Juice,3,3.0,0.0,9.0,
1212
TXN-013,2024-03-01T14:00:00.000000,S1,R1,Coffee,1,6.5,0.0,6.5,LC-1008

docs/guides/features/data/previous_load.csv

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ TXN-004,2024-03-01T10:30:00.000000,S1,R2,Butter,3,1.9,0.1,5.6,LC-1004
66
TXN-005,2024-03-01T11:00:00.000000,S1,R1,Cheese,1,4.5,0.0,4.5,
77
TXN-006,2024-03-01T11:20:00.000000,S2,R4,Apples,4,0.75,0.0,3.0,LC-2001
88
TXN-007,2024-03-01T11:45:00.000000,S2,R4,Chicken,2,5.4,0.5,10.3,LC-2002
9-
TXN-008,2024-03-01T12:10:00.000000,S2,R4,Rice,1,2.1,0.0,2.1,
9+
TXN-008,2024-03-01T12:10:00.000000,S2,R4,Rice,1,0.75,0.0,0.75,
1010
TXN-009,2024-03-01T13:00:00.000000,S1,R1,Yogurt,2,1.2,0.0,2.4,LC-1006
1111
TXN-010,2024-03-01T13:30:00.000000,S1,R2,Juice,3,3.0,0.0,9.0,
1212
TXN-011,2024-03-01T08:00:00.000000,S2,R3,Soap,1,2.5,0.0,2.5,LC-2004

docs/guides/features/investigating.ipynb

Lines changed: 565 additions & 188 deletions
Large diffs are not rendered by default.

docs/guides/features/summary.ipynb

Lines changed: 609 additions & 134 deletions
Large diffs are not rendered by default.

docs/guides/features/testing.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
# Testing Utilities
22

3-
`diffly` provides assertion functions that print detailed summaries when DataFrames differ, making test failures easier to debug.
3+
`diffly` provides assertion functions that print detailed summaries when data frames differ, making test failures easier to debug.
44

55
## Asserting equality of frames
66

7-
Use {func}`~diffly.testing.assert_frame_equal` to compare two Polars DataFrames or LazyFrames in your tests:
7+
Use {func}`~diffly.testing.assert_frame_equal` to compare two Polars data frames or lazy frames in your tests:
88

99
```python
1010
from diffly.testing import assert_frame_equal

docs/guides/features/tolerances.ipynb

Lines changed: 380 additions & 196 deletions
Large diffs are not rendered by default.

docs/guides/quickstart.ipynb

Lines changed: 154 additions & 77 deletions
Large diffs are not rendered by default.

docs/index.md

Lines changed: 47 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Diffly
22

3-
A utility package for comparing Polars DataFrames.
3+
A utility package for comparing `polars` data frames.
44

55
```{toctree}
66
:maxdepth: 2
@@ -12,31 +12,31 @@ API Reference <api/modules>
1212

1313
## What is Diffly?
1414

15-
Diffly is a utility package for comparing Polars DataFrames and LazyFrames with detailed analysis capabilities. It identifies differences between datasets including:
15+
Diffly is a utility package for comparing `polars` data frames and lazy frames with detailed analysis capabilities. It identifies differences between datasets including:
1616

17-
- **Schema differences**: Columns that exist only in one DataFrame
18-
- **Row-level mismatches**: Rows that are different between DataFrames
19-
- **Missing rows**: Rows that exist only in one DataFrame
17+
- **Schema differences**: Columns that exist only in one data frame
18+
- **Row-level mismatches**: Rows that are different between data frames
19+
- **Missing rows**: Rows that exist only in one data frame
2020
- **Column value changes**: Detailed analysis of which columns differ and by how much
2121

2222
## Key Features
2323

24-
- **Primary key-based comparison**: Join DataFrames on specified primary keys for row-by-row comparison
24+
- **Primary key-based comparison**: Join data frames on specified primary keys for row-by-row comparison
2525
- **Rich summaries**: Generate detailed, visually formatted comparison reports
26-
- **Lazy evaluation**: Uses Polars LazyFrames internally for efficient computation
2726
- **Tolerance-based equality**: Configure absolute and relative tolerances for floating point comparisons
27+
- **Lazy evaluation**: Uses `polars` lazy frames internally for efficient computation
2828
- **Temporal tolerance**: Support for comparing temporal types (dates, datetimes) with configurable tolerances
2929
- **Per-column tolerances**: Fine-grained control over comparison tolerances for each column
3030
- **Method caching**: Automatically caches comparison results to avoid recomputation
31-
- **Testing utilities**: Built-in assertion functions for DataFrame and Collection equality in tests
31+
- **Testing utilities**: Built-in assertion functions for data frame and collection equality in tests
3232

3333
## Quick Example
3434

3535
```python
3636
import polars as pl
3737
from diffly import compare_frames
3838

39-
# Create two DataFrames to compare
39+
# Create two data frames to compare
4040
left = pl.DataFrame({
4141
"id": ["a", "b", "c"],
4242
"value": [1.0, 2.0, 3.0],
@@ -47,22 +47,55 @@ right = pl.DataFrame({
4747
"value": [1.0, 2.5, 4.0],
4848
})
4949

50-
# Compare the DataFrames
50+
# Compare the data frames
5151
comparison = compare_frames(left, right, primary_key="id")
5252

5353
# Check if they're equal
5454
if not comparison.equal():
5555
# Display a detailed summary
5656
summary = comparison.summary(
5757
top_k_column_changes=1,
58-
show_sample_primary_key_per_change=True
58+
show_sample_primary_key_per_change=True,
5959
)
6060
print(summary)
6161
```
6262

63+
This prints a rich summary showing schema differences, row counts, match rates, and top value changes:
64+
65+
```
66+
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
67+
┃ Diffly Summary ┃
68+
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
69+
Primary key: id
70+
71+
Schemas
72+
▔▔▔▔▔▔▔
73+
Schemas match exactly (column count: 2).
74+
75+
Rows
76+
▔▔▔▔
77+
Left count Right count
78+
3 (no change) 3
79+
80+
┏━┯━┯━┯━┯━┓
81+
┃-│-│-│-│-┃ 1 left only (33.33%)
82+
┠─┼─┼─┼─┼─┨╌╌╌┏━┯━┯━┯━┯━┓╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╮
83+
┃ │ │ │ │ ┃ = ┃ │ │ │ │ ┃ 1 equal (50.00%) │
84+
┠─┼─┼─┼─┼─┨╌╌╌┠─┼─┼─┼─┼─┨╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌├╴ 2 joined
85+
┃ │ │ │ │ ┃ ≠ ┃ │ │ │ │ ┃ 1 unequal (50.00%) │
86+
┗━┷━┷━┷━┷━┛╌╌╌┠─┼─┼─┼─┼─┨╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╯
87+
┃+│+│+│+│+┃ 1 right only (33.33%)
88+
┗━┷━┷━┷━┷━┛
89+
90+
Columns
91+
▔▔▔▔▔▔▔
92+
┌───────┬────────┬───────────────────────────┐
93+
│ value │ 50.00% │ 2.0 -> 2.5 (1x, e.g. "b") │
94+
└───────┴────────┴───────────────────────────┘
95+
```
96+
6397
## Next Steps
6498

65-
- Follow the [Quickstart Guide](guides/quickstart.ipynb) for a comprehensive introduction
66-
- Explore [Examples](guides/examples/index.md) for common use cases
67-
- Learn about advanced [Features](guides/features/index.md) like tolerances and custom summaries
99+
- Follow the [Quickstart Guide](guides/quickstart.ipynb) for a hands-on introduction
100+
- Learn about [Features](guides/features/index.md) like summaries, tolerances, and investigation tools
68101
- Check the [API Reference](api/modules.rst) for detailed function documentation

0 commit comments

Comments
 (0)