-
Notifications
You must be signed in to change notification settings - Fork 0
Märk upp belopp och avgifter i SFS som data-taggar #56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
marcarl
wants to merge
9
commits into
main
Choose a base branch
from
claude/tag-swedish-amounts-aUZmN
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,153
−0
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…centages Implements a function to identify and wrap Swedish currency amounts (kronor, kr, SEK) and percentages (%, procent) with semantic <data> tags containing: - id: context-aware slug (e.g., "avgift-1500-kr", "ranta-8-5-procent") - type: "amount" or "percentage" - value: normalized numeric value The function: - Handles Swedish number formats (space separators, decimal comma) - Supports multipliers (miljoner, miljarder, tusen) - Extracts context words for descriptive slugs - Skips markdown headers and XML/HTML tags - Includes 48 unit tests
Remove numeric value and unit from id attribute, keeping only the context-derived identifier (e.g., "avgift", "ranta", "moms"). This allows tracking the same data point across law amendments, since the id stays constant while only the value changes. Before: id="avgift-1500-kr" After: id="avgift"
Replace context-based slug generation with positional ids that can be
mapped to descriptive slugs via a reference table.
Changes:
- Generate positional ids like "kap5.2-belopp-1" based on section + type + position
- Add reference table support (data/amount-references.json) for custom slugs
- Section tags in the text automatically set the current section id
- Counters reset when entering a new section
This approach allows:
- Consistent ids across law amendments (same position = same id)
- Human/LLM curation of descriptive slugs like "riksbankens-referensranta"
- Tracking value changes over time using the stable id
Example with reference table:
{"kap5.2-belopp-1": "tillstandsavgift"}
Output:
<data id="tillstandsavgift" type="amount" value="1500">1 500 kronor</data>
Include SFS designation (e.g., "2024:123") in positional ids to enable:
- Unique identification across different laws
- Tracking value changes when same slug maps to multiple SFS versions
New id format: sfs-2024-123-kap5.2-belopp-1
Reference table now supports tracking changes over time:
{
"sfs-2020-100-kap5.2-belopp-1": "tillstandsavgift",
"sfs-2024-123-kap5.2-belopp-1": "tillstandsavgift"
}
Both resolve to id="tillstandsavgift" but with different values,
allowing comparison of the same data point across amendments.
Also extracts SFS id from <article selex:id="lag-2024-123"> tags.
Change positional id format from: sfs-2024-123-kap5.2-belopp-1 To: sfs-2024-123/kap5.2-belopp-1 The "/" creates clearer visual hierarchy: - Before slash: the law (SFS designation) - After slash: position within the document Added test for reference table slug resolution.
New function to find amounts/percentages that need slugs in the reference table. Returns list of dicts with: - positional_id: the id that needs mapping - type: "amount" or "percentage" - value: normalized numeric value - matched_text: original text matched - context: surrounding text for understanding Useful for batch curation of slugs with LLM assistance.
Add comprehensive reference table entries covering: - Socialtjänstlagen (2025:400) - sanktionsavgifter - Inkomstskattelagen (1999:1229) - basbelopp, avdrag, skattesatser - Socialförsäkringsbalken (2010:110) - sjukpenning, föräldrapenning - Brottsbalken (1962:700) - straffbestämmelser - Aktiebolagslagen (2005:551) - kapitalkrav - Räntelagen (1975:635) - referensränta - And 27 more Swedish laws This enables tracking of amount changes across law amendments using stable descriptive slugs like "prisbasbelopp", "referensranta", etc.
YAML supports inline comments, making it easier to document and organize the reference table with section headers and annotations. Changes: - Convert data/amount-references.json to data/amount-references.yaml - Update load_reference_table() to use yaml.safe_load() - Replace json import with yaml import
Each entry now includes an inline comment with: - The actual value (e.g., "57 300 kr", "80%") - A text excerpt showing context from the law Example: "sfs-1999-1229/kap2.1-belopp-1": prisbasbelopp # 57 300 kr - "prisbasbeloppet enligt 2 kap." This makes it easier to understand and verify each mapping.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Implements a function to identify and wrap Swedish currency amounts (kronor,
kr, SEK) and percentages (%, procent) with semantic tags containing:
The function: