Skip to content

[THIS-1095] 🤕Handle integer string#1594

Merged
doctrino merged 7 commits intomainfrom
fix-reading-int-version
Mar 31, 2026
Merged

[THIS-1095] 🤕Handle integer string#1594
doctrino merged 7 commits intomainfrom
fix-reading-int-version

Conversation

@doctrino
Copy link
Copy Markdown
Contributor

@doctrino doctrino commented Mar 29, 2026

Description

I discovered this one while running Neat in Toolkit. This is fairly important for Neat as plugin in Toolkit as it currently blocks Neat from being applied to a lot of models.

Problem

When reading from YAML

space: my_space
externalId: MyModel
version: 1_0_0
...

When reading this with the YAML parser we get out

{"space": "my_space", "externalId": "MyModel", "version": 100}

Thus 1_0_0 which should have been a string, becomes the number 100.

solution

This is technically not a bug, but a user mistake. They can write it

space: my_space
externalId: MyModel
version: '1_0_0'
...

However, instead of requiring every user to know the details of how YAML parsing works, we automatically apply this fix to all version fields. This is the same thing we do in Toolkit, and that that has been very successful.

Bump

  • Patch
  • Skip

Changelog

Improved

  • When reading data models from yaml, Neat now handles integer in the version field of views and data models.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 29, 2026

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
7433 6867 92% 90% 🟢

New Files

No new covered files...

Modified Files

File Coverage Status
cognite/neat/_data_model/importers/_api_importer.py 94% 🟢
cognite/neat/_utils/text.py 91% 🟢
TOTAL 92% 🟢

updated for commit: e62a54b by action🐍

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 29, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.83%. Comparing base (9a95a2e) to head (e62a54b).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1594      +/-   ##
==========================================
- Coverage   91.85%   91.83%   -0.02%     
==========================================
  Files         126      130       +4     
  Lines        7468     7573     +105     
==========================================
+ Hits         6860     6955      +95     
- Misses        608      618      +10     
Files with missing lines Coverage Δ
...ognite/neat/_data_model/importers/_api_importer.py 93.14% <100.00%> (+0.20%) ⬆️
cognite/neat/_utils/text.py 91.30% <100.00%> (+1.83%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@doctrino doctrino marked this pull request as ready for review March 29, 2026 10:14
@doctrino doctrino requested a review from a team as a code owner March 29, 2026 10:14
@doctrino
Copy link
Copy Markdown
Contributor Author

/gemini review

@doctrino doctrino enabled auto-merge (squash) March 29, 2026 10:15
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a utility to quote integer values for specific keys in YAML files, ensuring fields like 'version' are correctly parsed as strings. Feedback indicates that the current regular expression is not robust enough to handle trailing whitespace or comments, and a more reliable regex pattern was suggested.

Comment on lines +77 to +78
pattern = rf"^(\s*-?\s*)?{key}:\s*(?!.*['\":])([\d_]+)$"
replacement = rf'\1{key}: "\2"'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current regular expression is not robust. It will fail to match lines that contain comments or trailing whitespace. For example, version: 1_0_0 # My comment would not be matched, and the version would be incorrectly parsed as an integer, which is the problem this change is intended to solve.

Additionally, the negative lookahead (?!.*['\":]) is overly restrictive. For example, it would fail if a comment contained a quote: version: 1 # My "quoted" comment.

A simpler and more robust regex can achieve the desired outcome while correctly handling these cases.

Suggested change
pattern = rf"^(\s*-?\s*)?{key}:\s*(?!.*['\":])([\d_]+)$"
replacement = rf'\1{key}: "\2"'
pattern = rf"^(\s*-?\s*{key}:\s*)([\d_]+)(.*)$"
replacement = rf'\1"\2"\3'

@doctrino doctrino merged commit 9d02ae0 into main Mar 31, 2026
10 checks passed
@doctrino doctrino deleted the fix-reading-int-version branch March 31, 2026 11:16
def quote_int_value_by_key_in_yaml(content: str, key: str) -> str:
"""Quote a value in a yaml string"""
# This pattern will match the key if it is not already quoted
pattern = rf"^(\s*-?\s*{key}:\s*)([\d_]+)(.*)$"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem like we successfully handle floats with this regex, see the image below, this is what happens when I load a yaml with version: 2.1.1

Image

Magssch added a commit that referenced this pull request Mar 31, 2026
# Description

Tighten `quote_int_value_by_key_in_yaml` so it no longer uses
`([\d_]+)(.*)$`, which only quoted the leading digit run and broke lines
like `version: 1.0` by formating them like `version: "1".0`. This PR
builds upon #1594 which was again based on the the existing handling in
Toolkit, and retains the same-line `#` comment handling which was added
while avoiding to malformat floats. Extra unit tests added to ensure we
don't introduce changes to floats, semver-like text, scientific
notation, and empty `version:` lines.

Currently, the user would see the following error if the version
contains periods:

<img width="2228" height="762" alt="image"
src="https://github.com/user-attachments/assets/b398bd59-34e9-4589-ae3e-8606af20af99"
/>

## Bump

- [x] Patch
- [ ] Skip

## Changelog
### Fixed

- Fixes a crashing issue when reading Toolkit-format data models with
version strings containing periods (e.g. `1.0.0`).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants