-
Notifications
You must be signed in to change notification settings - Fork 10
Adiciona validação e captura de valores inválidos em data_availability_status #1252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adiciona validação e captura de valores inválidos em data_availability_status #1252
Conversation
Cria constante DATA_AVAILABILITY_STATUS_INVALID e lista DATA_AVAILABILITY_STATUS_VALID_VALUES para validar valores vindos do XML, preservando informações inválidas para análise posterior.
Cria campo invalid_data_availability_status para preservar valores inválidos vindos do XML quando não estiverem na lista de choices.
Valida se o valor extraído do XML está na lista de choices válidos. Valores inválidos são preservados em invalid_data_availability_status e o status é marcado como "invalid".
Cria campo no modelo Article para armazenar valores inválidos de data availability recebidos do XML.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements validation for data_availability_status values extracted from XML files, preserving invalid values for analysis. When an XML contains a value that doesn't match the expected choices, the system stores the original value in a new field and marks the status as "invalid" to prevent data loss.
Key changes:
- Added validation logic to detect and handle invalid
data_availability_statusvalues from XML - Introduced a new database field to preserve unrecognized values for later analysis
- Updated the choices list to include an "invalid" status option
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| article/choices.py | Added new constant DATA_AVAILABILITY_STATUS_INVALID and created validation list DATA_AVAILABILITY_STATUS_VALID_VALUES |
| article/models.py | Added invalid_data_availability_status field to store unrecognized values from XML; whitespace cleanup |
| article/sources/xmlsps.py | Implemented validation logic in add_data_availability_status() to check values against valid choices and preserve invalid ones; whitespace cleanup |
| article/migrations/0045_article_invalid_data_availability_status_and_more.py | Database migration adding the new field and updating choices for data_availability_status |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| article.invalid_data_availability_status = status | ||
| article.data_availability_status = choices.DATA_AVAILABILITY_STATUS_INVALID | ||
| else: | ||
| article.invalid_data_availability_status = None |
Copilot
AI
Jan 6, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The validation logic has a subtle issue: when the status from XML is None (absent), it correctly sets the status to DATA_AVAILABILITY_STATUS_ABSENT, but it also sets invalid_data_availability_status to None. However, when a valid status is provided from the XML, invalid_data_availability_status is also set to None. This means that if an article previously had an invalid status and is then updated with a valid status or no status, the original invalid value will be lost. Consider whether the invalid_data_availability_status field should only be cleared when explicitly appropriate, or if historical tracking of invalid values is needed.
| article.invalid_data_availability_status = None | |
| # Não sobrescreve `invalid_data_availability_status` quando o status é válido ou ausente, | |
| # preservando qualquer valor inválido previamente armazenado. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
article/choices.py
Outdated
| ) | ||
|
|
||
| # Lista com valores válidos para validação | ||
| DATA_AVAILABILITY_STATUS_VALID_VALUES = [status[0] for status in DATA_AVAILABILITY_STATUS] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Rossi-Luciano os valores válidos estão na documentação do SPS. Esta lista é a lista de controle e não necessariamente dos valores aceitáveis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
article/sources/xmlsps.py
Outdated
| items.append({"language": lang, "text": text}) | ||
|
|
||
| article.data_availability_status = status or choices.DATA_AVAILABILITY_STATUS_ABSENT | ||
| # Valida se o status está na lista de valores aceitos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
choices.DATA_AVAILABILITY_STATUS_ABSENT é usado para identificar que não há no XML este dado, pois a orientação da existência deste elemento no XML é mais recente. Então, ajuste o código para que isso seja considerado.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
robertatakenaka
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Rossi-Luciano realizar as correções
Corrige preservação de histórico em invalid_data_availability_status Corrige validação de data availability para valores SPS oficiais
| if status is None: | ||
| # Valor ausente no XML (orientação mais recente do SPS) | ||
| # Não altera invalid_data_availability_status (preserva histórico) | ||
| article.data_availability_status = choices.DATA_AVAILABILITY_STATUS_ABSENT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Rossi-Luciano acho que é melhor adicionar article.invalid_data_availability_status = None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Valida o status extraído do XML | ||
| if status is None: | ||
| # Valor ausente no XML (orientação mais recente do SPS) | ||
| # Não altera invalid_data_availability_status (preserva histórico) | ||
| article.data_availability_status = choices.DATA_AVAILABILITY_STATUS_ABSENT | ||
| elif status not in choices.DATA_AVAILABILITY_STATUS_VALID_VALUES: | ||
| # Valor inválido encontrado no XML | ||
| article.invalid_data_availability_status = status | ||
| article.data_availability_status = choices.DATA_AVAILABILITY_STATUS_INVALID | ||
| else: | ||
| # Valor válido explícito presente | ||
| article.invalid_data_availability_status = None | ||
| article.data_availability_status = status |
Copilot
AI
Jan 6, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new validation logic for data availability status values lacks test coverage. Consider adding tests to verify the three validation paths: (1) invalid values are stored in invalid_data_availability_status and status is set to "invalid", (2) valid values clear invalid_data_availability_status and set the correct status, and (3) absent values preserve the existing invalid_data_availability_status without changes. This is particularly important since this logic handles data preservation and could impact data integrity if it behaves incorrectly.
article/sources/xmlsps.py
Outdated
| # Valida o status extraído do XML | ||
| if status is None: | ||
| # Valor ausente no XML (orientação mais recente do SPS) | ||
| # Não altera invalid_data_availability_status (preserva histórico) |
Copilot
AI
Jan 6, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic preserves the invalid_data_availability_status field when status is absent (None), which creates potential inconsistency. If an article is reprocessed with a new XML that lacks a data availability section, the status becomes "absent" but the old invalid value remains in invalid_data_availability_status. This means the fields can show conflicting information (status="absent" with invalid_data_availability_status="some-old-value"). Consider clearing invalid_data_availability_status when status is None to maintain consistency, or document why this historical preservation is desired despite the inconsistency.
| # Não altera invalid_data_availability_status (preserva histórico) | |
| # Limpa invalid_data_availability_status para evitar estado inconsistente | |
| article.invalid_data_availability_status = None |
robertatakenaka
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Rossi-Luciano verificar o último comentário e corrigir
O que esse PR faz?
Este PR implementa validação de valores de
data_availability_statusextraídos do XML, preservando valores inválidos para análise posterior. Quando o XML contém um valor que não está na lista de choices válidos, o sistema:invalid_data_availability_statusdata_availability_statuscomo "invalid"Problema solucionado: XMLs estão chegando com valores de
specific-useque não correspondem aos choices esperados, causando perda de informação.Onde a revisão poderia começar?
article/choices.py- Verificar as novas constantes e lista de validaçãoarticle/models.py- Verificar o novo campoinvalid_data_availability_statusarticle/sources/xmlsps.py- Verificar a lógica de validação na funçãoadd_data_availability_statusarticle/migrations/0045_article_invalid_data_availability_status_and_more.py- Verificar a migration geradaComo este poderia ser testado manualmente?
Teste via Django Shell:
Verificar no banco:
Algum cenário de contexto que queira dar?
N.A.
Quais são tickets relevantes?
TK #1249
Referências
N.A.