Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jan 8, 2026

The library was rejecting domains with any TXT record containing invalid UTF-8, even when a valid SPF record existed. Per RFC 7208 § 4.5, only records beginning with v=spf1 should be processed—non-SPF records should be silently discarded.

Changes

  • query_spf_record(): Check for undecodable characters after filtering for SPF records. Add warning and skip instead of raising UndecodableCharactersInTXTRecord.

  • get_spf_record(): Preserve and merge warnings from query_spf_record() with those from parse_spf_record(). Previously warnings were being discarded.

Example

# Domain with non-SPF TXT record containing invalid UTF-8
# Before: raises UndecodableCharactersInTXTRecord
# After: returns valid SPF record with warning
result = checkdmarc.check_spf("fossgranlund.no")
# Returns: {"record": "v=spf1 include:...", "warnings": ["A TXT record with undecodable characters was skipped."]}

If an actual SPF record (starting with v=spf1) had undecodable characters, the behavior would be undefined since we cannot parse it—but this scenario is exceedingly rare and represents genuine DNS corruption rather than adjacent TXT records with different encodings.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • vmc.digicert.com
    • Triggering command: /usr/bin/python python -m unittest discover -s . -p tests.py -v (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

This section details on the original issue you should resolve

<issue_title>Incorrect "Undecodable characters" error for non-SPF TXT records</issue_title>
<issue_description># Incorrect "Undecodable characters" error for non-SPF TXT records
Hey just wanna say thanks for a really great library. I happened to stumble over a problem when checking a norwegian domain for SPF records that returned a "A TXT record contains undecodable characters." error. Verified the SPF records with mxtoolbox and others tools just to check that it actually is valid and added this debug to the code:

{'record': None, 'valid': False, 'void_dns_lookups': None, 'error': 'A TXT record contains undecodable characters for domain fossgranlund.no. Record index: 1, TXT record: "SKI OG SJ�FLY AS", Hex representation: 22534b49204f4720534ad8464c5920415322'}

Problem

The library incorrectly raises UndecodableCharactersInTXTRecord when any TXT record contains binary/invalid UTF-8 data, even when that record is not an SPF record.

Reproduction

import checkdmarc

# This domain has a non-SPF TXT record with invalid UTF-8
result = checkdmarc.check_spf("fossgranlund.no")

Error:

UndecodableCharactersInTXTRecord: A TXT record contains undecodable characters.

TXT Records for this domain:

  1. "SKI OG SJØFLY AS" - Contains invalid UTF-8 (NOT an SPF record)
  2. "v=spf1 include:spf.smtp2go.com include:spf.protection.outlook.com -all" - Valid SPF record
  3. "MS=ms31112654" - Valid TXT record

Root Cause

In spf.py lines 439-467, the code checks all TXT records for encoding errors before filtering for SPF records:

for record in answers:
    # Checks ALL records first
    if record.startswith("Undecodable characters"):
        raise UndecodableCharactersInTXTRecord(...)
    
    # Only then filters for SPF records
    if record.strip('"').startswith("v=spf1"):
        spf_txt_records.append(record)

Per RFC 7208 § 4.5, only records beginning with "v=spf1" should be processed. Non-SPF TXT records should be discarded regardless of their content.

Fix

Check if a record is an SPF record before validating its encoding

Expected Behavior After Fix

  • fossgranlund.no should return valid SPF record
  • Non-SPF TXT records with binary data are ignored
  • SPF records with encoding issues still raise appropriate errors
  • Compliant with RFC 7208 § 4.5</issue_description>

<agent_instructions>Consider a fix by treating UndecodableCharactersInTXTRecord as a warning when it it is not in an SPF record.</agent_instructions>

Comments on the Issue (you are @copilot in this section)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 2 commits January 8, 2026 15:33
- Modified spf.py to check for undecodable characters and add warning instead of raising exception
- Fixed get_spf_record to preserve warnings from query_spf_record
- Added test case for undecodable characters in non-SPF records

Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com>
Changed warning message from assuming it's a non-SPF record to acknowledging
that we cannot determine the record type due to encoding issues.

Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix incorrect undecodable characters error for non-SPF TXT records Fix: Skip undecodable TXT records instead of raising exception during SPF lookup Jan 8, 2026
Copilot AI requested a review from seanthegeek January 8, 2026 15:40
@seanthegeek seanthegeek marked this pull request as ready for review January 8, 2026 17:14
@seanthegeek
Copy link
Contributor

@copilot this PR is causing a unittest to fail, and I'm not quite sure why.

Copy link
Contributor Author

Copilot AI commented Jan 8, 2026

@copilot this PR is causing a unittest to fail, and I'm not quite sure why.

I've investigated the failing testTooManySPFDNSLookups test. This test was already failing on commit a470569 (before my PR changes).

The test is network-dependent and expects >10 DNS lookups, but currently only 9 are being counted. This is because the test relies on actual DNS queries to resolve SPF includes, and the DNS records or network conditions have changed since the test was written.

I verified this by:

  1. Testing on commit a470569 (pre-PR): ❌ FAIL
  2. Testing on commit 34ca8d7 (current): ❌ FAIL

All other tests pass, including my new testUndecodableCharactersInNonSPFRecord test. The failure is unrelated to the changes in this PR.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • vmc.digicert.com
    • Triggering command: /usr/bin/python python -m unittest discover -s . -p tests.py -v (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

@seanthegeek seanthegeek merged commit fe381a3 into main Jan 8, 2026
1 of 2 checks passed
seanthegeek added a commit that referenced this pull request Jan 8, 2026
- Skip undecodable TXT records instead of raising exception during SPF lookup (PR #232 fixes issue #231)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect "Undecodable characters" error for non-SPF TXT records

2 participants