Skip to content

Unit Testing Phase 2.4: Query_Processor Rule Matching Unit Tests #5476

@renecannao

Description

@renecannao

Parent Issue

Part of #5472 — Unit Testing Framework: Milestone 2

Depends On

Why Query Processor

This is the highest-value target for unit testing. Query routing rules are combinatorial — there are 20+ fields per rule (username, schemaname, client_addr, digest, match_pattern, flagIN/flagOUT, etc.) and rules interact via chaining. Testing all meaningful combinations through E2E tests is impractical because each test requires a full proxy + backend setup with specific user/schema/hostgroup configurations.

Unit tests can verify rule matching logic exhaustively, catching subtle bugs in:

  • Regex matching edge cases
  • Rule chaining via flagIN/flagOUT
  • Priority and ordering
  • Negation logic
  • Query rewriting

Scope

Test File

test/tap/tests/unit/query_processor_unit-t.cpp

Technical Approach

process_query() takes a MySQL_Session* parameter for query context. Two possible approaches:

  1. Minimal session stub: Create a lightweight MySQL_Session with just enough state (username, schemaname, client_addr) for rule matching. This may require test_init_minimal() + thread context setup.

  2. Extract matching logic (if approach 1 is too heavy): Create a wrapper that provides the needed context without a full session object.

Phase 2.1's smoke test will inform which approach is viable.

Test Cases

Basic rule matching:

  • Create a rule with match_digest → query matching that digest is routed to destination_hostgroup
  • Create a rule with match_pattern → full query text matching
  • Rule with username filter → matches only for that user
  • Rule with schemaname filter → matches only for that schema
  • Rule with client_addr filter → matches only from that IP

Rule fields:

  • active=false → rule is skipped
  • negate_match_pattern=true → matches queries that do NOT match the pattern
  • replace_pattern → verify query is rewritten correctly
  • error_msg → verify error is returned instead of routing
  • OK_msg → verify OK response is returned
  • cache_ttl → verify TTL is set in output
  • cache_empty_result → verify flag propagation
  • reconnect, timeout, retries, delay → verify values in output
  • sticky_conn → verify flag in output
  • multiplex → verify multiplexing control in output
  • log → verify logging flag in output
  • mirror_hostgroup / mirror_flagOUT → verify mirroring config in output
  • comment and attributes fields → verify they don't affect matching

Rule evaluation order and chaining:

  • Multiple rules → first match with apply=true stops evaluation
  • apply=false → continues to next matching rule (accumulates effects)
  • flagIN/flagOUT chaining → rule with flagOUT=X feeds into rule with flagIN=X
  • Multi-hop chains: rule A → rule B → rule C via flag chaining
  • Circular chain detection (if any) — flagOUT pointing back to earlier flagIN

Regex behavior:

  • RE2 regex syntax
  • PCRE regex syntax (if configured via re_modifiers)
  • Case-sensitive vs case-insensitive matching
  • Complex regex patterns (alternation, groups, anchors)
  • Invalid regex → verify graceful handling

Edge cases:

  • Empty ruleset → default behavior (no routing override)
  • 1000+ rules → verify performance (matching should be fast)
  • Rule with all fields set simultaneously
  • Rule with no match criteria (matches everything)
  • Query with special characters (quotes, semicolons, unicode)
  • Very long query text
  • NULL/empty fields in rules

PgSQL equivalents:

  • Mirror key test cases for PgSQL_Query_Processor

Acceptance Criteria

  • All test cases pass without a running ProxySQL instance
  • Tests complete in under 5 seconds
  • No memory leaks under ASAN
  • Rule chaining (flagIN/flagOUT) is tested with at least 3 levels
  • Both RE2 and PCRE regex matching are tested
  • Performance test with 1000+ rules completes without degradation
  • Tests cover both MySQL_Query_Processor and PgSQL_Query_Processor
  • Query rewriting (replace_pattern) is verified end-to-end

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions