feat: add Tornado checker by Ris-1kd · Pull Request #92 · antgroup/YASA-Engine

Ris-1kd · 2025-12-08T09:06:43Z

Summary by Sourcery

Introduce a Python taint analysis checker for Tornado web handlers and supporting utilities to detect and register request-derived data as taint sources and HTTP handler methods as entrypoints.

New Features:

Add a Tornado-specific Python taint checker that discovers Tornado routes, registers handler methods as entrypoints, and tracks request data as tainted sources across files.
Introduce shared Tornado utilities for resolving imports, parsing route definitions, and extracting handler parameters used by the taint checker.
Wire the new Tornado taint checker into the checker configuration so it can be invoked as part of the existing analysis pipeline.

Note

Medium Risk
Adds a new framework-specific checker and expands the Python analyzer to emit checkAtMemberAccess callbacks, which can change taint results and slightly impact analysis performance across Python scans.

Overview
Adds a new Tornado-specific Python taint checker (taint_flow_python_tornado_input) that discovers routes (e.g., Application/RuleRouter/add_handlers and Rule/URLSpec patterns), registers HTTP method handlers (get/post/etc.) as entrypoints, and marks common Tornado request accessors/attributes as PYTHON_INPUT taint sources.

Wires the checker into checker-config.json, both Python checker packs, and the example rule_config_python.json. Updates python-analyzer.ts to invoke CheckerManager.checkAtMemberAccess during processMemberAccess, enabling checkers to react to member-access expressions.

^{Written by Cursor Bugbot for commit 3ec065b. This will update automatically on new commits. Configure here.}

…logic

sourcery-ai · 2025-12-08T09:06:51Z

Reviewer's Guide

Adds a new Tornado-specific Python taint checker and utilities, wires it into the checker configuration, and implements route/handler discovery plus taint source propagation for Tornado request handlers.

Sequence diagram for Tornado route discovery and entrypoint registration

sequenceDiagram
  participant Analyzer
  participant TornadoTaintChecker
  participant AstUtil

  Analyzer->>TornadoTaintChecker: triggerAtCompileUnit(analyzer, scope, node, state, info)
  TornadoTaintChecker->>AstUtil: visit(node, visitors)
  AstUtil-->>TornadoTaintChecker: AssignmentExpression / VariableDeclaration / ClassDefinition
  TornadoTaintChecker->>TornadoTaintChecker: update fileCache for sourcefile
  Analyzer-->>TornadoTaintChecker: return

  Analyzer->>TornadoTaintChecker: triggerAtFuncCallSyntax(analyzer, scope, node, state, info)
  TornadoTaintChecker->>tornado_util: isTornadoCall(node, 'Application' or 'add_handlers')
  tornado_util-->>TornadoTaintChecker: boolean
  alt Tornado Application or add_handlers
    TornadoTaintChecker->>TornadoTaintChecker: collectTornadoEntrypointAndSource(analyzer, scope, state, routeList, fileName)
    TornadoTaintChecker->>TornadoTaintChecker: normalizeRoutes(routeList, fileName)
    loop each route
      TornadoTaintChecker->>tornado_util: parseRoutePair(routeNode)
      tornado_util-->>TornadoTaintChecker: RoutePair
      TornadoTaintChecker->>TornadoTaintChecker: resolveSymbol(handlerName, file)
      alt handler class found
        TornadoTaintChecker->>Analyzer: processInstruction(scope, classAst, state)
        Analyzer-->>TornadoTaintChecker: handlerSymVal (class)
        TornadoTaintChecker->>TornadoTaintChecker: emitHandlerEntrypoints(analyzer, handlerSymVal, urlPattern, classAst, scope, state)
        loop each http method (get, post, ...)
          TornadoTaintChecker->>Analyzer: processInstruction(scope, fdef, state)
          Analyzer-->>TornadoTaintChecker: fclos
          TornadoTaintChecker->>tornado_util: extractParamsFromAst(fdef)
          tornado_util-->>TornadoTaintChecker: ParamMeta[]
          TornadoTaintChecker->>TornadoTaintChecker: register ParamMeta as PYTHON_INPUT in sourceScope
          TornadoTaintChecker->>completeEntryPoint: completeEntryPoint(fclos)
          completeEntryPoint-->>TornadoTaintChecker: entryPoint
          TornadoTaintChecker->>Analyzer: push entryPoint to entryPoints
        end
      end
    end
  end

Class diagram for TornadoTaintChecker and Tornado utility types

classDiagram
  class PythonTaintAbstractChecker {
    <<abstract>>
    +checkerRuleConfigContent any
    +sourceScope any
    +getCheckerId() string
    +addSourceTagForSourceScope(kind string, sourceScope any) void
    +addSourceTagForcheckerRuleConfigContent(kind string, checkerRuleConfigContent any) void
    +triggerAtIdentifier(analyzer any, scope any, node any, state any, info any) void
    +triggerAtFunctionCallAfter(analyzer any, scope any, node any, state any, info any) void
    +checkByNameMatch(node any, fclos any, argvalues any) void
  }

  class TornadoTaintChecker {
    -fileCache Map~string, FileCache~
    +TornadoTaintChecker(resultManager any)
    -markAsTainted(value any) void
    +triggerAtStartOfAnalyze(analyzer any, scope any, node any, state any, info any) void
    +triggerAtCompileUnit(analyzer any, scope any, node any, state any, info any) bool
    +triggerAtFuncCallSyntax(analyzer any, scope any, node any, state any, info any) bool
    +triggerAtIdentifier(analyzer any, scope any, node any, state any, info any) void
    +checkByNameMatch(node any, fclos any, argvalues any) void
    +triggerAtFunctionCallAfter(analyzer any, scope any, node any, state any, info any) void
    +triggerAtMemberAccess(analyzer any, scope any, node any, state any, info any) void
    -resolveSymbol(name string, currentFile string) any
    -normalizeRoutes(node any, currentFile string) RoutePair[]
    -collectTornadoEntrypointAndSource(analyzer any, scope any, state any, routeList any, currentFile string) void
    -emitHandlerEntrypoints(analyzer any, handlerSymVal any, urlPattern string, classAst any, scope any, state any) void
    -buildClassSymbol(classNode any) any
  }

  class ImportSymbol {
    +file string
    +originalName string
  }

  class RoutePair {
    +path string
    +handlerName string
    +file string
  }

  class FileCache {
    +vars Map~string, any~
    +classes Map~string, any~
    +importedSymbols Map~string, ImportSymbol~
  }

  class ParamMeta {
    +name string
    +locStart number or 'all'
    +locEnd number or 'all'
  }

  class tornado_util {
    +tornadoSourceAPIs Set~string~
    +passthroughFuncs Set~string~
    +isRequestAttributeAccess(node any) boolean
    +isRequestAttributeExpression(expr any) boolean
    +isTornadoCall(node any, targetName string) boolean
    +parseRoutePair(route any) RoutePair
    +resolveImportPath(modulePath string, currentFile string) string
    +extractImportEntries(stmt any) Array~any~
    +extractParamsFromAst(funcNode any) ParamMeta[]
  }

  TornadoTaintChecker --|> PythonTaintAbstractChecker

  TornadoTaintChecker ..> FileCache : uses
  TornadoTaintChecker ..> RoutePair : uses
  TornadoTaintChecker ..> tornado_util : uses helpers
  TornadoTaintChecker ..> ParamMeta : uses
  FileCache ..> ImportSymbol : contains

File-Level Changes

Change	Details	Files
Introduce Tornado-specific Python taint checker implementing Tornado route discovery and entrypoint/source registration.	Create TornadoTaintChecker class extending PythonTaintAbstractChecker with checker id taint_flow_python_tornado_input. Implement trigger hooks (start, compile unit, function call, identifier, member access) to reload rule config, collect routes from Application/add_handlers calls, and tag Tornado request data as PYTHON_INPUT sources. Build per-file caches of variables, classes, and imported symbols to resolve handler classes across files and normalize complex route lists into RoutePair structures. Emit entrypoints for HTTP methods on handler classes and register their non-self parameters as PYTHON_INPUT sources, integrating with completeEntryPoint and analyzer.entryPoints. Extend sink matching to support partial function-name matches when applying FuncCallTaintSink rules.	`src/checker/taint/python/tornado-taint-checker.ts`
Add shared Tornado utility module for route parsing, import resolution, and parameter metadata extraction.	Define RoutePair, FileCache, ImportSymbol, and ParamMeta interfaces for Tornado analysis. Implement helpers to recognize Tornado calls, parse route tuples/url() helpers into RoutePair, and detect self.request.* attribute/expressions. Provide resolveImportPath and extractImportEntries helpers to resolve relative module paths and imported symbols. Implement extractParamsFromAst to normalize function parameter metadata with location info.	`src/checker/taint/python/tornado-util.ts`
Wire Tornado checker into existing checker configuration.	Register the new Tornado taint checker in checker-config.json and checker-pack-config.json so it can be enabled via rules. Ensure ruleConfig reloading and checkerIds matching can pick up taint_flow_python_tornado_input rules at analysis start.	`resource/checker/checker-config.json` `resource/checker/checker-pack-config.json`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

gemini-code-assist · 2025-12-08T09:07:01Z

Summary of Changes

Hello @Ris-1kd, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new taint analysis checker specifically designed for Python applications built with the Tornado web framework. The primary goal is to enhance the security analysis capabilities for Tornado projects by accurately identifying application entry points and marking various forms of user-controlled input as potential sources of taint. It also incorporates advanced features like cross-file symbol resolution and intelligent taint propagation through common utility functions, leading to more precise and comprehensive security vulnerability detection.

Highlights

New Tornado Taint Checker: A dedicated taint analysis checker has been added for Python applications utilizing the Tornado web framework, enabling specialized security analysis for these projects.
Automated Entry Point Detection: The checker automatically identifies Tornado route handlers, such as those defined via Application() and add_handlers(), as entry points for analysis.
Comprehensive Source Identification: Various forms of user input in Tornado, including request arguments (get_argument, get_query_argument), request body, query parameters, headers, and cookies (self.request.body, self.request.query), are now marked as PYTHON_INPUT sources.
Cross-File Symbol Resolution: A new file caching mechanism is implemented to resolve variables, classes, and imported symbols across different Python files, significantly improving the accuracy of taint flow analysis.
Taint Propagation through Passthrough Functions: The checker includes logic to correctly propagate taint through common string manipulation and utility functions (e.g., decode, strip), ensuring that sanitized inputs are not prematurely untainted.
Configuration Integration: The new Tornado checker has been integrated into the existing checker configuration and relevant checker packs, making it readily available for use.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

sourcery-ai

Hey there - I've reviewed your changes - here's some feedback:

The logic that detects self.request.<attr> as a source is duplicated between triggerAtMemberAccess and isRequestAttributeAccess/isRequestAttributeExpression in tornado-util.ts; consider reusing the utility helpers in the checker to keep behavior consistent and reduce maintenance overhead.
In triggerAtStartOfAnalyze you re-parse process.argv and reload the rule config on every analyze-start; if this hook is called frequently, it may be worth caching the resolved ruleConfigFile and loaded rule data to avoid repeated file I/O and argument parsing.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The logic that detects `self.request.<attr>` as a source is duplicated between `triggerAtMemberAccess` and `isRequestAttributeAccess`/`isRequestAttributeExpression` in `tornado-util.ts`; consider reusing the utility helpers in the checker to keep behavior consistent and reduce maintenance overhead.
- In `triggerAtStartOfAnalyze` you re-parse `process.argv` and reload the rule config on every analyze-start; if this hook is called frequently, it may be worth caching the resolved `ruleConfigFile` and loaded rule data to avoid repeated file I/O and argument parsing.

## Individual Comments

### Comment 1
<location> `src/checker/taint/python/tornado-taint-checker.ts:360-361` </location>
<code_context>
+    }
+
+    // 检查是否是 tornado source API 调用（如 get_argument）
+    if (funcName && tornadoSourceAPIs.has(funcName)) {
+      this.markAsTainted(ret)
+    }
+
</code_context>

<issue_to_address>
**issue (bug_risk):** Treating any get_argument-like call as a source ignores receiver context and may create false positives.

The check currently only matches on `funcName`, so calls like `some_other_obj.get_argument()` (or any monkey‑patched function named `get_argument`) will be treated as taint sources even when unrelated to Tornado handlers. To avoid these false positives, also validate the receiver (e.g., ensure it’s `self`, a known handler type, or otherwise consistent with Tornado request handlers) before marking the return value as tainted.
</issue_to_address>

### Comment 2
<location> `src/checker/taint/python/tornado-taint-checker.ts:371-361` </location>
<code_context>
+      // node.callee.object.type = 'MemberAccess' (body)
+      // node.callee.object.object.type = 'MemberAccess' (request)
+      // node.callee.object.object.object.name = 'self'
+      if (node.callee?.type === 'MemberAccess' && node.callee.object) {
+        const bodyNode = node.callee.object
+        if (
+          bodyNode.type === 'MemberAccess' &&
+          bodyNode.property?.name === 'body' &&
+          bodyNode.object?.type === 'MemberAccess' &&
+          bodyNode.object.property?.name === 'request' &&
+          bodyNode.object.object?.name === 'self'
+        ) {
+          // 直接标记返回值为 source（因为 self.request.body 是 source）
+          this.markAsTainted(ret)
+          return // 已经标记，不需要再检查 receiver
+        }
</code_context>

<issue_to_address>
**suggestion:** The special-case for self.request.body duplicates the logic in isRequestAttributeExpression and is brittle.

Since `tornado-util.ts` already provides `isRequestAttributeExpression` for recognizing Tornado request attributes (including call-expression forms), prefer using that helper here instead of duplicating the AST pattern for `self.request.body`. Centralizing this logic will keep it consistent and easier to evolve.
</issue_to_address>

### Comment 3
<location> `src/checker/taint/python/tornado-taint-checker.ts:268-269` </location>
<code_context>
+            valEnd === 'all' &&
+            val.scopeFile !== 'all' &&
+            val.scopeFunc === 'all' &&
+            typeof node.loc?.sourcefile === 'string' &&
+            node.loc.sourcefile.includes(val.scopeFile)
+          ) {
+            shouldMark = true
</code_context>

<issue_to_address>
**issue (bug_risk):** Using substring matching for scopeFile may accidentally match unrelated files.

Because `node.loc.sourcefile.includes(val.scopeFile)` does a raw substring match, a rule scoped to `user.py` will also trigger for `old_user.py`, `user.py.bak`, or any path containing that text. If `scopeFile` represents a file path, consider normalizing both paths and comparing full path segments (or using a suffix match that respects path separators) to avoid unintended matches.
</issue_to_address>

### Comment 4
<location> `src/checker/taint/python/tornado-util.ts:120-121` </location>
<code_context>
+    const { callee } = route
+    const isUrlHelper =
+      (callee.type === 'Identifier' && callee.name === 'url') ||
+      (callee.type === 'MemberAccess' &&
+        AstUtil.prettyPrint(callee).includes('url'))
+    if (isUrlHelper && Array.isArray(route.arguments)) {
+      const [first, second] = route.arguments
</code_context>

<issue_to_address>
**suggestion (bug_risk):** Detecting url-helper calls via prettyPrint string search is fragile and potentially slow.

Using `AstUtil.prettyPrint(callee).includes('url')` risks false positives (any member chain whose printed form contains "url") and adds unnecessary work in large, route-heavy files. Prefer matching specific AST node shapes and names (e.g., `Identifier`/`MemberAccess` paths for known helpers like `some_module.url` or `tornado.web.url`) instead of searching the pretty-printed text.

Suggested implementation:

```typescript
  } else if (route.type === 'CallExpression' && route.callee) {
    const { callee } = route

    const isIdentifierUrlHelper =
      callee.type === 'Identifier' && callee.name === 'url'

    const isMemberAccessUrlHelper =
      callee.type === 'MemberAccess' &&
      (
        // e.g. something.url(...)
        (callee.member &&
          callee.member.type === 'Identifier' &&
          callee.member.name === 'url') ||
        // or if the AST uses a `property` field instead of `member`
        (callee.property &&
          callee.property.type === 'Identifier' &&
          callee.property.name === 'url')
      )

    const isUrlHelper = isIdentifierUrlHelper || isMemberAccessUrlHelper

    if (isUrlHelper && Array.isArray(route.arguments)) {
      const [first, second] = route.arguments
      pathExpr = first
      handlerNode = second
    }
  }

```

If your AST representation for member access uses different field names or nesting (for example, `callee.attr` or nested `MemberAccess` chains for `tornado.web.url`), you should adjust the `isMemberAccessUrlHelper` condition accordingly to:
1. Refer to the correct field(s) holding the final attribute/identifier node.
2. Optionally check the full member chain (e.g., that the object path matches known helpers such as `tornado.web`), if you want stricter matching.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-12-08T09:08:33Z

src/checker/taint/python/tornado-taint-checker.ts

+
+    // 检查是否是 tornado source API 调用（如 get_argument）
+    if (funcName && tornadoSourceAPIs.has(funcName)) {
+      this.markAsTainted(ret)


suggestion: The special-case for self.request.body duplicates the logic in isRequestAttributeExpression and is brittle.

Since tornado-util.ts already provides isRequestAttributeExpression for recognizing Tornado request attributes (including call-expression forms), prefer using that helper here instead of duplicating the AST pattern for self.request.body. Centralizing this logic will keep it consistent and easier to evolve.

sourcery-ai · 2025-12-08T09:08:33Z

src/checker/taint/python/tornado-taint-checker.ts

+            typeof node.loc?.sourcefile === 'string' &&
+            node.loc.sourcefile.includes(val.scopeFile)


issue (bug_risk): Using substring matching for scopeFile may accidentally match unrelated files.

Because node.loc.sourcefile.includes(val.scopeFile) does a raw substring match, a rule scoped to user.py will also trigger for old_user.py, user.py.bak, or any path containing that text. If scopeFile represents a file path, consider normalizing both paths and comparing full path segments (or using a suffix match that respects path separators) to avoid unintended matches.

sourcery-ai · 2025-12-08T09:08:33Z

src/checker/taint/python/tornado-util.ts

+      (callee.type === 'MemberAccess' &&
+        AstUtil.prettyPrint(callee).includes('url'))


suggestion (bug_risk): Detecting url-helper calls via prettyPrint string search is fragile and potentially slow.

Using AstUtil.prettyPrint(callee).includes('url') risks false positives (any member chain whose printed form contains "url") and adds unnecessary work in large, route-heavy files. Prefer matching specific AST node shapes and names (e.g., Identifier/MemberAccess paths for known helpers like some_module.url or tornado.web.url) instead of searching the pretty-printed text.

Suggested implementation:

} else if (route.type === 'CallExpression' && route.callee) { const { callee } = route const isIdentifierUrlHelper = callee.type === 'Identifier' && callee.name === 'url' const isMemberAccessUrlHelper = callee.type === 'MemberAccess' && ( // e.g. something.url(...) (callee.member && callee.member.type === 'Identifier' && callee.member.name === 'url') || // or if the AST uses a `property` field instead of `member` (callee.property && callee.property.type === 'Identifier' && callee.property.name === 'url') ) const isUrlHelper = isIdentifierUrlHelper || isMemberAccessUrlHelper if (isUrlHelper && Array.isArray(route.arguments)) { const [first, second] = route.arguments pathExpr = first handlerNode = second } }

If your AST representation for member access uses different field names or nesting (for example, callee.attr or nested MemberAccess chains for tornado.web.url), you should adjust the isMemberAccessUrlHelper condition accordingly to:

Refer to the correct field(s) holding the final attribute/identifier node.

Optionally check the full member chain (e.g., that the object path matches known helpers such as tornado.web), if you want stricter matching.

gemini-code-assist

Code Review

This pull request introduces a new taint analysis checker for the Python Tornado framework. The implementation is comprehensive, covering route discovery, entry point registration, and taint source tracking from request data. My review focuses on improving cross-file symbol resolution, code structure, and configuration handling. I've identified a high-severity issue with how absolute imports are resolved, which could impact the checker's effectiveness. I've also included a couple of medium-severity suggestions to improve maintainability and performance.

gemini-code-assist · 2025-12-08T09:09:59Z

src/checker/taint/python/tornado-util.ts

+export function resolveImportPath(
+  modulePath: string,
+  currentFile: string,
+): string | null {
+  if (!modulePath) return null
+  const currentDir = path.dirname(currentFile)
+  const leadingDots = modulePath.match(/^\.+/)?.[0] ?? ''
+  let baseDir = currentDir
+  if (leadingDots.length > 0) {
+    baseDir = path.resolve(currentDir, '../'.repeat(leadingDots.length - 1))
+  }
+  const remainder = modulePath.slice(leadingDots.length)
+  const normalized = remainder ? remainder.split('.').join(path.sep) : ''
+  const resolved = normalized ? path.resolve(baseDir, normalized) : baseDir
+  return `${resolved}.py`
+}


The current implementation of resolveImportPath does not correctly handle absolute Python imports. It treats them as relative to the current file's directory, which is incorrect. Absolute imports should be resolved from the project's source root(s). This can lead to failures in resolving symbols and incomplete taint analysis.

I suggest modifying the function to accept an optional mainDir (project root) and use it as the base for absolute imports.

Additionally, the current implementation doesn't handle package imports (directories with __init__.py). While the suggested change doesn't fully solve that, it's a step in the right direction. You might consider enhancing it further to check for __init__.py files.

You will also need to update the call site in tornado-taint-checker.ts to pass Config.maindir.
For example, at src/checker/taint/python/tornado-taint-checker.ts:148:

const resolved = resolveImportPath(modulePath, fileName, Config.maindir);

export function resolveImportPath( modulePath: string, currentFile: string, mainDir?: string, ): string | null { if (!modulePath) return null const currentDir = path.dirname(currentFile) const leadingDots = modulePath.match(/^\.+/)?.[0] ?? '' let baseDir: string if (leadingDots.length > 0) { baseDir = path.resolve(currentDir, '../'.repeat(leadingDots.length - 1)) } else if (mainDir) { baseDir = mainDir } else { // Fallback for absolute imports when mainDir is not provided. // This is the original behavior and is likely incorrect. baseDir = currentDir } const remainder = modulePath.slice(leadingDots.length) const normalized = remainder ? remainder.split('.').join(path.sep) : '' const resolved = normalized ? path.resolve(baseDir, normalized) : baseDir return `${resolved}.py` }

gemini-code-assist · 2025-12-08T09:09:59Z

src/checker/taint/python/tornado-taint-checker.ts

+    if (!ruleConfigFile || ruleConfigFile === '') {
+      const args = process.argv
+      const ruleConfigIndex = args.indexOf('--ruleConfigFile')
+      if (ruleConfigIndex >= 0 && ruleConfigIndex < args.length - 1) {
+        ruleConfigFile = args[ruleConfigIndex + 1]
+        const path = require('path')
+        ruleConfigFile = path.isAbsolute(ruleConfigFile)
+          ? ruleConfigFile
+          : path.resolve(process.cwd(), ruleConfigFile)
+      }
+    }


The logic to load the ruleConfigFile by parsing process.argv is fragile and considered a code smell. It indicates a potential issue with configuration management or initialization order, as the comment on line 62 suggests. This logic appears to be duplicated across different checkers and makes them less reliable and harder to test.

It would be better to refactor the configuration handling to ensure that Config.ruleConfigFile is fully resolved and available before the checker is instantiated, removing the need for this manual argument parsing.

gemini-code-assist · 2025-12-08T09:09:59Z

src/checker/taint/python/tornado-taint-checker.ts

+        }
+        // 确保 finalEp 有 filePath
+        if (!finalEp.filePath && finalEp.fdef?.loc?.sourcefile) {
+          const FileUtil = require('../../../util/file-util')


There are several require calls inside methods (e.g., triggerAtStartOfAnalyze, emitHandlerEntrypoints). This is inefficient as the module will be re-evaluated on every call (or at least checked in the cache), and it's generally better for readability and maintainability to have all imports at the top of the file.

Please move all require calls to the top of the file. This applies to:

require('../../common/rules-basic-handler') on line 63

require('path') on line 71

require('../../../util/file-util') on line 80 and 683

require('../../../util/common-util') on line 94

CrackTC · 2025-12-09T10:46:13Z

src/checker/taint/python/tornado-taint-checker.ts

+    if (!value) return
+    if (!value._tags) {
+      value._tags = new Set()
+    }
+    value._tags.add('PYTHON_INPUT')
+    value.hasTagRec = true


Suggested change

if (!value) return

if (!value._tags) {

value._tags = new Set()

}

value._tags.add('PYTHON_INPUT')

value.hasTagRec = true

IntroduceTaint.setTaint(value, 'PYTHON_INPUT')

This logic already exists in source-util.ts

CrackTC · 2025-12-09T11:00:55Z

src/checker/taint/python/tornado-taint-checker.ts

+   * @param info
+   */
+  triggerAtStartOfAnalyze(analyzer: any, scope: any, node: any, state: any, info: any): void {
+    // 重新加载规则配置（因为可能在构造函数时还没有设置 ruleConfigFile）


I think this assert is not true.
Since TornadoTaintChecker extends Checker, it will load rule config to this.checkerRuleConfigContent in constructor

YASA-Engine/src/checker/common/checker.ts

Lines 22 to 28 in 48be5b2

constructor(resultManager: any, checkerId: any) {

this.checkerId = checkerId

this.resultManager = resultManager

this.checkerRuleConfigContent = {}

initRules()

this.loadRuleConfig(this)

}

At the point where triggerAtStartOfAnalyze is called, the constructor must have been called, so IMO the code loading ruleConfig again in this function is redundant

CrackTC · 2025-12-09T11:16:01Z

src/checker/taint/python/tornado-taint-checker.ts

+    routeList: any,
+    currentFile: string
+  ) {
+    const processed = new Set<string>()


this should be a member of class instead of a local variable of a function i think

CrackTC · 2025-12-09T11:30:42Z

src/checker/taint/python/tornado-taint-checker.ts

+   * @param state
+   * @param _info
+   */
+  triggerAtFuncCallSyntax(analyzer: any, scope: any, node: any, state: any, _info: any): boolean | undefined {


I would favor a Symbol-Interpret approach instead of AST-based approach. Here we can use triggerAtFunctionCallBefore and get the symbol value of arguments (especially the routeList) without any performance tradeoff, and with the symbol value of routeList, further processing will be more accurate. For example, when the routeList itself is actually retrieved from certain function call, current normalizeRoutes implementation cannot handle this situation properly as it doesn't take CallExpression into consideration, but with symbol value of routeList, we can get the "real" route object regardless of how it is retrieved from whatever expression type.

CrackTC · 2025-12-09T11:34:44Z

src/checker/taint/python/tornado-taint-checker.ts

+   * @param _info
+   */
+  triggerAtCompileUnit(analyzer: any, scope: any, node: any, _state: any, _info: any): boolean | undefined {
+    const fileName = node.loc?.sourcefile


Suggested change

const fileName = node.loc?.sourcefile

if (config.entryPointMode === 'ONLY_CUSTOM') return

const fileName = node.loc?.sourcefile

CrackTC · 2025-12-09T11:45:27Z

src/checker/taint/python/tornado-taint-checker.ts

+            if (!res._tags) {
+              res._tags = new Set()
+            }
+            res._tags.add('PYTHON_INPUT')
+            res.hasTagRec = true


Suggested change

if (!res._tags) {

res._tags = new Set()

}

res._tags.add('PYTHON_INPUT')

res.hasTagRec = true

this.markAsTainted(res)

CrackTC · 2025-12-09T11:48:37Z

src/checker/taint/python/tornado-taint-checker.ts

+   * @param state
+   * @param info
+   */
+  triggerAtIdentifier(analyzer: any, scope: any, node: any, state: any, info: any): void {


Can you clarify what does this hook do?

CrackTC · 2025-12-09T11:51:24Z

src/checker/taint/python/tornado-taint-checker.ts

+   */
+  triggerAtFunctionCallAfter(analyzer: any, scope: any, node: any, state: any, info: any): void {
+    // 先调用基类方法处理规则配置中的 source
+    super.triggerAtFunctionCallAfter(analyzer, scope, node, state, info)


Suggested change

super.triggerAtFunctionCallAfter(analyzer, scope, node, state, info)

super.triggerAtFunctionCallAfter(analyzer, scope, node, state, info)

if (config.entryPointMode === 'ONLY_CUSTOM') return

CrackTC · 2025-12-09T11:51:52Z

src/checker/taint/python/tornado-taint-checker.ts

+   * @param info
+   */
+  triggerAtMemberAccess(analyzer: any, scope: any, node: any, state: any, info: any): void {
+    const { res } = info


Suggested change

const { res } = info

if (config.entryPointMode === 'ONLY_CUSTOM') return

const { res } = info

CrackTC · 2025-12-09T11:56:51Z

src/checker/taint/python/tornado-util.ts

+  'get_argument',
+  'get_query_argument',
+  'get_body_argument',
+  'get_query_arguments',
+  'get_body_arguments',
+  'get_cookie',
+  'get_secure_cookie',


Suggested change

'get_argument',

'get_query_argument',

'get_body_argument',

'get_query_arguments',

'get_body_arguments',

'get_cookie',

'get_secure_cookie',

'get_argument',

'get_query_argument',

'get_body_argument',

'get_query_arguments',

'get_body_arguments',

'get_cookie',

'get_secure_cookie',

'get_arguments',

'get_json_body'

This is not resolved yet

src/checker/common/rules-basic-handler.ts

src/engine/analyzer/common/analyzer.ts

src/checker/taint/python/tornado-taint-checker.ts

src/checker/common/rules-basic-handler.ts

CrackTC · 2025-12-22T05:34:01Z

src/checker/taint/python/tornado-util.ts

+  'get_argument',
+  'get_query_argument',
+  'get_body_argument',
+  'get_query_arguments',
+  'get_body_arguments',
+  'get_cookie',
+  'get_secure_cookie',


This is not resolved yet

src/engine/analyzer/common/analyzer.ts

CrackTC · 2025-12-22T05:53:08Z

src/engine/analyzer/python/common/python-analyzer.ts

            entryPoint.entryPointSymVal?.parent
          )
        } catch (e) {
+          console.error(`[DEBUG] Error executing entrypoint [${i}]: ${e}`)


We should inspect the error log for error details instead of printing the error directly onto the console

CrackTC · 2025-12-22T06:26:22Z

src/checker/taint/python/tornado-taint-checker.ts

+   * @param _info
+   */
+  triggerAtFuncCallSyntax(analyzer: any, scope: any, node: any, state: any, _info: any): boolean | undefined {
+    const fileName = node.loc?.sourcefile


not resolved yet

CrackTC · 2025-12-22T06:28:02Z

src/checker/taint/python/tornado-taint-checker.ts

+
+    if (isApp) {
+      // Application(...) -> first arg is routes
+      ;[routeListArgValue] = argvalues


redundant ;

CrackTC · 2025-12-22T06:28:14Z

src/checker/taint/python/tornado-taint-checker.ts

+      ;[routeListArgValue] = argvalues
+    } else if (isAddHandlers) {
+      // add_handlers(host, routes) -> second arg is routes
+      ;[, routeListArgValue] = argvalues


redundant ;

CrackTC · 2025-12-22T06:37:24Z

src/checker/taint/python/tornado-taint-checker.ts

-   * Flatten route lists (handles BinaryExpression +)
-   * @param node
-   * @param currentFile
+   * Extract route pairs from resolved symbol values (from argvalues)


the logic is too complex, I think several member access should be enough...

cursor · 2025-12-22T11:03:59Z

src/checker/taint/python/tornado-taint-checker.ts

+      if (receiver && (receiver.taint || receiver.hasTagRec || receiver._tags?.has('PYTHON_INPUT'))) {
+        this.markAsTainted(ret)
+      }
+    }


Missing ONLY_CUSTOM check in triggerAtFunctionCallAfter

The triggerAtFunctionCallAfter method is missing the Config.entryPointMode === 'ONLY_CUSTOM' check that exists in both triggerAtCompileUnit (line 164) and triggerAtFunctionCallBefore (line 236). This inconsistency means that when entryPointMode is set to ONLY_CUSTOM, the Tornado-specific source API detection (like get_argument, get_cookie) and passthrough function handling will still execute, which is likely unintended behavior. The check should be added after the super call but before processing the fclos and ret values.

cursor · 2025-12-22T11:03:59Z

src/checker/taint/python/tornado-taint-checker.ts

+    const httpMethods = new Set(['get', 'post', 'put', 'delete', 'patch', 'head', 'options'])
+    const entrypoints = Object.entries(handlerSymVal.value)
+      .filter(([key, value]: [string, any]) => httpMethods.has(key) && value.vtype === 'fclos')
+      .map(([, value]: [string, any]) => value)


Potential crash when handlerSymVal.value is undefined

The emitHandlerEntrypoints method calls Object.entries(handlerSymVal.value) after only checking that handlerSymVal.vtype === 'class'. When pair.handlerSymVal with vtype === 'class' is used directly (lines 315-317) without going through processHandlerClass or buildClassSymbol, the value property may be undefined, causing Object.entries(undefined) to throw a TypeError. The code ensures handlerSymVal.field exists (line 341-343) but not handlerSymVal.value.

Additional Locations (1)

src/checker/taint/python/tornado-taint-checker.ts#L314-L317

cursor · 2025-12-22T11:03:59Z

src/checker/taint/python/tornado-taint-checker.ts

+    // 如果有匹配的规则，调用基类方法处理
+    if (matchedRule) {
+      super.checkByNameMatch(node, fclos, argvalues)
+    }


Partial matching in checkByNameMatch is ineffective

The checkByNameMatch override is documented to "support partial matching (e.g., os.system matches syslib_from.os.system)" but this feature doesn't work. The method uses string-based partial matching (callFull.endsWith('.os.system')) to find a matching rule, then delegates to super.checkByNameMatch which uses AST-based matchField that requires exact path matching. When a partial match is found, super's AST matching will fail because it doesn't support partial matching - for a call like mymodule.os.system with rule fsig: "os.system", the AST matcher walks the full path and fails when encountering the extra prefix. This results in partial matches never producing findings, defeating the stated purpose of the override.

cursor · 2025-12-22T11:03:59Z

src/engine/analyzer/python/common/python-analyzer.ts

+          }
+        }
+        // Include parent class name in key to distinguish handlers with same method name
+        const parentName = entryPoint?.entryPointSymVal?.parent?.id || entryPoint?.entryPointSymVal?.parent?.name || ''


parentName may serialize to [object Object] in key

The parentName extraction uses parent?.id || parent?.name directly, but in the AST these properties are often objects with a name sub-property (e.g., {type: 'Identifier', name: 'MyClass'}), not strings. When parent.id is an object, it's truthy so used as-is, and when concatenated into entryKey it becomes [object Object]. This causes different handler classes to produce identical keys, making the deduplication logic skip valid entrypoints. The params serialization nearby correctly uses p?.id?.name || p?.id pattern, but parentName doesn't follow this pattern.

cursor · 2025-12-22T11:26:51Z

src/checker/taint/python/tornado-taint-checker.ts

+        this.markAsTainted(ret)
+      }
+    }
+  }


Missing ONLY_CUSTOM mode check in triggerAtFunctionCallAfter

The triggerAtFunctionCallAfter method lacks the if (Config.entryPointMode === 'ONLY_CUSTOM') return check that exists in triggerAtCompileUnit, triggerAtFunctionCallBefore, and triggerAtMemberAccess. This inconsistency means tornado-specific taint marking (for APIs like get_argument and passthrough functions) will still run in ONLY_CUSTOM mode, while all other tornado-specific behaviors are disabled. The PR discussion explicitly flagged this as "not resolved yet".

cursor · 2025-12-27T06:50:55Z

src/checker/taint/python/tornado-taint-checker.ts

+    const { fclos, ret } = info
+    if (!fclos || !ret) {
+      return
+    }


Missing ONLY_CUSTOM mode check in triggerAtFunctionCallAfter

The triggerAtFunctionCallAfter method is missing the Config.entryPointMode === 'ONLY_CUSTOM' guard that exists in triggerAtCompileUnit, triggerAtFunctionCallBefore, and triggerAtMemberAccess. This inconsistency means when using ONLY_CUSTOM mode, the custom taint source marking logic (for tornadoSourceAPIs and passthroughFuncs) will still execute, which may cause unintended behavior. The PR reviewer noted this suggestion was "not resolved yet".

cursor · 2025-12-27T06:50:55Z

src/engine/analyzer/python/common/python-analyzer.ts

+          }
+        }
+        // Include parent class name in key to distinguish handlers with same method name
+        const parentName = entryPoint?.entryPointSymVal?.parent?.id || entryPoint?.entryPointSymVal?.parent?.name || ''


Parent name extraction may produce [object Object] in key

The parentName extraction accesses parent.id or parent.name directly, but these are typically AST node objects (with nested .name properties), not strings. If parent.id is an object, it will be truthy and used as parentName, becoming [object Object] in the template string at line 149. This defeats the purpose of distinguishing handlers by class name in the deduplication key, potentially causing different handlers from different classes to be incorrectly deduplicated. The code pattern at line 215 shows the correct approach: ast?.id?.name. This should use parent?._sid or parent?._qid or drill down to parent?.ast?.id?.name.

cursor · 2026-01-13T02:43:46Z

src/checker/taint/python/tornado-taint-checker.ts

+   * @param path
+   */
+  private registerEntryPoints(analyzer: any, cls: any, path: string) {
+    const methods = ['get', 'post', 'put', 'delete', 'patch']


Missing HTTP methods compared to Django checker

Low Severity

The methods array is missing head and options HTTP methods, which are included in the analogous Django checker (['get', 'post', 'put', 'delete', 'patch', 'head', 'options']). Tornado's RequestHandler supports these methods, and if a handler implements a custom head or options method that processes user input, it won't be registered as an entry point. This could result in missed taint flows for handlers using these less common but valid HTTP methods.

cursor · 2026-01-13T02:43:46Z

src/checker/taint/python/tornado-taint-checker.ts

+    if (!h) return
+
+    if (h.vtype === 'union' && Array.isArray(h.value)) h = h.value[0]
+    if (h.vtype !== 'class' && h.ast?.type === 'ClassDefinition') {


Crash when union handler has empty value array

Medium Severity

In finishRoute, after the assignment h = h.value[0], if the union's value array is empty, h becomes undefined. The immediately following line accesses h.vtype without re-checking if h is defined, which would throw a TypeError: Cannot read property 'vtype' of undefined. The early return check if (!h) return only happens before the reassignment, not after.

cursor · 2026-01-13T03:41:06Z

src/checker/taint/python/tornado-taint-checker.ts

+  private finishRoute(analyzer: any, scope: any, state: any, h: any, path: string) {
+    if (!h) return
+
+    if (h.vtype === 'union' && Array.isArray(h.value)) h = h.value[0]


Union handler only processes first element, missing others

Medium Severity

When a handler is a union type (e.g., from conditional assignment like handler = A if cond else B), finishRoute only processes the first element via h = h.value[0], discarding all other potential handlers. This causes two problems: entry points from other handlers in the union are never registered, and if union element ordering varies between analysis runs, different handlers get processed, leading to non-deterministic results where different bugs are discovered on subsequent runs.

cursor · 2026-01-13T03:41:06Z

src/checker/taint/python/tornado-taint-checker.ts

+    if (Config.entryPointMode !== 'ONLY_CUSTOM' && isRequestAttributeAccess(node)) {
+      markTaintSource(info.res, { path: node, kind: 'PYTHON_INPUT' })
+    }
+  }


Missing null check causes crash on unresolved member access

Medium Severity

The triggerAtMemberAccess method calls markTaintSource(info.res, ...) without checking if info.res is defined. When member access resolution fails (common in static analysis for unresolved references), info.res is undefined. The markTaintSource function then calls setTaint which attempts res._tags = res._tags || new Set(), throwing a TypeError. Unlike triggerAtFunctionCallAfter which guards with if (!ret) return, this method lacks the equivalent if (!info.res) return check.

cursor · 2026-01-13T09:36:01Z

src/checker/taint/python/tornado-taint-checker.ts

+    if (Config.entryPointMode !== 'ONLY_CUSTOM' && isRequestAttributeAccess(node)) {
+      markTaintSource(info.res, { path: node, kind: 'PYTHON_INPUT' })
+    }
+  }


Missing null check before marking taint source

Medium Severity

The triggerAtMemberAccess method calls markTaintSource(info.res, ...) without checking if info.res is defined. Looking at markTaintSource in source-util.ts, it calls setTaint(unit, kind) which attempts to set res._tags = res._tags || new Set(). If info.res is undefined (when getMemberValue returns undefined), this will throw a TypeError. The pattern in triggerAtFunctionCallAfter correctly guards against this with if (!ret) return, but this guard is missing here.

cursor · 2026-01-13T09:36:01Z

src/checker/taint/python/tornado-taint-checker.ts

+        if (rules) {
+          this.processRoutes(analyzer, scope, state, rules)
+          return
+        }


Rule objects incorrectly processed as route containers

Medium Severity

The code treats Rule and RuleRouter objects identically when extracting routes at lines 92-98. For RuleRouter(rules), extracting val.value['0'] correctly gets the rules list. However, for Rule(pattern, handler), val.value['0'] is the URL pattern, not a list of routes. When a Rule object is processed, the pattern is incorrectly passed to processRoutes recursively, and the function returns early without ever extracting the handler from val.value['1']. This causes routes defined via Rule objects to fail registration because the handler is never processed.

cursor · 2026-01-13T09:36:01Z

src/checker/taint/python/tornado-taint-checker.ts

+    if (!h) return
+
+    if (h.vtype === 'union' && Array.isArray(h.value)) h = h.value[0]
+    if (h.vtype !== 'class' && h.ast?.type === 'ClassDefinition') {


Missing null check after extracting from union array

Medium Severity

After line 193 extracts h.value[0] from a union with an array value, h becomes undefined if the array is empty. The subsequent line 194 immediately accesses h.vtype without re-checking if h is defined, which would throw a TypeError. The guard at line 202 (if (path && h)) comes too late - the crash occurs before reaching it. This would cause analysis to fail when encountering a union type with an empty value array.

cursor · 2026-01-13T09:36:01Z

src/checker/taint/python/tornado-util.ts

+  const names = [targetName]
+  if (targetName === 'Application') {
+    names.push('RuleRouter', 'Rule')
+  }


isTornadoCall incorrectly matches Rule when checking for Application

Medium Severity

The isTornadoCall function adds 'Rule' to the checked names when targetName is 'Application'. This causes isTornadoCall(node, 'Application') to return true for standalone Rule(pattern, handler) calls. In triggerAtFunctionCallBefore, when isApp is true, the code sets routes = argvalues[0], expecting a list of routes. However, for Rule calls, argvalues[0] is the pattern (not a routes list) and argvalues[1] is the handler. This mismatch causes routes defined via standalone Rule calls to fail registration, potentially missing entry points for taint analysis.

Additional Locations (1)

src/checker/taint/python/tornado-taint-checker.ts#L45-L50

cursor · 2026-01-22T03:13:48Z

src/checker/taint/python/tornado-taint-checker.ts

+  private finishRoute(analyzer: any, scope: any, state: any, h: any, path: string) {
+    if (!h) return
+    if (h.vtype === 'union' && Array.isArray(h.value)) h = h.value[0]
+    if (h.vtype !== 'class' && h.ast?.type === 'ClassDefinition') {


Missing null check after union value extraction

Medium Severity

In finishRoute, when h is a union type with an empty value array, the expression h = h.value[0] sets h to undefined. The next line then accesses h.vtype, which throws a TypeError. A null check is needed after extracting from the union array, or the condition needs to verify the array has elements before accessing h.value[0].

cursor · 2026-01-22T03:13:48Z

src/checker/taint/python/tornado-taint-checker.ts

+   * @param path
+   */
+  private registerEntryPoints(analyzer: any, cls: any, path: string) {
+    const methods = ['get', 'post', 'put', 'delete', 'patch']


Missing HTTP methods in Tornado entry point detection

Medium Severity

The methods array only includes ['get', 'post', 'put', 'delete', 'patch'], but is missing head and options which are valid HTTP methods that Tornado RequestHandler supports. The Django checker correctly includes all seven methods. Handlers implementing head() or options() won't be registered as entry points, potentially causing missed taint flows and security vulnerabilities to go undetected.

cursor · 2026-01-28T10:41:45Z

src/checker/taint/python/tornado-taint-checker.ts

+    if (!h) return
+    if (h.vtype === 'union' && Array.isArray(h.value)) h = h.value[0]
+    // 1. Check for recorded nested routes (Application/Router instances)
+    const innerRoutes = h.tornadoRoutes || h.value?.tornadoRoutes || h.field?.tornadoRoutes


Missing null check after union value reassignment

Medium Severity

In finishRoute, when h is a union type with an empty value array, the line h = h.value[0] reassigns h to undefined. The subsequent line then attempts to access h.tornadoRoutes, which throws a TypeError. The initial null guard if (!h) return only protects the original h parameter, not the reassigned value after extracting from the union array.

cursor · 2026-01-28T10:41:45Z

src/checker/taint/python/tornado-taint-checker.ts

+   * @param path
+   */
+  private registerEntryPoints(analyzer: any, cls: any, path: string) {
+    const methods = ['get', 'post', 'put', 'delete', 'patch']


Missing HTTP methods in entry point registration

Low Severity

The methods array is missing head and options HTTP methods that Tornado handlers can implement. The Django checker at the same codebase includes these methods in its httpMethods set. Tornado handlers implementing head() or options() methods won't be registered as entry points, meaning taint flows through those handlers won't be analyzed for vulnerabilities.

cursor · 2026-01-28T10:41:45Z

src/checker/taint/python/tornado-taint-checker.ts

+    // 4. Fallback for raw tuple (path, handler)
+    const isTuple =
+      (Array.isArray(val.value) && val.value.length >= 2) ||
+      (val.vtype === 'object' && val.value && val.value['0'] && val.value['1'])


Unreachable condition due to earlier return statement

Low Severity

The second part of the isTuple condition (val.vtype === 'object' && val.value && val.value['0'] && val.value['1']) is unreachable. Section 3 above (lines 97-105) checks for objects with numeric keys via Object.keys(val.value).some((k) => /^\d+$/.test(k)) and returns if found. Since any object with val.value['0'] would have '0' in its keys matching the regex, section 3 always handles and returns before reaching this condition.

cursor · 2026-01-28T10:41:46Z

src/checker/taint/python/tornado-taint-checker.ts

+      (val.vtype === 'object' && val.value && val.value['0'] && val.value['1'])
+    if (isTuple) {
+      const pathArg = val.value['0'] || (Array.isArray(val.value) ? val.value[0] : null)
+      const handler = val.value['1'] || (Array.isArray(val.value) ? val.value[1] : null)


Redundant array fallback expressions on tuple extraction

Low Severity

The fallback expressions (Array.isArray(val.value) ? val.value[0] : null) and (Array.isArray(val.value) ? val.value[1] : null) are redundant. In JavaScript, array['0'] is equivalent to array[0] for arrays, so val.value['0'] already returns the correct value when val.value is an array. The ternary fallback never provides a different result than the primary accessor.

cursor · 2026-01-28T11:04:22Z

src/checker/taint/python/tornado-taint-checker.ts

+   */
+  private finishRoute(analyzer: any, scope: any, state: any, h: any, path: string) {
+    if (!h) return
+    if (h.vtype === 'union' && Array.isArray(h.value)) h = h.value[0]


Empty union array access causes undefined reference error

Medium Severity

In finishRoute, when h is a union type with an array value, the code accesses h.value[0] without checking if the array is empty. If h.value is [], then h becomes undefined, and the subsequent access to h.tornadoRoutes on line 130 will throw a "Cannot read property of undefined" error. The UnionValue class can create unions with empty value arrays (initialized as [] by default), making this a realistic edge case.

cursor · 2026-01-28T11:04:23Z

src/checker/taint/python/tornado-taint-checker.ts

+   */
+  triggerAtMemberAccess(analyzer: any, scope: any, node: any, state: any, info: any): void {
+    if (Config.entryPointMode !== 'ONLY_CUSTOM' && isRequestAttributeAccess(node)) {
+      markTaintSource(info.res, { path: node, kind: 'PYTHON_INPUT' })


Missing null check before marking taint on undefined result

Medium Severity

In triggerAtMemberAccess, markTaintSource(info.res, ...) is called without verifying that info.res is defined. The getMemberValue function in the analyzer can return undefined when a property cannot be resolved. When this happens, setTaint inside markTaintSource will crash trying to access res._tags on undefined. Unlike other similar checkers that use introduceTaintAtMemberAccess (which has additional guards), this code calls markTaintSource directly whenever isRequestAttributeAccess returns true.

cursor · 2026-01-28T11:04:23Z

src/checker/taint/python/tornado-taint-checker.ts

+        cls = analyzer.processInstruction(scope, cls.ast, state) || this.buildClassSymbol(cls.ast)
+      } catch (e) {
+        cls = this.buildClassSymbol(cls.ast)
+      }


Silent exception swallowing without logging

Low Severity

The catch block silently swallows the exception without logging. Other Python taint checkers in the codebase use handleException(e, ...) from exception-handler to properly log errors, but this catch block completely ignores the error. This makes debugging difficult because failures during AST processing will be hidden. The PR discussion also notes this: "We should inspect the error log for error details."

cursor · 2026-01-28T11:04:23Z

src/checker/taint/python/tornado-taint-checker.ts

+    if (isApp || isRouter || isAdd) {
+      let routes: any = null
+      if (isApp || isRouter) {
+        const isInit = ['__init__', '_CTOR_'].includes(node.callee?.property?.name || node.callee?.name)


Duplicated isInit detection logic in same file

Low Severity

The same isInit check pattern ['__init__', '_CTOR_'].includes(node.callee?.property?.name || node.callee?.name) appears at line 50 in triggerAtFunctionCallBefore and effectively the same check at line 242 in triggerAtFunctionCallAfter (using the name variable which has the same value). This duplicated logic could be extracted to a small helper function in tornado-util.ts for consistency and maintainability.

Additional Locations (1)

src/checker/taint/python/tornado-taint-checker.ts#L241-L242

cursor · 2026-01-28T11:59:29Z

src/checker/taint/python/tornado-taint-checker.ts

+    const isRouter = isTornadoCall(node, 'RuleRouter')
+    if (!isInit && (isApp || isRouter)) {
+      ret.tornadoRoutes = argvalues[0]
+    }


Missing null check for argvalues before array access

Medium Severity

The code at line 263 accesses argvalues[0] without checking if argvalues exists or has elements. Unlike triggerAtFunctionCallBefore which guards with !argvalues in its early return (line 43), triggerAtFunctionCallAfter omits this check (line 231). All other argvalues accesses in the same method (lines 235, 242, 250) properly guard with argvalues && argvalues.length >= N, but this one does not. This could cause a TypeError if argvalues is undefined, or set tornadoRoutes to undefined if argvalues is empty.

cursor · 2026-01-28T11:59:29Z

src/checker/taint/python/tornado-taint-checker.ts

+    if (!h) return
+    if (h.vtype === 'union' && Array.isArray(h.value)) h = h.value[0]
+    // 1. Check for recorded nested routes (Application/Router instances)
+    const innerRoutes = h.tornadoRoutes || h.value?.tornadoRoutes || h.field?.tornadoRoutes


Missing null check after union unwrap can cause TypeError

Medium Severity

In finishRoute, line 127 reassigns h to h.value[0] when h is a union type with an array value. If h.value is an empty array, h becomes undefined. The subsequent line 129 then accesses h.tornadoRoutes without re-checking if h is defined, which would throw a TypeError. The initial null check on line 126 becomes ineffective after the reassignment.

cursor · 2026-01-28T11:59:29Z

src/checker/taint/python/tornado-taint-checker.ts

+                scopeFunc: 'all',
+                locStart: 'all',
+                locEnd: 'all',
+              })


Taint sources scoped globally instead of to specific parameters

High Severity

The source scope entries use scopeFile: 'all', scopeFunc: 'all', locStart: 'all', locEnd: 'all' which causes ANY variable matching the parameter name to be marked as tainted across the entire codebase. According to source-util.ts, when all scope values are 'all', the taint source matches globally by name only. The Django checker correctly uses specific file paths, function names, and line ranges from param.loc and ep.fdef. This bug causes false positives where unrelated variables with common names like id or name are incorrectly marked as user input.

cursor · 2026-01-28T11:59:29Z

src/checker/taint/python/tornado-taint-checker.ts

+        cls = analyzer.processInstruction(scope, cls.ast, state) || this.buildClassSymbol(cls.ast)
+      } catch (e) {
+        cls = this.buildClassSymbol(cls.ast)
+      }


Caught exception silently swallowed without logging

Low Severity

The catch block captures exception e but ignores it entirely. Other Python taint checkers in this codebase use handleException(e, ...) from the common exception handler to properly log errors. Silent error suppression makes debugging difficult since there's no indication when analyzer.processInstruction fails, and the fallback to buildClassSymbol happens invisibly. This was also flagged in the PR discussion but not resolved.

cursor · 2026-02-03T08:36:02Z

src/checker/taint/python/tornado-taint-checker.ts

+    if (ast?.type === 'CallExpression') {
+      const name = ast.callee?.property?.name || ast.callee?.name
+      if (isTornadoCall(ast, 'Rule') || isTornadoCall(ast, 'URLSpec') || name === 'url') {
+        const args = val.ast.arguments


Wrong variable used for accessing arguments property

High Severity

The code computes ast = val.ast || val.node on line 99 to handle cases where either property exists, but then uses val.ast.arguments instead of ast.arguments on line 103. When val.ast is undefined but val.node exists, this will throw a TypeError since val.ast is undefined.

cursor · 2026-02-03T08:36:02Z

src/checker/taint/python/tornado-taint-checker.ts

+      if (key) {
+        this.instanceRoutes.set(key, argvalues[0])
+      }
+    }


Missing null check before accessing argvalues array

Medium Severity

The code accesses argvalues[0] without checking if argvalues is defined. Unlike line 320 which properly guards with if (isInit && argvalues && argvalues.length >= 2), this block at line 347 has no such guard and will throw a TypeError if argvalues is undefined.

cursor · 2026-02-03T08:36:02Z

src/checker/taint/python/tornado-taint-checker.ts

+    if (!h) return
+    if (h.vtype === 'union' && Array.isArray(h.value)) h = h.value[0]
+    let innerRoutes: any = null
+    const hAst = h.ast || h.node


Empty array causes undefined access after reassignment

Medium Severity

When h is a union type with an empty value array, h.value[0] returns undefined, causing h to become undefined. The subsequent access to h.ast on line 192 will then throw a TypeError. The initial null check at line 189 doesn't protect against this since h is reassigned after that check.

cursor · 2026-02-03T08:36:02Z

src/checker/taint/python/tornado-taint-checker.ts

+  triggerAtMemberAccess(analyzer: any, scope: any, node: any, state: any, info: any): void {
+    if (Config.entryPointMode !== 'ONLY_CUSTOM' && isRequestAttributeAccess(node)) {
+      markTaintSource(info.res, { path: node, kind: 'PYTHON_INPUT' })
+    }


Missing null check before marking taint source

Medium Severity

The triggerAtMemberAccess method calls markTaintSource(info.res, ...) without checking if info.res is defined. The isRequestAttributeAccess(node) function only validates the AST structure, not whether the member resolution succeeded. If getMemberValue returns undefined (e.g., when resolving self.request.body fails), the call to markTaintSource will crash when setTaint tries to access properties on undefined.

cursor · 2026-02-03T08:47:10Z

src/checker/taint/python/tornado-taint-checker.ts

+    if (ast?.type === 'CallExpression') {
+      const name = ast.callee?.property?.name || ast.callee?.name
+      if (isTornadoCall(ast, 'Rule') || isTornadoCall(ast, 'URLSpec') || name === 'url') {
+        const args = val.ast.arguments


Wrong variable used causing potential null access error

High Severity

The code accesses val.ast.arguments when it should use ast.arguments. The variable ast is defined on line 99 as val.ast || val.node, meaning when val.ast is undefined but val.node exists, ast will be val.node. Accessing val.ast.arguments in this case throws a TypeError ("Cannot read property 'arguments' of undefined"). The code correctly uses ast in lines 101-103 but incorrectly reverts to val.ast on line 104.

cursor · 2026-02-03T08:47:10Z

src/checker/taint/python/tornado-taint-checker.ts

+      if (key) {
+        this.instanceRoutes.set(key, argvalues[0])
+      }
+    }


Missing null check for argvalues may cause crash

Medium Severity

The code accesses argvalues[0] on line 348 without checking if argvalues is defined. In triggerAtFunctionCallBefore (line 57), there's a proper guard !argvalues before using it. Similarly, line 321 in the same function checks argvalues && argvalues.length >= 2. However, the block at lines 345-350 directly accesses argvalues[0] without any null check. When !isInit && (isApp || isRouter) is true but argvalues is undefined, this causes a TypeError crash.

cursor · 2026-02-03T08:47:10Z

src/checker/taint/python/tornado-taint-checker.ts

+   * @param path
+   */
+  private registerEntryPoints(analyzer: any, cls: any, path: string) {
+    const methods = ['get', 'post', 'put', 'delete', 'patch']


Missing HTTP methods head and options from handlers

Medium Severity

The methods array only includes ['get', 'post', 'put', 'delete', 'patch'] but is missing 'head' and 'options'. The Django taint checker includes all seven methods: ['get', 'post', 'put', 'delete', 'patch', 'head', 'options']. Tornado's RequestHandler class supports HEAD and OPTIONS methods, so handlers implementing these methods won't be registered as entry points, causing missed taint flow detection.

cursor · 2026-02-03T08:47:10Z

src/checker/taint/python/tornado-taint-checker.ts

+    if (!h) return
+    if (h.vtype === 'union' && Array.isArray(h.value)) h = h.value[0]
+    let innerRoutes: any = null
+    const hAst = h.ast || h.node


Null check bypassed after union value extraction

Medium Severity

The null check on line 190 (if (!h) return) is invalidated by line 191, which can reassign h to undefined. When h is a union type with an empty value array, h.value[0] returns undefined, making h undefined. Line 193 then accesses h.ast without rechecking, causing a "Cannot read property 'ast' of undefined" crash. The null check happens before the mutation but the code uses h after without validating it again.

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

cursor · 2026-02-04T09:41:45Z

src/checker/taint/python/tornado-taint-checker.ts

+    const isRouter = isTornadoCall(node, 'RuleRouter')
+    if (!isInit && (isApp || isRouter)) {
+      tornadoRoutesMap.set(ret, argvalues[0])
+    }


Missing length check before accessing argvalues[0]

Low Severity

The code accesses argvalues[0] without verifying that argvalues contains any elements. When Application() or RuleRouter() is called without arguments, argvalues is an empty array and argvalues[0] evaluates to undefined. This causes tornadoRoutesMap to store undefined as routes, which may lead to unexpected behavior when the routes are later retrieved and processed. Other similar accesses in this file (lines 240, 247, 255) properly check argvalues.length before accessing array elements.

cursor · 2026-02-04T09:41:45Z

src/checker/taint/python/tornado-taint-checker.ts

+                locEnd: p.loc?.end?.line,
+              })
+            }
+          })


Source scopes registered even for duplicate entry points

Medium Severity

The code that registers source scopes (lines 180-198) executes regardless of whether the entry point is a duplicate. The duplicate check at lines 171-178 only guards the analyzer.entryPoints.push(ep) call, but the parameter processing and this.sourceScope.value.push() calls happen unconditionally within the if (ep) block. When the same route handler is encountered multiple times, duplicate taint source entries are registered, which could cause performance degradation and potentially duplicate taint findings.

Ris-1kd and others added 7 commits November 26, 2025 03:43

fix: update entrypoint collector and enhance FastAPI route detection …

699a0b6

…logic

fix: update fastapi entrypoint

86f3473

fix: update fastapi

812b3a0

fix: resolve FastAPI route parsing issues

9d11215

fix:FastAPI entrypoint

30fdbdc

Merge branch 'antgroup:main' into main

d46215b

feat: add Tornado checker

e61be5f

Ris-1kd requested review from AntJiuFo, Arielwyy and alipaydeshui as code owners December 8, 2025 09:06

sourcery-ai bot reviewed Dec 8, 2025

View reviewed changes

gemini-code-assist bot reviewed Dec 8, 2025

View reviewed changes

feat: Tornado checker

f32f9fb

This comment was marked as outdated.

Sign in to view

CrackTC suggested changes Dec 9, 2025

View reviewed changes

Fix: update Python analyzer and Tornado taint checker rules

2067880

cursor bot reviewed Dec 20, 2025

View reviewed changes

src/checker/common/rules-basic-handler.ts Outdated Show resolved Hide resolved

src/engine/analyzer/common/analyzer.ts Outdated Show resolved Hide resolved

src/checker/taint/python/tornado-taint-checker.ts Outdated Show resolved Hide resolved

CrackTC reviewed Dec 22, 2025

View reviewed changes

Fix: update tornado framework

b08bead

cursor bot reviewed Dec 22, 2025

View reviewed changes

Ris-1kd added 3 commits December 22, 2025 03:07

Fix: update tornado framework

d23f87b

Fix: update tornado

9f58641

Fix: update tornado

fbd5978

cursor bot reviewed Dec 22, 2025

View reviewed changes

update tornado

6c0fcd4

cursor bot reviewed Dec 27, 2025

View reviewed changes

Fix: update tornado

4aad909

This comment was marked as outdated.

Sign in to view

Fix: update tornado

8d19c17

cursor bot reviewed Jan 13, 2026

View reviewed changes

Fix: update tornado-framework

61c8b5f

cursor bot reviewed Jan 13, 2026

View reviewed changes

Fix:update tornado

f84b50f

cursor bot reviewed Jan 13, 2026

View reviewed changes

Fix: update tornado framework

15da4c1

This comment was marked as outdated.

Sign in to view

Fix: update tornado-framework

2682e6a

This comment was marked as outdated.

Sign in to view

Fix: update tornado framework

fee55b1

This comment was marked as outdated.

Sign in to view

Fix: update tornado framework

a1721c2

This comment was marked as outdated.

Sign in to view

Fix: update tornado framework

f4428da

This comment was marked as outdated.

Sign in to view

Fix: update tornado

0d63baf

cursor bot reviewed Jan 22, 2026

View reviewed changes

Fix: update tornado-framework

e20920a

cursor bot reviewed Jan 28, 2026

View reviewed changes

Fix: update tornado-framework

8244780

cursor bot reviewed Jan 28, 2026

View reviewed changes

Fix: update tornado framework

73ae7ef

cursor bot reviewed Jan 28, 2026

View reviewed changes

Fix: update tornado framework

6f50c2a

cursor bot reviewed Feb 3, 2026

View reviewed changes

Fix: update tornado framework

d3ac17a

cursor bot reviewed Feb 3, 2026

View reviewed changes

Fix: update tornado-framework

3ec065b

cursor bot reviewed Feb 4, 2026

View reviewed changes

		typeof node.loc?.sourcefile === 'string' &&
		node.loc.sourcefile.includes(val.scopeFile)

		(callee.type === 'MemberAccess' &&
		AstUtil.prettyPrint(callee).includes('url'))

	constructor(resultManager: any, checkerId: any) {
	this.checkerId = checkerId
	this.resultManager = resultManager
	this.checkerRuleConfigContent = {}
	initRules()
	this.loadRuleConfig(this)
	}

	const fileName = node.loc?.sourcefile
	if (config.entryPointMode === 'ONLY_CUSTOM') return
	const fileName = node.loc?.sourcefile

Conversation

Ris-1kd commented Dec 8, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Uh oh!

sourcery-ai bot commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for Tornado route discovery and entrypoint registration

Class diagram for TornadoTaintChecker and Tornado utility types

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

gemini-code-assist bot commented Dec 8, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Ris-1kd commented Dec 8, 2025 •

edited by cursor bot

Loading

sourcery-ai bot commented Dec 8, 2025 •

edited

Loading

Parent name extraction may produce `[object Object]` in key