Skip to content

PFOCR "Copy to PathVisio" sometimes maps wrong genes #20

@khanspers

Description

@khanspers

Discussed in wikipathways/wikipathways-help#165

Originally posted by eweitz May 3, 2025
While creating a pathway diagram, I opted to try the convenient "Copy to PathVisio" feature in PFOCR. It seems 17 of 18 gene nodes were correct.

One gene node was subtly but significantly incorrect. The pro-inflammatory "IL-23" was mapped to the anti-inflammatory "IL37".

To reproduce:

  1. Go to https://pfocr.wikipathways.org/figures/PMC8901701__in-22-e7-g001.html
  2. In "Gene mentions" header, click "Copy to PathVisio" icon
  3. Open a text editor, paste
  4. Find "IL37"
  5. Note hit for: <DataNode TextLabel="IL37" GraphId="a1001" Type="GeneProduct"><Comment>OCR lexicon match: IL-23</Comment>
  6. Go to https://en.wikipedia.org/wiki/Interleukin_37
  7. Note IL-37 is an anti-inflammatory cytokine
  8. Go to https://en.wikipedia.org/wiki/Interleukin_23
  9. Note "IL-23 is an inflammatory cytokine"

Interestingly, "IL37" is not found in the "Gene mentions" HTML table that is visible at https://pfocr.wikipathways.org/figures/PMC8901701__in-22-e7-g001.html.

This issue seems like it could plausibly impact pathway diagrams made by PFOCR. I detected the problem when I could not find the "IL37" node in the source pathway image, then did a cursory search on "IL37" and "IL-23" to find their substantially different functions. For reference, the new pathway diagram I'm creating from this PFOCR entry is "Immunotherapies for psoriasis" (WP5537).

PFOCR is a fantastic tool, and its "Copy to PathVisio" functionality is overall quite useful! Hopefully refining its accuracy will speed up pathway creation even more.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions