Skip to content

Issues with some special characters in annotatorplus api #31

@dilshans2k

Description

@dilshans2k

Request:

encoded_text = quote_plus(text)
apikey = ""
ontologies_to_search = [
    "MONDO"
    ]
format = "json"
params: Dict[str, Any] = {
    "apikey": apikey,
    "format": format,
    "ontologies": ontologies_to_search,
    "mappings": True,
    "longest_only": True,
    "exclude_synonyms": False,
    "expand_class_hierarchy": False,
    "class_hierarchy_max_level": 0,
    "text": encoded_text
}
url = "http://services.data.bioontology.org/annotatorplus"
url = url + f"?apikey={apikey}&format={format}&ontologies={ontologies_to_search[0]}&mappings={True}&longest_only={True}&exclude_synonyms={False}&class_hierarchy_max_level={0}&text={text}"
r = requests.get(url=url)
r.raise_for_status()

Issue with %

If the input text contains % (note the whitespace), API gives 500 internal server error.

Sample input:
text=Parkinson Disease % Pneumonia

Server response:

<body>
    <h1>HTTP Status 500 – Internal Server Error</h1>
    <hr class="line" />
    <p><b>Type</b> Exception Report</p>
    <p><b>Message</b> Unexpected end of input at 1:1</p>
    <p><b>Description</b> The server encountered an unexpected condition that prevented it from fulfilling the request.
    </p>
    <p><b>Exception</b></p>
    <pre>com.eclipsesource.json.ParseException: Unexpected end of input at 1:1
	com.eclipsesource.json.JsonParser.error(JsonParser.java:490)
	com.eclipsesource.json.JsonParser.expected(JsonParser.java:484)
	com.eclipsesource.json.JsonParser.readValue(JsonParser.java:193)
	com.eclipsesource.json.JsonParser.parse(JsonParser.java:152)
	com.eclipsesource.json.JsonParser.parse(JsonParser.java:91)
	com.eclipsesource.json.Json.parse(Json.java:295)
	org.sifrproject.annotations.input.BioPortalJSONAnnotationParser.parseAnnotations(BioPortalJSONAnnotationParser.java:65)
	org.sifrproject.servlet.AnnotatorServlet.doPost(AnnotatorServlet.java:177)
	org.sifrproject.servlet.AnnotatorServlet.doGet(AnnotatorServlet.java:118)
	javax.servlet.http.HttpServlet.service(HttpServlet.java:655)
	javax.servlet.http.HttpServlet.service(HttpServlet.java:764)
	org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
	org.sifrproject.util.CharacterSetFilter.doFilter(CharacterSetFilter.java:24)
</pre>
    <p><b>Note</b> The full stack trace of the root cause is available in the server logs.</p>
    <hr class="line" />
    <h3>Apache Tomcat/9.0.62</h3>
</body>

Issue with ;

  1. If the text prefix contains ;, API gives 200OK but with error

    Sample input:
    text: ;Disease

    Sample output:

    [
        {
            "error": "{"errors":["A text to be annotated must be supplied using the argument text=<text to be annotated>"],"status":400}
    "}]
    
  2. If the text contains ;, only entities before ; are annotated.
    Sample input1:
    text: PARKINSON DISEASE PARKINSON's DISEASE

Sample output1:

[
    {
        "annotatedClass": {
            "definition": [
                "A progressive degenerative disorder of the central nervous system characterized by loss of dopamine producing neurons in the substantia nigra and the presence of Lewy bodies in the substantia nigra and locus coeruleus. Signs and symptoms include tremor which is most pronounced during rest, muscle rigidity, slowing of the voluntary movements, a tendency to fall back, and a mask-like facial expression."
            ],
            "prefLabel": "Parkinson disease",
            "synonym": [
                "paralysis agitans",
                "Parkinson disease",
                "Parkinson's disease"
            ],
..........................
        "hierarchy": [],
        "annotations": [
            {
                "from": 1,
                "to": 17,
                "matchType": "PREF",
                "text": "PARKINSON DISEASE"
            },
            {
                "from": 19,
                "to": 37,
                "matchType": "SYN",
                "text": "PARKINSON'S DISEASE"
            }
        ],
        "mappings": []
    }
]

Sample input2:
text = PARKINSON DISEASE; PARKINSON's DISEASE

Sample output2:

    [
        {
            "annotatedClass": {
                "definition": [
                    "A progressive degenerative disorder of the central nervous system characterized by loss of dopamine producing neurons in the substantia nigra and the presence of Lewy bodies in the substantia nigra and locus coeruleus. Signs and symptoms include tremor which is most pronounced during rest, muscle rigidity, slowing of the voluntary movements, a tendency to fall back, and a mask-like facial expression."
                ],
                "prefLabel": "Parkinson disease",
                "synonym": [
                    "paralysis agitans",
                    "Parkinson disease",
                    "Parkinson's disease"
                ],
    ............................
            "annotations": [
                {
                    "from": 1,
                    "to": 17,
                    "matchType": "PREF",
                    "text": "PARKINSON DISEASE"
                }
            ],
            "mappings": []
        }
    ]

As it is visible, Only the first instance of PARKINSON DISEASE was annotated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions