Skip to content

Conversation

@reenberg
Copy link
Contributor

@reenberg reenberg commented Sep 4, 2014

"Major" bug discovered, when parsing XML messages from Maltego.

See commit messages.

Attached is debug versions of safedexml for future reference if such a bug ever resurfaces again.

I still suspect that the safedexml.fields.List has a minor bug in its implementation, as discussed in the commit messages. However I don't have the time nor the knowledge to investigate further. As far as I can see the project is semi dead, so I don't suspect that it will make any difference trying to make bug reports at the safedexml github page.

When parsing the following valid XML from Maltego, the 'entities' field
is actually not parsed but silently ignored, due to the fact
that it was marked optional.

Now the following code will throw an ParseError, which needs to be
fixed. To aid this, I have added a debug version of 'safedexml.py' in
the xmltools folder, which may be copied into your path, and thus output
debug info when calling the 'parse' function. This debug output is also
included at the end.

---------------------------------------------------------------------------

In [1]: import canari.maltego.message as msg

In [2]: mmsg = msg.MaltegoMessage.parse('''
<MaltegoMessage>
   <MaltegoTransformRequestMessage>
      <Entities>
         <Entity Type="Domain">
            <AdditionalFields>
               <Field Name="fqdn" DisplayName="Domain
               Name">paterva.com</Field>
            </AdditionalFields>
            <Weight>0</Weight>
            <Value>paterva.com</Value>
         </Entity>
      </Entities>
      <Limits SoftLimit="255" HardLimit="255"/>
   </MaltegoTransformRequestMessage>
</MaltegoMessage>
''')

---------------------------------------------------------------------------

ParseError                                Traceback (most recent call
last)
<ipython-input-3-346b231349d0> in <module>()
     14    </MaltegoTransformRequestMessage>
     15 </MaltegoMessage>
---> 16 ''')

/usr/local/lib/python2.7/dist-packages/safedexml/__init__.py in
parse(cls, xml)
    343             print nest(), 'self._fields:', str(self._fields)
    344             print nest(), 'fields found:', fields_found
--> 345             self._parse_children_unordered(node, self._fields,
fields_found)
    346         #  Check that all required fields have been found
    347

/usr/local/lib/python2.7/dist-packages/safedexml/__init__.py in
_parse_children_unordered(self, node, fields, fields_found)
    402                     continue
    403                 field = fields[idx]
--> 404                 res = field.parse_child_node(self, child)
    405                 if res is PARSE_DONE:
    406                     done_fields[idx] = True

/usr/local/lib/python2.7/dist-packages/safedexml/fields.pyc in
parse_child_node(self, obj, node)
    809             field.field_name = self.field_name
    810             field.model_class = self.model_class
--> 811             res = field.parse_child_node(obj, node)
    812             if res is dexml.PARSE_MORE:
    813                 raise ValueError("items in a Choice cannot
    return PARSE_MORE")

/usr/local/lib/python2.7/dist-packages/safedexml/fields.pyc in
parse_child_node(self, obj, node)
    474             return dexml.PARSE_SKIP
    475         else:
--> 476             inst = typeclass.parse(node)
    477             self.__set__(obj, inst)
    478             return dexml.PARSE_DONE

/usr/local/lib/python2.7/dist-packages/safedexml/__init__.py in
parse(cls, xml)
    351             if field.required and field not in fields_found:
    352                 err = "required field not found: '%s'"
    % (field.field_name,)
--> 353                 raise ParseError(err)
    354             field.parse_done(self)
    355         #  All done, return the instance so created

ParseError: required field not found: 'entities'

---------------------------------------------------------------------------

For reference, I have added some debug output from safedexml, which may
help to diagnose the issue.

It is seen that when it constructs the class
MaltegoTransformRequestMessage, then it goes straight to creating the
Limits class.

---------------------------------------------------------------------------

 parse(....) ------------------------------------------------------------------------------------
        Constructing class: MaltegoMessage
        Generating node from XML: '\n<MaltegoMessage>\n   <MaltegoTransformRequestMessage>\n      <Entities>\n         <Entity Type="Domain">\n            <AdditionalFields>\n               <Field Name="fqdn" DisplayName="Domain Name">paterva.com</Field>\n            </AdditionalFields>\n            <Weight>0</Weight>\n            <Value>paterva.com</Value>\n         </Entity>\n      </Entities>\n      <Limits SoftLimit="255" HardLimit="255"/>\n   </MaltegoTransformRequestMessage>\n</MaltegoMessage>\n'
        node: <DOM Element: MaltegoMessage at 0x7f156a955290>
        attrs: []
        Current node has the following defined fields:
                field: <safedexml.fields.Choice object at 0x7f156a9580d0> (tagname: None) (attrname: None) (field name: message) (model class: <class 'canari.maltego.message.MaltegoMessage'>)
                         Has unused_attrs: []
        parsing order_sensitive = True
        parse(....) ------------------------------------------------------------------------------------
                Constructing class: MaltegoTransformRequestMessage
                Generating node from XML: <DOM Element: MaltegoTransformRequestMessage at 0x7f156a955368>
                node: <DOM Element: MaltegoTransformRequestMessage at 0x7f156a955368>
                attrs: []
                Current node has the following defined fields:
                        field: <safedexml.fields.List object at 0x7f156a94ea50> (tagname: Entities) (attrname: None) (field name: entities) (model class: <class 'canari.maltego.message.MaltegoTransformRequestMessage'>)
                                 Has unused_attrs: []
                        field: <safedexml.fields.Dict object at 0x7f156a94eed0> (tagname: TransformFields) (attrname: None) (field name: parameters) (model class: <class 'canari.maltego.message.MaltegoTransformRequestMessage'>)
                                 Has unused_attrs: []
                        field: <safedexml.fields.Model object at 0x7f156a94ef50> (tagname: None) (attrname: None) (field name: limits) (model class: <class 'canari.maltego.message.MaltegoTransformRequestMessage'>)
                                 Has unused_attrs: []
                parsing order_sensitive = True
                parse(....) ------------------------------------------------------------------------------------
                        Constructing class: Limits
                        Generating node from XML: <DOM Element: Limits at 0x7f156a955e18>
                        node: <DOM Element: Limits at 0x7f156a955e18>
                        attrs: [<xml.dom.minidom.Attr instance at 0x7f156a955ea8>, <xml.dom.minidom.Attr instance at 0x7f156a955f80>]
                        Current node has the following defined fields:
                                field: <safedexml.fields.Integer object at 0x7f156a940d90> (tagname: None) (attrname: SoftLimit) (field name: soft) (model class: <class 'canari.maltego.message.Limits'>)
                                         Has unused_attrs: [<xml.dom.minidom.Attr instance at 0x7f156a955f80>]
                                         adding field to list of found
                                field: <safedexml.fields.Integer object at 0x7f156a940dd0> (tagname: None) (attrname: HardLimit) (field name: hard) (model class: <class 'canari.maltego.message.Limits'>)
                                         Has unused_attrs: []
                                         adding field to list of found
                        parsing order_sensitive = True
                        parse() check that required fields have been found
                                 looping field: <safedexml.fields.Integer object at 0x7f156a940d90>
                                 looping field: <safedexml.fields.Integer object at 0x7f156a940dd0>
                        Created: Limits
                parse(....) ----------------------------------------- END --------------------------------------
                parse() check that required fields have been found
                         looping field: <safedexml.fields.List object at 0x7f156a94ea50>
ERROR: An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line string', (1, 0))

---------------------------------------------------------------------------
Dead code:

  self._canari_fields = dict([(f.name, f.value) for f in  self.entity.fields.values()])

We can remove this line of code from the constructor of
MaltegoTransformRequestMessage, as it is basically a no op:

 1. Trying to read from self.entity in __init__ will ALWAYS end
    up returning a 'none-entity' (return Entity('')), as the read
    from self.entities will fail due to the fact that it has not
    been populated with anything yet.

    a. This read from self.entities has the side effect of
       reading from the safedexml List field type, which
       initialises it to the empty list.  When we later on tries
       to actually parse anything into the field, it will fail as
       the call:

          val = super(List, self).__get__(obj)

       inside safedexml.fields.List.parse_child_node will return
       '[]' instead of None.  Because of this, the if statement
       will skip the 'parsing logic' and eventually end up
       returning PARSE_SKIP, instead of PARSE_CHILDREN.

    b. This has the effect that the <entities>...</entities> tag
       is not parsed, and thus the entities List field of the
       request message is newer populated.

 2. '_canari_fields' is not references anywhere else in the canari code.
    So even if the call did return anything, it would newer be used.

 3. To the best of my knowledge the code doesn't produce any side
    effects that are needed or even desired.

With the dead code, the following valid XML from Maltego, will
not populate the entities field.  This is an issue, as the
request message is sent to the transform, from where it is not
possible to access any of the information about the entity (most
importantly the value).
Worse is, if the previous bugfix has been implemented, where the
entities is no longer optional.  Then it will fail with an
ParseError due to the required field not being found.

--------------------------------------------------------

In [1]: import canari.maltego.message as msg

In [2]: mmsg = msg.MaltegoMessage.parse('''
<MaltegoMessage>
   <MaltegoTransformRequestMessage>
      <Entities>
         <Entity Type="Domain">
            <AdditionalFields>
               <Field Name="fqdn" DisplayName="Domain
               Name">paterva.com</Field>
            </AdditionalFields>
            <Weight>0</Weight>
            <Value>paterva.com</Value>
         </Entity>
      </Entities>
      <Limits SoftLimit="255" HardLimit="255"/>
   </MaltegoTransformRequestMessage>
</MaltegoMessage>
''')

--------------------------------------------------------

Now if trying to access the entity property of the request
message, we will get a non-entity back as seen by the empty
value and the empty list of entities.

--------------------------------------------------------

In [3]: mmsg.message.entity.value
Out[3]: ''

In [4]: mmsg.message.entities
Out[4]: []

--------------------------------------------------------

This can be furhter documented by adding debug statements inside
the 'entity' property.  However with the dead code eliminated,
the correct information is returned as seen below.

--------------------------------------------------------

In [3]: mmsg.message.entity.value
Out[3]: u'paterva.com'

In [4]: mmsg.message.entities
Out[4]: [<canari.maltego.message._Entity at 0x7f1ef1b1ff50>]

--------------------------------------------------------

All in all, this seem to be a bug in safedexml.  Currently I have
only noticed it on the List field type, however it may or may not
be present on other field types as well.  This isue is solved by
not reading from the entities field before the user has a chance
to call the parse function or if explicitly adding instances into
it (by appendelement) for use when rendering to xml.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line needs to stay in as it provides backwards compatibility to the older Canari API for a MaltegoTransformRequestMessage. If you take a look below, there is a property called fields which provides access to the first input entity's fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wan't to make sure I understand your argument for backwards compatability.

You are saying that someone might be accessing a private/internal field (as indicated by the leading underscore) of the request message class?
In any case, no one should use private/internal fields, besides the library itself.

Currently I can only think of two possible outcomes for setting this value. In the general case, you instantiate the class with no arguments, in which case it gets a non-entity back from self.entity, which results in self._canari_fields being set to the empty dictionary. Aka in this case the field contains no valuable information anyways.
The other, case is that you give a list of entities as argument to the constructor of the request message. However I can't possibly think of an old solution that is using this way of constructing request messages (as the preferred way of adding entities is with appendelement), and also reading the private/internal (which it should not).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants