Skip to content

Incorrect handling of the unicode queries #13

@emorozov

Description

@emorozov

When searching like this:
Body.search.query(u'привет')

There're always zero results, while command-line search returns hundreds. This is due to double (or even triple) encoding in utf-8 done somewhere in the guts of django-sphinx/sphinxapi.

There're instances of pointless code like unicode(string).encode('utf-8'). The problem is that if string is already a unicode object, this code will create a unicode object containing its utf-8 representation and encode it using utf-8 again thus creating garbage. I've fixed this place in code but the string is sill double-encoded somewhere. :(

This code is pointless anyway because even if it would work - it would be a noop - take a bytestring, convert to unicode, convert to bytestring again. But instead of a useless noop it makes garbage of unicode input.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions