Skip to content

PHP strtolower converts UTF-8 wrong #41

@vtoc

Description

@vtoc

Converting UTF-8 strings produces invalid characters, when a UTF-8 locale is set, otherwise they are just ignored, wich is fine.

<?php
    $t = 'ö';
    $c = strtolower($t);
    print(implode(unpack("H*", $t)) . "\t" . implode(unpack("H*", $c)) . "\n");
?>
LC_ALL="C" php test.php 
c3b6    c3b6
LC_ALL="en_US.UTF-8" php test.php 
c3b6    e3b6

In this case, it "converts" an already lowercase character, but it's the same for uppercase Umlauts, Ö(c396) will end up as e396 instead of c3b6.

It works on Linux, i do not yet know if it's a problem in Illumos or PHP.

The mb_strtolower($keywords, 'UTF-8') way works however.

Tested with:
PHP 5.3.29 (cli) (built: Jan 29 2016 19:08:29)
PHP 7.0.27 (cli) (built: Apr 8 2018 20:20:23) ( NTS )

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions