본문 바로가기
IT-개발,DB

[개발/PHP] web php unicode 문자열 urldecode

by SB리치퍼슨 2011. 2. 10.



[개발/PHP] web php unicode 문자열 urldecode

문자열 : %uXXXX


<?php

function decode_unicode_url($str)
{
  $res = '';

  $i = 0;
  $max = strlen($str) - 6;
  while ($i <= $max)
  {
    $character = $str[$i];
    if ($character == '%' && $str[$i + 1] == 'u')
    {
      $value = hexdec(substr($str, $i + 2, 4));
      $i += 6;

      if ($value < 0x0080) // 1 byte: 0xxxxxxx
        $character = chr($value);
      else if ($value < 0x0800) // 2 bytes: 110xxxxx 10xxxxxx
        $character =
            chr((($value & 0x07c0) >> 6) | 0xc0)
          . chr(($value & 0x3f) | 0x80);
      else // 3 bytes: 1110xxxx 10xxxxxx 10xxxxxx
        $character =
            chr((($value & 0xf000) >> 12) | 0xe0)
          . chr((($value & 0x0fc0) >> 6) | 0x80)
          . chr(($value & 0x3f) | 0x80);
    }
    else
      $i++;

    $res .= $character;
  }

  return $res . substr($str, $i);
}
?>

Simple test with japanese characters,
combined with urldecode:

<?php

$str = decode_unicode_url('%u65E5%u672C%u8A9E');
print(mb_convert_encoding(urldecode($str), "sjis", "euc-jp, utf-8, sjis") . '<br/>');
?>

반응형

댓글