由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
Programming版 - sucks (转载)
相关主题
用react的试过中文么?求教, python 对于很奇怪的字符的encoding 怎么处理?
php DOM parse 中文乱码问题 (转载)encode high cardinality categorical features
顺便问一个CreateFile问题 (转载)有谁可以推荐一下经典的PYTHON书籍?
How to encode YYYY-MM-DD?how to decode these data from users' input at a web site
how to encoding UTF-8 to hex怎样遍历一个字母的组合 (转载)
Any one works in Intel for 64bit asm ? (转载)问个Perl的简单问题
怎么把 integer 转为 multi-byte integer format?James Gosling is interviewing for job :)
大家来看看这个纯Javascript实现的QR二维码生成器Perl question - use perl to read another html files and update new html files
相关话题的讨论汇总
话题: unicode话题: java话题: cookbook话题: sucks话题: 16
进入Programming版参与讨论
1 (共1页)
c******n
发帖数: 4965
1
【 以下文字转载自 SanFrancisco 讨论区 】
发信人: creation (努力自由泳50m/45sec !), 信区: SanFrancisco
标 题: sucks
发信站: BBS 未名空间站 (Tue Dec 9 22:24:58 2008)
normally the cookbook series is pretty good
but this one sucks
"Since both Java char values and Unicode characters are 16 bits in width"
I am reading the gosling book for the general language level concepts, but
for quick hands-on tool reference, anybody has a good reco ?
g*****g
发帖数: 34805
2

This is correct, what's the problem?

【在 c******n 的大作中提到】
: 【 以下文字转载自 SanFrancisco 讨论区 】
: 发信人: creation (努力自由泳50m/45sec !), 信区: SanFrancisco
: 标 题: sucks
: 发信站: BBS 未名空间站 (Tue Dec 9 22:24:58 2008)
: normally the cookbook series is pretty good
: but this one sucks
: "Since both Java char values and Unicode characters are 16 bits in width"
: I am reading the gosling book for the general language level concepts, but
: for quick hands-on tool reference, anybody has a good reco ?

Z****e
发帖数: 2999
3
well I don't know about the book but this statement is wrong for sure
char is indeed 16 bit, but unicode is not necessarily always 16-bit; for
example, UTF-8 and UTF-16 are both variable length, the former could be 8/16
/24/32-bit, the latter could be 16-bit or 32-bit

【在 g*****g 的大作中提到】
:
: This is correct, what's the problem?

g*****g
发帖数: 34805
4
Well, Unicode is 16 bits, it's encoding method that can be different.
This is like color image is 24 bits per pixel. But compression
format can be different.

16

【在 Z****e 的大作中提到】
: well I don't know about the book but this statement is wrong for sure
: char is indeed 16 bit, but unicode is not necessarily always 16-bit; for
: example, UTF-8 and UTF-16 are both variable length, the former could be 8/16
: /24/32-bit, the latter could be 16-bit or 32-bit

Z****e
发帖数: 2999
5
Unicode is a grand index (code points) about all supported characters; in
terms of number of code points, according to wiki, Unicode has passed 65535
marker since version 3.1, so 16-bit is insufficient to index all code points
long time ago

【在 g*****g 的大作中提到】
: Well, Unicode is 16 bits, it's encoding method that can be different.
: This is like color image is 24 bits per pixel. But compression
: format can be different.
:
: 16

E*V
发帖数: 17544
6
我记得好像是这样的。

65535
points

【在 Z****e 的大作中提到】
: Unicode is a grand index (code points) about all supported characters; in
: terms of number of code points, according to wiki, Unicode has passed 65535
: marker since version 3.1, so 16-bit is insufficient to index all code points
: long time ago

s******n
发帖数: 876
7
older version of java supported older version of unicode.
but who cares about bits and bytes in java? that's the wrong way.

65535
points

【在 Z****e 的大作中提到】
: Unicode is a grand index (code points) about all supported characters; in
: terms of number of code points, according to wiki, Unicode has passed 65535
: marker since version 3.1, so 16-bit is insufficient to index all code points
: long time ago

c******n
发帖数: 4965
8
unicode code point value can go larger than 65535
anyway that book made a conceptual error : it's meaningless to talk about
the "length" of unicode, unicode is just a number, when we talk about length
, it's always the *encoded* length under a charset.
java char can only represent the BMP and surrogate part of unicode, that's
what the book should say

【在 g*****g 的大作中提到】
: Well, Unicode is 16 bits, it's encoding method that can be different.
: This is like color image is 24 bits per pixel. But compression
: format can be different.
:
: 16

1 (共1页)
进入Programming版参与讨论
相关主题
Perl question - use perl to read another html files and update new html fileshow to encoding UTF-8 to hex
同主题转寄 (转载)Any one works in Intel for 64bit asm ? (转载)
cookie的问题怎么把 integer 转为 multi-byte integer format?
Marshal C++ struct to C# struct大家来看看这个纯Javascript实现的QR二维码生成器
用react的试过中文么?求教, python 对于很奇怪的字符的encoding 怎么处理?
php DOM parse 中文乱码问题 (转载)encode high cardinality categorical features
顺便问一个CreateFile问题 (转载)有谁可以推荐一下经典的PYTHON书籍?
How to encode YYYY-MM-DD?how to decode these data from users' input at a web site
相关话题的讨论汇总
话题: unicode话题: java话题: cookbook话题: sucks话题: 16