c******n 发帖数: 4965 | 1 【 以下文字转载自 SanFrancisco 讨论区 】
发信人: creation (努力自由泳50m/45sec !), 信区: SanFrancisco
标 题: sucks
发信站: BBS 未名空间站 (Tue Dec 9 22:24:58 2008)
normally the cookbook series is pretty good
but this one sucks
"Since both Java char values and Unicode characters are 16 bits in width"
I am reading the gosling book for the general language level concepts, but
for quick hands-on tool reference, anybody has a good reco ? | g*****g 发帖数: 34805 | 2
This is correct, what's the problem?
【在 c******n 的大作中提到】 : 【 以下文字转载自 SanFrancisco 讨论区 】 : 发信人: creation (努力自由泳50m/45sec !), 信区: SanFrancisco : 标 题: sucks : 发信站: BBS 未名空间站 (Tue Dec 9 22:24:58 2008) : normally the cookbook series is pretty good : but this one sucks : "Since both Java char values and Unicode characters are 16 bits in width" : I am reading the gosling book for the general language level concepts, but : for quick hands-on tool reference, anybody has a good reco ?
| Z****e 发帖数: 2999 | 3 well I don't know about the book but this statement is wrong for sure
char is indeed 16 bit, but unicode is not necessarily always 16-bit; for
example, UTF-8 and UTF-16 are both variable length, the former could be 8/16
/24/32-bit, the latter could be 16-bit or 32-bit
【在 g*****g 的大作中提到】 : : This is correct, what's the problem?
| g*****g 发帖数: 34805 | 4 Well, Unicode is 16 bits, it's encoding method that can be different.
This is like color image is 24 bits per pixel. But compression
format can be different.
16
【在 Z****e 的大作中提到】 : well I don't know about the book but this statement is wrong for sure : char is indeed 16 bit, but unicode is not necessarily always 16-bit; for : example, UTF-8 and UTF-16 are both variable length, the former could be 8/16 : /24/32-bit, the latter could be 16-bit or 32-bit
| Z****e 发帖数: 2999 | 5 Unicode is a grand index (code points) about all supported characters; in
terms of number of code points, according to wiki, Unicode has passed 65535
marker since version 3.1, so 16-bit is insufficient to index all code points
long time ago
【在 g*****g 的大作中提到】 : Well, Unicode is 16 bits, it's encoding method that can be different. : This is like color image is 24 bits per pixel. But compression : format can be different. : : 16
| E*V 发帖数: 17544 | 6 我记得好像是这样的。
65535
points
【在 Z****e 的大作中提到】 : Unicode is a grand index (code points) about all supported characters; in : terms of number of code points, according to wiki, Unicode has passed 65535 : marker since version 3.1, so 16-bit is insufficient to index all code points : long time ago
| s******n 发帖数: 876 | 7 older version of java supported older version of unicode.
but who cares about bits and bytes in java? that's the wrong way.
65535
points
【在 Z****e 的大作中提到】 : Unicode is a grand index (code points) about all supported characters; in : terms of number of code points, according to wiki, Unicode has passed 65535 : marker since version 3.1, so 16-bit is insufficient to index all code points : long time ago
| c******n 发帖数: 4965 | 8 unicode code point value can go larger than 65535
anyway that book made a conceptual error : it's meaningless to talk about
the "length" of unicode, unicode is just a number, when we talk about length
, it's always the *encoded* length under a charset.
java char can only represent the BMP and surrogate part of unicode, that's
what the book should say
【在 g*****g 的大作中提到】 : Well, Unicode is 16 bits, it's encoding method that can be different. : This is like color image is 24 bits per pixel. But compression : format can be different. : : 16
|
|