Community
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Character literals?

 
Post new topic   Reply to topic    Community Forum Index -> Java
View previous topic :: View next topic  
Author Message
jon80



Joined: 28 May 2009
Posts: 24

PostPosted: Tue Jul 17, 2012 7:01 am    Post subject: Character literals? Reply with quote

Why is the following incorrect syntax?

class CharactersInPlay
{

public static void main (String[] args)
{
char a = 'u\16848'; //I tried escaping i.e. 'u\\16848', and, also other characters
System.out.println(a);
}

}

References
Banum Unicode Chart at http://www.unicode.org/charts/PDF/U16800.pdf.
SCJP 1.6 Exam Prep written by K. Bates and Sierra. P.189
_________________
Jon
Back to top
View user's profile Send private message Yahoo Messenger MSN Messenger
ben_josephs



Joined: 02 Mar 2003
Posts: 2360

PostPosted: Tue Jul 17, 2012 8:40 am    Post subject: Reply with quote

Java's chars are 16 bits (4 hex digits) wide, enough to hold the characters of the basic multilingual plane.

If you need to use wider ("supplementary") characters you'll have to use surrogate pairs.

Or ints.
Back to top
View user's profile Send private message
jon80



Joined: 28 May 2009
Posts: 24

PostPosted: Tue Jul 17, 2012 8:43 am    Post subject: Reply with quote

ben_josephs wrote:
Java's chars are 16 bits (4 hex digits) wide, enough to hold the characters of the basic multilingual plane.

If you need to use wider ("supplementary") characters you'll have to use surrogate pairs.

Or ints.


UTF-16 (16-bit Unicode Transformation Format) is a character encoding for Unicode capable of encoding 1,112,064[1] numbers (called code points) in the Unicode code space from 0 to 0x10FFFF. It produces a variable-length result of either one or two 16-bit code units per code point.
http://en.wikipedia.org/wiki/UTF-16/UCS-2#Code_points_U.2B10000..U.2B10FFFF

A code point is a code value that is associated with a character in an encoding scheme. in the Unicode standard, code points are written in hexadecimal and prefixed with U+, such as U+0041 for the code point of the letter A. Unicode has code points that are grouped into 17 code planes.

The first code plane, called the basic multilingual plane, consists of the classic Unicode characters with code points U+0000 to U+FFFF. Sixteen additional planes, with code points U+10000 to U10FFFF, hold the supplementary characters.

Cora Java Vol I (P.43)

Can you provide an example?
_________________
Jon
Back to top
View user's profile Send private message Yahoo Messenger MSN Messenger
ben_josephs



Joined: 02 Mar 2003
Posts: 2360

PostPosted: Tue Jul 17, 2012 8:58 am    Post subject: Reply with quote

Indeed. The page you refer to also explains
Code points from the other planes (called Supplementary Planes) are encoded in UTF-16 by pairs of 16-bit code units called a surrogate pair...

Did you read the whole of that page or the Oracle Java page it refers to?
Back to top
View user's profile Send private message
jon80



Joined: 28 May 2009
Posts: 24

PostPosted: Tue Jul 17, 2012 9:02 am    Post subject: Reply with quote

ben_josephs wrote:
Indeed. The page you refer to also explains
Code points from the other planes (called Supplementary Planes) are encoded in UTF-16 by pairs of 16-bit code units called a surrogate pair...

Did you read the whole of that page or the Oracle Java page it refers to?

Yes, but I am down with coffee today and require an example, if possible.
_________________
Jon
Back to top
View user's profile Send private message Yahoo Messenger MSN Messenger
ben_josephs



Joined: 02 Mar 2003
Posts: 2360

PostPosted: Tue Jul 17, 2012 9:20 am    Post subject: Reply with quote

How you do it depends on the requirements of whatever function you're going to pass it to.

And I've never done it, so I don't have an example.
Back to top
View user's profile Send private message
jon80



Joined: 28 May 2009
Posts: 24

PostPosted: Tue Jul 17, 2012 9:21 am    Post subject: Reply with quote

ben_josephs wrote:
How you do it depends on the requirements of whatever function you're going to pass it to.

And I've never done it, so I don't have an example.


Oh ic, well I am reading the document at http://www.unicode.org/versions/Unicode6.1.0/ch02.pdf.
_________________
Jon
Back to top
View user's profile Send private message Yahoo Messenger MSN Messenger
MudGuard



Joined: 02 Mar 2003
Posts: 1254
Location: Munich, Germany

PostPosted: Tue Jul 17, 2012 1:16 pm    Post subject: Re: Character literals? Reply with quote

jon80 wrote:
Why is the following incorrect syntax?
char a = 'u\16848'; //I tried escaping i.e. 'u\\16848', and, also


first of all, it is an escaped u that starts an unicode character:
'\u0064'

Codes above \uFFFF must be encoded in two parts.
For details see
http://java.sun.com/developer/technicalArticles/Intl/Supplementary/

(whether a java char or a Character can hold unicode characters above \uffff i don't know - I know they can be used in a string)
Back to top
View user's profile Send private message Visit poster's website
jon80



Joined: 28 May 2009
Posts: 24

PostPosted: Tue Jul 17, 2012 1:27 pm    Post subject: Re: Character literals? Reply with quote

okay I will have a look, thanks
_________________
Jon
Back to top
View user's profile Send private message Yahoo Messenger MSN Messenger
Display posts from previous:   
Post new topic   Reply to topic    Community Forum Index -> Java All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB