string constants

classic Classic list List threaded Threaded
7 messages Options
tsm
Reply | Threaded
Open this post in threaded view
|

string constants

tsm
How do we push a charachter in a string to the stack
eg if I want to say
'push constant a', how would I write 'a' as a number
Maybe what I am really asking is how to make references to char constants
Reply | Threaded
Open this post in threaded view
|

Re: string constants

ybakos
This post was updated on .
I hesitate to say "you can't," but I believe doing so would require modifying the VM specification.

By design, the only thing you can push onto the stack are constant numeric values.

What are you trying to accomplish, exactly? Are you working on a Jack program?

(edit: I may have misinterpreted the original question - see Mark's answer below.)
Reply | Threaded
Open this post in threaded view
|

Re: string constants

cadet1620
Administrator
In reply to this post by tsm
tsm wrote
How do we push a charachter in a string to the stack
eg if I want to say
'push constant a', how would I write 'a' as a number
Maybe what I am really asking is how to make references to char constants
You need to push the numeric character value. For instance to push an 'A' your compiler would write "push constant 65".

For characters in the range space to '~' this is the standard ANSI character number 32 - 126 used by C, Java, Python, etc.

Characters outside that range are not allowed in Jack strings.

--Mark
Reply | Threaded
Open this post in threaded view
|

Re: string constants

ybakos
Doesn't the Jack platform adhere to UTF rather than ANSI? (More than just the ANSI range is allowed?)

Reply | Threaded
Open this post in threaded view
|

Re: string constants

cadet1620
Administrator
This post was updated on .
ybakos wrote
Doesn't the Jack platform adhere to UTF rather than ANSI? (More than just the ANSI range is allowed?)
The book says "unicode", but I think I'd call it:

    An unspecified 16-bit character set, with only the code points 32 - 129 defined.
    Code points 32-127 are defined as corresponding to ASCII characters in that range.
    Code point 128 is defined as the "new line" control code.
    Code point 129 is defined as the "backspace" control code.

It's not Unicode; new line and backspace are in violation of the Unicode specification.

--Mark
Reply | Threaded
Open this post in threaded view
|

Re: string constants

Dennis
@cadet1620 Why do you call the code points 32–127 a subset of ANSI? I think its better to speak here of good old plain ASCII; once because this is how Unicode is defined, and second because the critical differences of the ANSI encodings, ISO 8859-1, Windows-1252 … are all how they use the ›upper part of the byte‹ (128–255).

I think it makes more sense to speak of a custom 8-bit (or 16-bit, if you want) charset with goes mostly with ASCII.

Besides, a single memory cell of the Hack Computer isn't capble of represent an abitrary unicode code point: The Unicode Standard can define up to 1.114.112 different code points (U+0000–U+10FFFF), so one need at least ceil(log_2(0x10ffff)) = ceil(20.087) = 21 Bits.
Reply | Threaded
Open this post in threaded view
|

Re: string constants

cadet1620
Administrator
Dennis wrote
@cadet1620 Why do you call the code points 32–127 a subset of ANSI?
Insufficiently caffeinated fingers this morning... 8-(

FWIW, quoted from the Unicode spec.
    The first 256 codes follow precisely the arrangement of ISO/IEC 8859-1 (Latin
    1), of which 7-bit ASCII (ISO/IEC 646 IRV) accounts for the first 128 code
    positions.

(Back when I started with computers, bit 7 was a character parity bit, and character sets varied a bit from vendor to vendor.  Standards make the world a better place these days.  Unless you need to deal with EBCDIC!)

--Mark