Faster Character Drawing

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Faster Character Drawing

cadet1620
Administrator
After finishing the OS, most students run their game with their OS. Most find that their OS is quite a bit slower than the supplied OS when it comes to printing text on the screen.

Many student OSes draw the font one pixel at a time using Screen.setColor() and Screen.drawPixel(). This is quite slow because every call to drawPixel() requires multiplication and division.

It is much faster to write the font data 8 bits at a time. After locating the starting screen word for the character (one multiplication and division), the addresses for the remaining character rows only require addition. The font data for each row is written using two bit masking operations.

The bit masking is required because there are parts of two characters in each screen word. For example, here are the characters "12" in row 6, columns 8 and 9:

Address    Bit number      Hex   Dec
                  111111
        0123456789012345
18500   ..##.....####...   1E0C  3102
18532   .###....##..##..   330E  3635
18564   ####........##..   300F  3888
18596   ..##.......##...   180C  3096
18628   ..##......##....   0C0C  3084
18660   ..##.....##.....   060C  3078
18692   ..##....##......   030C  3075
18724   ..##....##..##..   330C  3123
18756   ######..######..   3F3F 16191
18788   ................
18920   ................
(Remember from writing drawPixel() that the LSB of the screen word is the leftmost pixel.)

Bit Masking

A bit mask is a pattern of 1s or 0s that is used to set or clear a portion of a data word. For example, when drawing characters to the screen, they will need to be written only to the top half or bottom half of a word so that they will not interfere with the character stored in the other half of the word.

Bits can be set in a data word by using the OR operation. OR the data with a mask with 1s in the positions to be set

data  11000110
mask  00011100  set bits 2,3,4
OR    11011110
Bits can be cleared in a data word by using the AND operation. AND the data with a mask with 0s in the positions to be cleared
data  00101110
mask  11100011  clear bits 2,3,4
AND   00100010
It is common to have a single mask value and use its inverse when clearing bits:
bits234 = 0b00011100;
...
foo = foo | bits234;    // set bits
bar = bar & ~bits234;   // clear bits

Direct RAM Access

It is faster to use an Array as a pointer to access RAM than to to call Memory.peek() and Memory.poke(). To access the screen, you can set the Array's address to 16384. That way you will not need to keep adding 16384 to your address calculations:
class Screen {
    static Array screen;
    ...
    function void init()
        let screen = 16384;
        ...
    function void drawChar(...)
        ...
        let ch_addr = ...
        let w = screen[ch_addr];    // Access RAM[16384 + ch_addr]
        ...
This is particularly important if you are translating to ASM. The code for the call and return VM commands is a lot longer than an array access.

Drawing Characters

When drawing a character into an even column, the font data needs to be written into the lower half-word so the clear mask needs to be 1111111100000000, and the font data is the set mask.

It is trickier for an odd column when the character must be drawn into the upper half-word. The clear mask needs to be 0000000011111111, but the font data can not be used directly as the set mask because it is located in the lower half-word. To be used as the set mask, the font data needs to be moved to the upper half-word:

00000000..data..  -->  ..data..00000000
Since Jack does not have any shifting operators, the easiest way to do this is using multiplication. A lot harder to type, but possibly faster would be 8 doublings which would result in 8 repetitions of VM code "push n; push n; add; pop n". Not good; the idea here is to get rid of slow code!

Fortunately, there is an easy way to solve this.

The upper half-word of all the font data is 0, so when the font table is created, that is when to do the multiplication to create the font bits for the upper half-word characters. This multiplication now occurs 11 times during each character's creation instead of 11 times every time a character is drawn.

Assuming that you started with the supplied Output.jack, your Output.create() function is storing the font data as

function void create(int index, int a, ...)
...
let map[0] = a;
...
You need to set the upper bits here too. The obvious way is to use a + a * 256. A little algebra gives the more efficient a*257.

Now that there is data in the upper half-word of the font table, you need to mask the font data appropriately before you OR it into the screen. The mask for the font data is all 1s for the half-word that is required. Here is the odd column example using the augmented font data:

clear_mask = 255;           // 0x00FF
font_mask = ~clear_mask;    // 0xFF00
    // for each font row
    font_data = font[row];
    font_data = font_data & font_mask;
    Screen[addr] = Screen[addr] & clear_mask;
    Screen[addr] = Screen[addr] | font_data;

(The masking code is shown with one operation per line for
clarity. JackCompiler will generate more efficient code if it
is a single expression.)

    
font_data   0x3636
font_mask   0xFF00
       AND  0x3600
       
Screen      0x1E0C
clear_mask  0x00FF
       AND  0x000C
                  
        OR  0x360C

Drawing characters to the lower half-word is identical, except that the values of clear_mask and font_mask are swapped.

Reply | Threaded
Open this post in threaded view
|

Re: Faster Character Drawing

ybakos
Thank you as always for another fantastic write-up. Super useful!