Assembly in Basic II – Simple utility function
As we all know Assembly programs perform most functions much
faster than Basic programs due to its compiled nature vs. interpreted Basic and
possibility to optimize code closer to actual hardware.
On the other hand coding in Assembly can be much more time
consuming and with complex applications can become very long and complicated to
manage.
So in our journey into Commander X16 programming we will
jump into it only halfway. Basic is fast enough for many types of programs,
even games. However there are functionalities that are impossible to write in
Basic in such a way to execute fast enough for smooth gameplay So for our next
challenge we will write one pretty simple function in Assembly and learn how to
compile it, load it by Basic program and call it from Basic in order to improve
overall performance of our program.
An area where Basic is lacking the most and coincidently also
pretty easy to demonstrate is almost anything related to display. In fact one
of the first issues when writing a game in Basic is how to fill the large areas
of screen with some pattern or even just clearing it. So I decided to write a
simple RectFill function to demonstrate such Basic-Assembly hybrid program.
RectFill Function
The requirements for the function are pretty simple. We
should be able to draw rectangle filled with any PETSCII character and any
Color Attribute by also defining starting x and y position and defining width
and height. The same function can therefore be used to clear the screen without
using PRINT CHR$(147), at the same we can change the color of the foreground
and background of the screen but we can also change the patterns in just parts
of the screen or even just draw character based vertical or horizontal lines
and stripes of different colors.
To achieve that we will look at how to pass parameters to Assembly
function from Basic, how to load Assembly program and how to call it. Oh and of
course we will also write the function and compile it using Visual Studio Code
editor and Retro Compiler we set up in Part I.
It would also be very useful to read the blog post Direct
Vera Access where we describe how to access video memory from Assembly (or
Basic) directly.
Passing parameters
The simplest and easiest way to pass any kind of data from
Basic to Assembly and back is to use the memory directly. As we know we can use
POKE command in Basic to write to any address the CPU has access to and with
PEEK we can ready from any address. Obviously also Assembly programs have the
same reach. Of course not all memory is created equal. We will not go into
details about the memory Map and how to utilize banking and similar advanced
features. However there is one area of the memory that is especially interesting.
6502 family of processors uses so called Zero page memory for some very
convenient and fast addressing modes. Zero page means addresses from $00 to $FF
hex or in other words these are addresses that can be reached by using single
byte as an address. That is much faster to load into CPU and therefore faster
to execute. In our example that is not really that important but I thought it
is a good place to mention it. You are of course free to use some other memory
location to pass parameters by modifying the sample code.
Based on current documentation the area that is not used by
Kernel or any other system function and therefore free for us to use is from
$02 to $7F. SO let’s decide on following addresses:
Address | Value |
---|---|
$02 | X of the start point |
$03 | Y of the start point |
$04 | Width of the rectangular area |
$05 | Height of the rectangular area |
$06 | Character code to be written into Video memory |
$07 | Attribute value to be written into Video memory |
With that decided it is very simple to pass parameters from
Basic. We use POKE commands to set everything up for example for drawing a
square checkered area in the center of the screen in the Default Screen Mode
$02, starting location in 20,10 of width 40 and height 40 characters filled
with checkered character in White ink on Black background:
POKE $02,20
POKE $03,10
POKE $04,40
POKE $05,40
POKE $06,102
POKE $07,$01
Assembly Code
Let’s write the code that will read these parameters and
write to VERA registers that will result in changes in Video Memory. For easier
readability let’s first define some Labels and assign them constant values:
Listing 1 = Labels
First group of labels are used for addressing VERA
registers. Three registers for 20 bit long addresses. Then in line 4 we have
location of Data 0 register and finally also the VERA Control register.
In the second group of labels we define the memory locations
in Zero Page where we will read parameters from. Those are same locations we
discussed above and already prepared some POKEs to write proper values in.
Before starting to write the code we have to talk some more
about memory map in order to decide where to put our Assembly program. In the
official Commander X16 documentation we see the following map:
As we see there is a section perfect for our needs. We obviously want to leave the Basic memory free for the basic part of our program and we definitely don’t want to mess with any of the system programs like Kernal or Basic and Zero Page is to be used sparingly. That leaves the area from $0400 to $07FF. That is 1024 bytes of space that is more than enough for such a simple function and is safe from Basic programs to overwrite it.
Adresses | Description |
---|---|
$0000-$007F | User zero page |
$0080-$00FF | KERNAL and BASIC zero page variables |
$0100-$01FF | CPU stack |
$0200-$03FF | KERNAL and BASIC variables, vectors |
$0400-$07FF | Available for machine code programs or custom data storage |
$0800-$9EFF | BASIC program/variables; available to the user |
As we see there is a section perfect for our needs. We obviously want to leave the Basic memory free for the basic part of our program and we definitely don’t want to mess with any of the system programs like Kernal or Basic and Zero Page is to be used sparingly. That leaves the area from $0400 to $07FF. That is 1024 bytes of space that is more than enough for such a simple function and is safe from Basic programs to overwrite it.
So we will use the following syntax to tell our compiler
that we want the program to start at $0400:
* = $0400
Let’s look at the whole Assembly program now:
Before we go into details let’s make quick glance at the
structure. As we see we have two labels one for Columns (col) and the other for
Rows (row). So clearly we have (as expected) double loop to draw the filled
rectangle. Inner loop draws required number of characters horizontally as
defined in Width parameter and outer loop makes sure we draw required number of
rows as set by Height parameter.
Now let’s analyze the code line by line…
Line 19 simply sets the VERA control register to 0 meaning
we don’t want to reset it and we want to use Data Register 0 for data transfer
to Video Memory.
To write the characters to Video memory we have to calculate
starting address to it. Remember how in Basic we used formula to use with VPOKE
Y*256+X*2, the same thing has to be done in Assembly. The only difference is
that in Assembly we have to deal with each byte separately since we can’t
manipulate 16 bit values directly. Therefore we reserved two bytes in Zero Page
memory where we will store calculated 16 bit value inside Video memory.
In lines 21-23 we calculate Low byte of the address. We transfer
the X parameter value into Accumulator register, multiply by two (like in
basic) and store it into Low Byte address. Multiplication by two is done in
line 21 by shifting bits by one position to left. Because we know that maximum
X can only be 79 so highest bit is always 0 we can use Rotate Left (rol)
instruction. Alternatively we could use Arithmetic Shift Left (asl) but we
would need another instruction to clear Carry flag before and would therefore
spend few CPU cycles more.
In lines 24 and 25 we “calculate” the High address byte. As
you see there is really no calculation required because of how Screen mode 0
and 2 are setup – each line uses 256 bytes so by writing Y to High byte we “automatically”
multiply the address by 256, which is very convenient indeed.
We will have to keep track of counting rows and columns so
we load Width into register X and because we only have three registers and will
need two later we will store Height into Zero Page location $0A (labeled as
Ycounter) in lines 27-29.
In lines 31-32 we tell VERA chip that we will be using
increment 1 when writing to the Data transfer register. It means that the Video
memory pointer will be incremented by one after every write (or read if we
would be reading from it). This is because we decided that we will be writing
both the character code and attribute. We could easily modify this function to
just change the character or just attribute in that case the appropriate value
for increment would be two.
Next we start the first loop. In lines 34-37 we transfer the
starting memory address for the first character for the row we write to. In the
beginning it is of course first row but that will be later incremented for each
subsequent rows.
Next we have to load the values we will be writing to
screen. We load character code into Accumulator and the Attribute value into
Index Y register. We do that in lines 39-40.
In lines 42-45 we do the inner loop. We simply write
character code and attribute to VERA data register 0 and let VERA increment the
address automatically. We just need to make sure we do correct number of cycles
and we do that decrementing Index X register and returning to the beginning of
the inner loop until it becomes 0.
The remaining code in lines 47-50 is to prepare for the next
row. We increment memory location by 256 by incrementing High byte by 1. Load
Index X register with fresh Width value and decrement height counter and return
to the beginning of outer loop if we haven’t reached 0.
Line 52 returns control back to Basic program.
Compiling the code
If you set up the environment as described in previous post
the compiling should be pretty straightforward. There is one more thing we didn’t
talk about. The Retro Assembler is able to compile Assembly for several types
of CPUs including 6502 varieties. In our code we used some commands that are
specific to 65C02 so we have to tell that to compiler. There are several ways
to do it but I like the approach to include it in the name by adding extension
.65C02.asm to the file.
So open the Visual Studio Code, copy and the Assembly source
code from Github or from above and paste it into editor. Save the file as
RectFill.65C02.asm and using shortcuts you defined (I use F4 and F5) compile
the file. You should get a .prg file that is 57 bytes long with first two bytes
containing the address $0400 so Commander X16 knows where to load it.
If you didn’t get the Assembly environment setup yet I will
provide link to download compiled file below.
Loading and calling the Function from Basic
We have several ways to load the Assembly program into the
Commander X16 Emulator. We can start it with using parameter –prg and it will
automatically load the Assembly program at the desired location. The actual command
line will of course depend on your environment but if you put the compiled prg
file in the same directory as the Emulator executable the simplest call would
be:
x16emu –prg RectFill.prg
because we encoded the memory start address in the file
itself the emulator knows where to load it to.
We can test it by setting the parameters using 6 POKE
commands from above and calling the Assembly code using:
SYS $0400
And should see the following screen:
Alternatively we can also load the Assembly program from
Basic directly using a Load command:
LOAD “RECTFILL.PRG”,8,1
I haven’t had a chance to go very deep into LOAD command so
I am not sure of all the parameters and what they mean. I expected that default
device should be 1 but that doesn’t work. With current version of Emulator 36
the above version works.
To get a feel for the speed of the function we wrote let’s
make a simple Basic program that will fill random rectangles on screen in a
loop. Source code is below:
And the result is as follows:
The speed is clearly huge improvement that no trickery in
Basic can come close to and with such an hybrid approach the possibilities for
game development open up significantly. I hope this will encourage you to
experiment some more and would like to see what cool Assembly functions you will come up with.
This didn't work for me, I just loose the cursor and the screen stays the same edit screen. I'm using Release 44 of the emulator.
ReplyDeletePS D:\Retro Computing\Development\Commander x16\8bit coder> C:\\retroassembler\\retroassembler.exe "d:\Retro Computing\Development\Commander x16\8bit coder\RectFill.asm"
Retro Assembler V2022.1 -- Crafted by Peter Tihanyi with 8-Bit Love
(C) 2017-2023 Engine Designs in West Virginia, USA
Pre-processing the source code file... 41 lines of code loaded.
Compiling, 1st pass... (Processing)
Compiling, 2nd pass... (Finalizing)
Success. Size: 55 bytes ($37) in 0.048s
PS D:\Retro Computing\Development\Commander x16\8bit coder>