VERA Overview
VERA stands for Versatile Embedded Retro Adapter and consists of (copy from official documentation):
- Video generator featuring:
- Multiple output formats (VGA, NTSC Composite, NTSC S-Video, RGB video) at a fixed resolution of 640x480@60Hz
- Support for 2 layers, both supporting either tile or bitmap mode.
- Support for up to 128 sprites.
- Embedded video RAM of 128kB.
- Palette with 256 colors selected from a total range of 4096 colors.
- 16-channel Programmable Sound Generator with multiple waveforms (Pulse, Sawtooth, Triangle, Noise)
- High quality PCM audio playback from an 4kB FIFO buffer featuring up to 48kHz 16-bit stereo sound.
- SPI controller for SecureDigital storage.
In this short introduction post we will discuss access to VERA and write some very simple examples. We will explain the memory organization and test it with Basic and later translate it into Assembly program. At the end is breakdown with links to detailed descriptions of many of VERA features.
When programming in Basic we have few handy commands that make access to graphics simple and seamless. For example using VPOKE and VPEEK we can write and read to and from any address in video memory. With VLOAD we can load chunks of data directly into Video memory. So why would accessing it with machine code be any different?
To explain let’s look at the (simplified) memory map of
Commander X16. This illustration actually describes the most important concept in programming VERA, which is the fact that to do anything with it we can only access it through 32 bytes located in memory locations $9F20 - $9F3F:
The CPU is using 16 bit address bus and therefore can
directly access 64 Kbytes of memory. We call it CPU memory. Commander X16 actually has more memory which
CPU can access through “banking switching” however it can still only directly address 64K at any given time.
On the other hand we have 128K of Video memory (VRAM). That
memory is completely separated and isolated from CPU memory therefore CPU has
no direct access to it. The only way to manipulate the data in Video memory and
therefore the content of the display is through a fairly narrow window in the
CPU memory.
Currently 32 bytes are reserved for VERA communication and those 32 bytes contain 36 VERA registers. To be able to access 36 registers in only 32 addresses four of the registers have dual functionality that can be switched by changing the DCSEL bit in register 5.
Let’s look at those 36 bytes or registers that we can use:
Register | Address | Name | Bit 7 | Bit 6 | Bit 5 | Bit 4 | Bit 3 | Bit 2 | Bit 1 | Bit 0 |
---|---|---|---|---|---|---|---|---|---|---|
0 | $9F20 | Address Lo | VRAM Memory address bits 0-7 | |||||||
1 | $9F21 | Address Mid | VRAM Memory address bits 8 - 15 | |||||||
2 | $9F22 | Address Hi | Address Increment | Dec | - | Bit 16 | ||||
3 | $9F23 | Data 0 | Data Register 0 | |||||||
4 | $9F24 | Data 1 | Data Register 1 | |||||||
5 | $9F25 | Control | Reset | DC Select | Address Select | |||||
6 | $9F26 | IEN | IRQ bit 8 | Scan bit 8 | - | AFLOW | SPRCOL | LINE | VSYNC | |
7 | $9F27 | ISR | Sprite Collissions | AFLOW | SPRCOL | LINE | VSYNC | |||
8 | $9F28 Read | IRQ Line | IRQ bits 0-7 | |||||||
8 | $9F28 Write | Scan Line | Scan line bits 0-7 | |||||||
9-0 | $9F29 DCSEL=0 | Video | Current Field | Sprite Enable | Layer 1 Enable | Layer 0 Enable | NTSC/RGB 240p | Chroma Disable | Output Mode | |
10-0 | $9F2A DCSEL=0 | Horizontal Scale | Active Display Horizonatl Scale | |||||||
11-0 | $9F2B DCSEL=0 | Vertical Scale | Active Display Vertical Scale | |||||||
12-0 | $9F2C DCSEL=0 | Border Color | Border Color | |||||||
9-1 | $9F29 DCSEL=1 | Horizontal Start | Active Display Horizontal Start Bits 2 - 9 | |||||||
10-1 | $9F2A DCSEL=1 | Horizontal Stop | Active Display Horizontal Stop Bits 2 - 9 | |||||||
11-1 | $9F2B DCSEL=1 | Vertical Start | Active Display Vertical Start Bits 1 - 8 | |||||||
12-1 | $9F2C DCSEL=1 | vertical Stop | Active Display Vertical Stop Bits 1 - 8 | |||||||
9-63 | $9F29 DCSEL=63 | DC_VER0 | Version: Character 'V' | |||||||
10-63 | $9F2A DCSEL=63 | DC_VER1 | Version: Major Release | |||||||
11-63 | $9F2B DCSEL=63 | DC_VER2 | Version: Minor release | |||||||
12-63 | $9F2C DCSEL=63 | DC_VER3 | Version: Build Number | |||||||
13 | $9F2D | Layer 0 - Config | Map Height | Map Width | T256C | Bitmap Mode | Color Depth | |||
14 | $9F2E | Layer 0 - Mapbase | Map Base Address Bits 9 - 16 | |||||||
15 | $9F2F | Layer 0 - Tilebase | Tile Base Address Bits 11 - 16 | Height | Width | |||||
16 | $9F30 | Layer 0 - H Scroll | Horizontal Scroll Bits 0 - 7 | |||||||
17 | $9F31 | Layer 0 H Scroll | - | Horizontal Scroll Bits 8 - 11 | ||||||
18 | $9F32 | Layer 0 - V Scroll | Vertical Scroll Bits 0 - 7 | |||||||
19 | $9F33 | Layer 0 - V Scroll | - | Vertical Scroll Bits 8 - 11 | ||||||
20 | $9F34 | Layer 1 - Config | Map Height | Map Width | T256C | Bitmap Mode | Color Depth | |||
21 | $9F35 | Layer 1 - Mapbase | Map Base Address Bits 9 - 16 | |||||||
22 | $9F36 | Layer 1 - Tilebase | Tile Base Address Bits 11 - 16 | Height | Width | |||||
23 | $9F37 | Layer 1 - H Scroll | Horizontal Scroll Bits 0 - 7 | |||||||
24 | $9F38 | Layer 1 H Scroll | - | Horizontal Scroll Bits 8 - 11 | ||||||
25 | $9F39 | Layer 1 - V Scroll | Vertical Scroll Bits 0 - 7 | |||||||
26 | $9F3A | Layer 1 - V Scroll | - | Vertical Scroll Bits 8 - 11 | ||||||
27 | $9F3B | Audio Control | FIFO Full/Reset | FIFO Empty | 16 Bit | Stereo | PCM Volume | |||
28 | $9F3C | Audio Rate | PCM Sample Rate | |||||||
29 | $9F3D | Audio Data | Audio FIFO Write Only Data Register | |||||||
30 | $9F3E | SPI Data | SPI (Serial Peripheral Interface) Data Register | |||||||
31 | $9F3F | SPI Control | Busy | - | Slow Clock | Select |
First three registers are used to set address into VRAM. We
have 17 bits at our disposal (0 – 16). That gives us a range from $00000 to
$1FFFF or 128 Kbytes. Of course it is no coincidence that VPOKE command has the
same range.
Not all VRAM memory is created equal. As already marked in the picture above, there are three sections at the end of the memory space with special role:
Address Range | Description |
---|---|
$00000 - $1F9BF | Video RAM for Maps and Tiles |
$1F9C0 - $1F9FF | Programmable Sound Generator Registers (64 bytes) |
$1FA00 - $1FBFF | Color Palette |
$1FC00 - $1FFFF | Sprite Attributes (1024 bytes, 128 sprites - 8 bytes each) |
We see that video memory is between $00000 and $1FFFF and
that we have dedicated space for Palette, sprite registers and Programmable Music Generator from $1F9C0 on.
That means that to be able to manipulate VRAM we need 17 bits, which is exactly what we have in registers 0-2. Register 0 contains bits 0-7, Register1 contains bits 8-15 and Register 2 contains bit 16 and some other settings we look at later.
After setting the address we can use Data registers to write
or read from Video Memory. We can use Register 3 ($9F23) as Data Register 0 and
Register 4 ($9F24) as Data Register 1. We can choose which one we prefer by
setting Bit 0 in Register 5 ($9F25) with value 0 selecting Data Register 0 and
value 1 selecting Register 1. Note that Bit 7 in this register resets the VERA
settings but I recommend not to use it from BASIC.
Next we have to explain probably the most important concept
of using VERA and accessing Video Memory. Register 2 ($9F22) contains highest bit of address but also Increment and bit to switch between Increment and Decrement.
Increment is setting that defines automatic increment/decrement of
VERA address in registers 0-2 after every read or write. Since we have 4 bits
we have 16 different values but what does that actually mean.
The best way to describe how increment works let’s imagine
following example and test it. As you remember default video mode of Commander
X16 after start (or reset) is 80 column text mode. The displayed text starts in
Video memory $1B000 which we can verify by VPOKE to that address. The memory is
organized in a way that first byte contains a character code. The second
character contains a color attribute, the third byte contains second character displayed,
fourth byte contains color attribute for that second character and so on. So if we want to write
only characters we have to write to every other address starting with $1B000,
then $1B002 followed by $1B004 and so on. Obviously the address has to be
incremented by two for each write to the screen. And that is exactly what the Increment
setting does automatically.
So let’s test this theory and write a simple BASIC program
to write to the video memory without using VPOKE but just POKE to the VERA
external registers and see if we can recreate above scenario.
First step is to decide which data register we will use.
Let’s just use Register 0. We do that with
POKE $9F25,0
Next we have to set screen address to $1B000 and set
Increment to 2 so we can write to every other address. That means we set Low byte to $00 and Mid byte to $B0 and Hi Byte to $21 Hex.
POKE $9F22,$21
POKE $9F21,$B0
POKE $9F20,$00
If we write some value to Data Register 0 ($9F23) it should
display the character in the top left corner and increment the address by two.
To make it more interesting let’s put all this into a program:
The Increment values are not simply represented by binary
value from 0 – 15 but they have the following values to increase the reach of
the Increment:
Increment Setting | Actual Increment Amount |
---|---|
0 or $0 | 0 |
1 or $1 | 1 |
2 or $2 | 2 |
3 or $3 | 4 |
4 or $4 | 8 |
5 or $5 | 16 |
6 or $6 | 32 |
7 or $7 | 64 |
8 or $8 | 128 |
9 or $9 | 256 |
10 or $A | 512 |
11 or $B | 40 |
12 or $C | 80 |
13 or $D | 160 |
14 or $E | 320 |
15 or $F | 640 |
Knowing this we can modify the above program and try to write vertically. If we recall in the default mode each line takes 256 bytes of memory and only part is visible. 80 out of 128 characters and attributes. To write vertically we therefore have to increment memory location pointer by 256 after every write. Based on a table above we have to use value 9 for increment and now our code looks like this:
No real surprises there. We could use what we learned in
Basic and in special cases we might be able to speed up the code a little but
not in significant way, especially not in real life scenarios.
Everything we did above we can also do in reverse by using the Decrement bit 3 in Register 2. If it is set the Increment works as Decrement for the same value.
Assembly
The logical next step is to use the above learned in assembly. Since POKE command is closest we get to hardware from Basic the translation is very simple. Let’s look at below Assembly code and walk through it. Commander X16 comes with Monitor built in which is very convenient. It allows us write simple assembly programs right there on the system without complicated setup and external tools or installing additional tools on a system itself.Just like in BASIC we have to split starting address of the Screen layer 0 (also known as Map Base) of $1B000 into VERA registers. So $00 will be Low Byte, $B0 will be Middle byte and $01 will be High byte.
First line writes 0 into register $9F25. Mnemonic STZ (STore Zero) was added to 65C02 so if you try to use it on Commodore 64 or any other 6502 computer it will not be recognized.
Second line writes 0 into register $9F20 because, as we saw above, the Low byte in $1B000 is $00.
In lines 3-4 we write $B0 to Middle address byte.
In lines 5-6 we combine High byte of the address which is $01 with Increment value of 2 in high nibble for value of $21.
We will use Accumulator for two things, to store the value
of the next character to be written to screen and as a counter to write exactly
80 characters. Since we will start with A, which has screen code 1 (just like
in the Basic program) we initialize it by setting it to 1 in Line 7.
Lines 8-11 are our loop. We write 1 to first location in
video memory which in first iteration is $1B000. Then in line 8 we increment
Accumulator by one. In line 10 we compare it with value $51 hex, which is 81 in
decimal. If we would compare with 80 the loop would exit too soon (we could use
BPL but let’s keep it simple for now). Then in line we check the result of
comparison and if the comparison was not equal we jump back to the beginning of
the loop to line 8 (address $4012).
During second iteration therefore we write 2 into address
$1B002, then 3 into $1B004 and so on until we write all 80 characters.
In line 12 we Return from Subroutine and go back to basic or
wherever we called this function from.
Like every piece of code this routine could be written in
many different ways but I feel this is the clearest way to do it.
Before we wrap up let’s quickly look at how to write it and
test it. At the end of this part our screen should look similar to this:
We have to start Monitor by writing command MON
Our cursor will be waiting after the prompt in the form of
dot.
Without any additional preparation we can start writing code
by telling Monitor we will write assembly and at what address by using command
A immediately followed by mnemonic of the first assembly command and parameter
(if any). In our case we just type:
A4000 STZ $9F25
After that monitor will expand the line and show the actual bytes stored in memory at locations $4000 - $4002 like we see on above screenshot and wait for next command in next line with new address $4003. So we just keep typing the whole program and when finished on memory location $401B we simply press enter and we will be at dot prompt again. By typing X we can exit monitor.
All we have to do now is call the program by calling it from
Basic by:
SYS $4000
And we should see 80 characters magically appear in the first line of the screen.
Have fun experimenting with and changing your first assembly
program.
Comments
Post a Comment