Wednesday, August 7, 2019

The Lifecycle of a Commercial Software Project

I seem to have been around enough times that I'm starting to detect a pattern. I thought I'd share my observations and see what other people had in mind. It's kind of a frustrating one to me - I'm a huge fan of continuity - because it seems to define that all commercial projects are doomed to die.

So in phase one, we have the wild west. The project begins life - sometimes with formal blessing and sometimes without. The developers are the die-hards who believe in the project and put in crazy amounts of effort to putting it together. There's minimal oversight at this point - the team trusts each other. The team is usually in constant communication about what they are doing and what needs to be done next, but it's hardly formalized.

In phase two, the efforts of the developers have paid off. The project has crossed a major milestone, been revealed to all, and possibility even filled a hole that the company needed filled. Everyone is very excited, not least because the project seems to have come out of nowhere to meet a need. Now management is involved in taking this generally-still-rough-around-the-edges project and "finishing" it. Sometimes this is also called productization. But the main point is that someone is now in charge of choosing a future for the project. Everyone is still pretty happy and additional developers are often added. Source control becomes formalized. Everyone wants to use the product, although it requires updates to meet their exact needs (which the developers are happy to do).

In phase three, the product is more or less mature. It still has rough edges, since the resources planned for the grand designs in phase two were diverted to unexpected feature requests. New developers are added to increase the head count and try to catch up - costs increase. The product sees massive growth in feature set but little polishing. New features are published early in order to get back to the main tasks and frequently cause bugs to be seen by end users, resulting in a tightening of code quality standards such as code review and a decline in reputation.

In phase four, upper management begins to ask the project management why the product is still so unpolished and why it has so many bugs, despite the time and money put into it. Demands for better tracking of resources and money are put into place. Bug tracking and time tracking become formalized, and statistics are often added to the regular process, increasing the time spent managing the product. Demands begin to be made that the developers focus only on what the manager dictates needs to be done, and not side issues that the developers consider important. Small bugs linger and user satisfaction begins to fall. The developers begin to complain that they need permission to work on the aging architecture and to correct user satisfaction bugs.

Phase four may linger for a long time, but eventually it grows to phase five. Management decides that the number of things that lead developers have stated need attention is too large a list to handle in a cost-effective manner. The project is put into a bug-fix-only mode while a replacement is designed and built from scratch. Note that although there's often some planning sessions, the replacement project usually starts in phase one.

I've been all the way through this loop a few times, and been at various points of it on various other projects, and I am becoming convinced that it's inevitable. Look, software is expensive and it's very difficult to quantify. It's hard to plan, it's hard to maintain, and it's hard to get right. We need to stop pretending that there must be a silver bullet out there if we just find the right way to manage the project.

The truth is that phase one probably accomplished 80% of everything the product needed for its entire lifespan. And that should really be the statistic that makes the most sense. No matter what some people who don't write real software for a living may claim, it's not something that you can just plan out to the smallest detail, because unlike so many sciences that can (I'm thinking architecture here), SOFTWARE DOES NOT EXIST IN THE REAL WORLD. That's right - it's completely virtual. It doesn't obey any laws of nature.

To write software, a person needs to have a firm grasp on how to tell an imaginary concept to do a real world task. They need to be able to abstract thousands of steps, put them in the right order, and then be able to anticipate all the wrong ways that a person will try to interact with those thousands of steps. BUGS ARE INEVITABLE. You can mitigate some of them (see earlier blogs), but by god stop pretending you can prevent them all. Assume that EVERY LINE OF CODE WILL NEED ATTENTION SOMEDAY.

It's also important to keep in mind that if you are not a user of the software and you are not a developer of the software, you are not qualified to determine what the software NEEDS. If you're a manager and you came in to manage the product before you even knew it existed, then get over yourself. You need to buddy up with one of the developers, understand what it does, and work with them when you're making decisions. It's all too easy to focus on the wrong parts of the product, or to choose a forward path that is completely against all of the design to date (been to this one specifically). This will make the product worse, not better. Even better, become a user. Use the software daily. Make it something you HAVE to use. Then you'll better understand why those little usability bugs are a big deal. ;)

I'm pretty convinced that the longer a project survives before phase three, the more successful its life will be. It's good to have some structure - source control is critical. Code reviews are useful. Even a bug tracker is a good thing to have. But when you get into late phase three and phase four - all of these useful things start to be used against the project. Why is there so much code churn? Why are so many people wasting time with code reviews? Why are there so many issues in the bug tracker? Good things become bad because they are seen as wasting money, when in fact they are preventing waste by reducing the issues before they are seen in the field.

What's your experience with the commercial software life cycle?

Sunday, August 4, 2019

What's New

Since finishing off Dragon's Lair, I've actually been sidetracked working on the CollectorVision Phoenix (https://collectorvision.com/shop/colecovision/collectorvision-phoenix/), which is coming out soon. I've done a basic loader for the ColecoVision that will hopefully work well enough for people. We'll see when you get it!

Since that's done, I'm trying to get my todo list started. It's sobering to look at that list, do the math, and realize I will not likely finish it. Stupid mortality. ;)

You can look too! Then vote on what I should do! http://harmlesslion.com/cgi-bin/walrusshow.cgi

Unrelated to the task list, I just published a new version of Flipterm, in case anyone was using that. Flipterm is a text-based MUCK client. A MUCK is comparable to a MUD, and that probably doesn't help you at all. ;) MUD stands for 'Multi-User Dungeon', and before everything was a first-person shooter, this is what an MMORPG looked like. It was a lot like those old Infocom text adventures, except there were other people in the room as well.

Imagine a modern chat room or Discord server, but without the GIFs.

Anyway, FlipTerm is derived from the old GMud, and I started working on it back in 2002. It was pretty much stable since then, only minor fixes, but I was finally asked to make a few improvements, and took it as an opportunity to update the codebase.

Oh my god it's an awful codebase. It started rough, and then I was clearly learning as I went, not only MFC and MDI, but C++ itself. 

Anyway! Upgraded* the code to Visual Studio 2017, which came with a whole new MFC mindset for MDI that took a few days to get updated. Should be only minorly visible, though for free it did bring a new docking paradigm that has snazzy looking controls, and tooltips are nicer looking. It also, for free, fixed the docking preview on high DPI (ie: 4k) monitors, as that was kind of messed up.

(* - haha, "Upgraded". I fixed the parts that didn't work and left the rest as-is. I'm not that crazy!)

From the user point of view, the enhancements are:

- updated docking to modern version - fixes high DPI and provides better(?) user feedback
- removed MCP and the plugins. Nothing important ever used any of them.
- mouse scroll wheel works on the main output window and not just the input window (the one thing that did bug me for the last ten years or so ;) )
- word wrap settings now reformat the view when you resize the window (huzzah!) (unless you are using the fixed 80-column mode, but I'm not sure why anyone would anymore.)
- when you start a log after logging in, then log back to the beginning, the formatting of the old text is as-received instead of as-formatted, which makes logs more consistent.

Not much, really, but several nights of deep diving to get it all working with the new system. Hopefully good for another ten years!

Grab it over on Github here: https://github.com/tursilion/FlipTerm

Friday, May 31, 2019

Classic99 Debugger

Recently, I've been rather bubbling over how excited I am to have the Classic99 debugger working for ColecoVision software, and in my last post I promised to talk about that some. So I think I will.

 

While still lacking a number of features I'd really like, it really is my favorite retro-debugger. Since I wrote it, I'm clearly biased, but I also worked to get the features I actually need into it. Some of these features - I'm really baffled that other emulators DON'T have them.

The first feature, and nobody seems to do this, is real-time view. While the system in running, you get a real-time view (updated roughly 10 times per second) of whatever window you are looking at. You can see the disassembly flying by (very useful for seeing tight loops), or you can watch memory in real time (looking for a RAM access pattern - here you go!) All the while, the current system registers are ALWAYS available.

In addition, while less often useful, the commands that let you ALTER memory or registers are available at all times. Usually it's more useful to breakpoint first, but sometimes you just want to end a timed delay early, and you can just write the register directly without freezing execution. Even on the Coleco, which has far fewer registers than the TI, this is useful.


The second feature is flexible breakpoints. Again, there's more to do here, but I can break on PC, or I can break on read or write or either to a particular address in any of the system memories, or I can break on a particular value being written to a target memory. In addition, in case I don't know the exact address, any memory can be a range. I can even look for single bits. Many of these options would be difficult to implement on real hardware, but an emulator doesn't know the difference between real and simulated.

The breakpoint system even allows non-break functions, in this case it can count cycles between two addresses for performance monitoring, or even log all writes to a particular address to a disk file! Want to capture a speech pattern or a song, this can do it.


The ability to single-step and immediately see the result of an instruction is invaluable. This one at least, most debuggers do implement. Classic99 of course lets you alter any register while paused or running, including the program counter, which allows you to go back and retry a piece of code with different inputs, or even change the code on the fly (although you need to enter the hex bytes manually.)

While still rough, Classic99 also allows debug of both the host CPU and the F18A GPU at the same time.

Within an hour of having the debugger available to ColecoVision software, I was able to troubleshoot a problem with the emulator's implementation of NMI, and observe the memory access patterns of a number of Coleco games. I even engaged in a little bit of cheating on a favorite game (Antarctic Adventure: time remaining 0x60e3 (16-bit BCD, maximum value C60E3=9999), distance remaining 0x605e (16-bit BCD, finish level now C605E=0100)).

That's really all I have to say. The main deficiency I see in other debuggers is the lack of real-time view, which really slows me down. Stability seems poor on some, as well. At the same time, there are features in other emulators I also plan to implement, like sprite and tile views, audio waveforms, and more.

I use the Classic99 debugger almost all the time, to debug my own code, to reverse engineer others' code, and even to debug the emulator itself. It's an invaluable tool that too many emulators treat as a secondary add-on.


Tuesday, May 14, 2019

Whoooops, been neglecting you!

My... what... two readers? Oh, you only found this through a Google search? Good enough then!

So, Dragon's Lair is finished and done! I still have about 200 PCBs that need a new project as I can NOT put Dragon's Lair on them (seriously, I will ignore any more questions about just pretending I forget about licensing, unless accompanied with a dump truck of money parked outside the front door when I go to look).

I did publish my followup as a slideshow - if you want to know about the building of Dragon's Lair, have a peek here:

https://docs.google.com/presentation/d/1u7SOusjQDInq95GrEH2tmIphUAjRg-9YbhR_X25D62A/

I also put up a Ko-Fi, though I probably don't post here enough to be worth it. ;) But have a peek here: https://ko-fi.com/tursilion

With all that out of the way, I wanted to talk some about emulation.

For many years now I've had a TI emulator called Classic99. I started it back in 1996 or so so I could have my own emulator to play with, since V9T9 was pure x86 assembly and PC99 was closed source. While development has frequently been slow it has at least been steady, and the emulator has a lot of features that help with debugging software. At first I would breakpoint the source code of the emulator itself, and write C code for whatever I wanted it to dump, but over time most of the reasons I needed to do this were coded into the debugger, and today it's reasonably comprehensive. (It still needs a 'cheat' search and remapper...)

Anyway, I also play around with the ColecoVision a lot. It is, at the basic level, the same machine as the TI except that it has a different CPU. Where the TI has a TMS9900, the ColecoVision has a Z80. There are, of course, some other gotchas - but that's enough to get going.

There are a handful of emulators for the ColecoVision. My favorite is probably BlueMSX, ColEm is probably the best known, and CoolCV seems to be the new hotness. There are many others, including the catch-all MAME. But when I start writing code and I actually need to test it, I need three things:

1) I need a fast way to launch my code
2) I need a debugger that lets me view the system state, single step and observe results, find and set breakpoints, etc.
3) I need a codebase I can modify in case I am using hardware the emulator doesn't support

BlueMSX was my favorite, it came close. True, when I changed the code it broke the debugger for reasons I couldn't guess, and the debugger only works when the emulation is paused, but at least it was in a separate window, let me view memory and set breakpoints.

Last night I loaded up a new project that I was sure worked a couple weeks ago, hit run in BlueMSX, and nothing came up. I loaded the debugger, and a few minutes later I closed the whole thing down. I was tired of dealing with partial debuggers that only let me view some information at a time, only in certain cases. I was tired of code bases that treated complexity like a badge of honor and made building the code a challenge. And I was tired of systems that locked you in to some other systems' way of thinking (I run Windows, I want a Windows interface, thanks.)

This isn't unique to the ColecoVision, by the way, I had the same issue with Sega Genesis debuggers (but the code bases were easier to build.)

I pulled Classic99 up. I've been approaching this point for a while, but I finally went for it. All I needed, in theory, was a Z80. As I'm in the middle of a rearchitecture, I decided to just hack it in for now, since the new code base won't be ready for a while. I ended up creating a new working folder.

Then, I hunted down a Z80 core. I found a nice single-file one that was recently updated, but it required this magic include library. This library is apparently great because it doesn't use any of standard C, and is a pure header library. Personally, I think that's a stupid thing to be proud of. First off, the standard C includes are, you know, STANDARD, and have been for nearly half a century. It's okay to use them! Second, this code didn't do anything weird enough to warrant it.

When I saw that one of the purposes of this library was to redefine its own types for char, int, short, etc (and not using the standard terms, but using its own terms so you're locked in), that was when I discarded it. I created my own include file, and in about 15 lines had everything that this code used.

Yeah... 15 typedefs is why I needed to download this whole other library. Kids these days.

Unfortunately, there was an entire struct not defined anywhere in the code. There was a script that I guess was meant to generate the missing header file, but, you know, Windows loser. So I actually found another project that already had this missing file, dropped it in, and built. F$#%#$ everything.

Okay, now at least the Z80 core itself WAS as simple as I could have hoped, and the documentation adequately described what you needed to do to set it up. (Except for mentioning you had to provide your own copy of the context struct, but I guess that's fair. Would have liked to see it called out).

The memory read/write and IO in/out functions were easy to link to the existing Classic99 functions. I added basic joystick emulation and handled the NMI in the interrupt code. I put the reset calls in with the reset of the 9900 and F18A cores, and then I hacked in some extra code to load the ColecoVision BIOS and copy the loaded cartridge over to Coleco space, which made the Classic99 load functions work. (I gave the Z80 its own memory since the 9900 memory space is quite different, having hardware overtop of the cartridge space). Finally, after considering it for a bit, I threw a hacky call to run Z80 cycles next to the call to run a 9900 instruction. This wouldn't give correct speed, but it would at least provide SOME throttle. Oh! And I disabled the TI CRU code that read the keyboard and joystick, so it would not get input.

Then I shrugged my shoulders and hit run. To my surprise, I got the TI title screen, but corrupted with the wrong colors and graphics. But this was a good thing - it means that it worked!


What do I mean it worked? Well, both the ColecoVision and the TI started at the same time, sharing the hardware. The TI was slower, so it drew the title screen last, but the ColecoVision set the video registers last. In other words, I knew that the Coleco side did SOMETHING! I also knew my original plan, to just leave the TI on the master title page, wouldn't be enough.

When Classic99 boots, it used to start the CPU before the ROMs were loaded, which sometimes caused a startup crash on the emulation side. Back then, I implemented a tiny boot ROM that spins the 9900 until it's ready to go. Since that was still in there, I just reloaded the boot ROM before the reset - now the 9900 was still spinning (and so providing timing to much of the system), but it wasn't doing anything useful.

When I reset it again, the ColecoVision splash screen came up!


I booted up Donkey Kong, but something was still wrong, after selecting the number of players, it just hung on the main screen:

But now I could already start using tools I didn't have just two hours ago - the debugger. Although I didn't have Z80 debug yet, I did have VDP debug. I was quickly able to determine that interrupt processing, which is central to most Coleco actions, was not happening. After checking a couple of other games and finding that everything had similar issues, I finally noticed in my code a little hack I entered. The Z80 core takes the NMI input as a pulse, rather than taking a level (and dealing with the edge itself, like the hardware does). But the flag that I'd used for edge detection was temporary, meaning my code never caught the edge. Once I fixed that...

After roughly three hours, including fighting with the compiler and tracking down foreign libraries, it was running. I had to invert the bits on the joystick, as I'd coded them with the wrong polarity, but that was it. I had a game of Donkey Kong, and got halfway around the world in Antarctic Adventure (a game I'd always wanted to see on the TI, actually...) Everything 32k or less worked, so, I have a functioning base.

I just need to add a disassembly and put up the Z80 registers instead of the 9900 registers, and I can use this for dev. (I also need to add megacart emulation and a couple of other small devices, another evening or two will do).

Sadly, I can't share this code as the GPL licenses conflict with my own, but it's a good learning experiment to see that it DOES come together as expected. Once I get the new architecture out, the multiple CPUs will be a lessor issue and easier to update.

Edit: I guess I forgot to mention why the Classic99 debugger is any better than anyone else's. I'll cover that later, when I update it, but for now, here's an example of the debugger being used to cheat at Alpiner: https://www.youtube.com/watch?v=_qjZN6qf1wk


Sunday, March 24, 2019

Dragon's Lair Status

It's amazing to look back and see that I ordered the PCBs all the way back on Feb 7th. They (of course) came later than expected, and I've been battling them for the last week and a bit.

I wrote the software without emulation support (for all but the CF card that I already did) only to find it not working on hardware. At least, not reliably. I could always get the CFI data, but attempts to write or erase the flash were random - sometimes working, sometimes not.

This issue consumed me all of last weekend and by Monday I had implemented full flash emulation in Classic99 and fixed a number of minor isses, but it still didn't work on hardware. I decided I would have to give up, ship the last few boards I could build by hand, and call it done. But I went ahead and created a list of possible things I could verify on the new boards. Then, because I can't let things go, I started hitting the easy ones. ;)

One of those was to desolder the flash chip, program it externally, and try it in my test cart. When this worked, I put it back on the board and tried again, where it didn't work. I quickly realized that now that the OE and WE lines were under CPLD control, I would need to set them to valid states (it turns out both were defaulting to pull up, +5. This disabled the flash output). I ended up setting the default to pull down and giving WE a powerup state of high -- this worked.

I decided, just for kicks, to try that on an unmodified board, and found that my programming load for the CPLD had all the new pins set to CMOS drive. Without a pullup on the board itself, this would leave the lines floating when they were supposed to be high (if I follow right). Changing those to LVTTL suddenly made all my tests work.

Since then I've been correcting minor issues and improving the flow of the software. I thought I had a success earlier in the week, and that led to needing to make another critical change to the game software itself.

The board has a flaw on it that write protects the first 128k of the flash chip by hardware, and disabling this write protect is risky because the trace is under the board (instead of out where it would be easy. When I changed the board I made a specific pass to look for traces like this, but because I didn't label my pins correctly, I thought it was just a power supply ground and left it alone.)

This means I needed to change the software to skip over the first 128k. The game is raw assembly and although I did use offsets and equates for a lot of it - there was some hard coded page wrangling to deal with. Lesson learned - I'm still making sure I got all of it right tonight (even though I've already programmed five boards).

Anyway, the first pass of the programming software needed over 4 hours, so today while it was running I decided to speed that up. I first attempted a mod over at nouspikel.com/ti99 to remove a couple of wait states from the console. My thinking was to hot-rod my main console, add the hacks needed for programming, and have two machines ready to run in parallel. Unfortunately, I failed and it doesn't boot. I don't have time to see why right now, so I put it aside to double check later.

I then moved over to the code itself, and moved the three inner loops to the scratchpad. Scratchpad RAM (the TI has a whole 256 bytes of it!) is zero-wait-state RAM, running fully 3 times faster than all other RAM in the system. (After the slow execution time of the instructions themselves, programs usually see a bit under twice the performance). The main loops are reading from the CF card, writing to the flash chip, and verifying the flash chip data... these three loops only needed about 100 bytes. I was also really happy (once again) to see how easy it was for GCC to link into them.

I also found a bug in a little gimmick I had put in last time. It's /always/ the little gimmicks. I put a little progress bar and running man to show it was working at the top of the screen. I figured that it wouldn't hurt anything. But, sure enough, a bug meant the running man updated every single packet, eating cycles. ;)

Testing in emulation suggested that time would be down to about 2.5 hrs, but my local build Classic99 is a bit broken speed-wise. Loading it up on hardware had a cartridge programmed in 90 minutes. Definitely happier with that!

However, the output carts don't work yet. They don't pass checksum and they had a bug in practice mode. I've fixed the practice mode bug and it's writing again to see if I have sorted checksum (which works in emulation, so it's bothersome). They DO run and play so at least I can verify the game, even if the diagnostics were broken by the move ;)

But anyway, long way of saying that I had to learn so much to get this going, and it never stops. I'm grateful for the knowledge but will be glad when the next project doesn't require so much download into my head. ;) Going alllllll the way back, this project brings together:

- image conversion to TI bitmap mode (a process that took many years to get nice)
- video conversion to frames
- audio conversion to TI audio (took some fiddling)
- VHDL for the CPLD
- programming CPLD hardware
- laying out a PCB for manufacture (rather than the simple tools I used before)
- hand soldering fine pitch components
- troubleshooting this particular hardware
- getting a PCB manufactured (done before, but never fully on my own)
- programming these specific flash chips (I've done others before)
- reverse engineering the CF7 interface (this frankly was pretty simple)
- reading and writing Compact Flash cards in 8-bit mode
- understanding how the CF7 stores disk information on the CF card (this took longer)
- learning how to use DD from Windows to build the CF data I need to program the carts

I don't know, something like that, it took a while. I need to go bed. ;)


Tuesday, March 5, 2019

Counterfeit Flashes?

It's always hard to be sure... but I'm getting a lot of failures on the last tray of flash chips - I'm up to 29 so far (and I only had 110 in the first place!) What bugs me is that the first trays I ran through, the smaller ones, were perfect, as were the first ten or so from the big tray.

I can't find any markings that tell me for sure what's wrong with the chips that fail - they have varying date codes and I don't (yet) have any for-certain authentic chips to compare them against. But the chips that fail all fail the same way - the EPROM programmer reports that the ID command comes back as "0x09090909" (and 0x09 seems to be the ID command), and if I ignore it, both the chip erase and the write attempts fail immediately.

This means if I'm going to reach 100 carts for Dragon's Lair, I'm now dependent on the manufactured run working. I haven't started that software yet, hopefully this weekend. (I started to update Classic99 for the flash emulation and that turned into a re-architecture. So hey, good news, I finally started version 4! ;) But I need to put that aside and do something quicker and hackier for testing the software.)

The good news is that although Fry's didn't have any CF cards, I did grab a bunch off Amazon. Only a couple of them worked on the CF7 (there's really something wrong with that thing's timing), but I only need ONE to work, so I can proceed with that for now. They've billed me for them, so fingers crossed.

Saturday, February 9, 2019

CF7 Compact Flash Hacking...

One of the devices I have for my TI is a cheap little compact flash adapter called either the CF7 or the NanoPeb... it changed names at some point. It didn't come with much for documentation but now that I actually need to get big files into my TI, I decided to see if I could figure it out.

I had a dump of the BIOS already, so I loaded that into my emulator and it didn't take long to map out the startup register accesses. From there it was a quick verification to map that to the CF spec, and I had the data I was interested in.

For anyone curious, the read registers map at >5Exx and the write registers map at >5Fxx (when CRU base >1100 is turned on, of course).

The CF7 deliberately uses 16-bit access but only 8-bits of the bus, but I found turning on 8-bit mode in the CF card worked just fine, so the entire card data is accessible by the hardware, at least. I need to do a little more work on the sector mapping though.. I couldn't (quickly) find sectors I wrote on the PC, using LBA addressing. I think that's a sector, but I need to double-check.

CF seems pretty forgiving, I was even able to command reads and writes through Easy Bug. But I did hit the issue with larger cards just not working when I tried my 512MB card. Reading would hang in EasyBug (on the LED at least), and I would forever get the first byte from the sector only.

Oddly, the startup was able to overwrite my boot sector (grr...), so I guess access is intermittent. It'd be neat to figure out why the cards don't work, but I guess I have bigger fish to fry right now.

My 32MB card works fine, so I guess I'll go try to find some mid-size cards at Fry's tomorrow and see if any of them work. If I can just find one that works, then I can write some code to program Dragon's Lair right from the compact flash card, just using sector reads.

It'd be fun to someday write a replacement DSR that could read a normal card... but add that to the huge stack of things I'll never get to...

Thursday, February 7, 2019

Dragon's Lair Status

While you really want to be watching Twitter for the silly raw status, but some big changes happened recently that I figured I'd chat about.

Basically, we got everything up, we got the code finished, we got the cart stable, and I went ahead and opened up orders. Then I started building and realized that building SMT PCBs is a level 9 skill, and I'm at best a mid-level 8. They're coming, very slowly, but the quality is lower than I am happy with.

Of course, if I had more time, I could work at it and level up my skills. But I'm down to about 45 days before the license expires. So, I panicked.

I will keep working on it, but I decided to investigate manufacture, which I should have done in the first place. The reason that I didn't, you see, was because I didn't provide a way for the flash chip to be programmed on the board. In my mind, at the time, I needed a way to get 56 pins off the board and into my EPROM programmer, which seemed hard. And hey, how hard can assembling 120 boards be?

I've worked late into the last couple of nights and tonight finally got everything going, and so I thought I'd chat about the challenges I went through. Cause they were fun. Ish.

In studying the datasheet, I realized that I actually had all the important pins running through the CPLD already, except two. I had locked down both Output Enable (OE) and Write Enable (WE). OE because the gating to the TI bus was performed by the CPLD. WE because I decided I didn't need to support writing the flash.

It's important to note that the CPLD is totally full - I actually can't support flash writes in the standard cart. But it's trivial to reprogram the CPLD, and so I realized I could use a second CPLD load and write through the card edge connector, if I only just routed OE and WE through the CPLD too. So, I tied up two of the spare lines for that.

The next question was how I was going to use that card edge connector. I eventually realized that I could use my CF7 - this is a compact flash adapter for the TI that simulates 3 floppy drives. But it's possible to change the active disk image from software, and I believe there are 256 available disks. If true, at 400k each that gives us 100MB, which is smaller than the flash but big enough for Dragon's Lair.

Now there is an issue in this interface, and that is control. In order to erase and write a flash chip, you have to be able to issue a precise sequence of writes without any intervening reads. Furthermore, since I'm in 8-bit mode, they need to be 8-bit writes. The problem is that the TI is a 16-bit machine: every access is /always/ 16-bits. Furthermore, every write is preceded by a read - the TI always does a read-before-write access.

So, this means, at the 8-bit cartridge port, every 8-bit write is actually four accesses long - two reads and two writes, only one of which we care about.

Now, the way that the cart works right now, is that we have 13 bits of address space (8k) and 8 bits of data. During a write, we simply capture 12 of those address bits and use them to extend the address bus - this is a pretty simple mapper. Write to ROM and the address you wrote to is used to change the upper 12 address bits. (Why 12? Because of the 16-bit-to-8-bit issue, since every write always becomes two writes, we can't trust the least significant bit. So we ignore it.)

12 bits doesn't give us enough latch though - 2^12 * 8k gives us 32MB. To get 128MB, we need two more bits of latch -- so we just take them from the data bus.

So, in order to program the entire flash, we still need that latch to work, so I thought about using the GROM side to manage the flash writes. But the GROM side is rather tricky - it doesn't have dedicated address pins and instead needs an onboard address register - and all the gates needed to sort data writes from address writes. With it, there's really no room in the CPLD for changes.

After a little consideration, I decided to hack my TI. I disabled the GROM Select line, and tied it to one of the memory expansion pins. This basically gives me a different select line already wired into the CPLD - I just access the cartridge port from a different address. This let me preserve all the existing functionality - read data from >6000, and write the latch at >6000. I decided the new address - nicely out of the way at >E000, would be solely dedicated to writing the flash. Reads from >E000 would be completely ignored.

Unfortunately, GROM was how I managed the reset timing, which is longer than the TI startup. So I decided to make the reset software controlled. I took one of the unused data bits in the latch block, and used it to directly manage the reset line. So now the TI software can explicitly reset and unreset the flash.

So setting the CPLD to relay those writes was easy, but now I come back to the problem - I need to control the writes at a byte level. I decided to use two more bits to control the MSB write, and the LSB byte. If they were both set, then the full 16 bit write is available, which I thought would be helpful to speed up the buffer fill.

When I fired this up - I ran into a new issue that was completely unexpected. Two of the data lines to the flash were input-only. I could not switch them to bidirectional for the writes. This means that a PCB change was going to be required, but there was really no way around it. I took the last two spare pins and remapped them as the data pins.

Then I loaded the cart up and tried it. I was able to put the flash in CFI mode and read back the data, but this was easy. It's just a single write. But I wasn't able to get the write to work.

The first thing I found was that the address for unlock was wrong in the datasheet for 8-bit mode. It reported writes to 0xAAA and 0x555. This required me to go back to 0x6000 inbetween to switch from MSB writes to LSB writes. In 16-bit mode, the addresses are 0x555 and 0x2AA. Noting that the addresses in CFI mode were doubled, I found that double addresses are, indeed, 0xAAA and 0x554. As an advantage, it is all MSB writes. I tried this, and was pleased to see that my single byte write worked!

However, LSB writes didn't work.

I finally had to spend a lot of time with the scope to see what was going on, and I realized that when I went back to 0x6000 to change the byte mask, the hardware was performing a read before the write, and interfering with the sequence.

Fortunately, I had one setting left - if the two bits of the word mask were 00, neither write would reach the flash. So all I did was say that if either bit was set, reads to 0x6000 were disabled. Finally, it worked!

So, I have a lot of software to write now, and I have to prepare a compact flash card. But I've proven that the hardware can do what I need to do, so I have sent off the PCB for quote. If it works, it will set me back 3 weeks but also should give much higher quality results. I hope so anyway!

I dunno, just wanted to talk about that somewhere. Haven't officially announced this cause I'm waiting for feedback from the manufacturer, tomorrow I hope. ;)