I intend for this to be the last release with an ambiguous license. Right now I am constrained by the terms of the M6502 code, which has some "no commercial use" clause in it, preventing me from applying my preferred MIT-style license. I began writing a replacement 6502 core with no strings attached, but that effort has taken a six month hiatus. Rewriting the CPU core will also get me one step closer to cycle-accuracy - close enough that if I were really serious about it, I'd have to rewrite the video... again. Bleh.
Here are a few changes I do remember:
- Keyboard input now works correctly when there's no joystick connected
- Stripe recording (see http://ahefner.livejournal.com/13165.htm
l) - "Screenshot movie mode" (toggled by Control-F12)
- FUSE filesystem support, disabled by default (see http://ahefner.livejournal.com/14789.htm
l)
Also, I've started including the git repository in its tarball, tripling its size. I think this is terribly inconsiderate, but I deleted the old git repository on the web server, so I feel like I have to back it up somewhere. I still don't understand these mixed messages from the git enthusiast crowd, where on the one hand they claim that pushing to a central repository isn't really done, but then go off and appear to use github for exactly that purpose. Whatever. They are strange people.
Previously I mentioned my simple music player in CL, Shuffletron. Shuffletron uses save-lisp-and-die on SBCL or save-application on CCL to create a standalone executable, and is launched by a wrapper script that invokes it with rlwrap when available. It depends on several foreign libraries (more than I expected, in fact), and there were an alarming number of problems with the binaries I'd originally posted, all related to these library dependencies. I think I've solved them, though not necessarily in an optimal way.
The first problem, observed and fixed before the publically released binaries, was an issue where cffi-grovel (on behalf of Osicat) created a shared library with some auto-generated C wrappers for various functions, which the saved executable now expected to reopen at startup. I didn't realize this until my first user reported it dying at startup with an error about not finding /home/hefner/clbuild/source/osicat/posix/w
The next problem was simple - on machines without the libc6 development package installed, the program failed at startup, unable to load librt. This was a simple matter of Osicat not using the best choice of name by which to load the library, a problem which has since been fixed. To be doubly certain, I modified the build script to close this library before saving the executable, because I wasn't using any functions from it anyway.
The most aggravating problems I encountered were with the libmpg123 library. Some users reported "undefined alien function" errors whenever they tried to play a song. Others reported a scary error about stack alignment as soon as the problem started. To some extent I haven't solved these (in the sense that Mixalot users and people building from source may still get bitten by them) so much as hacked around them for the sake of the binary release. There were two basic problems:
- On 32-bit Linux, with newer versions of libmpg123 than my Debian-running laptop happens to have, the compile-time choice of large file support breaks binary compatibility with the libmpg123 library, wreaking havoc with the FFI bindings. Specifically, if large file support is enabled, a number of symbols change names, e.g., mpg123_open becomes mpg123_open_64. This is irritating and not how I'd have done it (for instance, I don't think the version number of the shared library changes to indicate the break in compatibility), but I could've easily worked around it in mpg123-ffi by detecting which version is present at the time the library is initialized. I very nearly did so, were it not for the following problem:
- On 32-bit Linux, due to the desire to have properly aligned data when using SSE instructions, recent versions of GCC provide support for aligning stack frames along larger boundaries, such as 16 bytes. The mpg123 developers seem to take this support as sufficient justification to ignore the platform ABI, introducing dubious stack alignment checks at every entry point to the library. This is not something I know how to work around from the CFFI binding in a reasonable fashion, so for the time being I've given up on adapting to such hostile versions of the library. Fortunately, I've absorbed enough trivia eavesdropping on the SBCL hackers on IRC to know that Darwin uses 16 byte alignment for precisely the same reason, so SBCL must support it, and after a few minutes searching I hacked the align-stack-pointer macro in SBCL's x86 backend to enable the 16 byte alignment on Linux as well, which appears to work around the problem (at the cost of my build environment getting even stranger). I also rebuilt the libmpg123 library with --disable-largefile and, just to be thorough, --disable-aligncheck, and set my CFLAGS to -mstackrealign, two measures which I think should've sufficed to solve the issue even without a hacked SBCL.
The 32-bit Linux/x86 binary now includes a rebuilt version of libmpg123, renamed libmixalot-mpg123, with these changes. The 64-bit binary doesn't have any of these problems, and should Just Work, but you have to provide your own libmpg123 as before. My mistake, of course, was using libmpg123, but I wasn't aware of libmad when I made that choice, and I'm not enthusiastic about rewriting otherwise perfectly working code on account of these issues (although I probably will, sooner or later). My current binaries, in conjunction with Shuffletron 0.0.3 (featuring various minor improvements), should be free of these problems, but still haven't been as widely tested as I'd like. The lesson to me, obvious in retrospect, is to test these things on a more diverse set of configurations than my own machines (particularly when they all run the same release of Debian), even when it seems so simple that nothing could go wrong.
- Music:Gorillaz, "Kids with Guns"
Bordeaux-FFT is a small library for computing the FFT/IFFT on complex data, originally written by Robert Strandh (and/or Sylvain Marchand and Martin Raspaud), with contributions by Paul Khuong and myself. Last summer I did some audio processing experiments, originally using the FFT code from Sapaclisp. Robert helpfully volunteered his FFT implementation, which I cleaned up slightly and have been cheerfully employing in various audio hacks ever since. Several versions have changed hands through email, lisppaste, and my web server, so I've finally come around to collecting the changes, writing a brief manual, and tarring up a release. It's surprisingly fast, particularly with Paul Khuong's recent work on SBCL, although I don't know how it stacks up against some of the highly tuned assembly language implementations out there. I use it along with my fledgling task queuing code to grind out batches of FFTs across four CPU cores.
Shuffletron is a simple music player in CL, with a few interesting features. I've been running this program full time for several weeks now as my preferred music player. It snuck on to the lisp subreddit before I really announced it, and I'm sure folks on IRC are already sick of hearing about it, so I won't say much. I began with plans for a much more ambitious player, with a fancy graphical interface, and wrote the first version of this one Saturday realizing I needed something simple to put the audio code through its paces while I wrote the full player, intending to include this one as an example program with Mixalot. It worked better than expected, so I ended up fleshing out the feature set instead and decided that it was really all I needed. The code is lean and mean.
I've had a number of problems trying to produce redistributable binaries of this which I believe I've finally solved (although new binaries are forthcoming). I hope to write about these later.
Mixalot is the audio back-end of Shuffletron, factored into its own system(s) because it might be useful for other purposes. It includes a mixer which pulls audio from any number of streamer objects and outputs them to ALSA. It also includes FFI definitions for libmpg123 with some helpers for decoding MP3 files and reading ID3 tags, and a streamer class for decoding and playing MP3 files in real time. The libmpg123 portions are usable independently of the audio mixing/output code.
Perhaps eMusic feels that access to the wider catalog of music will offset the increased cost. If they think their existing subscribers will understand, they are gravely mistaken. Given the option, I'd rather Sony Music go out of business, their back catalog be destroyed in a fire, and their CEO choke to death on his breakfast. This article puts it nicely:
Most eMusic fans I've heard from are real music nuts, and are there to sample a wide range of music from relatively unknown cutting-edge acts, not to download music they could find anywhere. Imagine the clerks in High Fidelity suddenly being told that their favorite mail-order distributor is raising prices, but in exchange will now let them order ABBA and Chili Peppers records just like the chain stores in the mall.
In my brief investigations so far, I haven't found another music service that is competitive with eMusic, even (perhaps) at their new price point. I don't consider that an argument for staying with eMusic rather than defecting to another service, though. Given that I consider $0.40/song roughly twice what I'd call a reasonable price, I instead take it as an argument for not buying music online at all. It isn't like there aren't alternatives.
- Music:Au Revoir Simone, "The Lucky One"
I hate it. All I wanted to do was write a simple goddamn music player.
hefner@lightworks:~$ ls -l /tmp/nesfs total 0 -rw-rw-rw- 1 root root 8192 Dec 31 1969 chr-rom -rw-rw-rw- 1 root root 256 Dec 31 1969 oam -rw-rw-rw- 1 root root 32768 Dec 31 1969 prg-rom -rw-rw-rw- 1 root root 2048 Dec 31 1969 ram -r--r--r-- 1 root root 22 Dec 31 1969 rom-filename -r--r--r-- 1 root root 15 Dec 31 1969 rom-hash -rw-rw-rw- 1 root root 8192 Dec 31 1969 sram -rw-rw-rw- 1 root root 16384 Dec 31 1969 vram
I suppose I could fix the other fields in the stat structure. This is the first time I've done an ls -l here to notice. The FUSE examples I started from didn't bother. :)
This will appear in the next version of my NES emulator, although I can't say when that will be. Until I feel like spending an unpleasant hour or several learning to do merging with Git, probably.
- Music:MC Hawkings, "The Hawkman Cometh"
Apparently people have downgraded systems from testing to stable using APT, but this sounded tricky. Just changing the sources list and doing dist-upgrade doesn't work, and apparently people who have done this have to pin a bunch of packages and somehow fool it into downgrading. Given that most of the info Google turns up on this is seven years old, I didn't want to risk it. Normally I'd just do another install into a spare partition, but having had such a smooth ride with Debian for so long (I ran the same install for something like five years, until finally moving on to 64-bit), this is the first Linux machine in years I didn't bother making such a partition on.
Instead, I installed Debian stable into a subdirectory (/lenny) using debootstrap, chrooted in to install packages, and prepped the /etc directory. Now I was ready to switch systems. I got the statically linked version of busybox and, while still in X11 with music, Firefox, and IRC going, did the following:
lightworks:/# mkdir sqeeze
lightworks:/# mv bin emul/ etc lib lib32 lib64 sbin usr var squeeze/
lightworks:/# cd lenny/
lightworks:/lenny# mv bin emul/ etc lib lib32 lib64 sbin usr var /
lightworks:/lenny#
Now I'm running Debian stable. Everything is working just fire, except I had to restart firefox, and install some xfont packages for emacs. This is one of those rare instance where I'm genuinely pleased with my operating system. There's some loose ends to tie up, like configuring X11 (I'm still running the one from sqeeze; I'll fix it next time I quit X), and some miscellaneous packages to track down, but at this point I think I'll declare this stunt a success.
- Music:Lucasfilm Games, Ballblazer

* Including instructor (at t=0)
I'm fiddling with some C code, experimenting with different approaches to various subtasks, some of which depend on previous tasks. Results are valid only for the duration of the current frame. Performance matters, so I tend to comment pieces I'm not currently working on unless they're needed. For the time being, the data is often only displayed as debugging output, but sometimes the debug display is turned off, making the computation wasteful. Up to this point I'd been juggling the dependencies manually, calling each computation in order from a central point, but finally I've gotten restless to automate things. Here's some cute C preprocessor fun I bashed out a few minutes ago:
struct task {
char *name;
int last_time;
int locked;
void (*taskfn) (void);
};
#define deftask(name) \
void task_do_##name (void); \
struct task name = {#name, -1, 0, task_do_##name}; \
noinline void task_do_##name (void)
static inline void require_task (struct task *task, int time)
{
assert(task != NULL);
if (task->last_time < time) {
if (task->locked) {
fprintf(stderr, "Fatal: Circular dependency on task \"%s\".\n", task->name);
exit(1);
}
task->locked = 1;
task->taskfn();
task->last_time = time;
task->locked = 0;
}
}
I interface this with my code and the global frame_number as follows:
#define using(name) require_task(&name, frame_number)Next I define various tasks. Some depend on others:
deftask(track_motion)
{
.. some code here ..
}
deftask(feature_matrix)
{
.. more code here ..
}
deftask(grid_alignment)
{
using(track_motion);
.. yet more code..
}
Code elsewhere also relies on the 'using' macro to ensure computations are up to date. The components communicate through global state. If they didn't, and the style were more functional, I supposed I'd be blogging about how I'd implemented memoization instead. Eventually, I might rig this up to farm the computation out to other threads (though this doesn't seem like such a huge win unless I go back to eagerly computing things in advance of needing them). I'd need a parallel 'using' operator.
- Music:Frank Zappa, "San Ber'dino"
Plan A: Inject massive amounts of money with the aim of creating a new bubble, letting us coast by with another five or six years of illusory prosperity before it all blows up again in an even bigger and more spectacular fashion, leaving it to the next guy to sort out the mess, just like Alan Greenspan did to Bernanke.
Plan B: Destroy the currency! Massive inflation! Double, even triple the price of every imported good. What better way to get more people working than to force them to get second jobs (or first jobs, if they're wealthy or retired). Let the proles claw their way out of the pit while politicians fight to take credit for "solving the crisis," and Bernanke rests on the seventh day after reaffirming that his powers of economic destruction are not completely impotent.
Barring some miracle of science, like the invention of cheap and bountiful fusion power, I don't forsee a economic recovery for the United States. The collusion of corporate interests and a reckless and parasitic government is finally going to kill the host. At best, I can imagine stagnation in the near term. After that, a long and unpleasant decline, punctuated by occasional collapses in the house of cards constructed by government and financial interests, as the delusiion of perpetual exponential growth steadily erodes and eats alive both the financial system and (with its impossibly large debts) the US government. All the while, dwindling supplies of petroleum and nuclear fuel will push the chances of recovery and growth further and further out of reach, until civilization as we know it cracks under the pressure and dissolves. For extra fun, toss in the wild card of destructive climate change, if you believe in that sort of thing.
This scenario is horrifying, so convince me otherwise. Central bankers and politicians are playing god, and every move they make seems destined to trade the future for their own short term gain. It isn't necessarily maliciousness, just stupidity, desperation, and shortsightedness. Our leaders are going out screaming and flailing -- in quicksand.

It's a bit rough (and that's one of the better portions), but you can see the path of Mario hopping from point to point. I tried this with several games: Super Mario Bros. 1, Super Mario Bros. 3, Duck Tales, and Life Force. In the latter case the game scrolls at a fixed rate, so you get a sort of shabby automap.
- Music:Matti Raekallio, Prokofiev Piano Sonata No. 1
At a glance:
- $20.5 billion: Agriculture, Rural Development, Food and Drug Administration, and Related Agencies
- $57.7 billion: Commerce, Justice, Science, and Related Agencies
- $33.3 billion: Energy and Water Development
- $22.7 billion: Financial Services and General Government
- $27.6 billion: Interior, Environment, and Related Agencies
- $151.8 billion: Labor, Health and Human Services, Education, and Related Agencies
- $4.40 billion: Legislative Branch
- $36.6 billion: State, Foreign Operations, and Related Programs
- $55.0 billion: Transportation, Housing and Urban Development, and Related Agencies
- $0.1 billion (?): Further Provisions Relating to the Department of Homeland Security and Other Matters
Division A - Agriculture, Rural Development, Food and Drug Administration, and Related Agencies
Total Funding: $20.5 billion
Major items:
- Nutrition for Women, Infants, and Children (WIC): $6.9 billion
- Rural Development: $2.7 billion
- Agricultural Research: $2.3 billion ($1.1 billion for Agricultural Research Service, $1.2 billion for the State Research, Education, and Extension Service)
- Food and Drug Administration: $2 billion
- International Food Aid (P.L. 480): $1.2 billion
- Food Safety and Inspection Service: $972 million
- Conservation Programs: $968 million
- Other: $3.46 billion ($20.5 billion minus sum of the above)
9.1 billion Americans? Perhaps they mean 9.1 million. Just dividing the funding level by 9.1 million people, we get about $758 per participant. I found this great table which sheds a little light on the program.
- FY 2008 Participants include 2,153,250 women, 2,222,533 infants, and 4,328,516 (presumably non-infant) children. Total participants: 8.7 million in 2008. So the 9.1 million estimate for 2009 sounds reasonable.
- FY 2008 Food costs: $4.5 billion, or $521 per participant per year ($43.42 per month, as the table shows).
- FY 2008 Nutrition Service and Administrative (NSA) costs: 1.6 billion. From the link above: "Approximately two-thirds of total costs are used to provide nutrition education, breastfeeding promotion and support, and linkages to health and other client services (e.g.,immunization; drug, alcohol and tobacco education; referrals to family and child health social programs). The remaining third is used for traditional management functions."
Still, I can't help but think you could just mail every woman, infant, and child in the program a check for $700 (or monthly allotments if you prefer) and still keep a few tens of millions of dollars on the side for printing pamphlets and running TV commercials educating people on nutrition and whatnot. Perhaps it isn't politically tenable to redistribute wealth without first sanitizing it through a massive bureaucracy. That's how the middle class gets its cut, I suppose.
Moving on, Rural Development: $2.7 billion. This strikes me as outrageous, but I'm having trouble discerning from the bill exactly what it entails. It seems to be some combination of loans, rental assistance, various development subsidies, and the broadband initiative. However it works out, I'm opposed on the principle that people who choose to live far away from major population centers ought to pay their own way and not receive their infrastructure at a massively subsidized discount. In a time of financial crisis, it is not clear to me how pouring money into rural areas is a sound investment in economic growth. Probably doesn't hurt if you're a representative trying to get reelected, though.
Division B - Commerce, Justice, Science, and Related Agencies
How odd that they lump law enforcement and science together under the same heading.
Major justice programs:
- Federal Bureau of Investigation: $7.1 billion
- Federal Bureau of Prisons: $6.2 billion
- State and Local Law Enforcement and Crime Prevention Grants: $3.2 billion
- Drug Enforcement Administration: $1.9 billion
- Bureau of Alcohol, Tobacco, and Firearms: $1.1 billion
- NASA: $17.8 billion
- National Science Foundation: $6.5 billion
- National Oceanic and Atmospheric Administration: $4.4 billion
- Global Climate Change Research: "Nearly $2 billion"
- NIST Research: $819 million
- Census Bureau: $3.1 billion
- Mood:sick
The textual MML format worked very well for rhythms and experimenting with sound parameters, but I never warmed to it for writing melodies. If I play with NES music again in the future, I think I'll write my own player routine and compiler instead.
Also, note that due to the steady deterioration of the Linux platform, recording the above NSF file to create the mp3 took me over an hour, and then only because I gave up and used Windows (although I understand the trend is to make this sort of thing increasingly painful there too). If Debian still provided me with xmms and the NSF-player plugin, this would have taken less than 30 seconds, and I wouldn't be ranting to whoever will listen about how I want to boil the ALSA developers alive in hot oil.
- Music:Slick Rick, "Mona Lisa"
I get the impression many emulators follow a curve similar to the one described in that document. I tried my test on Nestopia, FCEU, and (of course) my own emulator, and they behave similarly in this respect. On the other hand, Nintendulator seems to mix channels differently, because my original version of the audio test sounds better there. Unfortunately, I haven't tried this on a real NES yet, because it's bigger (256KB) than I can conveniently program to an EPROM right now, and I've have to build a new EPROM cart to host it even so (my current one only handles 64 KB). I should probably just buy a PowerPak and/or a CopyNES, but I'm still enjoying desoldering donor carts and playing with chips. =p
If I couldn't try the full music loop on a real NES, at least I could write some simpler test programs and run them from my EPROM cart. It also occured to me that aside from the nonlinearity due to the channel mixing, there might be some additional nonlinearity inherent in the DAC, significant enough for someone trying to get the best audio they could from the machine, and it'd be interesting to try and measure. For my first attempts at this, I took the approach of generating test signals in software from the NES, recording into my MOTU audio interface from the NES audio output, analyzing the resulting recordings with a little CL code, and finally plotting the results using Octave (shown below).
- "Square rising edge" - a square wave at a few hundred Hz (220? 440? I forget.), increasing in amplitude each cycle from 0 to 127. I measured the height of the rising edge. I let this repeat 29 times and averaged the results. I screwed the program up and only captured every other amplitude level (incrementing the amplitude variable on every clock edge instead of just one).
- "Pulse 1" - Same idea, ramping up the amplitude of a brief pulse. I measured the height of the largest rising (or rather falling) edge in the viscinity of the pulse. I let this loop for 20 or 30 minutes, averaging out 1,440 samples per amplitude level. The individual ramps didn't look right, having a curious moire-like modulation as the energy of the pulse fell differently into samples, but it averaged out into a very nice curve. Unfortunately, it was different enough from the previous curve to bother me, although it shared some common features.
- "Pulse 2" - I thought I'd try lengthening the pulse, hoping I might get cleaner data. Instead I got the opposite - the resulting curve, even average over 1,300 samples per amplitude level, is extremely jagged and irregular. I haven't figured out what went wrong here
I'd really hoped these measurements would agree, so that I could divide out the affect of the mixer equation and have something like a definitive measurement of the DAC linearity which I could build into my emulator (and use to correct audio for playback on the real hardware). As it is, I'll have to try again. It would probably make a lot more sense to measure from the output pin on the NES CPU, rather than measuring after it has passed through the final mixing and filtering circuits, but I haven't tried this yet. Ideally I could program a DC level into the DAC and measure it on the output pin at my leisure. To this end I put together a little DAC test program which lets you select the output level and toggle the signal on and off. This program, along with various other hacks including the signal generators for the three tests above, is on my NES test cart image.
- Music:Chrono Trigger OST, "Time Circuits"
Economic stimulus? It's a shell game.
Update 2/10/09: I think the stock market agrees with me!
- Mood:pissed off
- Music:The Presets, The Girl and the Sea
These demos demonstrate the full color palette which the NES is capable of - 410 colors - on the screen simultaneously. This is done via the trickery of changing the palette registers mid-scanline, 14 times, in conjunction with using color emphasis bits every several scanlines. The NES color palette is typically cited as containing 52 unique colors, but the aforementioned color emphasis bits allow tinting of the video output, producing additional colors.

The CPU does not have direct access to the palette registers, which exist in the PPU address space, therefore the CPU must access them through writes to a pair of address and data registers. The address register is also used by the PPU during rendering, therefore background (and supposedly sprite) rendering must be disabled in order to write to the palette mid-scanline. Writes to the data register autoincrement the address by your choice of 1 or 32 bytes.
Since rendering is disabled, you'd expect it necessary to continually change the zeroth palette register at $3F00, whch controls the overall background color. If the address register were configured to increment by 1, we'd have to reload the address each time to reset it to $3F00. Adding two absolute store instructions to reset the address would take 8 clock cycles each time. The CPU is very slow compared to the video signal - an NTSC NES PPU outputs 3 pixels during each CPU cycle, so 8 CPU cycles would increase the width of each color cell in the picture above by 24 pixels. They wouldn't fit on the screen! Alternatively, the 32 byte increment mode (really intended for updating vertical stripes of background tiles) saves us this trouble, because the palette registers repeat every 32 bytes between $3F00 and $3FFF, so each increment leaves us pointing at a mirror of the $3F00 palette register ($3F20, $3f40, etc.) This works until the eighth write, when we wrap around to $0000, which isn't a palette register at all. At that time, we'd still have to do our two writes to reset the address to $3F00, introducing a wider stripe in the middle of the screen which clearly doesn't exist in the screenshot above. Or perhaps you could do three writes (the address is latched after the first two), so that you only have to do one write to reset the address, but that's not what's going on here.
So how does it do it? The demo relies on two odd features of the hardware. Most importantly (and perhaps by accident), when rendering is disabled and the address register points at a palette register, that color will be displayed rather than the expected background palette entry (that is, $3F00). This allows the built-in autoincrement by 1 to cycle through which palette register will determine the color at the current raster position. Because this is a post-increment, the display color will always be ahead of the colors we're writing, so it actually displays the palette entries we programmed on the previous scanline, but this isn't a problem. Using this approach, unrolling the two instruction loop "stx $2007; inx" allows us to change the color in 4+2=6 CPU cycles, or every 18 pixels, which corresponds to the width of each color stripe in the screenshot. Notice that the color stripes are not smooth, but rather have a rough edge. To the best of my knowledge it is not possible to synchronize precisely with the previous scanline and straighten this out, because the scanline width in pixels (including the horizontal blank) is not an integral number of CPU cycles.
The other peculiar point, which might occur to anyone familiar with the NES hardware, is that the machine really organizes palettes into groups of three colors (plus a background/transparency color) each, four for background tiles, and four for sprites. Despite having four distinct palettes which can be applied to background characters, normally the first entry of each mirrors the overall background color in the first palette, at $3F00. If that held in this case, one in every four of those color stripes would be the wrong color, which isn't occurring. So it seems there are three more palette registers ($3F04, $3F08, $3F0C), but during normal rendering they all defer to $3F00, which seems a tremendous waste (unless, of course, you can explain this in a way which doesn't require these three registers to exist, such as the last written color value getting latched somewhere, but that's not what the guys who've actually done the reverse engineering say). I wonder what the rationale was for designing it that way.
With respect to the emulator, I realized I only needed to add one or two lines of code to emulate this effect. During normal rendering, my background and sprites are rendered instantaneously at the start of the scanline. If the palette were to change in the middle of that, it would indeed require reworking the video rendering so that you could interleave it with the CPU execution, mindful of timing (being called upon to catch up with the CPU after some number of cycles when a control register is about to be changed). My audio code already works in this way, as does the handling of the color attribute and mono bits (which are filled in a buffer parallel to the current line's color buffer and combined by the video output filter before the next line). Then I realized I'd overlooked the obvious - you only do these mid-scanline palette tricks when rendering is disabled anyway, so I'm free to fill over the contents of the color buffer using the same catchup mechanism. So it fell out for free.
- Music:Roni Size, "New Forms"
Every three or four years, I've pulled this code off the shelf and made minor improvements to it. This year over the holidays I really got sucked into it, and whipped the thing into pretty good shape. It can now run much of the commercial and homebrew NES software I've tried with little or no issue, and the audio sounds accurate to my ears. It can run fullscreen, use gamepads, and supports save/restore state, so by my standards it's ready for recreational use, even if (in this era of cycle-accurate emulators like Nintendulator and Nestopia) it doesn't offer anything new. Just what the world needs, another mediocre NES emulator, right? For this reason, I bestow upon it no frou-frou name, dumping it anonymously onto the net.
The renderer is scanline-based, except for color emphasis and the grayscale flag, which are tracked with pixel accuracy. This lets me emulate most of the interesting scanline-based effects, like the wavy water in the CMC Wall demo, the nifty copper bars demo, or (my favorite) the light beam effect when you light an orb in Final Fantasy.

Although it's probably passé in this era of exact NTSC composite video emulation (dot crawl, artifacts, and all), I added a neat interlaced scanlines mode entirely for the sake of seeing how this classic highres interlacing demo might look:

I don't think the effect would work on a real NES. As I understand it, the NES video output is not interlaced (that is, opposite video fields aren't offset, so you'd just see the two images flickering overtop each other). This might be the first thing I try when I finally get some kind of NES devcart put together.
Incidentally, if there were an award for Best Graphic Design in an 8-Bit Videogame Title Screen, I think Darkwing Duck would take it. Too bad the game doesn't live up to the title screen.

Having spent more time than I care to trying to nail the correct timing and behavior of the MMC3 IRQ counter, I'm thoroughly sick of messing with it for now. Plus, this way I'll leave plenty left do this time next year when I return to it (precise timing, MMC3 quirks. MMC2 and MMC5, etc). In the mean time, I can play Zelda, Final Fantasy, SMB1-3, and pretty much every NES game I'd actually want to play, perfectly well. I know the hacker machismo of the nesdev community demands you emulate every obscure Japanese game and one-off pirate mapper, but I can think of more interesting things to do. :)
I also used this project as a guinea pig to learn how to use Git, or attempt to - an effort I would classify as a failure and a waste of hours of time. It's really astonishing how unintuitive and opaque Git is, and how completely useless every tutorial I've seen on the web is (which all explain how to create a repository, make a commit or two, then go off rambling about semi-obscure operations without ever explaining the rest of the things you'll need to know every single day or presenting any kind of usable workflow).
All I wanted was a way to push/pull changes back and forth between my desktop and laptop via a copy of the repository on my web server as a backup / public repository. After much pain I think I did figure out the correct magic incantations (pull --rebase, etc.), but with the accumulated frustration, the last straw was noticing that all the web-visible source files were ancient versions, and that a git-status on the server's repository indicated a backward view of the repository (as if it wanted to commit the ancient version over top the newest work). In fact, I still don't know how to fix this, and it's probably easy, but at this point I don't care. I'm nuking the thing from orbit, because practically anything would be better than having to make three backups of my repository and spend half an hour on google every time I want to use the version control system. CVS would be better. Hell, diff, patch, and some duct tape would be better. Or cp -r. Or pretty much anything that doesn't have "detached heads."
Anyway, I dumped the code at http://vintage-digital.com/hefner/hacks/n
- Music:Mastodon, "Shadows That Move"
The principle is the same as all the racing games that used per-scanline scrolling changes to depict a curving track, but turned on its side, with walls instead of roads. Each scanline becomes a vertical slice of the display, and you precompute these for the columns of your wall texture at various scaling distances. Back of the envelope calculations suggest that these prescaled textures would consume a lot of memory, so you'd have to settle for very small/simple textures, store them in a large ROM (or precompute them into large RAM, e.g. 320 XE), or compute some of the intermediate scalings at runtime. This leads me to wonder how much time it would take to determine the visible surfaces each frame. Could be interesting.

