Ok library gurus (aka @Daniel_Garcia and @Mark_Kriegsman ), am I (finally) hitting the limit of what the WS2801 can handle here?
I can read 144 bytes from SD in an average of 430 usecs. When trying to read from SD -> push to FastSPI -> read next 144 bytes -> push to FastSPI, the resulting image is horribly corrupted. I need to add a pause between each refresh. The way I’m reading the data in is by pushing all 144 bytes straight into the array:
myFile.read((char*)leds, NUM_LEDS * 3);
The images below list the pause between each read -> push loop. So where’s the issue here? (I have some LPD8806 on the way which I will try once I get them, but till I do, I have no idea whether it is the speed of the IC or not.)
What causes the image corruption? - is the problem that you’re writing data out/updating the strips faster than your camera/rig can catch them? (given that your strip is writing out in about 1.1ms, and your reading takes 430µs, that’s 1.5ms/frame total, or 666 fps - which I suspect is faster than you want for your POV setup 
From the pictures it looks like some column values are being repeated a few times. To me, that suggests that the new column isn’t being read (properly, or maybe at all) sometimes. I’d investigate if you’re getting read failures of some sort?
Ok well there may be more involved here.
In the original code (without the SD), it’s being bitbanged on the SPI bus, and it is also reading straight out of PROGMEM. So it flies like a bat out of hell. It can push entire columns out to the string with absolutely no pauses between each column.
Now, I’m not only reading an SD card, but the string is also being bitbanged on non-SPI pins (the SD is on the SPI bus). So I’m trying to figure out whether it is indeed the SD that isn’t reading fast enough, or whether I can’t push data out fast enough on non-SPI pins.
I think adding a check for read failure will definitely slow the process down, at which point it would work again. 
What chipset are you using? I was certainly able to get above 2Mbps with bitbang’d out SPI data on the 16Mhz avr.
It’s my test string of WS2801s. My LPD8806s haven’t arrived yet.
I mean what are you driving them off of - arduino, teensy 3?
Looking at the original code, they can push this specific image with a delay as small as 273 usecs and it looks perfect. Keep in mind, this is reading straight from PROGMEM and bitbanging the SPI pins. The smaller the delay, the faster the performer needs to spin.
I’m well aware that I can’t get that kind of speed when I’m reading the SD as that will read a 144 byte chunk in an average of 430 usecs (this got bench tested with help from the sdfat library author.) So if I want to be generous, I would think 500 usecs should be achievable. But even when I try 750 usecs, it still doesn’t work. I can’t get anything under 1000 usecs to display right. Doesn’t matter what the image is.
Ok - so change up how you’re reading data a little bit. Instead of reading one line off of the SD card at a time, instead read the whole image into memory at once (you should have enough memory to have at least two, if not more, images in memory, right?).
Then interleave reading in the next image with moving between the lines of the current image. Read 1/3rd of a line at a time, or something like that.
Then, at some point after the next image has finished loading, and you’re tired of showing the current image - flip images.
This should give you a lot more room to play with.
Erm, how do you propose shoving 11,091 bytes of one image into 1K SRAM? Or even the larger 4K SRAM AVRs?
11091 isn’t a multiple of 144 bytes - how many “columns” is a single image, given that each row is 144 bytes (48 pixels).
Sorry, the 11,091 is a different image (11,088 + 3 bytes for the header). Each image is different. This specific image has 43 columns, 48 rows (all images have 48 rows). Plus 3 bytes for the header.
So (43 x 48) x 3 = 6,192, plus the 3 bytes for the header and you get 6,195 bytes. Still won’t fit in SRAM.
Did some more testing. I’m seeing the same failures at the higher rates on the other code that reads from PROGMEM. Although that code can easily push data out with 750 usecs delay between updates to this same string, going any faster causes problems. And going from 750 usecs to 1,000 usecs isn’t something I’m going to cry about. Sooo, it’s a waiting game for me right now. Last I checked, my shipment is somewhere in California trying to find a turtle coming this way. Once I get the LPDs and build the custom boards, I’ll try higher speeds, see what happens.