So... the Kickstarter has been going so awesome (100% funded in less than 48

(Adam Haile) #1

So… the Kickstarter has been going so awesome (100% funded in less than 48 hours!) I’m working on some stretch goals. First is that I want to update to FastLED 3.x to get the new chipsets. Wasn’t sure if it was going to blow my RAM limitations (and mean less pixel count) but it uses even less RAM, so great work there!

Question about dithering though. Docs say to either call show() or delay() frequently to get it to work best. The way the AllPixel works, however, show() gets called when new data comes to the device. So I don’t currently have any control over the frequency of that. I thought about calling FastLED.delay(0) after every time I check for new data (which means it gets called either right after there WAS data or everytime there isn’t… it’s just keeps looping around checking for serial data nonstop), but then I noticed it just calls show() internally. Which makes total sense. But here’s the confusion… is that going to decrease my frame rate performance? I don’t think given tests I’ve done but I wanted to be sure.

For example, I hooked up the max 680 LEDs (LPD8806), turned on dithering, and let a simple rainbow gradient rip as fast as it could. Without dithering, update (pushing bytes to AllPixel, and FastLED pushing them to the strip) took ~8ms. With dithering turned on it took… ~8ms. So no degradation. Is it really pushing the data out to the strip that fast? Or is it somehow that my setup is making it so dithering isn’t actually happening (I am decreasing the brightness to 64 for these tests)?

Partly, I can’t really see any difference, at least with what I’m testing. So I just want to make sure it’s being done right.


(Daniel Garcia) #2

680 leds at 12Mhz is about 1.3ms per frame for writing the LPD8806 data. If you’re getting about 2.4Mhz data rate over usb it’ll take you ~6.7ms per frame of data writing. My guess is that while you are reading you aren’t also calling show - right?

(Adam Haile) #3

That is correct… read all the serial bytes into the FastLED buffer and then call show()

(Daniel Garcia) #4

Where’s your code again for the reading? I have a thought on how to make this better for you :slight_smile:

(Adam Haile) #5

I’m all for better :slight_smile: The code that uses FastLED 2.x is here:
getData() line 146

(Daniel Garcia) #6

'k - I need to look at how the 32u4 usb works - because ideally what you would have instead would be a loop that looks something like:

CRGB leds[2][340];
uint8_t dispBuffer =0;

void loop() {
pLeds->show(leds[dispBuffer], numLeds);

with an interrupt handler being responsible for copying data from usb as it comes in, and then swapping the draw buffer once a full frame has come in. Or, conversely, adjust getData so that it will copy as many bytes as it has, but then do another show while waiting for the usb buffer to fill back up.

Double buffering like this would halve the number of leds you could drive (maybe make it a configuration option, to allow people to drive fewer leds but have dithering frames written out while the data for the next frame is being uploaded) - but would let you drive the higher frame rate for dithering. Hell, by allowing show and receiving of data to occur simultaneously (vs. right now where it’s show, then a blocking receive of data) you could simply get a higher frame rate.

It looks like with the 32u you can have multiple 64 byte endpoints (hell, even 1 256 byte endpoint) that can fill up in the background without any cpu interaction.

Of course, WS2811 pixels would throw a monkey wrench into this (though if I can fix the interrupt handling there, it might open some more options for you).

64 bytes at that 2ish Mhz rate would take you ~256µs to transfer over usb, and could probably be copied into ram in roughly … (2 clock cycles to read UEDATX from io, 2 cycles to write/post increment) 16µs + overhead - so it would fit in the 30µs window that you have for interrupts if/when i fix the WS2812 interrupt handling for AVR :slight_smile:

Anyway - a lot of thoughts out loud to think about :slight_smile:

(Adam Haile) #7

Interesting… I’ll definitely have to keep that in mind. I like the idea of a user option for a little extra speed at the expense of pixel count. And yeah… that was purely a design decision of valuing pixel count over everything else, while still working with the 32u4 (there’s thoughts of a “Pro” version with a bigger chip though…). And the lack of interrupts was pretty much so that framerate was purely controlled by how fast the data came in. But obviously if I wanted to do dithering I should just have it always pushing frames out no matter what. Since I’ve got to have the WS2811, I’ll hold off for now… let me know if you need a hand testing any of that when you fix the interrupts.

(Daniel Garcia) #8

yeah - the one downside is it takes you about 4 times as long to push a frame to the device as it does to write out an LPD8806 or APA102 frame - that’s 4 dithered frames you could output in the meantime :slight_smile:

(Adam Haile) #9

Well… in that case, what stops me from just calling directly after getData()? As you can see, the only thing in loop() is getData(). So it would be doing the dither frames all the while during it’s time waiting for the next bunch of data. For example, if I set the animation to run at 60fps and it’s only taking 6ms to send and show the data, I should be able to push out at least a couple dithering frames during the extra 10ms before the next frame comes down the pipe… no interrupts required. Right? My main concern here is that it would delay the next “real” frame a few milliseconds if it’s in the middle of a dither show()

(Daniel Garcia) #10

You could certainly do that easily enough. So - from the numbers above, it takes 1.3ms to write an LPD8806 frame, it takes 6.7ms to read an entire frame (and once you start reading a frame, you read it until you have everything. So, that’s 8ms per frame. If you are generating frames at 60fps, that’s 16ms per frame, give or take.

If you were calling every time through the loop after getData (in fact, you could probably just get rid of the call to show in getData at that point) you’d get ~7 calls to show in.

So, what’d you’d have would be 7 calls to show each happening roughly 1.3ms apart, followed by 6.7ms of no updates to the leds while you read in a frame, etc, or a refresh rate of roughly 420Hz for your 60fps. (@Mark_Kriegsman and I have started trying to differentiate between frame rate aka “how many times a second you change the contents of the led array” and refresh rate (aka “how many times a second show is called”). For many people, these numbers are the same, but for some of us, there can be a huge difference between the two :slight_smile:

Now, for laughs, what about WS2811? Well, WS2811 is going to take you 20ms to write a full frame, so in fact your 60fps is right out the window. Instead, you’re looking at a minimum of 26.7ms/frame for 680 WS2811s (vs. 8ms/frame for 680 LPD8806) - which means your frame rate will be capped at around 37fps, which, in this case, would also be your refresh rate.

(And now you understand why I either want WS2811’s to die in a fire and/or I’ve been working on doing parallel output for WS2811 type leds on arm platforms :slight_smile:

(Adam Haile) #11

Sounds about right… and yeah, while I understand the draw of the 2811 from form factor… it’s SO slow. Makes sense why there are things like the FadeCandy and OctoWS2811 with their 8 output channels… no other way to get good frame rates. I’m really hoping that the APA102 takes over as the “new hotness” before long. Looks like the best of both worlds. Need to get my hands on some and try it out. I’ll play around with this method and see if it doesn’t totally suck on the slower chips like the 2811 and 2801… playing around with some further updates to the AllPixel firmware :slight_smile: