For folks who raise questions about the overhead of C++ vs.

For folks who raise questions about the overhead of C++ vs. C - check this out. While doing timing testing and tweaking for the CRGBSet stuff, I added support for C++11 style ranged loops - and I discovered that they were faster! For example - this code:

for( CRGB & pixel : leds) { pixel = CRGB::Black; }

runs about 10-20% faster on AVR than this code:

for(int i = 0; i < NUM_LEDS; i++ ) { leds[i] = CRGB::Black; }

(10% faster if NUM_LEDS is less than 255, 20% faster if NUM_LEDS is over 255. Bonus points to folks that can tell me why (I already know and @Mark_Kriegsman I suspect you do too :))

Some good stuff coming down the line :slight_smile: (Also, I’ve changed how i’m referring to these things internally to PixelSets - mostly because RGB pixels aren’t going to be the only types of pixels supported before long)

This is GREAT. I’m loving the new compact, simple syntax for doing the thing that we do ALL the time: loop over pixels… Just great. And the performance increase? Very, very nice!

What a coincidence, I just finished the chapter in C++ primer on iterators and range loops!

No idea why a byte value would run faster though…

That’s a bit (8 to be precise) over my head!

Any tests on an ARM platform?

I don’t understand (yet) how the C++11 style loops work, but I do know that a 10% or 20% speed improvement is a very nice bonus.

Any link suggestions for where to learn the basics of this style loop would be welcome. Because 10%!

So - to be clear, this is 10-20% over what is a mostly empty loop (foo = CRGB::Black basically just copies 3 zeros - if you’re doing a whole lot of work inside of the loop, you probably won’t see a whole lot of increase. (Also the performance difference has as much to do with me tweaking the internals of CRGBSet and its iterators, it used to be 70% slower than the regular C loop).

Not every thing can use this kind of loop - although C++11 is trying to make it easier. is a pretty good article about it.

I need to be careful with these features in the library. Some things - like putting in support for iterator based range loops can be done in a way that folks using pre-C++11 compilers can still use the library. Other things, though, for example, if I were to start using ranged loops inside of FastLED anywhere, would break people who aren’t using C++11/C++14 compilers yet.

Still, i’m looking forward to spending some time with the spec and new features and seeing what I can pull into FastLED to make things easier for folks :slight_smile:

@marmil ​ for(reference to object of: this){ operation on object }

The compiler determines the elements in the range (this) and works its way through the code between the range brackets for each element in “this” without needing to manually increment an index the way we do with a traditional c- for loop.

Passing a reference (&) to the range allows us to modify the iterator we are currently at whereas without, you can only “view” the object, not modify.

For example:

int i=0;
for(CRGB pixel: leds){i++;}

Would yield NUM_PIXELS, which is to say, for each CRGB pixel in leds, we increment i by 1.

If you Google range loop it should yield some insightful reading.

@Jarrod_Wagner right now the range loop is about 10% slower on arm - some of this, I suspect, is because a lot of operations that on arm take only a single cycle can take 2-8 cycles on avr. As I did with avr, i’ll spend some time digging through the compiler output to see what I can tweak.

Ok - tracked down. It turned out the C loop on ARM was 10% faster than the ranged loop because the compiler was taking advantage of the loop conditional being a constant and was putting an extra bit of optimization in there. If I changed my test code to pass in a CRGBSet for the range loop test and a CRGB*/int for the C loop test, then the tables are reversed, and the C loop is now 10% slower. For those interested in what the compiler is doing under the hood - here’s a note dump from digging into it -