Question for the community.

clolsonus · March 9, 2018, 11:17pm

Question for the community. I have an application (a fixed wing autopilot) that I run on a pocketbeagle (so main disk is a micro sd card.) My app has a main loop that runs at 100hz. This loop writes a bit of data each time through. I am noticing that periodically my loop runs long because the kernel drivers get busy writing data to the SD card. Essentially, the write performance of my SD card is the main bottleneck in my system and the main reason for delays in the system.

This isn’t an ultimate show stopper, but I need to buy another micro sd card or two and I’m trying to figure out a good choice.

Are there bottlenecks with the pocketbeagle hardware itself that would make it so above some threshold, it would be pointless to spend more money on a fancier micro sd card? Or is it productive to go crazy and buy the fasted UHS-II card available for high speed/high resolution cameras? Has anyone looked into this in detail?

Thanks!

Andrew_Tridgell · March 10, 2018, 1:22am

are you running the main loop as a realtime (FIFO scheduled) process? Are you doing the IO writes in a separate (lower priority) thread?

Andrew_Tridgell · March 10, 2018, 1:24am

also, faster microSD cards are not always best for this application. What you care about is worst case latency, not IO bandwidth. The “fast” cards are optimised for IO bandwidth for cameras. It isn’t unusual for them to have higher worst case latency as they achieve the bandwidth through larger buffers. Larger buffers can mean worse latency.

Andrew_Tridgell · March 10, 2018, 1:27am

This is worth watching: https://www.youtube.com/watch?v=K3zb6p0thQU
Plus everything Bunny has written on microSD, such as this: https://www.bunniestudios.com/blog/?p=2297 (search for his other posts)

clolsonus · March 10, 2018, 1:50am

@Andrew_Tridgell I am intentionally maintaining a single threaded architecture on the linux processor in the name of simplicity. You may be right, but it’s my impression that the delays are happening in kernel driver space, not user process space, and as such affect any user process of any priority similarly.

Watching the video now … finding out far more than I expected (more than I wanted?), but fascinating!

Richard_Parsons · March 10, 2018, 10:29am

Put the timing critical things in PRU.then you can write a driver to feed data to it.
Linux is not designed to be deterministic, there will always be jitter.

Another method is to change the task priories and scheduler. The chrt command helps set processes to real-time and their levels

Andrew_Tridgell · March 10, 2018, 9:04pm

I think you’ll find that doing IO in a separate thread makes a big difference. We don’t get problems with microSD IO in the HAL_Linux port of ArduPilot. We use a separate thread for IO, plus a thread for each sensor bus (eg. if 2 SPI buses and 2 I2C buses then 4 bus threads). We get very good scheduling results even at high loop rates.

clolsonus · March 11, 2018, 1:01am

I thought I had explored logging in separate thread unsuccessfully, but that would have been quite a while ago. I’ll have to take another pass at it. I’m running a pocketbeagle here (single core cpu). About a year ago we had a student plot out the time interval between IMU samples on a PX4 / Pixhawk system (scheduler and messaging using uOrb) and discovered that was all over the map with occasionally some hideous delays (like 0.1 - 0.4 seconds) when logging to a slower SD card … but that was running on a vanilla pixhawk hardware. Note: this was the sampling interval as far as we could tell, not just the logging interval. The aircraft still gets around the sky just fine even when the flight controller goes out to lunch for a few tenths thanks to Newton.
Personal commentary: uOrb is beautiful from a computer science and operating system perspective, but seams to suffer a lot of time jitter issues under normal loads.
You might not expect that given that it is running on nuttx as a real time system … but it’s been my experience that true real-time is a lot harder than it looks and often people get unexpected results unexpectedly … especially with respect to timing intervals.

Andrew_Tridgell · March 11, 2018, 8:46am

uOrb is horrible - we don’t use it in ArduPilot. We’re also moving away from NuttX to ChibiOS on stm32, with vastly better performance results. See my presentation here for details: https://www.youtube.com/watch?v=y2KCB0a3xMg
We can get standard-deviation on loop times of under 4 microseconds now on a STM32F427. That includes logging and running a pile of SPI and I2C sensors.

clolsonus · March 11, 2018, 6:54pm

Hi Tridge +1 on your talk! Very interesting stuff. Really thrilled that you guys are focused on timing precision because I hadn’t seen anyone in the open-source uav community examine that closely before (and obviously there are some issues with some of the systems out there.) And I see you’ve sufficiently way over engineered it so the results are without question. Well done!

clolsonus · March 18, 2018, 5:09pm

Tridge: I took another run at this and posted my results to the http://beagleboard.org group here. Thanks for keeping this at the front of my brain! The SCHED_FIFO features of the kernel (which everyone else seemed to know about except for me) make a huge difference and give me the level of control I need to run my main app ahead of those pesky dumb kernel threads.

Andrew_Tridgell · March 18, 2018, 8:56pm

glad it helped. Drop by our mumble channel sometime if you want to discuss this stuff in more detail. http://ardupilot.org/dev/docs/ardupilot-mumble-server.html
The key with SCHED_FIFO threads on Linux is to align the threads with the hardware resources. So a thread per bus, plus a thread for filesystem IO and a thread for UARTs etc

clolsonus · March 18, 2018, 9:32pm

I outsource all the sensor I/O to a teensy-3.2. Taking away all the sensor I/O, the core of the autopilot needs to happen in a step by step sequence (receive sensor data packet(s), compute EKF, run PID’s, send out actuator commands.) What’s left over after that is higher level mission code, telemetry / communication, and logging. It makes sense to push logging into a separate process/thread so it can eat any time spent blocking. Outside of that, pretty much everything else works just fine in the single main thread (using non-blocking I/O.) That’s just my 2 cents, but I’m not trying to do anything super fancy or support every platform on the market.
I’ve heard of mumble, and my kids mumble whenever I ask them questions.