As we mentioned in my earlier column, till a couple of weeks in the past, the Arduino Uno microcontroller growth board we’ve all grown to know and love was the R3 (“Revision 3”). This little rascal is powered by an ATmega328P microcontroller unit (MCU), which proffers an 8-bit information bus, a 16-MHz clock, 2 KB of SRAM (through which to retailer native variables and any information generated by the person’s program), and 32 KB of flash reminiscence (through which the person’s program is saved). When this board is powered up, it instantly runs no matter program resides within the MCU’s flash reminiscence.
Additionally of be aware is the truth that this board’s digital enter/outputs (I/Os) swap between 0 V and 5 V, as in contrast with many of the newer Arduino choices, which swap between 0 V and three.3 V. That is necessary as a result of—because of the open-source nature of the Arduino—an enormous ecosystem of plug-in boards referred to as “shields” has come into being. These shields, that are geared up with header pins that match the Uno’s footprint, embody sensors, shows, motor controllers, relays, Wi-FI, and myriad different issues. Not surprisingly, these shields all count on to see the 5 V indicators employed by the Uno R3.
As an apart, despite the fact that a 16-MHz clock, 2 KB of RAM, and 32 KB of flash reminiscence could not strike you as being overwhelming, I’ve fortunately used a bunch of Uno R3s to drive all kinds of tasks over time, similar to my Superior Audio-Reactive Artifact, for instance (the place the “Superior” is a part of its official moniker).
On this case, I began with a small journey suitcase that seems to be an costly vintage crafted out of wooden and leather-based, however that’s actually a cheap-and-cheerful imitation meant just for dwelling décor. Subsequent, I took a bunch of damaged and discarded vacuum tubes, and I used epoxy to connect these to a skinny plywood panel through which I’d drilled holes and painted black.
Subsequent, I connected tricolor light-emitting diodes (LEDs) to the bottoms of the vacuum tubes. These had been NeoPixels from Adafruit, which implies they are often daisy-chained collectively. In flip, this implies they will all be pushed utilizing a single pin on the MCU.
I additionally mounted a small microphone on the entrance of the case and used this to feed an 8-pin MSGEQ7 spectrum analyzer system. You possibly can purchase these little scamps as standalone elements from SparkFun, or you should buy pre-constructed breakout boards from SparkFun or on eBay.
The Arduino continuously loops round studying the audio samples and driving the LEDs. When anybody is speaking or music is taking part in, completely different tubes flicker with completely different colours to replicate the assorted frequency elements of the sound supply. Suffice it to say that this can be a actual eye-catcher that all the time attracts consideration and constructive feedback, particularly for these of us who had been obliged to endure the minimalistic sound-to-light results out there within the Nineteen Seventies.
The purpose of those meandering musings is that my Superior Audio-Reactive Artifact is powered by an Arduino Uno R3 with a number of “headroom” to spare. Nevertheless, having mentioned this, I do have tasks that require extra “oomph” on the processing entrance. Because of this I used to be so excited when, simply a few weeks in the past as I pen these phrases, the blokes and gals at Arduino launched two new variations of the Arduino Uno within the type of the R4 Minima and the R4 WiFi (the oldsters at Arduino guarantee me that they don’t have any plans to discontinue the R3 for which they foresee sturdy continued demand). One crucial level is that the R4s present the identical 5 V indicators because the R3, thereby permitting us to reuse our present assortment of shields (phew!).
Each R4s are powered by a RA4M1 MCU from Renesas. This little rascal is predicated on a 32-bit Arm Cortex-M4F core (the ‘F’ means it features a {hardware} floating-point unit, which is unfortunately missing within the R3). That is, after all, 4X the width of the R3’s 8-bit information bus. The R4’s clock runs at 48 MHz (3X the R3), it’s geared up with 32 KB of SRAM (16X the R3), and it boasts 256 KB of flash reminiscence (8X the R3). The R4 WiFi additionally boasts an Espressif ESP32-S3 module for WiFi and Bluetooth Low Vitality connectivity, however that’s a narrative for one more day.
A few of the early PR I noticed for the R4s boasted “3X the Efficiency!” I don’t learn about you, however this was a tad underwhelming to me. This quantity is, after all, derived from the truth that the R4’s 48-MHz clock is 3X the frequency of the R3’s 16-MHz clock, however there’s far more to efficiency than this. Take floating-point operations, for instance. Though the R3’s MCU doesn’t embody a floating-point unit (FPU) in {hardware}, we will nonetheless use floating-point operations in our R3 code as a result of the compiler breaks them down into 8-bit “chunks.” In fact, for the reason that R4 MCUs do have {hardware} FPUs, they need to execute floating-point operations a lot sooner.
How a lot sooner? Enquiring minds (like mine) wish to know. I’m positive that, if we had been to delve deep into the info sheets for these MCUs, we might decide what number of clock cycles every operation consumes however (a) who desires to spend time trying by means of information sheets and (b) the place’s the enjoyable in that?
I first decided to discover the relative efficiency of integer operations. There are three basic sizes of integers: quick ints, common ints, and lengthy ints. How huge are these information sorts? It varies relying on the width of the MCU’s information bus and its underlying inner structure. All of the C/C++ specs need to say about that is that the minimal measurement of a brief int is 16 bits and the minimal measurement of a protracted int is 32 bits, whereas the dimensions of a daily int is anybody’s guess (it’s 16 bits in an R3 and 32 bits in an R4).
As some extent of reference, within the case of my built-in growth atmosphere (IDE), I’m utilizing the latest-and-greatest Arduino IDE 2. Simply to verify that all the things was as anticipated, I created a easy take a look at program that determines and prints the dimensions of quick ints, common ints, and lengthy ints (the outcomes are introduced when it comes to 8-bit bytes).
If you want, you may obtain a textual content model of this program to peruse and ponder. I ran this program on each an R3 and an R4. The outcomes are as proven beneath.
That is, after all, what we anticipated to see, nevertheless it by no means hurts to verify the fundamental issues earlier than plunging into the fray with gusto and abandon (and aplomb, after all).
My final purpose is to have the ability to current a desk evaluating all the common integer operations (+, -, *, /, %, >, and so forth.) and logical operations (&, |, ^, and so forth.) on quick ints, common ints, and lengthy ints for each the R3 and R4. I additionally wish to evaluate floating-point operations (+, -, *, /) on each platforms.
To additional this purpose, I subsequent created a easy program to guage the fundamental arithmetic operations (+, -, *, /, %) on common ints. The core of this program is proven beneath.
On this case, NUM_TESTS is about to eight and NUM_ITTERATIONS is about to 10,000,000. What are the primary three checks purported to be doing? Nicely, I vaguely questioned if an if() take a look at for zero consumed much less clock cycles than an if() take a look at for non-zero values, so I made a decision to verify this out. Additionally, I wish to account for the clock cycles used to carry out the if() checks in order to have the ability to isolate them from the clock cycles used to carry out the arithmetic operations.
Please be at liberty to obtain a textual content model of this program within the hope you may inform me why if fails to work. “What? It fails to work!” I hear you cry. Sure, I believe it’s protected to say that the outcomes, that are proven beneath, are actually not what I anticipated to see.
“Oh pricey,” I mentioned to myself (or phrases to that impact). The very first thing we observe is that the “Elapsed Time” values for each processors are meaningless (properly, clearly they’ve that means, however we don’t know what it’s). Take simply the primary if() take a look at on the R3, for instance. Assuming this takes 1 clock cycle (and excluding any clock cycles related to the for() loop), then since we’re performing 10,000,000 iterations on an R3 with its 16-MHz clock, we should always have an elapsed time of 10,000,000 / 16 = 625,000 microseconds (µs). As an alternative, we see an elapsed time of solely 4 µs.
No matter is going on right here, why is the elapsed time for Take a look at 3 on the R3 0 µs whereas all of the others are 4 µs? And why are the elapsed instances for the R4 ~2X these for the R3? And… coloration me confused.
So, the primary high-level query is WTW? (“What the what?”), which grew to become my new favourite expression when my spouse (Gina the Attractive) and I binge-watched the Completely satisfied Shiny Individuals: Duggar Household Secrets and techniques documentary on Amazon Prime Video.
The second high-level query is, “Simply what’s going on right here?” I wracked my brains watching my code making an attempt to identify any apparent “gotchas” to no avail. Subsequent, I instigated a video name with my buddy Joe Farr within the UK to ask for his ideas. It wasn’t lengthy earlier than we unearthed the elemental challenge underlying these pear-shaped outcomes. Nevertheless, addressing this challenge has proved to be one other matter fully. We discover ourselves in a battle of wits with an unknown and unseen adversary, and I’ve grown to remorse referring to myself as “1/2 Man, 1/2 Beast, and 1/2 Wit.”
All might be revealed in my subsequent column on this matter (by which era I hope to have an answer). Within the meantime, I welcome your charming feedback, insightful questions, and sagacious options.