FLIRTing with 8-bit MCU OSes

by Dave Armour , TechOnline India - August 11, 2009

FLIRT is for developers who want a small, easy to use, and simple to port operating system for a truly tiny multithreaded system on an eight bit MCU, without sacrificing the ability to multithread and share CPU time effectively.

I come from the 8080/8085 era. I used to hand write the code on paper then type the opcodes into an EPROM programmer. If that didn't work, I figured it out and did it again. I developed some pretty complex code that way. Since then, I've never lost sight of the fact that small is beautiful.

After reading Jack Ganssle's article "Small is Beautiful" (March 9, 2009,, I felt like we were brothers from a different mother. In this world, the constant drumbeat of "more, more, more" is pounded into our heads so much that the concept of "simple is best" is lost. Everyone gets "creeping featuritis," and soon the whole place is infected with features no one wants or even needs. I'm here to tell you that it doesn't have to be that way.

The first time our hardware engineer brought a proposal to me and said it has an 8-bit processor, I said "are you nuts?" Yet, when all was said and done, I have to say I was impressed. The processor our hardware guy proposed was the Freescale MC9S08. Originally he wanted a simple super loop that he could write himself and not get the firmware guys involved. He was concerned about us catching creeping featuritis and turning the super loop into a huge 32-bit machine with way too many bells and whistles. I convinced him that that would not be the case and that I would write the firmware for it myself. My first concern was that the super loop starved other parts of the system, so I started researching what was out there that I could use.

I read Miro Samek and Robert Ward's great article in the July 2006 Embedded Systems Design magazine ( on the simple task switcher because I have always preached that "good developers write good code, great developers steal." I didn't use this task switcher because I was also concerned with starvation. Their system had higher priority tasks run to completion and was based on state machines that are amazingly useful but limited when dealing with time constraints. Events get backed up waiting for other events to run to completion.

I then looked at Helium ( A great little tasker but too much overhead for what I was looking for and it was based on priorities as well. I needed something where every task has the same priority, gets its share of the CPU, and yet can have a different "weight" in the system. When I say weight, I mean a little bit more of the task's share of the CPU is going to that task. This is not to be confused with priority over the other tasks; instead, it's just that one task needs another tick or two, while all the other tasks are still geting their share. More on this later.

So after not finding anything I could (ahem) borrow, I set out to create a small multithreaded tasker that can run on an 8-bit MCU and lets every task have access to the CPU more or less when they needed it. Enter FLIRT (Friendly Little Interrupt Tasker). In the end my implementation consumed a whopping 144 bytes of flash.


I based FLIRT on the basics we were all taught in school: every task has a task block that includes its stack; every time tick causes the next task to run; tasks get created when needed and removed when not needed anymore. After that, anything you add is another ball and chain on the CPU. Don't get me wrong, this system will not answer everyone's needs. It only works on systems that can handle interrupts and don't have a large amount of tasks running. If you need more power, you should rethink your choice of CPU and operating system. FLIRT is for those of us trying to stay as small as possible yet as powerful as possible.

Working on a system small enough to use an 8-bit MCU, you need to understand a fair amount of the architecture and the assembly language. This allows you more control of the system. If you ever don't know how you would implement something in assembler, just write the closest possible code in C, compile it, and disassemble it to get an idea of how to approach your assembly implementation. (The compiler guys are awesome.) For this example, I will substitute assembly code with pseudo code labeled in italic-bold-underline surrounded with brackets like this: [pseudo code].

I'll begin with the simplest form of the system and add on the "weighted" aspects I mentioned earlier as the only extra I see useful. As in any system like this, the task control block is the center of the universe. Each task running is a node in a doubly linked list. I use a doubly linked list because it makes the task removal much easier. Listing 1 shows the layout.

View the full-size image

Next, a few globals to keep around:

TCB *TaskToBeDeleted;
TCB *Head;
TCB *Tail;
TCB *CurrentTask;

One of the first things that you might notice is the finite amount of stack space for each task--a necessary evil in a simple system like this. If you want to make a dynamic stack, you need to add the overhead to do so. I found this finite space worked just fine for my needs on the MC9S08. The second thing to notice is that the task and stack pointers, as well as the stack, are the only items in the task block (Figure 1). For a simple system like this, what else do you need? So, each task block uses four pointers + STACK_DEPTH bytes.

Once you get past your system initialization, you enter the task initialization portion the job of which is to simply get things going. You start off by adding the one and only irremovable task in the system. I call this task the "base" task. It has just as much time on the CPU as any other task, only it never removes itself. This task is where you put your "super loop" of things that are handled in a polled fashion if needed. If you want, you could even use this task as the main scheduler that adds tasks to the list in a system that can't deal with interrupts. The first task is always the "base" task. Next you add whatever tasks you need to run initially and simulate a task switch by loading up the stack pointer with CurrentTask->StackPointer. Finally, enable interrupts and execute a return from interrupt (RTI) to start executing the first task. Quick, fast, no messing around. Why an RTI at the end of your main system function when you're not in an interrupt service routine (ISR)? The reason is that you're acting like you are in an ISR to get the ball rolling. You'll never return here since the "base" task will never let it happen.

View the full-size image {pagebreak}

Listing 2 is the code for the task initialization.

View the full-size image

The TaskCreate function is shown in Listing 3.

View the full-size image

I will talk later about the #define SIZEOF_SWITCH_ STACK_DROP. And then the TaskDestroy is shown in Listing 4.

View the full-size image

So a task gets deleted by simply marking itself as "to be deleted." Something very important I want to point out is that there are no mutexes used to make sure only one task is accessing the task list at a time. This is because I only access the list when interrupts are off--a subtle form of mutex. You must make sure you only access the list when interrupts are off. If your MCU gives you the capability to verify that the interrupts are off, add that to the beginning of TaskCreate and TaskDestroy.

The final part of the system was the most difficult to write and is definitely processor dependent. I'll show you pseudo code you can implement based on your architecture. The ISR handles the timer tick, which is the part that switches tasks. Basically it:

• Disables interrupts.

• Checks to see if TaskToBeDeleted is NULL:

• If it is, push all your registers on the stack, save the stack pointer, and then move to the next TCB in the list.

• If TaskToBeDeleted is not NULL, move to the next TCB in the list and remove the TaskToBeDeleted. Be sure and set it to NULL afterwards (we've all seen that bug before).

• Once we're pointing to the next TCB:

• Load the stack pointer from the new TCB.

• Load any other registers the ISR call did not push in the reverse order they were saved. We are now in the same state we were in when this task last ran.

• Enable interrupts.

• Return from interrupt.

View the full-size image

Listing 5 is the Timer tick ISR with [pseudo code] inserted where you would use assembler.

The last point to explain is what the #define SIZEOF_SWITCH_STACK_DROP is. Whenever a system responds to an interrupt, a set of registers are automatically pushed on the stack. That size (in bytes) is the value SIZEOF_SWITCH_ STACK_DROP is set to. In the MC9S08 that I was using, it is six. Offsetting that amount from the top of the stack during the task creation gives the task the registers it picks up during the first task switch and initializes all the registers to zero. In the MC9S08, register H was not saved on the stack for the Timer_Tick_ISR so I had to push it on the stack at the beginning of the ISR and pop it at the end of the ISR.

In the ISRs you would use to take care of business in your system, you would look for the task you needed to run in the list. If it wasn't there, add it with the TaskCreate function. If it was, do nothing. Since this system is not designed for tens or hundreds of tasks, the time taken to search a list of 10 or so tasks is small.

I ran FLIRT for weeks with a very intense test that created, removed, added, removed, and added as much as 10 tasks: I had no memory leaks or stack overruns.


Once the system was running, I got a lot of inquiries from my peers about adding prioritization but to me the concept of prioritization for this application was wasted and defeated the purpose of the system. (Why is beyond the scope of this article.) When you add prioritization in a multithreaded system, you need to add the concept of blocking, which requires adding another queue of blocked items, among others. That would start adding complexity I didn't want. I wanted simple! I wondered how I could add the concept of one task being more important than another task without starvation or complexity: I came up with the concept of "weighted" importance. If a task has more weight, it requires additional ticks during its share of the CPU time. This can be abused if you are not careful to the point where starvation occurs, so think carefully before adding too much weight to a task. This enhancement changed the task control block structure to the code in Listing 6.

View the full-size image

The two new items are Weight and Count, where Weight is the amount of clock ticks this task will run consecutively before letting the next task run and Count is the number of clock ticks that have run so far. The base task always has a weight of 1.

View the full-size image

Listing 7 is the task initialization with weighting where the second parameter is now the Weight of the task.

The TaskCreate function would set Count to zero and then assign the second parameter to the Weight member of the TCB. Of course, good design states that you have a maximum Weight allowed in the system and then do some error checking.

The timer tick ISR would simply bump the Count member of the task's TCB by one, see if it equals the Weight member. If not, just return. If it does equal the Weight then switch tasks. Listing 8 is the modified portion of the timer tick ISR.

View the full-size image {pagebreak}

I created a flowchart for people like me who like pictures (see Figure 2). It has two modes (if you will), one is when only the base task is running and the other when additional tasks are running. It doesn't reflect the concept of Weight.

View the full-size image

Flirting with simplicity
FLIRT is not for everyone. It's for those developers who want a small, easy to use, and simple to port operating system for a truly tiny multithreaded system. For very small systems, it gets the job done nicely. You shouldn't have to sacrifice the ability to multithread and share CPU time effectively just because you want to (or can) use an 8-bit MCU. FLIRT allows efficient use of the CPU and its resources. The system as I have stated it can easily be a platform where you could add features that enhance its ability, if you feel the need to, and I could have written about how to add many different features, but that isn't the purpose here. The purpose is to keep it simple, clean, and powerful. Enjoy.

Dave Armour is a recently laid off firmware engineer with 15+ years in the embedded world and now lots of time on his hands. He has helped to start two startups and has worked at companies such as Microsoft, Hughes Aircraft, and Avocent. He can be reached at

I would like to thank my friend and very smart guy Jeff Fore for his ear and advice while I was designing FLIRT as well as writing this article. -- Dave Armour

About Author


blog comments powered by Disqus