Code that is part of the kernel, aka the OS. (This means it’s not “part” of any particular Process.) It’s responsible for scheduling Threads, choosing which one to run and how long to run them.

Things to consider for scheduling algorithms

How does the scheduler decide which threads to run? Some attributes that we could focus on as we design the scheduler algorithm:

  • Fairness: every program gets to run
  • Liveness: that “something” will eventually happen
  • Throughput: amount of work that can be completed over an interval of time
  • Wait time: average time a task is alive, but not running
  • Turnaround time: time between a task being ready and completing
  • Response time: time it takes between the task being ready and when it can accept user input
  • Priority: we might designate some threads as being more important to run
  • Users: if a computer has multiple users, how do we divide time between them?

Some common goals are to minimize wait time, minimize latency, maximize throughput, and maximize fairness.

Other considerations to consider include the resources it costs to context switch, and the inherent resources required to run the scheduling itself, like the data structures to implement it.

Priority inversion problem

Assigning priority to tasks comes with the assumption that only higher priority tasks can interrupt lower priority ones, and the opposite can never occur. However, this can still sometimes occur.

Example

We’re given two tasks and where the priority order is . At some point the lower priority task runs. Let’s say it obtains a lock on a resource .

Now wants to run and also attempts to get a lock on . However, has already obtained a lock, therefore causing to block. This means a higher priority task is waiting on a lower priority task to finish running.

Furthermore, any threads that have a priority between and will also delay . This is because they’ll be run before , but is waiting on to unlock . So still can’t run.

Here, the priority of tasks has been inverted.

Aside: what about multiple cores?

Even on a modern machine with multiple CPU cores, each core generally has its own run queue. Each core handles and schedules its own tasks. This means that our single core assumption isn’t too far off and is pretty relevant.

Furthermore we would like to keep threads within the same process to prevent resetting the entire address space when we switch threads, so thinking in terms of only using one CPU core is important.

There are methods to keep one process “locked” to one core. This is commonly called thread affinity or CPU pinning.

Non-preemptive

If a thread is running, it continues to run until it completes or until it by itself decides to give up the CPU.

First Come First Serve (FCFS)

When a thread becomes ready, schedule it to run until it’s finished. We maintain a queue of ready threads, and take the thread at the front when the current thread finishes. Add newly ready/newly blocked threads to the back.

Shortest Job First (SJF)

Variation on FCFS, but maintain a priority queue instead. Have the tasks with the smallest CPU-time requirement (first in the queue) run first.

Preemptive

The thread gets interrupted after a given time, or if another thread becomes ready, etc. Most modern, relevant scheduling algorithms are preemptive.

Round Robin (RR)

Sort of a preemptive version of FCFS. Use a thread queue again, but only let it run for a fixed amount of time, called time quantum units. If it finishes before the time is up, we pick another thread (from front of queue) to run. If time is up, we send the running thread to the back of the queue.

Advantages

  • Still relatively simple
  • Can work for interactive systems (guaranteed that job will come back eventually)

Disadvantages

  • If quantum is too small, we spend too much time context switching
  • If quantum is too large, we end up reimplementing FCFS.

As a rule of thumb, we should pick a unit of time such that 80-90% of jobs finish in just one usage of CPU time. This requires qualitative testing.

Variant: PennOS Scheduler

We introduce the concept of priority to tasks, and expand the number of queues to three instead of one. Each queue has a different priority: 0, 1, or 2.

  • Priority 0 (P0) is scheduled 1.5 times more often than priority 1 (P1)
  • Priority 1 (P1) is scheduled 1.5 times more often than priority 2 (P2)

Within each queue, we perform round robin as before. After we finish running a task, we make a choice whether to run the next task from the P0, P1, or P2 queue. This choice has to be deterministic. It’s up to us to decide.

Need to maintain separate data structures for determining whether a task is sleeping or blocked so we don’t run it, etc.

Variant: Priority Round Robin

Introduce multiple thread queues for different priority levels.

  • The scheduler chooses the first item in the highest priority queue to run, and only schedules tasks from lower priority queues if all higher priority queues are empty.
  • Can cause starvation of lower priority of queues.

Variant: Multi-level Feedback

Maintain priority queues numbered from priority to , with an increasing time quantum as we decrement the time number. For example:

  • Queue with priority gets time quantum 12
  • Queue with priority gets time quantum 10
  • …etc.
  • Queue with priority gets time quantum 100

When a thread initially starts, we put it in the highest priority queue ( in this case), and start running it. From here, two scenarios can occur:

  1. The thread purposefully gives up CPU time before time is up. This might be because it’s wait()-ing on something, or being blocked to read user input. In this case, it’s simply moved back to the end of the current queue.
  2. The thread does not finish running before the time runs out. In this case, we demote it one level to a lower priority.

Furthermore, lower priority queues are not scheduled until all higher priority queues are empty, just like Priority RR.

The intended outcome is that threads with high I/O bursts become preferred, as they frequently run and block on user input/events. These stay in the higher priority queues. Meanwhile, threads that take longer to run slowly sink down to their preferred priority queue, which will have a longer time quantum to accommodate them.

Remember that minimizing context switching is never a bad idea.

Transclude of #^f05cf2

Some ways we can modify this algorithm is to assign tasks different priority levels upon initiation, promoting threads back up a priority queue if it consistently runs under the time quantum, etc.

Also see