1759 'Historical Perspective' SEP-20

This is a 'Paper Reading' post for Course ECE1759. The topic is 'Historical Perspective'. This paper list is here:

  • E. W. Dijkstra, The Structure of the 'THE'-Multiprogramming System, Communications of the ACM, Vol. 11, No. 5, May 1968, pp. 341-346.
  • P. B. Hansen, The Nucleus of a Multiprogramming System, Communications of the ACM, Vol. 13, No. 4, April 1970, pp. 238-241, 250.
  • D. G. Bobrow, J. D. Burchfiel, D. L. Murphy, and R. S. Tomlinson, TENEX, a Paged Time Sharing System for the PDP-10, Communications of the ACM, Vol. 15, No. 3, March 1972, pp. 135-143.

Background

  • There is no 'Thread' concept.
  • There is no C language.
  • There is no Unix (Announced outside Bell Labs in October 1973).
  • There is no Cache mechanism.
  • There is no Multi-core processor.
  • File systems already exist.

The Structure of the 'THE'-Multiprogramming System

What is Multiprogramming

Multiprogramming systems separate a job into some processes by some waiting periods like IO. On one CPU, OS will switch to another job when the executing job needs to do waiting like IO. The new job will then compute on the CPU. By switching, the CPU is always computing with the least waiting.

  • The difference between Multiprogramming with Multi-tasking is that there is no time slices in Multiprogramming. (Just like coroutine, but for a different reason.) Multiprogramming is not time-sharing.
  • The difference between Multiprogramming and Time-Sharing is that Time-Sharing has multiple users, and the systems must be interactive. Users can use their consoles to interact with or send commands to the system. The users share the CPU and memory.

Multiprogramming in this paper

Unlike the multiprogramming concept we mentioned before, this paper's multiprogramming looks like be with time slices, as it also has a clock to prevent processor monopolization.

In this system, each period of a process is called a sequential process.

Core Idea

The paper reports their system's current progress and introduces some innovative points in their system, including the Storage Allocation with page&segment, Processor Allocation, Hierarchical Structure, and the use of Semaphore.

Goal

  • a reduction of turn-around time for programs of short duration,
  • economic use of peripheral devices,
  • automatic control Communications of the ACM 341 of backing store to be combined with economic use of the central processor, and
  • the economic feasibility to use the machine for those applications for which only the flexibility of a general purpose computer is needed, but (as a rule) not the capacity nor the processing power.

About Page Segmentation/Storage allocation

  • Drum memory works as a secondary memory or persistent memory.
  • Segments corresponds one or a set of memory page.
    • The number of possible segment ids is much larger than physical memory.
    • The loaded page from drum memory is not necessary to dump back to the original address since it may be allocate to other application. Just getting a new unoccupied address is more efficient than reallocating the original resource.
  • This technology make the programs in the future can run on different machines.

About Hierarchical Structure

Level 0

Processor Scheduling. Introduce a real-time clock for preventing monopolization. Priority is introduced to responding quickly is needed.

Level 1

Segment controller (Pager). Manage virtual address (segment) and physical address (page).

In this level, a processor is owned only by one job from the job perspective.

Level 2

Message Interpreter(Drivers of the console, TTY). Interpret the messages from the terminals and peripherals & communicate with the operator. Since we have a message interpreter, we can use it to simulate many consoles. Each process seems to have one private console. Level 2 is above Level 1 because it needs to use virtual memory.

Level 3

IO peripherals abstraction (Shell). Here IO is abstracted as sequential processes and relies on message communications.

Level 4

User's Programs.

Level 5

Operators.

Advantages

  • Testing and Verification
    • For verification, when we have precondition and code, the postcondition has been decided. Iterative.

Disadvantages

  • The higher layer will always rely on the lower layer. But the lower layer can not have any dependence on the higher layer. So it is not that flexible.

About Design Experience

  • Test one layer immediately when one layer is finished. Make sure to build on top of a correct layer.
  • Without separating the levels, "relevant states" will inflate to a large number.

Two Types of Philosophy

  • The MIT approach: Sacrifice simplicity for correctness
  • New Jersey style: Everything should be simple.

About Semaphore

  • It is actually not Dijkstra that the first one to introduce semaphore. But he is the first one to document this tech.
  • Just like the PV operation we know.
    • P is short for Passeren, which means pass in Dutch.
    • V is short for Vrijgeven, which means release in Dutch.
  • The paper did not mention how to make the operations atomic.
  • Semaphore is used to implement mutual synchronization.
  • Private semaphore
    • Semaphores that other processes can not do P operation but are visible to others.

Advantages

  • Introduce a new programming model: structured programming. Trigger future verification research.
  • The idea of abstraction.
  • A lot of philosophy. Quite insightful.

Disadvantages

  • No specific method to keep semaphore atomic.
  • It's hard to find relative articles with no reference (Maybe because it is too old).
  • No reviews of other systems.
  • No pictures.

Confusion

  • How the peripherals are economically used. It did not point out how it was working before the paper.Console, IO, CPU, memory.
    • Achieved by multiprogramming.
  • How the turn-around time for programs is reduced.
    • Achieved by Priority scheduling

Reference

The Nucleus of a Multiprogramming System

Overview

The resulting system was not a complete operating system but a small kernel providing the mechanisms upon which operating systems for different purposes could be built.

The system applied a new message/answer method to do synchronizing.

RC4000

The RC 4000 Multiprogramming System (also termed Monitor) is historically notable for being the first attempt to break down an operating system into a group of interacting programs communicating via a message passing kernel. RC 4000 was not widely used, but was highly influential, sparking the microkernel concept that dominated operating system research through the 1970s and 1980s.

Core Ideas

About External Process

It is a system that considers peripherals as processes instead of files. So read/write will be considered as sending/receive messages to another process, which simplifies system construction.

About Process Hierarchy

  • It uses an inverted tree to manage processes.
  • The parent process has full control of the child processes.

About Communications

send message (receiver, message, buffer),
wait message (sender, message, buffer),
send answer (result, answer, buffer),
wait answer (result, answer, buffer).

Advantages

  • It fits the modern software requirement——extensibility, with which designers can easily extend or experiment with new techniques.
  • Give a more precise 'process' concept.
  • Give a clear explanation of the message/answer mechanism, which is more efficient to deliver information.

Shared memory V.S. message passing

Shared Memory Message Passing
Hard to verify Easy to verify
High speed Low speed
Now on single machine Now as RPC

Disadvantages

  • For now, we use shared memory instead of message passing to communicate in a system on a single machine. So the fact is message passing lost in the competition of process communication finally.
  • It seems to use the old memory allocation way without pages. So the resources will be limited for each process.
  • It seems that considering the occupation of the pool of the buffers. Should there not be too many processes?
  • Relatively slow when using the message/answer communication.

Confusion

  • What is the black sheep for semaphore? No examples.
    • To be solved.
  • Does it apply a segment-page strategy? Will the process have uneven storage allocation?
    • I guess not.

Reference

TENEX, a Paged Time Sharing System for the PDP-10

Background

  • DEC's PDP-10 was manufactured beginning in 1966.
  • Now we have Unix shell, the V6 shell (though not public outside the Bell Lab).

Goals

  • State of the art virtual machine.
    • Paged virtual address space
    • Multiple process capability
    • File systems: integrated into virtual address space, multilevel directory structure, consistent access
  • Good human engineering throughout system.
    • An executive command language interpreter(Shell)
    • Terminal interface
    • Virtual machine functions
    • Easy to use and safe from unexpected interaction
  • The system must be implementable, maintainable, and modifiable.
    • Modular
    • Debuggable and reliable,
    • Run efficiently and allow dynamic adjustment or reconfiguration.
    • Contain instrumentation to clearly indicate performance

Core Ideas

Hardware

Pager (Just like MMU)

  • Convert virtual memory to physical memory.

New Monitor Call JSYS(System Call).

  • JSYS provides an independent transfer mechanism into the monitor which does not conflict with previous system.
  • The old software (binary) will use the old system calls while the new software on the new OS will call system calls through JSYS.

Virtual Machine

Not like what we means nowadays. More like a process run on the top of virtual memory.

VM structure

  • Virtual memory is viewed 256K words (512 slot * 512 page size) (But the physical memory has about 1024K(1,048,576) words......)
  • A private page is automatically created whenever a process makes a reference to a page for which the map slot is empty. (Unlike now, we use new and delete in C++)

Job structure

  • Tree-like structure, parent processes can manage its child processes.
  • Use command EXEC to run user programs and handle faults, service interrupts (which is considered to be easy to use and interactive for users.)

Pseudo-Interrupts

  • process to receive asynchronous signals from other processes or terminals
  • detect sets of unusual conditions, including illegal memory reference, process overflow, EOF, and data errors.

Tenex File System

  • Relative primitive, has tree of max depth 5
  • With access protection.
  • With file version control, but no details about the implementation.
  • 15 bit (5*3) file permissions.
    • 5 file operations: list, read, write, append, execute.
    • 3 types of locations relative to user or group.
  • Thawed file (Single Write) and unthawed file (Multi Write)

Monitor (Kernel)

Scheduler (for processes)

  • Requirements
    • Provide an equitable distribution of CPU service.
    • Identify and give prompt service to jobs making interactive requests.
    • Make efficient use of core memory to maximize CPU usage.
  • Balance Set Scheduling.
    • Balance set is the set of jobs with highest priority whose total working sets fit in core mem.
  • Setting Process Priorities.
    • Old ways caused uneven distribution compute-bound and interactive jobs. Interactive ones have higher priority.
    • TENEX use long-term average ratio of CPU use to real time as the hints for priority. So more balanced.
    • Priority decreased at constant rate, increased while blocked.
    • short burst of max priority and queue position determined by long term average
      • hopeful to get quick service to very short interactions.
  • Resource Guarantees and Limitations.
    • A person with appropriate administrative access can assign to any job on the system a fraction, F, of guaranteed CPU service. For any job so designated, the scheduler will attempt to ensure that C/T> F, where C is the CPU seconds used by the process, and T is the real time since the process last unblocked.

Core Management

  • Every process has tis own work set of pages.
  • The size of work set of processes are dynamic decided by page fault time.
  • The swaping in a work set for each process use the LRU algorithm.

Advantages

  • Practical and full of technical details.
  • Very similar to modern operating systems.
  • Schedule with priority and try to avoid starvation.

Disadvantages

  • Pseudo Interrupt System provides no details of implementation.
  • Too much introduction of file system and its naming rules.
  • No details of message methods.
  • The file system only supports a five-depth tree, which seems to be a strong constraint.
  • The virtual memory is smaller than physical memory.

Confusion

  • Don't know the formula's meaning.
    • Not solved
  • Did the different version of file stored?
    • Not solved

Reference

Welcome to my other publishing channels