SEG-2106 "Software Construction"  
 
 
  Chapter 3-1: Concurrency 
 
Additional readings: 
  - [Sebesta] chapter 12 (see printed course notes) Please read Sebesta 12.1 (Introduction), 12.2 
        (Introduction to subprogram-level concurrency), 12.3 (Semaphores), 12.4 
  (Monitors), and 12.7 (Java threads).
Hardware and software concurrency
Physical concurrency
A single CPU of a computer executes one instruction after the
  other. There is no concurrency. If we have several CPUs running
  in parallel (either within the same computer or in separate
  computers) we have some form of concurrency (also called
  parallelism). Concurrency realized at the hardware level is
  called physical concurrency. Physical concurrency is related to
  parallel computer architectures: 
  - A normal computer contains at least one CPU and several peripharel processing units that look after input/output with the external world, such as video display, sound input and output, disc access, etc. Also the computer clock is a separate hardware unit that runs concurrently with the CPU and is used to control the scheduler of the operating system which allocates time slots to the different programs running on the computer (see below).
- Various schemes have been
    designed to build hardware architectures with multiple CPUs. The purpose is to 
    efficiently execute various computing tasks in parallel. They
    can be classified as follows:
    
      - SIMD (single instruction - multiple data): These are
        computers where multiple CPUs work on different data, but
        all execute the same sequence of instructions. Vector
        processors belong to this class.
- MIMD (multiple instructions - multiple data): Here the
        different CPUs execute different sequences of
        instructions (and work on different sets of data). They
      can be further classified as    
        
          - shared memory architectures: Here all CPUs
            access the same memory (it is easy to share
            data between the different processes executed
            by the CPUs)
- distributed architectures: Here each CPU has
            its own (local) memory (in a sense, each CPU
            is a more or less complete computer); sharing
            of data between the processes executed by the
            different CPUs is less efficient (usually
            realized through some network through which
            the different CPUs are connected.
 
 
- Concurrency at the level of instruction execution: This is realized when a single
    CPU, which consists of several parallel operating
    components, is organized in such a way that several
    instructions to be executed are processed in parallel,
    usually in some form of pipeline processing where each
    instruction goes through the different components each
  doing specific aspects of the instruction processing - such as (1) fetching the next instruction, (2) feching the operands, (3) performing an arithmetic operation.
Concurrency in software
The book by Sebesta mentions the following  levels of
  concurrency:
  - Instruction concurrency: this is actually concurrency at the hardware level, called "Concurrency at the level of instruction execution" above.
- Statement concurrency: This means that several statements
    of a program are executed concurrently on some physical
    parallel computer architecture. Usually, one considers a
    conventional (sequential - non-concurrent) programming
    language, such as Fortran or C. Certain optimizing
    compilers can automatically determine which statements
    are independent of one another and can be executed
    concurrently without changing the result of the program
    execution. They can also take the existing parallel
    hardware architecture into account (on which the program
    is supposed to be executed) in order to generate code for
    this hardware architecture which provides for concurrent
    statement execution. Note: The programmer is normally not much involved for this kind of concurrency, since it is dealt with automatically by the optimizing compiler.
- Unit (or sub-program) concurrency: Here one talks about concurrent
    "units", which are defined by the programming
    language and are intended to be executed concurrently.
    Programming languages with concurrent "units"
    allow the programmer to design programs which include a
    number of concurrent activities and the concurrency
    structure of a program depends very much on the
    application. A typical application area where concurrent
    "units" are very useful is simulation of
    process control or transportation systems, where the
    simulated objects evolve in real time concurrently. This
    level of concurrency is also called logical concurrency.
    In contrast to physical concurrency, it does not imply
    that the logically concurrent "units" are
    really executed in physical concurrency. Often they are
    executed in an interleaved manner (time-sharing, see
    below). Note: Instead of "unit", one often uses
    the words process, thread, task, or simply concurrent
    statements. Note: This kind of concurrency must be understood by the programmer that uses the programming language that supports concurrent threads (e.g. Java). In this course, we are mainly interested in this kind of concurrency.
- Program (or task) concurrency: Often several programs are executed
    concurrently (in an interleaved manner) on a computer
    within a time-sharing operating system. The CPU of the
    computer is shared among the programs by allocating the
    CPU to each program for a short time slice at a time. The
    allocation is done by a scheduler (see Sebesta's book).
    The same technique of time sharing is usually also used
    for realizing concurrent "units" within a given
    program (see above).
The sharing of a CPU among several concurrent threads or programs: Each operating system has a scheduler which allocates short time periods of the CPU to the different active programs running on the computer. At the end of such a time period, it determines which program should obtain the CPU in the next time period. A similar schedule is included in each program that supports several concurrent threads - this thread scheduler will determine how the time period, allocated by the operating system scheduler to the program, is shared among the threads within that program. For instance, the program running a Java JM has a scheduler to share the CPU among the threads of the Java program (if it contains threads in addition to the main thread). 
Note that the behavior of a scheduler is not always defined at all levels of details; e.g. for the Thread scheduler in a Java virtual machine, not all aspects are defined. Normally, one distinguishes the following states for a process (thread, task, ...) that is scheduled by the scheduler (see diagram below):
- New: the process was just created 
- Running: the process is allocated to a processor (e.g. CPU) that executes the instructions of the process 
- Ready: the process is ready to be executed, but no processor is available. When a processor terminates the execution of a process (either because the time unit has elapsed, or the process has to wait for some input/output operation to terminate) the processor will select another process from the "ready queue" to be executed. If there are several processes waiting in the ready state, it may not be defined which one will be selected, although priorities (if defined) must be taken into account. 
- Blocked: the process waits for an input/output operation to be completed before getting into the Ready state. 
- Terminated: the process has no more instructions to be executed. 

Dependencies between concurrent processes
Different kinds of dependencies
In the simplest case, concurrent processes perform their
  processing independently from one another. But often they have to
  coordinate their work, which is done through interprocess
  communication and synchronization. One may classify their
  interrelationship as follows:
  - Disjoint processes: They are independent, no
    synchronization (except, possibly, through their
    time-shared execution by the same CPU; but this is not
    seen at the logical level of the program which contains
    these processes).
- Competitive synchronization: They are logically
    independent of one another, but their share some common
    resource (which is also described by the concurrent
    program which contains these processes). Shared resources
    usually require some relative synchronization of the
    accesses performed by different processes (typical
    example: readers and writers accessing a shared database). 
- Cooperative synchronization: The logical structure of the
    processing of one process depends on information obtained
    from another process. A typical example is the
    "producer - consumer" system, which often
    contains a queue which passes objects (messages) from the
    producer to the consumer (see the "Monitors and
    concurrency in Java" topic explained in sections
    12.4 and 12.7 of Sebesta's book).
Note: Competitive and cooperative synchronization and
  communication can be realized through message passing primitives
  or through shared variables (including some basic synchronization
  primitive containing as an atomic operation the testing and
  (conditional) updating of a variable. Such primitives can be
  provided by an operating system for communication between
  concurrently executing programs, or by the hardware of a
  concurrent computer architecture. Examples of synchronization
  primitives for shared variables are a machine instruction
  "test and set" and the semaphore concept described in
  Section 12.3 in Sebesta's book. 
The non-deterministic nature of concurrent systems (remembering what we discussed in Chapter 1-3)
Concurrent processes  give rise to a large number of possible interaction sequences because the execution of sequences of the different processes may be interleaved in many ways. In the case of independent concurrency (as for instance defined by parallel regions of UML State diagrams) one assumes by default that there is no interrelationship between the concurrent processes (no communication, no shared resources, nor shared variables). This has the advantage, that the behavior of each process can be analysed independently of the other processes. However, the number of possible overall execution sequences is the largest, because the actions of the different processes may be interleaved arbitrarily in the global execution trace. An example is shown in the following diagrams, where (a) and (b) represent two independent processes, defined in the form of LTSs, and (c) shows the resulting global behavior (obtained by reachability analysis - which is simple in this case). In this example, one assumes that the LTS transitions take no time, therefore the execution of two transitions of two different LTSs can be assumed to be sequential, one after the other, and when the LTS are independent and two transitions of different LTSs are enabled, then their order of execution is not determined; there is therefore some non-determinism introduced by this form of concurrency. This is called interleaving semantics.
 
       
The main point is that one cannot predict which execution path will be taken by the interleaving concurrent processes. If, let us assume, there is some dependency between the two processes which leads to some possible difficulties when the transition z is taken from the global state (3, C), then it is very difficult to test the system to ensure that this dependency is correctly implemented. We cannot control the system to go through a particular path. It is worse if one does not know where a problem may reside in an implementation - the chances to execute a transition path that encounters the problem is very small indead during normal testing. Conclusion: Concurrent systems are very difficult to test for correctness. Therefore one should try to verify by logical reasoning.
 Some notes on the
  "message passing" concept 
"Message passing" means that one process sends a
  message to another process and then continues its local
  processing. The message may take some time to get to the other
  process and may be stored in the input queue of the destination
  process if the latter is not immediately ready to receive the
  message. Then the message is received by the destination process,
  when the latter arrives at a point in its local processing where
  it is ready to receive messages. This is called asynchronous
  message passing (because sending and receiving is not at the same
  time, although the reception is necessarily after the sending). 
Asynchronous message passing implies the following
  synchronization between the two processes: 
  - Blocking send and receive operations: The receive will be
    blocked if it arrives at the point where it may receive
    messages and no message is waiting. (The sender may get
    blocked if there is no room in the message queue between
    the sender and the receiver; however, in many cases, one
    assumes arbitrary long queues, which means that the
    sender will never be blocked).
- Non-blocking send and receive operations: In this case,
    the send and receive operations always return
    immediately, returning a status value which could
    indicate that no message has arrived at the receiver.
    Therefore the receiver may test whether a message is
    waiting and possibly do some other processing.
In the case of synchronous message passing (as for instance
  realized in the Ada language, see Sebesta's book, Sections 12.5.1
  through 12.5.3), one assumes that sending and receiving takes
  place at the same time (there is no need for an intermediate
  buffer). This is also called rendezvous and implies closer
  synchronization: the combined send-and-receive operation can only
  occur if both parties (the sending and receiving processes) are
  ready to do their part. Therefore the sending process may have to
  wait for the receiving process, or the receiving process may have
  to wait for the sending one (as in the case of asynchronous
  message passing). 
Ressource sharing
Need for mutual exclusion 
  - The example of two uses simulaneously updating the salaries of employees:
    
      - One administrator increments all salaries by 5% : salary = salary * 1.05;
- The other administrator adds a 100$ bonus to employee X: salary = salary + 100;
- What are the possible outcomes for employee X, if initially he had a salary of 5000$.
 
- Notion of transactions in database systems (possibly distributed). A transaction must be executed completely, or not at all. Several transactions may be executed concurrently, but must follow certain scheduling rules that ensure that the net result is equivalent to serial execution (serializable schedule). A transaction must leave the database in a consistent state. In case of a system failure, the effect of partially executed transactions must be "rolled back". These properties of transactions are called ACID properties (atomicity, consistency, isolation, durability).
- Example of money transfer: Transfer of 1000$ from account A in bank BankA to account B in BankB. The transaction requires access to both accounts, residing on two different computers. It should be atomic - execute completely or be aborted (even if one of the computers fails during the operations). The mutual exclusion should include both accounts - they both must be reserved before they can be used, and later freed. - The resource for which mutually exclusive access is reserved may include many units, e.g. a whole database, or several databases.
Ressource reservation
Ressource reservation is a scheme for realizing mutual exclusion; may be realised through semaphores or monitors (as programmable in Java)
  
  - Historical note: Dijkstra (at the University of Eindhoven in Holland) constructed in 1965 THE operating system (probably the first multi-tasking operating system); and he had problems sharing the different computer resources (CPU, memory, disk, etc.) among the different concurrent applications.  He programmed in machine language and noticed that, given the list of instructions available on the computer, it was impossible to realize reliable ressource sharing among concurrent processes. He then proposed a new instruction, sometimes called "test and set", which he introduced as part of concept of a semaphore. The semaphore is an object that supports two operations: P (for reserve: it tests whether the value of the semaphore is > 0; if that is the case, the value is decremented by one, otherwise the process has to wait) and V (for liberate: the value of the semaphore is incremented and one of the waiting processes (if there is one) is allowed to try its P operation again). An application must call the P operation before accessing the ressource associated with the semaphore, and must execute the V operation when it does not need the ressources any more.
    
      - Conclusions:
        
        
          - A semaphore is a very simple primitive for programming and controlling concurrency
- One cannot control concurrency with operations (machine instructions) that are solely conceived for sequential programs
- However, much later, Lamport described the Bakery algorithm (see Wikipedia) that provides mutual exclusion based on individual read and write operations on shared variables (where the write operations must be atomic).
 
 
- More complex concurrency rules: Often one wants to describe more complex concurrency rules, beyond simple mutual exclusion. Examples are: (a) the readers-writers-problem (where several readers may access the ressource concurrently), and (b) the tutorial example of the dining philosophers proposed by Dijkstra (each philosopher needs to reserve the left and right fork in order to eat; there may be conflicts and the possibility of deadlocks  due to a cyclic "wait-for graph". Here is the code for a philosopher who is initialized with the parameter i which indicates at what position he is on the table. 
- Deadlocks
    
      - Necessary conditions:
        
          - mutual exclusion
- hold and wait attitude: while keeping a given ressource, a process may ask for another ressource
- no preemption
- possibility for circular waiting chains
 
- Strategies for dealing with deadlocks
          
            - prevent: prevent the above conditions (at least one)
- avoid: organize the system such that no circular waiting will occur. One example of such a scheme is the so-called two-phase ressource reservation: the ressources are ordered, and each process must first reserve all ressources required for its task, and request the reservations in the order of the ressources; afterwards, all ressources must be released.
- deadlock detection and recovery: this requires a mechanism to detect the occurrence of a deadlock and a scheme for un-doing the actions of one of the processes involved in the deadlock (which may retry is actions later). In the case of transaction management, this means that one of the transactions involved in the deadlock cycle must be aborted.
 
 
The monitor concept
  - Historical note: 1967 Simula (first OO language, defined by Dahl in Norway); 1972 Pascal (defined by Wirth in
    Zurich,  Switzerland); 1976 Concurrent Pascal (defined by Brinch-Hansen,
    a danish emigrant to the US) : This language combines the simplicity of Pascal with the concept of Class from Simula and introduces the concept of controlling  ressource sharing for concurrent processes by Monitors. 
- A monitor is an instance of a Class with the following properties:
    
      - The execution of concurrent calls of methods are mutually excluded in time (in Java, this is indicated by using the synchronized keyword) 
- The possibility of using within the body of methods the operations wait (terminating temporarily the mutual exclusion until some programmer-defined condition - depending on internal monitor variables and method parameters - is satisfied), and notify (executed just before terminating a method call - used to activate any waiting process that will again check its waiting condition which may have been modified by the process that executed the notify operation).
 
- A semaphore is a special case of a monitor.
- In Java, each object instance (that inherits from the Object class) can be used as a monitor; the keyword synchronized indicates the need for mutual exclusion for a given method and the operations wait and notify (or notifyall) are defined on any instance of Object. 
Concurrency in Java (Please read in the book by Sebesta)
Examples: 
  - A FIFO queue with producer and consumer process are given in the book by Sebesta (see copy (a) of the Queue class (note: there is a bug in the incrementation of the pointer variables, such as nextIn) and (b) of the producer and consumer classes that call an instance of Queue. At the end, there is the main program that creates instances of these objects and passes the Queue instance to the producer and consumer when they are constructed. 
- Note: A very similar example is provided by SUN's Java tutorial under the title "Guarded Blocks"). 
- Here are some additional Java programs using semaphores and other kinds of monitors.
Some additional questions:  
  
    - How could one implement the monitor concept in Java in the Java Virtual Machine (suppose you have to design such a machine)
- Program the example of readers and writers in different versions (different versions of priorities).
   
Initialement écrit: 22 mars 2003; 
mise à jour: 19 mars 2004; révision typographique: 4 avril 2005; translated to english: March 3, 2008, revised 2009, 2010, 2011, 2013