Short Notes By: Salman Fazal
Contents Operating Systems
……………………………………………………………
1
…………………………………………………………………………….
3
The CPU and Assembly code …………………………………………………………………….
5
Booting
9
Linux
…………………………………………………………………………….
Outside the CPU …………………………………………………………………….
10
Memory Management
……………………………………………………
12
Cache & Buffer .…………………………………………………………………...
16
File Management
……………………………………………………………
18
The Kernel 1
…………………………………………………………………….
20
The Kernel 2
…………………………………………………………………….
22
Jobs Doing Work
…………………………………………………………...
24
Process Control …………………………………………………………………….
26
Scheduling ………………………………………………………………………………
28
Synchronisation …………………………………………………………………….
32
IPC & Synchronisation …………………………………………………………...
35
IPC Pipes and Sockets …………………………………………………………...
36
IPC Threads
…………………………………………………………………….
39
OS Security
…………………………………………………………………….
42
Defensive Programming
……………………………………………………
45
220CT Notes Salman Fazal
Operating Systems An operating system is a computer program designed to serve these basic purposes: o o o
Provide an interface between the computer hardware and software Controls and manages recourses among the various users and tasks Executes programs on behalf of users
The kernel is part of the operating system and closer to the hardware to prove:
device driver control process management memory management system calls
The kernel is the core of the operating system. It is a small piece of code loaded into the device memory at booting.
Why do need an OS?
Simplified platform for developing software Allow users to perform tasks and manage the system Portable and ‘hides the hardware’ – ie. When writing an application to read from a file, you don’t need to find its physical position on disk. Recourse sharing – can run several programs at once. Saves time and money.
Difference between operating system and application: The operating system is used to manage system resources resources and give application software an environment in which it can run. Application software is the front-end software that users handle (ie. Word processors). An OS is nothing but its own machine co de; ie. It doesn’t rely on other software or libraries.
2
220CT Notes Salman Fazal Where is the data stored?
Early computers used separate storage and signal pathways for data and instructions. This is called the Harvard Architecture
The architecture we mostly use today is called the von Neumann Architecture. Here, data and instructions are stored in the same place and are read in the same way. The line between the two is not clear anymore.
EXTRA: Difference between kernel and OS: The “kernel” word means the “nucleus”. It is the core component of any OS. It handles low level interface with the hardware, i .e. .e. the “dirty work” like managing memory usage, scheduling tasks, IO
access, etc. The OS, on the other hand, includes the kernel, plus everything else that makes a machine usable. This includes tools to manage files f iles and directories, standard libraries that interface with the kernel, printing and networking servers, servers, office suites, GUI servers, servers, etc.
3
220CT Notes Salman Fazal
Linux Linux is a UNIX-like operating system that runs on many different hardware platforms. It is free and open-source. Facts: Runs on about 2/3 servers More than 90% of supercomputers About 70-80% of phones based on UNIX
The Shell The shell is the interface between you and Linux.
The prompt shows these 3 things; Username, Host and the final character. Final character shows whether you are root not. $ means normal user # means root
Basic Terminal Shortcuts
4
220CT Notes Salman Fazal
COMMAND LINE COMMAND
MEANING
EXAMPLES NAVIGATION
pwd cd
ls
Print working directory. will tell you what directory you are in. change directory. Move to another location List files & directories you are in.
pwd → /home/music cd → home directory cd .. → upper directory cd /documents/uni → go to uni ls → list files & directories ls -a → show all including hidden ls -l → long listed format ls -t → sort by modification time ls -S → sort by size *you can combine more ie. ls -al
DIRECTORIES/FOLDERS mkdir rmdir
Create directory (folders) Remove directory
mkdir myFiles → make myFiles folder rmdir myFiles → remove myFiles folder FILES
touch cp
Create blank file copy file to destination
mv
move file to destination
rm
Remove file
touch testFile → make testFile cp Documents/myFiles/testFile ../Downloads/ → copy file to downloads mv Documents/myFiles/testFile ../Downloads/ → move file to downloads rm testFile → remove testFile FILES- EXTRAS
cat head tail sort wc grep diff wget
Concatenate files. View multiple files at once. Show the first n lines Show the last n lines Sort lines in a given way
cat file1.txt file2.txt → displays entire text from both files head -4 file1.txt → outputs first 4 lines tail -2 file1.txt → outputs last 2 lines sort file1.txt → outputs in ascending order ( sort -r for reverse) Count number of words, characters wc file1.txt → 19 77 654 file1.txt and lines lines-words-characters lines-words-characters (wc -l/-w/-c) grep corn text1.txt → outputs lines with corn Search for a given pattern diff file1 file2 → finds differences btwn 2 files. compare difference btwn two files wget www.mySite.com/textFIle.txt www.mySite.com/textFIle.txt Download given URL. PERMISSIONS
chmod
chmod 754 file1.txt → change file/folder permissions. - chmod permissions filename user has full control, group can read & execute - rwxr--r-- → read-write-execute read-write-execute only and others can read only. - user-group-others user-group-others - Calculated by adding digits below: 4=read; 2=write; 1=execute
5
220CT Notes Salman Fazal
The CPU and Assembly Code The CPU is also known as the processor or microprocessor. The CPU is responsible for executing a sequence of stored instructions called a program. Before discussing further, let’s look back at the Von Neumann Architecture. All computers more or less are based on the same basic design. ALU – capable of performing arithmetic operations CU – interprets the instructions in the memory and causes them to be executed. Registers – Used to carry temporary data for performing operations.
(Low level languages make direct use of the hardware within the CPU. This means the programmer needs to know a lot about the internal structure of the CPU in order to write low level code. In particular, they to know about the registers within the CPU)
Types of registers: 1. Program Counter – also a pointer. Keeps track of the memory address of the next instruction to be executed. 2. Accumulator – stores results (from the ALU) of the currently running instructions. 3. Instruction Register – Holds the current instruction to be executed. 4. Memory Address register – holds the memory location of the next piece of instruction to be fetched. Fetch-Execute Cycle: It is the process by which a computer retrieves a program instruction from its memory, determines what actions the instruction dictates, and carries out those actions.
The CPU reads the next instruction from RAM and stores it in the instruction register. The instruction is decoded and used to instruct the arithmetic and logic unit, or ALU. The ALU performs the actual instruction and the result is stored, either in the accumulator and/or in RAM. Repeat and find the next instruction.
6
220CT Notes Salman Fazal
ASSEMBLY LANGUAGE “A low-level programming language for computers, microprocessors and other programmable devices”. Assemble language language is just one level higher than machine language (which is in bytecode). Nasm/yasm is a type of assemble language.
Today, assembly language is used primarily for direct hardware manipulation, access to specialized processor instructions, or to address critical performance issues. Typical uses are device drivers, lowlevel embedded systems, and real-time systems. Assembly language is as close to the processor as you can get as a programmer. Assembly is great for speed optimization. It's all about performance and efficiency. Assembly language gives you complete control over the system's resources, you write code to push single values into registers, deal with memory addresses directly to retrieve values or pointers. Referring to Registers In a 16-bit CPU, AX is our register (for example);
It contains a 16-bit value AH is the “high” end and AL is the “low” end Example: AX = 4902 = 0001 0011 0010 0110 AH = 19 = 0001 0011 AL = 38 = 0010 0110
8-bit and 16-bit CPUs are both not common. In 32-bit OS, out register becomes EAX.
Other Registers: AX (Accumulator) – where results of operations go BX (Adress/pointer) – Used to direct syscalls CX (Counter) – counter for loops, etc DX (Data) – extra results that don’t fil in AX
64-bit CPU register RAX:
IP (Instruction pointer) – points to the next instruction SI (Source Index) – location of the start of a string/array/etc. string/array/etc.
7
220CT Notes Salman Fazal Structure of an assembly language We’ll be using the NASM/YASM assembler (a language for assembly language).
A data section – setting initialized data or constants in memory. section.data
A bss section – setting variables in memory. section.bss
A text section – actual program code. o this section starts with a global_start declaration declaration section.text global _start _start:
Comments start with a semi-colon ; This is a comment
Assembly Language Syntax
Assembly language statements are entered one statement per line. Format : [label:] mnemonic [operands] [;comment]
Two parts; first is the name of the instruction, i nstruction, second are the parameters of the command. Label is like a variable name. It labels the location of the variable.
Assembly Language Instructions Arithmetic mnemonics: ADD, SUB, DIV, MUL, INC, DEC Data transfer mnemonics: MOV, POP, IN, OUT Logic mnemonics: AND, OR, XOR, NOT
TERMS Compiling – convert symbolic code into machine code. Linking – links libraries to compiled code to make an executable.
Compiling and running the code (Linux): To assemble the program, type “ nasm -f elf hello.asm “ To link the object and make it executable, type “ ld -m elf_i386 -s -o hello hello.o “ Execute the program by typing “ ./hello “
8
220CT Notes Salman Fazal SAMPLE CODE – Hello World
More Instructions
NOTES ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___
______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ _________________________ _______________ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ _______________________ ______________________ ______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ 9
220CT Notes Salman Fazal
Booting Powering On Once the power button is clicked:
The power supply confirms that the supple it can give the system is OK, it then signals the motherboard motherboard to begin. The BIOS (Basic Input/Output System) responds to the signal. The BIOS sees this signal as power_good. The BIOS then begins the POST (Power On Self Test) process. After POST, the system is ready for booting. The BIOS reads the first 512 bytes from the its booting device and executes it. o The 512 byte image is called the bootsector o Usually starts with the floppy drive, then moves to HDD then other devices. o The last two bytes of the bootsector must be 55AA
Bootloader a mini-executable program that loads an operating system when a computer is turned on. It is placed in the disk drive. i. Run Bootloader ii. Load Operating System iii. Run Operating System
The BIOS The BIOS is a boot firmware which executed as soon as the PC is powered on. It has some pretty powerful capabilities: It can access RAM and secondary storage It can interact with the user
* The Bootsector is only 512 bytes, but it is enough to load a second, much larger chunk of code somewhere else on the disk (this approach is called a two-stage bootloader). The bootloader is able to:
Display information information Read from drives
Interact with the user And so on
Once BIOS loads the bootloader, it leaves some code in the memory that interrupts are mapped to. The bootloader now loads the core of the OS and hands over control of the t he computer. Q. If the BIOS interrupts are still around, why can’t programmers use them to take control of the OS? A. When booting, the CPU is in “ real mode”, meaning software can use any command and do pretty much anything. But when the OS process starts executing, it first switches to protected mode, where limited instructions can be called and there’s restricted memory than can be written/read to. (Real mode has power to change privileges).
10
220CT Notes Salman Fazal
Outside The CPU Modern machines have
One or more CPU’s, device controllers connected through a common system bus.
- I/O devices and the CPU can execute concurrently. - Each device controller is is in charge of a particular device.
DEVICE (I/O) CONTROLLERS - A device controller makes sense of the signals going to, and coming from the CPU. - Each device controller has a local buffer and a command register. It communicates with the CPU by interrupts. - A device's controller plays an important role in the operation of that device; it functions as a bridge between the device and the operating system. Their job is to manage the devices connected to them, this is mainly moving data around to different peripherals.
INTERRUPTS Operating Systems are event driven. When you switch on, bootstrap is loaded, if nothing happens, OS will just wait for an event to happen. Interrupts are an event external to the currently executing process. Hardware interrupts are triggered by sending signals to the CPU and software interrupts are triggered tri ggered by executing an operation called system call When an I/O device sends an interrupt to the CPU, the request (IRQ) goes through a chip called PIC. The PIC selects the highest priority device and sends it to the CPU The CPU pauses whatever it’s doing and handles the request. Once the CPU performs the required action, it informs the PIC that it has finished executing the process and returns back to whatever it was doing before.
11
220CT Notes Salman Fazal INTERRUPT VECTOR Pointers to the start of each interrupt routine are stored in a Interrupt Vector . These locations hold the addresses of the interrupt service routines for various devices. This array of addresses is called Interrupt Vector.
PROCESS A process is a bas ic unit of execution in an Operating System. It’s basically a program in execution. The OS controls the execution of a process, a process can be of two states: running or not running. The CPU switches back and forth from process to process. This rapid switching is called multiprogramming. New – Process is being created. Ready – process is waiting to run on CPU. Running – Process is running on CPU. Blocked – the process is waiting for an event. Exit - process
Each process in the OS is represented by a Process Control Block. Includes; Process state (new, ready, running …) Program Counter – indicates address of next instruction. Registers – vary on number and type CPU-scheduling information – information includes a process priority and any other parameters. Memory-management information.
12
220CT Notes Salman Fazal
Memory Management MEMORY HIERARCHY
ADDRESS SPACE Addresses uniquely identifies a location in the memory. There are two types of addresses, logical address and physical address. Logical Address (also virtual address ) can be seen be the user. It is generated by the CPU during a program execution while Physical address refers to (or identifies) a location in the memory unit.
BINDING A program, to be executed, must be brought into the main memory. Address Binding is the process of mapping the program’s logical address (virtual) to its corresponding physical address (main memory). Binding can take place at ‘compile time’ and ‘load time’.
SWAPPING Because processes can only be executed when they are in the main memory, and the main memory is limited in size, we might need more. So, we swap some memory memory on the backing store (hard disk). Processes can be temporarily swapped out of the main memory to the secondary memory, this allows execution of other processes. processes. When the OS needs the data from the disk, it will wil l swap the data back to the main memory (swapping some other piece out). Different approaches are used for swapping, such as; higher priority process.
13
220CT Notes Salman Fazal Contiguous Memory Allocation The main memory must allocate memory to the OS and all the processes. There are numerous processes that run on the main memory at the same time, therefore there is a need to allocate memory to processes. Contiguous Memory Allocation is a method of assigning memory. The size of the process is compared with the amount of contiguous main memory available to execute the process. *Contiguous -> neighbouring – think of it as a line of memory First fit - Allocate the first hole that is big enough. Best fit - Allocate the smallest hole that is big enough. Worst fit – Allocate the largest hole.
VIRTUAL MEMORY
The conceptual separation of logical memory from physical memory. This, we can map large virtual memory on a small physical memory. Virtual memory is a feature of an operating system (OS) that allows a computer to compensate for shortages of physical memory by temporarily transferring pages of data from random access memory (RAM) to disk storage.
14
220CT Notes Salman Fazal PAGING It is not possible to keep all the necessary data and programs programs in the main memory, data is split up into chunks called ‘pages’. Only the necessary pages are in the main memory at a time. Paging is a memory-management scheme that allows physical address space of a process to be noncontiguous, where the processor stores and retrieves r etrieves data from secondary storage storage for use in main memory. The OS now has to keep a table of which pages are allocated and which are empty. When a page is page fault ), referenced and it is not in memory ( page ), then get the page from disk and re-execute.
Page Fault Occurs when there is a reference to the page that isn’t mapped to a physical page.
Steps to find the contents of the page: 1. The process requires a page which isn’t in the memory. 2. If page is valid, get it from the secondary storage. 3. Find a free page space (which is currently not in use). 4. Read the desired page into the newly allocated frame. 5. When memory is filled, alter the page table to show page is resident. 6. Restart the instruction that failed. PAGE REPLACEMENT ALGORITHMS
15
220CT Notes Salman Fazal THRASHING In a virtual memory system, thrashing is the excessive swapping of pages of data between memory and the hard disk, di sk, causing the application to respond more slowly. Solutions: Good paging algorithms Reduce number of processes running.
NOTES
______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ ___________________________ _________________________ ___________ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _____________________________ _________________________ _________ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ _________________________ _______________ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ _______________________ ______________________ ______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ _______________________ ___________________________________ ______________________ ______________________ _______________________ _______________________ ______________ __ 16
220CT Notes Salman Fazal
Cache & Buffer All processes must run from the main memory, transferring from one type of memory to another can be time consuming (ie. HDD -> RAM). All processes exist in the RAM, the OS must get instructions from RAM and put it in registers. The problem here is, the RAM is 25x slower than the registers in the CPU, meaning the registers will have to wait for the RAM to send it information. information. The solution to this issue is to put some quicker memory between the RAM and the CPU, this is the cache memory – access speed lies between the RAM and Registers. This technique is called caching.
CACHE AND CACHING Cache memory is a smaller, faster memory memory which stores copies of the data from the frequently used main memory locations. It is kept between the RAM and processor to increase data execution speed.
- Whenever we read something from the RAM, make a copy of it in the cache memory. - Whenever we need to access something from the RAM, first look for it in the cache memory; if it finds the instructions there, it does not have to do a more time-consuming search search reading of data from larger memory or other data storage devices. Writing to cache - Write-through (Immediate): every write to cache causes write to RAM. - Write-back (Latter-time): Write to RAM until data is evicted (replaced) from cache.
LOCALITY The concept of locality is to increase efficiency. Basically, Basically, hold blocks of the most recently referenced instructions and data items. i tems. Temporal Locality – Locality in time. The principle that data being accessed now will probably be accessed soon. Spatial Locality – Locality in space. The principle that data near the data accessed now will probably be needed soon.
Phonebook example: Consider looking up a telephone number in a personal organizer, the more
phone numbers, the slower the access. o o
Temporal – If – If you call somebody today, you’re more likely to call them again tomorrow. Spatial – You’re a lot of people you know. – You’re likely to call a
17
220CT Notes Salman Fazal BUFFER
A buffer is a portion of a computer's memory that is set aside as a temporary holding place for data that is being sent to or received from an external device. Peripheral devices operate at a slower speed than the processor. To allow these speed differences, a buffer needs to be used. For instance, consider transferring a file via a modem to the HDD. The file is received and placed into a buffer in main memory. When the buffer is full, the data can be written to the memory, while the data is being written, another buffer may be used to keep receiving the file. These buffers are also used as a cache for file I/O to improve efficiency for file. This is called BUFFER CACHE.
NOTES
______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ _______________________ ______________________ ______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ _______________________ ___________________________________ ______________________ ______________________ ________________________ _______________________ _____________ __ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ _________________________ ____________________________________ ______________________ ______________________ _______________________ ________________________ ____________ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ _____________________________________ __________________________ _______________________ _______________________ ______________________ ___________ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ______________________________________ __________________________ ______________________ ______________________ ______________________ ____________ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ __________________________ _________________________ ______________________ ______________________ ____________ 18
220CT Notes Salman Fazal
File Management -
-
Previously, we looked at RAM, Cache, buffer and registers. These are temporary, meaning as soon as the processor terminates, the memory is reallocated. Now we look how the OS manages more permanent memory. The way in which an operating system organizes, structures, names, accesses, uses, protects, and implements is called a file management system.
All computer applications need to store and retrieve information. While a process is running, it can store a limited amount of information within its own address space. For applications such as airline reservation, we need a much larger amount of memory. Problems of storing information in process o When process terminates, the information is lost. o For some applications (ie. database), information information must be retained for a longer period. o Information stored inside a single process is only accessible by that process. It is frequently necessary for multiple processes to access (parts of) the information at the same time. Solution is needed from a file management management system. o Store large amount of information that can be accessed quickly. o Information stored must survive the termination of the process using it. o Multiple processes must be able to access the information simultaneously. simultaneously. FILES -
Files are logical units of information created created by processes. Processes can read existing files and create new ones if required. File are managed by the OS. File types can be: .doc, .exe, .bat, .zip, .c, etc
File Operations: Operations can be: create, update, delete, open, close, append, seek, rename, get/set attributes… Seek: method to specify from where to take the data (file pointer to a specific place in the file). Get: returns file attributes for processing. Set: set the user-settable attributes when files are changed. * File attributes: name, type, location, size, protection, time, data, user identification..
FILE DIRECTORIES - Files are stored on disks, CD’s etc. which are stored in fixed -sized blocks. - Filesystems store information in volumes. Each volume is treated as a numbered sequence of blocks. A list of filenames and locations is called a directory . There must be a root directory on each volume. /Directory/Directory2/FileN /Directory/Directory2/FileName.txt ame.txt ./FileName.txt <- Hierarchical/Tree Structure.
19
220CT Notes Salman Fazal LINKS It is useful to keep the same file in several places. We use links which a files (pointers) which contain pathname of the file being linked to. Can be ‘soft’ or ‘hard’ link. A Soft Link is a file that is i s a pointer to another file. The other points to the data on the HDD. This behaves similar to a Windows shortcut.
A Hard Link is a direct pointer to the data on the HDD. A hard link is identical to the original file. Any modifications you make to the hard-linked version are made to the t he original as well, since you’re modifying the same physical space on the HDD.
*File attributes are stored in the inode An inode is a data structure that stores all the information about a file except its name and data. File Systems - FAT (MS-DOS, Windows) - Ext2fs (Linux ‘second extended filesystem’) - NTFS (Windows NT, 2000, XP) EXT2 File System - A popular file system for Linux. - Data held in files are kept in data blocks. Each file in i n the system is described with an inode data structure (which contain file attributes – permissions, last access time, etc). Directories contain filename and inode number. - Each inode has a unique number. The inodes are stores in inode tables. Inodes are 128 bytes in size and contains 15 block numbers. - Inode contains 15 block numbers - 1 to 12 direct link to first 12 blocks of file (so small files are handled efficiently) - single indirect block points to data block containing 256 more block numbers - double indirect block points to data block containing 256 more single indirect block numbers - triple indirect block points to data block containing 256 more single indirect block numbers.
20
220CT Notes Salman Fazal
The Kernel 1 A kernel is is the part of the operating system that facilitates facilitates access to system resources. It's responsible for enabling multiple applications to effectively share the hardware by controlling access to CPU, memory, disk I/O, and networking. - Setting up interrupt handlers, - Accessing memory, - Interfacing with hardware, - Caching & buffering, etc
Kernal Mode vs User Mode User defined functions are executed in user mode and system defined functions are executed in kernel mode. Kernal Mode
User Mode
Executing the code has full unrestricted access to hardware
Executing the code has no ability to directly access the hardware
lowest-level, most trusted functions of the operating system, can reference any memory
must delegate to system APIs to access hardware or memory
Crashes are severe; can halt the entire PC
crashes are always recoverable
Kernel jobs: -
Switching between programs Device control Memory Management Management Interprocess Communication (IPC)
-
Scheduling Process Management Interrupt Handling
Types of Kernel
21
220CT Notes Salman Fazal
MACRO KERNEL is a single large process running entirely in a single address space.
MICRO KERNEL is broken down into separate processes, known as servers.
- All kernel services exist and execute in the kernel-space.
- Only essential functions of an OS runs on kernel-space, all the other OS processes run on user-space. - Servers are kept separate and in a different address space. Communicate with each other by sending messages via IPC.
- The kernel can invoke functions directly.
MACRO KERNEL
MICRO KERNEL
- Writing one big software isn’t easy
+ kernel can remain unchanged while useruserspace changes only. - No efficiency as swapping required.
+ All components components have full privileges - Problem in one area can can cause a problem problem in all parts of the kernel
+ hard to find exploits as limited code in kernel - IPC to communicate communicate between servers, slower processing.
+ Efficiency: no need to swap swap out programs
NOTES Microkernel – Mash
|
_________________________________ ________________ _____ Macrokernal – Linux Linux _ ______________________
______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ _______________________ ______________________ ______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ _______________________ ___________________________________ ______________________ ______________________ ________________________ _______________________ _____________ __ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___
22
220CT Notes Salman Fazal
The Kernel 2 API’s and System Calls
The API – specifies a set of functions that are available to the programmer, programmer, these functions are used by software components to communication with each other. Their app could be something like Facebook, or weather.com …
System Calls – All OS provide some mechanism to request services from the kernel, these requests are called system calls. In Linux system, there is an interface layer between user space and kernel space, this layer consists of library made up of functions that communicate between the normal user applications and the Kernel. These calls are implemented implemented using interrupts.
An API does not necessarily correspond correspond to a specific system call. The API could offer its services directly in User Mode and an API function could make several system calls. System call examples: -
Exit(int) – terminates current process with exit value Nice(int) – changes niceness (priority) of the process by adding the new number Ioctl – controls devices at a very low level by setting access modes, buffering, etc
-
Int open(const char *pathname, int flags) – opens file from pathname. The flags integer conveys information to the kernel about how to open the file. The parameter is a bitmask. Eg. -> open(“file.txt,”, O_APPEND | O_DIRECT)
Bitmasks
A bitmask is an integer that is stored in the memory as a sequence/string sequence/string of bits. (Not in slides) Eg. Say I have a Days Picker (Mon-Sun), how do I know which days are selected? Bitmask. You can combine these numbers using an OR operator, without losing the knowledge of which ones were combined. 0000 0010 OR 0100 0000 0100 0010 <- Tuesday and Sunday
23
220CT Notes Salman Fazal Why API to create programs rather than system calls? Portability - A program developed developed using an API can compile compile and run on any system that supports supports the same API. Complexity – Actual system calls are often more detailed and difficult di fficult to work with than the API available.
Why low-level calls? Many of the system calls are wrapped by the higher-level libraries (ie. C library), a handful of low-level calls are replaced by a single libc call, however, you don’t always have libc and it might not do what you want it to do.
Modules
Kernel modules are pieces of code that can be loaded and unloaded into the kernel upon demand. These modules run in kernel mode and have full privilege and access to all the hardware in the machine. Modules can be compiled and dynamically added/removed added/removed from the kernel address space while the kernel is running. This helps by reducing memory consumption consumption by loading only necessary kernel modules. Module object file is installed in running kernel using “ insmod mod_name” Removing the kernel module involves invoking the rmmod command.
NOTES
______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ _______________________ ___________________________________ ______________________ ______________________ ______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ _______________________ ___________________________________ ______________________ ______________________ ________________________ _______________________ _____________ __
24
220CT Notes Salman Fazal
Jobs Doing Work Batch Processing Batch processing is the execution of a series of programs (“jobs”) on a computer without human interference. E.g. Printing – send a few documents to print and computer takes care of the job.
Multiprocesses and Scheduling
Multiprogramming The ability to store and execute multiple programs in the memory. The CPU executes only one program at a time, while the others wait their turn. It increases CPU utilisation such that the CPU always has on to execute. E.g. CPU picks a job to run, at some point the job might have to wait for some task (such as in I/O operation) in order to complete. At this point, the OS switches to another job that can be executed (and so on) and as soon as the first job finishes waiting, the CPU gets back to it. This maximises maximises the use of the CPU, so that something is always running on it. Multitasking (time-sharing) The ability to execute more the one task at the same time on a processor. These tasks share common recourses recourses such as CPU. Multiple jobs in the main memory are executed by switching between them quickly, making the system responsive to user input, and appearing the jobs to run simultaneously. Basically, the goal of multiprogramming is to have a process running all the time to maximise CPU usage and the goal of multitasking is to switch among processes making users have an impression that each program is running simultaneously. Scheduling The utilisation of the CPU to run several jobs. It basically handles the submission (or removal) of jobs to the CPU and decides which process to run given a particular strategy/algorithm. strategy/algorithm. Two levels of scheduling considered when multiprocessing: Job Scheduling – chooses which process to take from the secondary memory memory and put in the main memory CPU Scheduling – manages how processes in the main store get to use the CPU. •
•
When a process enters the system, they join the job queue (represents all processes in the system, held in secondary storage). storage). Job Scheduling places process which are ready to execute in the main memory and are kept in the ready queue (‘ready’ state). The process will then be selected from the ready queue for processing (‘running’ state).
25
220CT Notes Salman Fazal When the process is executing in the CPU, one of several things could happen (we’ll consider two):
1. Process issues an I/O request – the process might take CPU time when waiting for the request to be completed, so the process is added to the I/O Queue (‘waiting’ state). Once request is completed, the process is added back to the Ready Queue.
2. Process creates a sub-process – think about entering command on a command prompt. When a process creates a new sub-process, it changes state to ‘waiting’ and wai ts for the sub-process to finish (once completed, process is added back to the ready queue). E.g. When the terminal executes a process, it has to wait until that process has finished before it can continue its own operations.
NOTES _ _____________________ __________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ _ _____________________ __________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ _ _____________________ __________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ _ ______________________ __________________________________ ______________________ ______________________ ______________________ ______________________ _______________ ___ _ _____________________ __________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ _ _____________________ __________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ _ ______________________ __________________________________ ______________________ ______________________ ________________________ _______________________ _____________ __ _ _____________________ __________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ _ _____________________ __________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___
26
220CT Notes Salman Fazal
Process Control The Linux process state diagram. Starting an application creates a new process and puts it into the ‘ready’ state. The OS then moves it into the ‘running’ state when there’s a slot free for it. Ps. Command ‘ls’ is a process too (located in PATH) Terminating a process – In UNIX, one way to terminate a process is to press CTRL + C in the terminal. However, sometimes sometimes the process might have started elsewhere so there is no option for that, the kill command can be used instead. E.g. use ps to list processes, then use kill -9 1234 (or kill -KILL 1234) to kill process with id-1234 forcefully. Suspending a process – It is also possible to put a process into the ‘waiting’ state until it gets a signal to restart. Use CTRL + Z.
Multiple Processes and Forks
-
Fork makes a child process that is an exact copy of its parent process. When a program calls fork, a duplicate process, called the child process is created. The parent process continues executing the program from the point that fork was called. The child process, too, executes the same program from the same place. All the statements after the call to fork will be executed twice, once, by the parent process and once by the child process. o fork() splits process into two processes If pid is 0, run the child process o Else run the parent process (as the process id isn’t 0) o
The order of Child and Parent might differ between runs because we now have two separate processes.
27
220CT Notes Salman Fazal Summary: 1. Concept – for one process to create another, it forks. Leads to two identical processes, running the same code. 2. Function – fork() causes the process to fork. The function is called once, but returns twice. getpid() return the process id. Looping fork() E.g If we loop twice, we basically call fork 2 times. This is exactly what happens:
Exec command
-
The exec function replaces the program running in a process with another program. When a program calls an exec function, that process immediately stops executing and begins executing a new program from the beginning. Because exec replaces the calling program with another one, it never returns unless an error occurs. The new process has the same process id as the original one. Unlike fork, exec results are in a single process.
NOTES
______________________ ___________________________________ ______________________ ______________________ _____________________________ _________________________ _________ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ _________________________ _______________ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ _______________________ ______________________ ______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ _______________________ ___________________________________ ______________________ ______________________ ________________________ _______________________ _____________ __ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ 28
220CT Notes Salman Fazal
Scheduling RECAP
Multiprogramming – The ability to store and execute multiple programs in the memory. Multitasking (time-sharing/splitting) (time-sharing/splitting) – The ability to execute more the one task at the same time on a processor. Scheduling – basically handles the submission (or removal) of jobs to the CPU. Process scheduling decides which process should run on the CPU next (moved from ready state to the running state). -
Processes that use very little IO are called CPU bound and processes that spend more time doing IO compared to computation are called IO bound.
Scheduling
When a process terminates, or goes into a waiting state (while IO is waited upon), the OS must choose a new job to execute, this choice is made by the scheduler. Once the scheduler makes a decision, the process is started by a dispatcher. Scheduling – metrics Ways to measure scheduling performance. Metrics
Description
CPU utilisation
keep this as high as possible
Throughput
The number of processes completed per time unit. A measure of work being done.
Turnaround time
Average time for a process to finish.
Waiting time
Sum of waiting times
Response time
How quickly a process begins executing.
Scheduling Algorithms/Techniques Algorithms/Techniques These algorithms are either preemptive or non-preemptive: Preemptive – A process can be interrupted by other (higher-priority) tasks. Non-preemptive – once a process starts, it will run until it terminates. First-come-first served (Non-preemptive) (Non-preemptive)
-
Jobs are executed on a first come, first serve basis. Non-preemptive Non-preemptive (as processes do not have to give up for other processes). Its implementation is based on FIFO queues. Prevents parallel use of resources (as only one process at a time) therefore poor in performance as average waiting time is high.
Diagram next page ->
29
220CT Notes Salman Fazal
Shortest Job First (preemptive and non-preemptive)
-
Order of processes are based on the CPU time they will need. Really good approach to minimize waiting time however it can cause some processes to never get CPU time as short jobs always coming in.
-
Preemptive version could be based on shortest time remaining. The processor is allocated to the job closest to completion completion but it can be interrupted by a newer job with a shorter time to completion.
Round-robin Scheduling (preemptive)
-
Each process is provided a fixed time to execute, called a quantum. Once a process is executed for its given time period (quantum), it gets interrupted & goes back to the start of the list and another process executes for a given time period. Prevents the starvation of processes.
30
220CT Notes Salman Fazal Multi-Level Queues Scheduling
-
-
-
This is not an independent scheduling algorithm, they make use of other existing algorithms; a mixture of Round-Robin and a priority scheduling. Multiple queues/groups are maintained for processes. Each queue has a different priority. The scheduler select the next process from the highest priority queue. The priority levels are organised like this (higher to lower): o System processes o Interactive Processes (e.g.: the CPU waits for a keyboard input) o Batch Process o User Process Low priority queues may not execute if there are higher priority queues. To overcome this, we can use a multi-level feedback queue scheduler, in which processes can be moved between priority queues. E.g. a process gets lower priority if it uses a lot of CPU time and a process gets higher priority if it has been waiting a long time.
NOTES Turnaround time = completion time – arrival time __________ _____________________ _______________________ ___________________ _______ Wait time = turnaround time – execution time time ____________________________________________ Example (First Come First Served): _____________________ _____________________ __________ _______________________ _______________________ __________
______________________ _____________ ______________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ _____________ ______________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ _____________ ______________________ ______________________ ___________ __ Average TT = (24 + 24 + 30) / 3 = 26____ __ ______________________ _____________ ______________________ ______________________ ___________ __ Average WT = (0 + 21 + 27) / 3 = 16_______ _________ _________ ____________________ ______________________ ______________________ ______________________ _______________________ _______________ ___ _____ ______________________ _____________ _____________________ ____________ _ __________ _ ______________________ _______________________ ______________________ _____________ __ _ ____________ _____________________ ______________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ _____________ ______________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ _____________ ______________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ _______________________ _____________ ______________________ ______________________ ______________________ ______________________ ______________________ _______________ ___ ______________________ _____________ ______________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ _____________ ______________________ ______________________ ______________________ _______________________ ______________________ _______________ ___
31
220CT Notes Salman Fazal
Synchronisation Recap Previously, we’ve seen how to create and manipulate processes to perform basic tasks. In this chapter, we look at solutions the problems of multi-process applications: communication and synchronisation.
-
How to deal with multiple processes using the same resource at the same time? How to prevent processes from getting in each others way? How to transfer information safely between processes? SYNCHRONISATION
Synchronisation
Means sharing resources by processes in such a way that, simultaneous access to shared data is handled thus minimising the chance of inconsistent data. This is basically techniques to manage execution execution among processes. Ie. One process may have to wait for another. Shared resources (critical section) may require limited access. Example; printer is a shared resource where multiple processes can access but they can’t access it at the same time. If they t hey try, a deadlock occurs. A simple solution… Flags
Each process maintains a flag indicating that is wants to enter into the critical section. It checks the flag of the other process and does not enter the critical section if that other process wants to get in. Critical Section The region of code that deals with the process trying to access a shared resource. At a given point of time, only one process must be executing in the critical section, if any other process wants to execute too, it must wait its turn until the first one finishes. A process must be mutually exclusive when entering the critical section. This basically means that no more than one process can execute in the critical section at one time.
Note: ‘ turn turn’ in in the above image is a flag.
32
220CT Notes Salman Fazal Peterson’s Algorithm
-
This algorithm is used for mutually exclusion and allows two processes to share resources without conflict.
This algorithm uses 2 variables; flag indicates that a process wants to enter the critical section. Turn grants the entrance to the critical section. Synchronisation Hardware
Many systems provide hardware support for the critical section code. This problem could easily be solved by disabling interrupts to occur while the shared resource is being modified. However, this solution is not possible in multiprocessor environments. As a result, a software approach called mutex locks was introduced. When entering a critical section, a LOCK is acquired over the critical resources to prevent access to other processes, and when exiting, the LOCK is released. Semaphores
This is a synchronisation method method and can be viewed as an extension to mutex locks. A semaphore is a shared integer variable, int S, that is accessed and modified only through two standard operations, wait() and signal(). The variable keeps the semaphore state. These operations are atomic . Meaning, once the activity of wait() starts, they will continue to the end without any interruption. Similarly, the same applies to signal(). Initially the counter variable S is set to 1. Suppose a number of processes try to execute wait(), because only one thread can execute wait(), Process A causes the counter to decrease by 1 (making S=0) and enters the critical section, as a result, all other processes executing wait() will be blocked (similar to lock). When process A exits the critical section, it executes signal(). If there are any other processes, one of them will be released, and enters the critical section. Note: the counter is still 0, signal does not increase S as there are still processes in the list and wait does not decrement S as S is already 0. This means all processes executing wait() are blocked. On the other hand, if there are no waiting processes in the list, signal() increments S to 1. In this
33
220CT Notes Salman Fazal case, the next process to execute wait() can enter the critical section. Therefore, signal is similar to unlock. As the values of S are only 1 or 0, this is known as a binary semaphore. Pseudocode:
NOTES
______________________ _________________________________ ______________________ ________________________ ________________________ ______________________ ___________ ______________________ _________________________________ ______________________ ________________________ ________________________ ______________________ ___________ ______________________ _________________________________ ______________________ ________________________ ________________________ ______________________ ___________ ______________________ ___________________________________ _______________________ ______________________ ______________________ ______________________ ____________ ______________________ _________________________________ ______________________ ________________________ ________________________ ______________________ ___________ ______________________ _________________________________ ______________________ ________________________ ________________________ ______________________ ___________ ______________________ _________________________________ _________________________ _________________________ ______________________ _____________________ __________ ______________________ _________________________________ ______________________ ________________________ ________________________ ______________________ ___________ ______________________ _________________________________ ______________________ ________________________ ________________________ ______________________ ___________ ______________________ _________________________________ ______________________ ___________________________ __________________________ ___________________ _________ ______________________ _________________________________ ______________________ ________________________ ________________________ ______________________ ___________ ______________________ _________________________________ ______________________ ________________________ ________________________ ______________________ ___________ ______________________ _________________________________ ______________________ ________________________ ________________________ ______________________ ___________ ______________________ _________________________________ ______________________ ________________________ ________________________ ______________________ ___________ ______________________ _________________________________ ______________________ ________________________ ________________________ ______________________ ___________ ______________________ _________________________________ ______________________ ________________________ ________________________ ______________________ ___________ 34
220CT Notes Salman Fazal
IPC & Synchronisation Inter-Process Communication are techniques used to allow processes to communicate and exchange data with one another. This enables several processes to share the same data without interference.
Ftok Command The ftok() function creates an IPC key from fr om a file pathname and an identifier as parameter. key = ftok("/salman/filename", 'b'); The reason for this is i s you want two or more independent processes to have access to the same IPC resource. So, if two processes execute the same key, both will have the same token and can therefore access the same resource (semaphore, shared memory, message queues). Signals
Signals are a simple way to communicate between processes by indicating various conditions/signals conditions/signals (ie. KILL). Signals can be sent to any process as long as its ID is known. When a process receives a signal, a default action happens (exits), unless the process has arranged its own signal handler. Examples of signals: - SIGHUP = hang-up the process - SIGABRT, SIGALRM, SIGCHLD … - SIGQUIT = quit the process - SIGKILL = kill the process - SIGCONT = continues a stopped process. A callback is what catches these signals. They are similar to interrupts, it’s basically a piece of code attached to some event. Shared Memory
This is an implementation implementation for IPC where a memory section is shared between different processes. Meaning, process A writes to memory and process B can read from memory, or vice versa. The downside to this is that it’s difficult to manage changes to the shared memory area.
Any reading/writing needs to be mutually limited. So, we create a critical section and protect it with semaphores. We could have a shared memory segment to keep count so we can know how much data has been written. Figure below: A second option to buffer between processes would be: - Consumer can read some of the available data - When counter reaches then end of the buffer, it Rolls back to the start. - Ensures the producer doesn’t overwrite existing data or the consumer goes beyond the producer. This is called a circular/ring buffer. 35
220CT Notes Salman Fazal
IPC Pipes and Sockets Recap
In the last l ast chapter, we looked at two ways in which processes can share data: shared memory and signals. -
Shared memory allows direct sharing of primary storage and must be guarded by a mutually limited critical section. Signal removes the need to worry about data consistency, but have a limited amount of data that can be communicated. communicated.
In this section, we will look at two more methods of inter-process communication: communication: pipes and sockets. Both these methods remove the need for critical section, but do not limit the communication communication to simple events like signals. Pipes
Simply put, a pipe is a method of connecting the output of one process to the input of another. They provide a method of one-way communication communication between processes. A pipe has two ends, P2 inserts data to the end of the pipe and P1 reads that data from the end of the pipe (uses a FIFO data structure). Example: ls | grep tcp <= list files and search ‘tcp’. This command is taking the output of the ls command as the input of the grep command. ( | = pipe). The above is a description of an unnamed pipe. They are unidirectional by nature. Hence, to enable a two-way communica communication, tion, 2 separate pipes need to be created. Once the parameter parameter is passed into the pipe, the system holds the piped information until it is read by the receiving process. FIFOs or Named Pipes are similar to pipes but they are represented by files in a filesystem. Processes can read and write to them as if they were actual files. They exist until explicitly deleted. With this method, any process can access access the pipe if it knows the name and has the right privileges.
We can use the piping method within our programs by first creating the pipe and then forking it, so that both parent and child can have access to it. A basic difference between pipes and shared memory is that when we fork, we share a file descriptor. Using piping, we eliminate the problems of data consistency as we can only access the pipe through the concepts of file operations. Note: Having read and write in both processes doesn’t make it a full-duplex pipe. It’s actually a half duplex, with complications. To have a full duplex communication, communication, you need to close read on one end and write on the other, then create a new pipe with the closing reversed.
36
220CT Notes Salman Fazal Sockets
A socket is a bidirectional communication communication device that can be used to communicate communicate with another process running on the same or different machine. They work similar to a two-way pipe, however, all data communicated communicated takes place through the sockets interface/network interface/network connection instead of a file interface. Programming Programming a socket requires a client/server model whereby the client and server establish a dedicated connection over which they can communicate. A TCP server may serve several clients simultaneously, by creating child processes for each client and establishing a TCP connection between them. Unique dedicated sockets are created for each connection.
Client
Server
Requires service
Provides service
Normally ends after using the server a finite number of times
Normally accepts requests and provides responses indefinitely
Obtains an random, unused port for communication
Listens for requests on a well-known, reserved port
Client “sometimes on”
Server is “always on”
Initiates a request to server when interested
Handles services requests from many client hosts
Doesn’t communicate directly with other clients
Doesn’t initiate contact with the clients
Needs to know the server’s address
Needs fixed, known address
37
220CT Notes Salman Fazal The diagram above is a connection-oriented socket where authentication is required. A c lients’ connectionless socket is slightly different; the server does not need to authenticate the clients’ connection request, therefore, the ‘listen for client’ and ‘accept connection’ are eliminated.
NOTES
______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ______________________________________ ___________________________ _______________________ _______________________ _____________________ __________ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ __________________________ _________________________ ______________________ ______________________ ____________ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ _________________________ ___________________________ ______________________ ___________ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ ___________________________ _________________________ ___________ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ _________________________ _______________ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___
38
220CT Notes Salman Fazal
IPC Threads Thread 1
A thread is simply, parts of a process that are created by the process itself and are required for performing a task. 2
A ‘process’ is your whole program, whereas a ‘thread’ is an independent task within your program.
Think of it this way: Your process is a business. For a business to work, it needs at least one person working there, that’s a thread. If you want your business to do more things at the same time, you add more workers (thread), who can use anything the business owns (memory/resources). (memory/resources). You can start another business (process), but workers (threads) can only work inside that business. And the two businesses cannot be share the same location (memory space). space). Basically, threads running within the same process can access each other’s m emory, but no memory outside of that process.
Multithreading allows a single process to run several threads (parts of the program) at the same time.
Each thread within the process has its own program counter (keep track of instructions to execute), stack (which hold the execution history) and set of registers, however, they share a common code, data and files.
Benefits of Multithreading - Resource sharing – threads share many system resources (e.g. files, memory. code) - Responsiveness – other threads may still be running if one thread is blocked. - Economy – creating and managing threads is much faster than processes. of multiprocessors – multiple processors can run multiple threads at the same - Utilization of multiprocessors time. Threads and Processes - Processes are independent while threads exist as a subset of a process. - Thread takes less time to create than a process. - Thread takes less time to terminate than a process. - Switching between threads takes less time than switching between processes. - Threads enhance efficiency in communication between programs as resources are shared. Why use threads? - Application can avoid per-process overheads (resources required). - Threads have full access to address space. - Threads can execute in parallel on multiprocessors.
39
220CT Notes Salman Fazal Types of Threading
The operating system manages threads called kernel-level threads threads or light weight processes. processes. These threads suffer from too much overhead. To make threads cheap and fast, they need to be implemented at user level. Two types: 1. User level – these are what application programmers programmers use in their t heir programs. They are above the kernel and don’t require kernel support. Advantages Some implementations use timed switching Threads are fast and easy to create & manage Threads to now require kernel privileges
Disadvantages Timing requires more overheads
If a thread blocks (waiting for IO), the whole process blocks Can’t make use of multiprocessor architectures
Threads can run on any operating system
2. Kernel level – these are managed by the operating system and are supported within the kernel. Advantages Useful for multiprocessor architectures If one thread is blocked, kernel can c an schedule another thread of the same process
Disadvantages Generally slower to create and manage than the user threads Transfer between threads require more switching (user → kernel → user) within process quanta.
Threads to now require kernel privileges
Differences User-Level Faster and easier to create and manage Implementation by thread library at user level Can run on any OS Cannot take advantage of multiprocessing
Kernel-Level Slower and harder to create and manage. OS supports creation of Kernel threads.
Specific to the OS Can use multiprocessor architectures
Multithreading Models
Some operating systems provide a combined user-level and kernel-level thread facility. In this way, multiple threads within a process can run in parallel on multiple processors and a blocking system call will not block the entire process. Multithreading are of 3 types: -
Many-to-one Many-to-one relationship Many-to-many Many-to-many relationship One-to-many One-to-many relationship 40
220CT Notes Salman Fazal Many-to-one relationship
Many user threads are mapped to one kernel k ernel thread Typically used on systems that do not support kernel threads
+ creating and managing threads are cheap - if a thread issues a system call (IO request), all other threads are blocked.
One-to-one relationship
One user thread mapped to only one kernel thread
+ threads do not block other threads - overheads as there’s limit on number of threads which can be created. Every user thread requires a corresponding kernel thread.
Many-to-many Many-to-many relationship
Many user threads are mapped to many (equal or less) kernel threads
+ not all threads are blocked by blocking system calls - difficult to implement.
Threading Issues
-
If one thread calls fork(), does it duplicate only that thread or all the threads? Signal handling. Signals are used to notify a process that a particular event has occurred. Now which thread receives a signal? It can be delivered to all, or a single thread . thread-pooling. Create a number of threads at start and place them in a ‘pool’ where they wait for work. Usually faster to assign work to an existing thread than create a new one.
introducing mutex in threading helps prevent data inconsistencies due to multiple threads operating upon the same memory area.
41
220CT Notes Salman Fazal
OS Security The processes in an OS must be protected from one another’s activities. Protection refers to a way for controlling the access to programs, processes, or users to the resources.
The term protection/security follows the CIA model:
Integrity – unauthorised users should not be able to modify any data. Availability – nobody can disturb the system to make it unusable. Confidentiality – only the owner should be able to specify who can see/modify the data.
An example of a security threat would be an attempt to disrupt connections between two machines, thereby preventing access to a service, or flooding network to prevent traffic … Protection Domains
A domain is a set of (object-permissions) (object-permissions) pairs. Each pair/domain specifies objects and some operations that can be performed on them.
A protection matrix (or access control matrix) visualizes protection as a matrix. Rows represent domains, columns represent represent objects and each entry consists of a set of rights: Read Write Execute *A domain in the above examples could be a username or group of users. Access control lists specify the list of domains that are authorized to access a specific object. These are rows in the matrix – objects.
Capability lists specify the access rights a certain domain possesses relating to specific objects. These are columns in a matrix – users/domains.
42
220CT Notes Salman Fazal Authentication
Authentication refers to identifying each user of the system and linking the executing processes with those users. Authentication consists of two steps: 1. Identification – presenting an identifier to the security system. 2. Verification – providing authentication information that verifies the binding between the entity and the identifier. 4 ways to authenticate a user: 1. "something the person knows" -password, pin 2. "something the person has" -smartcards, -smartcards, electronic keycard 3. "something the person is" -fingerprint, retina, face 4. "something the person does" -voice pattern, handwriting Passwords
Ways passwords can be stored: Filing system: clear text Encrypted: pass + encryption Hashed: pass + hashed function Salted Hash: pass + salt + hash function <- Example of Hash
Passwords in UNIX This diagram shows the way UNIX loads a password. When verifying a password, a similar process occurs, a user id and password is inputted and its compared to the hash password generated by the salt and hash function. Cryptography
-
Cryptography is a method or storing and communicating data in a form that only those that is intended for can read or process. It is a science of protecting information by encoding it into an unreadable format. format. A cryptographic algorithm (or cipher) is the mathematical function embodied in the encryption and decryption functions. The algorithm has two inputs; data & key. Key is known only to ‘authorized’ users.
Encryption: process of encoding text to make it unreadable in transit Decryption: process of transforming it back into the original message.
43
220CT Notes Salman Fazal Key encryption
A cryptographic system uses two keys – a public key known to everyone and a private key know only to the receipt of the message. When John wants to send a secure message to Jane, he uses Jane's public key to encrypt the message. Jane then uses her private key to decrypt it. Summarizing: 1. Private key is kept secret, and public key is shared with others. 2. If a message is encrypted using one key (public or private), the other corresponding key must be used to decrypt the message. Example: SSL Secure Socket Layer (SSL) is a cryptographic protocol that enables two computers to communitcate securely. The authentication technique it uses is a public-key cryptography. -
When mutual authentication is carried out, at least one public key is needed. The client sends the server its SSL information and vice versa. The client then generates a ‘pre -master secret’ and sends it with server’s public key. The parties now have chosen an algorithm, their public & private keys and two shared keys. Encrypted communication communication then occurs between the two.
Notes
______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ _________________________ ___________________________ ______________________ ___________ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ ___________________________ _________________________ ___________ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ _______________________ ______________________ ______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___ _______________________ ___________________________________ ______________________ ______________________ ________________________ _______________________ _____________ __ ______________________ ___________________________________ ______________________ ______________________ _______________________ ______________________ _______________ ___
44
220CT Notes Salman Fazal
Defensive Programming
45