CS2304 – SYSTEM SOFTWARE UNIT I INTRODUCTION
8
System software and machine architecture – The Simplified Instructional Computer (SIC) - Machine architecture - Data and instruction formats - addressing modes - instruction sets - I/ and programming! UNIT II ASSEMBLERS 10 "asic assem#ler assem#ler functions functions - $ simple SIC assem#ler – $ssem#ler algorithm and data structures structures - Machine Machine dependent dependent assem#ler assem#ler features - Instruction formats and addressing modes mod es – %rogra %rogram m relocat relocation ion - Machin Machinee indepen independen dentt assem# assem#ler ler featur features es - &itera &iterals ls – Sym#olSym#ol-def defini ining ng statem statement entss – 'pres 'pressio sions ns - ne pass pass assem# assem#ler lerss and Multi Multi pass pass assem#lers - Implementation eample - M$SM assem#ler! UNIT III LOADERS AND LINKERS 9 "asic loader functions - Design of an $#solute &oader – $ Simple "ootstrap &oader Machine dependent loader features - elocation – %rogram &in*ing – $lgorithm and Data Structures for &in*ing &oader - Machine-independent loader features - $utomatic &i#rary Search – &oader ptions - &oader design options - &in*age 'ditors – Dynamic &in*ing – "ootstrap &oaders - Implementation eample - MSDS lin*er! UNIT IV MACRO PROCESSORS 9 "asic macro processor functions - Macro Definition and 'pansion – Macro %rocessor $lgori $lgorithm thm and data data struct structure uress - Machin Machine-i e-indep ndepende endent nt macro macro proces processor sor featur features es Concatenation of Macro %arameters – +eneration of ,niue &a#els – Conditional Macro 'pansion – .eyword Macro %arameters-Macro within Macro-Implementation eample M$SM Macro %rocessor – $SI C Macro language! UNIT V SYSTEM SOFTWARE TOOLS 9 Tet editors - 0er0iew of the 'diting %rocess - ,ser Interface – 'ditor Structure! Interacti0e de#ugging systems - De#ugging functions and capa#ilities – elationship with other parts of the system – ,ser-Interface Criteria! TEXT BOOK TEXT BOOK 1! &eland &! "ec*2 3System Software – $n Introduction to Systems %rogramming42 5rd 'dition2 %earson 'ducation $sia2 6778! REFERENCES 1! D! M! Dhamdhere2 3Systems %rogramming and perating Systems42 Second e0ised 'dition2 Tata Mc+raw-9ill2 6777! 6! :ohn :! Dono0an 3Systems %rogramming42 Tata Mc+raw-9ill 'dition2 6777!
1
UNIT I INTRODUCTION TO SYSTEM SOFTWARE AND MACINE STRUCTURE 1!1 SYSTEM SOFTWARE • • • • • • • •
•
System software consists of a 0ariety of programs that support the operation of a computer! It is a set of programs to perform a 0ariety of system functions as file editing2 resource management2 I/ management and storage management! The characteristic characteristic in which system software differs from application application software software is machine dependency! $n appli applica cati tion on progr program am is prim primar aril ily y conc concer erne ned d with with the the solu soluti tion on of some some pro#lem2 using the computer as a tool! System programs on the other hand are intended to support the operation and use of the computer itself2 rather than any particular application! ;or this reason2 they are usually related to the architecture of the machine on which they are run! ;or eample2 assem#lers translate mnemonic instructions into machine code! The instruction formats2 addressing modes are of direct concern in assem#ler design! There are some aspects of system software that do not directly depend upon the type type of comp comput utin ing g syst system em #ein #eing g suppo support rted ed!! Th Thes esee are are *no *nown wn as machi machine ne-independent features! ;or eample2 the general design and logic of an assem#ler is #asically the same on most computers!
TYPES OF SYSTEM SOFTWARE"
1! per perat atin ing g sys syste tem m 6! &ang &angua uage ge trans transla lato tors rs a! Compilers #! Interpreters c! $ssem#le #lers d! %rep %repro roce cess ssor orss 5! &oaders
It is the most important system program that act as an interface #etween the users and the system! It ma*es the computer co mputer easier to use! 6
• •
It pro0ides an interface that is more user-friendly than the underlying hardware! The functions of S are> 1! %roc %roces esss mana manage geme ment nt 6! Memo Memory ry mana manage geme ment nt 5! eso esour urce ce mana manage geme ment nt
LAN#UA#E TRANSLATORS
It is the program that ta*es an input program in one language and produces an output in another language!
S$%&'( P&$)&*+
L*5)%*)( T&*5*.$&
O,('. P&$)&*+
C$+/(& • • •
•
$ compiler is a language program that translates programs written in any highle0el language into its eui0alent machine language program! It #ridges the semantic gap #etween a programming language domain and the eecution domain! Two aspects of compilation are> o +enerate code to increment meaning of a source program in the eecution domain! %ro0ide diagnostics for 0iolation of programming language2 semantics in a o source program! The program instructions are ta*en as a whole!
) (( *5)%*)(
C$+/(&
M*'5( *5)%*)( /&$)&*+
I5.(&/&(.(&" •
•
It is a translator program that translates a statement of high-le0el language to machine language and eecutes it immediately! The program instructions are ta*en line #y line! The interpreter reads the source program and stores it in memory!
5
•
• •
•
During interpretation2 it ta*es a source statement2 determines its meaning and performs actions which increments it! This includes computational and I/ actions! %rogram counter (%C) indicates which statement of the source program is to #e interpreted net! This statement would #e su#@ected to the interpretation cycle! The interpretation cycle consists of the following steps> ;etch the statement! o $nalyAe the statement and determine its meaning! o 'ecute the meaning of the statement! o The following are the characteristics of interpretation> The source program is retained in the source form itself2 no target program o eists! $ statement is analyAed during the interpretation! o
I5.(&/&(.(&
P&$)&*+ '$%5.(&
M(+$&6
S$%&'( P&$)&*+
A(+,(&" •
• •
%rogrammers found it difficult to write or red programs in machine language! In a uest for a con0enient language2 they #egan to use a mnemonic (sym#ol) for each machine instructions which would su#seuently #e translated into machine language! Such a mnemonic language is called $ssem#ly language! %rograms *nown as $ssem#lers are written to automate the translation of assem#ly language into machine language!
$ssem#ly language program •
A(+,(&
Machine language program
;undamental functions> 1! Translating mnemonic operation codes to their machine language eui0alents! 6! $ssigning machine addresses to sym#olic ta#les used #y the programmers!
<
1!2 TE SIMPLIFIED INSTRUCTIONAL COMPUTER 7SIC" It is similar to a typical microcomputer! It comes in two 0ersions> • The standard model • B' 0ersion
SIC M*'5( S.&%'.%&(" M(+$&6" • •
It consists of #ytes( #its) 2words (6< #its which are consecuti0e 5 #ytes) addressed #y the location of their lowest num#ered #yte! There are totally 5628 #ytes in memory!
R().(&"
There are = registers namely 1! $ccumulator ($) 6! Inde egister(B) 5! &in*age egister(&)
Integers are stored as 6<-#it #inary num#ers> 6?s complement representation is used for negati0e 0alues characters are stored using their #it $SCII codes! They do not support floating – point data items!
I5.&%'.$5 $&+*."
$ll machine instructions are of 6<-#its wide O/'$:( 78 •
X 71
A::&( 71;
B-flag #it that is used to indicate indeed-addressing mode!
A::&(5) +$:("
=
•
Two types of addressing are a0aila#le namely2 1! Direct addressing mode 6! Indeed addressing mode or indirect addressing mode M$:( Direct Indee d
•
I5:'*.$5 T*&)(. A::&( '*'%*.$5 BF7 T$F$ddress BF1 T$F$ddress G (B)
Ehere() represents the contents of the inde register()
I5.&%'.$5 (."
It includes instructions li*e> 1! Data mo0ement instruction '> &D$2 &DB2 ST$2 STB! 6! $rithmetic operating instructions '> $DD2 S,"2 M,&2 DI"! This in0ol0es register $ and a word in memory2 with the result #eing left in the register! 5! "ranching instructions '> :&T2 :'H2 T+T! :S,"2 S,"!
I5/%. *5: O%./%." • • •
I/ is performed #y transferring one #yte at a time to or from the rightmost #its of register $! 'ach de0ice is assigned a uniue -#it code! There are 5 I/ instructions2 1) The Test De0ice (TD) instructions tests whether the addressed de0ice is ready to send or recei0e a #yte of data! 6) $ program must wait until the de0ice is ready2 and then eecute a ead Data (D) or Erite Data (ED)! 5) The seuence must #e repeated for each #yte of data to #e read or written!
1!3 SIC
M(+$&6" • 1 word F 6< #its (5 -#it #ytes) 67 • Total (SIC/B') F 6 (127<2=8) #ytes (1M#yte)
R().(&" • 17 6< #it registers
MNEMONIC $ B & " S T ; %C SE
R().(& 7 1 6 5 < = 8
P%&/$( $ccumulator Inde register &in*age register (:S,"/S,") "ase register +eneral register +eneral register ;loating %oint $ccumulator (< #its) %rogram Counter (%C) Status Eord (includes Condition Code2 CC)
D*.* F$&+*." • • •
Integers are stored in 6< #it2 6Js complement format Characters are stored in -#it $SCII format ;loating point is stored in < #it signed-eponent-fraction format>
s eponent L11 • • • •
fraction L58
The fraction is represented as a 58 #it num#er and has 0alue #etween 7 and 1! The eponent is represented as a 11 #it unsigned #inary num#er #etween 7 and 67<! The sign of the floating point num#er is indicated #y s > 7Fpositi0e2 1Fnegati0e! Therefore2 the a#solute floating point num#er 0alue is> fK6(e-176<)
I5.&%'.$5 F$&+*." •
There are < different instruction formats a0aila#le>
;ormat 1 (1 #yte)> op L
;ormat 6 (6 #ytes)> op L
r1 L<
r6 L<
;ormat 5 (5 #ytes)> op L8
n i # p e displacement L16
;ormat < (< #ytes)> op L8
n i # p e address L67
;ormats 5 N < introduce addressing mode flag #its> •
nF7 N iF1 Immediate addressing - T$ is used as an operand 0alue (no memory reference) nF1 N iF7 Indirect addressing - word at T$ (in memory) is fetched N used as an address to fetch the operand from nF7 N iF7 Simple addressing T$ is the location of the operand nF1 N iF1 Simple addressing same as nF7 N iF7
•
• •
;lag > F1 Indeed addressing add contents of B register to T$ calculation ;lag # N p (;ormat 5 only)> •
• •
#F7 N pF7 Direct addressing displacement/address field containsT$ (;ormat < always uses direct addressing) #F7 N pF1 %C relati0e addressing - T$F(%C)Gdisp (-67<OFdispOF67<)K #F1 N pF7 "ase relati0e addressing - T$F(")Gdisp (7OFdispOF<7=)KK
;lag e> eF7 use ;ormat 5 eF1 use ;ormat <
I5.&%'.$5"
SIC pro0ides 68 instructions2 SIC/B' pro0ides an additional 55 instructions (= total) SIC/B' has categories of instructions> • &oad/store registers (&D$2 &DB2 &DC92 ST$2 STB2 STC92 etc!) • integer arithmetic operations ($DD2 S,"2 M,&2 DIP) these will use register $ and a word in memory2 results are placed into register $ • compare (CM%) compares contents of register $ with a word in memory and sets CC (Condition Code) to O2 Q2 or F • conditional @umps (:&T2 :'H2 :+T) - @umps according to setting of CC • su#routine lin*age (:S,"2 S,") - @umps into/returns from su#routine using register & • input N output control (D2 ED2 TD) - see net section • floating point arithmetic operations ($DD;2 S,";2 M,&;2 DIP;) • register manipulation2 operands-from-registers2 and register-to-register arithmetics (M2 S,"2 CM%2 S9I;T2 S9I;T&2 $DD2 S,"2 M,&2 DIP2 etc) I5/%. *5: O%./%. 7I
6 (6=8) I/ de0ices may #e attached2 each has its own uniue -#it address 1 #yte of data will #e transferred to/from the rightmost #its of register $
Three I/ instructions are pro0ided> • D ead Data from I/ de0ice into $ • ED Erite data to I/ de0ice from $ • TD Test De0ice determines if addressed I/ de0ice is ready to send/recei0e a #yte of data! The CC (Condition Code) gets set with results from this test> O device is ready to send/receive F device isn't ready SIC/B' 9as capa#ility for programmed I/ (I/ de0ice may input/output data while C%, does other wor*) - 5 additional instructions are pro0ided> SI Start I/ • 9I 9alt I/ • TI Test I/ •
1!4 SIC> SIC
Simple
1 1 7 7 7 7 op c
disp
(T$)
Direct-addressing Instruction
1 1 7 7 7 1 Gop m
addr
(T$)
;ormat < N Directaddressing Instruction
(T$)
$ssem#ler selects either #ase-relati0e or program-counter relati0e mode
1 1 7 7 1 7 op m
1 1 7 1 7 7 op m
(") G disp
(T$)
$ssem#ler selects either #ase-relati0e or program-counter relati0e mode
1 1 1 7 7 7 op c2B
disp G (B)
(T$)
Direct-addressing Instruction
1 1 1 7 7 1 Gop m2B addr G (B)
(T$)
;ormat < N Directaddressing Instruction
(%C) G disp G (T$) (B)
$ssem#ler selects either #ase-relati0e or program-counter relati0e mode
(") G disp G (B) (T$)
$ssem#ler selects either #ase-relati0e or program-counter relati0e mode
#/p/e/disp
(T$)
Direct-addressing InstructionR SIC compati#le format!
7 7 1 - - - op m2B
#/p/e/disp G (B) (T$)
Direct-addressing InstructionR SIC compati#le format!
1 7 7 7 7 7 op c
disp
((T$))
Direct-addressing Instruction
1 7 7 7 7 1 Gop m addr
((T$))
;ormat < N Directaddressing Instruction $ssem#ler selects either #ase-relati0e or program-counter relati0e mode
1 1 1 7 1 7 op m2B
1 1 1 1 7 7 op m2B
7 7 7 - - - op m
Indirect
(%C) G disp
1 7 7 7 1 7 op m
(%C) G disp
((T$))
1 7 7 1 7 7 op m
(") G disp
((T$))
17
$ssem#ler
selects
either #ase-relati0e or program-counter relati0e mode Immediate
7 1 7 7 7 7 op c
disp
T$
Direct-addressing Instruction
7 1 7 7 7 1 op m
addr
T$
;ormat < N Directaddressing Instruction
T$
$ssem#ler selects either #ase-relati0e or program-counter relati0e mode
T$
$ssem#ler selects either #ase-relati0e or program-counter relati0e mode
7 1 7 7 1 7 op m
7 1 7 1 7 7 op m
(%C) G disp
(") G disp
11
UNIT II ASSEMBLERS 2!1! BASIC ASSEMBLER FUNCTIONS ;undamental functions of an assem#ler> a ssem#ler> • Translating mnemonic operation codes to their machine language eui0alents! • $ssigning machine addresses to sym#olic la#els used #y the programmer! F)%&( 2!1" A(+,(& *5)%*)( /&$)&*+ $& ,*' SIC (&$5
16
Indeed addressing is indicated #y adding the modifier 3 B4 following the operand! &ines #eginning with 3!4 contain comments only! The following assem#ler directi0es are used> • • • • • •
START" Specify name and starting address for the program! END " Indicate the end of the source program and specify the first eecuta#le instruction in the program! BYTE" +enerate character or headecimal constant2 occupying as many #ytes as needed to represent the constant! WORD" +enerate one- word integer constant! RESB" eser0e the indicated num#er of #ytes for a data area! RESW" eser0e the indicated num#er of words for a data area!
The program contains a main routine that reads records from an input de0ice( code ;1) and copies them to an output de0ice(code 7=)! The main routine calls su#routines> • RDREC – To read a record into a #uffer! 15
•
WRREC – To write the record from the #uffer to the output de0ice!
The end of each record is mar*ed with a null character (headecimal 77)!
2!1!1! A S+/( SIC A(+,(& The translation of source program to o#@ect code reuires the following functions> 1! Con0er Con0ertt mnemon mnemonic ic operatio operation n codes codes to their their machine machine language language eui0a eui0alen lents! ts! 'g> Translate ST& to 1< (line 17)! 6! Con0er Con0ertt sym sym#ol #olic ic operan operands ds to their their eui0alen eui0alentt machin machinee address addresses! es! 'g>Tran 'g>Transla slate te 'T$D to 1755 (line 17)! 5! "uild the machine machine instruction instructionss in the proper proper format! format! Translate '; to <=<;<8(line 7)! =! Erite Erite the o#@ect program program and the assem#l assem#ly y listing! listing! $ll fuctions ecept function 6 can #e esta#lished #y seuential processing of source program one line at a time! Consider the statement 17
177 7
;IST
ST &
'T$D
1 <1 75 5
This instruction contains a $&@*&: &((&(5'( (i!e!) a reference to a la#el ('T$D) that is defined later in the program! It is una#le to process this line #ecause the address that will #e assigned to 'T$D is not *nown! 9ence most assem#lers ma*e two passes o0er the source program where the second pass does the actual translation! The assem# assem#ler ler mus mustt also also proces processs statem statement entss called called *(+,(& *(+,(& :&('.( :&('.( $& /(%:$ 5.&%'.$5 5.&%'.$5 which are not translated into machine instructions! Instead they pro0ide instructions to the assem#ler itself! 'amples> 'S" and 'SE instruct the assem#ler to reser0e memory locations without generating data 0alues! The assem#ler must write the generated o#@ect code onto some output de0ice! This o#@ect program will later #e loaded into memory for eecution! O,('. /&$)&*+ $&+*. '$5.*5 .&(( .6/( $ &('$&:" •
(*:(& &('$&:> Contains the program name2 starting address and length! • T(?. &('$&:> Contains the machine code and data of the program! • E5: &('$&:> Mar*s the end of the o#@ect program and specifies the address in the program where eecution is to #egin!
1<
R('$&: $&+*. * $$@" (*:(& &('$&:"
Col! 1 Col!6- Col!-15 Col!1<-1
9 %rogram name Starting address of o#@ect program &ength of o#@ect program in #ytes
T(?. &('$&:"
Col!1 Col!6- Col!- Col 17-8
T Starting address for o#@ect code in this record &ength of o#@ect code in this record in #ytes #@ect code2 represented in headecimal (6 columns per #yte of o#@ect code)
E5: &('$&:"
Col!1 Col!6-
' $ddress of first eecuta#le instruction in o#@ect program!
F%5'.$5 $ .( .@$ /*( $ *(+,(&" P* 1 7D(5( 6+,$ 1! $ssign addresses to all statements in the program! 6! Sa0e the addresses assigned to all la#els for use in %ass 6! 5! %erform some processing of assem#ler directi0es! P* 2 7A(+,( 5.&%'.$5 *5: )(5(&*.( $,('. /&$)&*+
1=
1! 6! 5!
$ssem#le instructions (translating operation codes and loo*ing up addresses)! +enerate data 0alues defined #y "UT'2ED etc! %erform processing of assem#ler directi0es not done in %ass 1! Erite the o#@ect program and the assem#ly listing!
2!1!2! A(+,(& A)$&.+ *5: D*.* S.&%'.%&( $ssem#ler uses two ma@or internal data structures> 1! O/(&*.$5 C$:( T*,( 7OPTAB " ,sed to loo*up mnemonic operation codes and translate them into their machine language eui0alents! 6! S6+,$ T*,( 7SYMTAB " ,sed to store 0alues($ddresses) assigned to la#els! L$'*.$5 C$%5.(& 7LOCCTR " • • • •
Paria#le used to help in the assignment of addresses! It is initialiAed to the #eginning address specified in the ST$T statement! $fter each source statement is processed2 the length of the assem#led instruction or data area is added to &CCT! Ehene0er a la#el is reached in the source program2 the current 0alue of &CCT gi0es the address to #e associated with that la#el!
O/(&*.$5 C$:( T*,( 7OPTAB " • • • • •
Contains the mnemonic operation and its machine language eui0alent! $lso contains information a#out instruction format and length! In %ass 12 %T$" is used to loo*up and 0alidate operation codes in the source program! In %ass 62 it is used to translate the operation codes to machine language program! During %ass 62 the information in %T$" tells which instruction format to use in assem#ling the instruction and any peculiarities of the o#@ect code instruction!
S6+,$ T*,( 7SYMTAB " • • •
Includes the name and 0alue for each la#el in the source program and flags to indicate error conditions! During %ass 1 of the assem#ler2 la#els are entered into SUMT$" as they are encountered in the source program along with their assigned addresses! During %ass 62 sym#ols used as operands are loo*ed up in SUMT$" to o#tain the addresses to #e inserted in the assem#led instructions!
%ass 1 usually writes an intermediate file that contains each source statement together with its assigned address2 error indicators! This file is used as the input to %ass 6! This copy of the source program can also #e used to retain the results of certain operations that
18
may #e performed during %ass 1 such as scanning the operand field for sym#ols and addressing flags2 so these need not #e performed again during %ass 6!
2!2! MACINE DEPENDENT ASSEMBLER FEATURES Consider the design and implementation of an assem#ler for SIC/B' 0ersion!
1
Indirect addressing is indicated #y adding the prefi to the operand (line7)! Immediate operands are denoted with the prefi (lines 6=2 ==2155)! Instructions that refer to memory are normally assem#led using either the program counter relati0e or #ase counter relati0e mode! The assem#ler directi0e "$S' (line 15) is used in con@unction with #ase relati0e addressing! The four #yte etended instruction format is specified with the prefi G added to the operation code in the source statement! egister-to-register instructions are used where0er possi#le! ;or eample the statement on line 1=7 is changed from CM% V' to CM% $2S! Immediate and indirect addressing ha0e also #een used as much as possi#le! egister-to-register instructions are faster than the corresponding register-to-memory operations #ecause they are shorter and do not reuire another memory reference! Ehile using immediate addressing2 the operand is already present as part of the instruction and need not #e fetched from anywhere! The use of indirect addressing often a0oids the need for another instruction!
1
2!2!1 I5.&%'.$5 F$&+*. *5: A::&(5) M$:( •
SIC/B' %C-relati0e or "ase-relati0e addressing> op m o Indirect addressing> op m o Immediate addressing> op c o 'tended format> Gop m o Inde addressing> op m2 o register-to-register instructions o larger memory -Q multi-programming (program allocation) o
T&*5*.$5 •
•
egister translation register name ($2 B2 &2 "2 S2 T2 ;2 %C2 SE) and their 0alues (7212 62 52 <2 o =2 82 2 ) o preloaded in SUMT$" $ddress translation o Most register-memory instructions use program counter relati0e or #ase relati0e addressing ;ormat 5> 16-#it address field o #ase-relati0e> 7W<7= pc-relati0e> -67<W67< ;ormat <> 67-#it address field o
2!2!2 P&$)&*+ R($'*.$5 The need for program relocation • It is desira#le to load and run se0eral programs at the same time! • The system must #e a#le to load programs into memory where0er there is room! • The eact starting address of the program is not *nown until load time! $#solute %rogram • %rogram with starting address specified at assem#ly time • The address may #e in0alid if the program is loaded into somewhere else! • 'ample>
1
E?*+/(" P&$)&*+ R($'*.$5
• •
•
The only parts of the program that reuire modification at load time are those that specify direct addresses! The rest of the instructions need not #e modified! o ot a memory address (immediate addressing) %C-relati0e2 "ase-relati0e o ;rom the o#@ect program2 it is not possi#le to distinguish the address and constant! The assem#ler must *eep some information to tell the loader! o The o#@ect program that contains the modification record is called a o relocata#le program!
The way to sol0e the relocation pro#lem • ;or an address la#el2 its address is assigned relati0e to the start of the program(ST$T 7) • %roduce a Modification record to store the starting location and the length of the address • field to #e modified! 67
•
The command for the loader must also #e a part of the o#@ect program!
M$:'*.$5 &('$&: • • • •
ne modification record for each address to #e modified The length is stored in half-#ytes (< #its) The starting location is the location of the #yte containing the leftmost #its of the address field to #e modified! If the field contains an odd num#er of half-#ytes2 the starting location #egins in the middle of the first #yte!
R($'*.*,( O,('. P&$)&*+
2!3! MACINE INDEPENDENT ASSEMBLER FEATURES 2!3!1 L.(&* •
•
The programmer writes the 0alue of a constant operand as a part of the instruction that uses it! This a0oids ha0ing to define the constant elsewhere in the program and ma*e a la#el for it! Such an operand is called a &iteral #ecause the 0alue is literally in the instruction!
61
•
Consider the following eample
•
It is con0enient to write the 0alue of a constant operand as a part of instruction!
•
$ literal is identified with the prefi F2 followed #y a specification of the literal 0alue!
•
'ample>
L.(&* ! I++(:*.( O/(&*5: •
•
&iterals The assem#ler generates the specified 0alue as a constant at some other memory location!
Immediate perands
66
The operand 0alue is assem#led as part of the machine instruction •
Ee can ha0e literals in SIC2 #ut immediate operand is only 0alid in SIC/B'!
L.(&* P$$ • • •
•
ormally literals are placed into a pool at the end of the program In some cases2 it is desira#le to place literals into a pool at some other location in the o#@ect program $ssem#ler directi0e &T+ o Ehen the assem#ler encounters a &T+ statement2 it generates a literal pool (containing all literal operands used since pre0ious &T+) eason> *eep the literal operand close to the instruction therwise %C-relati0e addressing may not #e allowed o
D%/'*.( .(&* •
•
The same literal used more than once in the program nly one copy of the specified 0alue needs to #e stored o ;or eample2 FB?7=? o Inorder to recogniAe the duplicate literals o Compare the character strings defining them 'asier to implement2 #ut has potential pro#lem e!g! FB?7=? Compare the generated data 0alue o "etter2 #ut will increase the compleity of the assem#ler e!g! CEOF and X4;4F4
P&$,(+ $ :%/'*.(-.(&* &('$)5.$5 • •
• •
XK? denotes a literal refer to the current 0alue of program counter ",;'D 'H, K o There may #e some literals that ha0e the same name2 #ut different 0alues "$S' K o &D" FK (&'+T9) o The literal FK repeatedly used in the program has the same name2 #ut different 0alues The literal 3FK4 represents an 3address4 in the program2 so the assem#ler must generate the appropriate 3Modification records4!
L.(&* .*,( - LITTAB
65
•
•
Content o &iteral name perand 0alue and length o $ddress o &ITT$" is often organiAed as a hash ta#le2 using the literal name or 0alue as the *ey!
I+/(+(5.*.$5 $ L.(&* P* 1 • •
"uild &ITT$" with literal name2 operand 0alue and length2 lea0ing the address unassigned Ehen &T+ or 'D statement is encountered2 assign an address to each literal not yet assigned an address updated to reflect the num#er of #ytes occupied #y each literal o
P* 2 • • •
Search &ITT$" for each literal operand encountered +enerate data 0alues using "UT' or ED statements +enerate Modification record for literals that represent an address in the program
SYMTAB = LITTAB
2!3!2 S6+,$-D(55) S.*.(+(5.
6<
•
Most assem#lers pro0ide an assem#ler directi0e that allows the programmer to define sym#ols and specify their 0alues!
$ssem#ler directi0e used is EU! • • •
•
•
Synta> sym#ol 'H, 0alue ,sed to impro0e the program reada#ility2 a0oid using magic num#ers2 ma*e it easier to find and change constant 0alues eplace G&DT <78 with M$B&' 'H, <78 G&DT M$B&' Define mnemonic names for registers! $ 'H, 7 M $2B B 'H, 1 'pression is allowed M$B&' 'H, ",;'D-",;;'
$ssem#ler directi0e + • $llow the assem#ler to reset the %C to 0alues Synta> + 0alue o • Ehen + is encountered2 the assem#ler resets its &CCT to the specified 0alue! • + will affect the 0alues of all la#els defined until the net +! • If the pre0ious 0alue of &CCT can #e automatically remem#ered2 we can return to the normal use of &CCT #y simply writing o + E?*+/(" %5) OR# •
If + statements are used
•
Ee can fetch the P$&,' field #y &D$ P$&,'2B B F 72 112 662 Y for each entry
F$&@*&:-R((&(5'( P&$,(+
6=
• • • •
•
;orward reference is not allowed for either 'H, or +! $ll terms in the 0alue field must ha0e #een defined pre0iously in the program! The reason is that all sym#ols must ha0e #een defined during %ass 1 in a two-pass assem#ler! $llowed> $&%9$ 'SE 1 "'T$ 'H, $&%9$ ot $llowed> "'T$ 'H, $&%9$ $&%9$ 'SE 1
2!3!3 E?/&($5 • • •
•
The assem#lers allow 3the use of epressions as operand4 The assem#ler e0aluates the epressions and produces a single operand address or 0alue! 'pressions consist of perator G2-2K2/ (di0ision is usually defined to produce an integer result) o Indi0idual terms Constants o o ,ser-defined sym#ols o Special terms2 e!g!2 K2 the current 0alue of &CCT 'amples M$B&' 'H, ",;'D-",;;' ST$" 'S" (8G5G6)KM$B'TI'S
R($'*.$5 P&$,(+ 5 E?/&($5 • Palues of terms can #e $#solute (independent of program location) o constants elati0e (to the #eginning of the program) o $ddress la#els K (0alue of &CCT) • 'pressions can #e $#solute • nly a#solute terms! o M$B&' 'H, 1777 o elati0e terms in pairs with opposite signs for each pair! • M$B&' 'H, ",;'D-",;;' elati0e •
68
$ll the relati0e terms ecept one can #e paired as descri#ed in 3a#solute4! The remaining unpaired relati0e term must ha0e a positi0e sign! ST$"
'H,
%T$" G (",;'D – ",;;')
R(.&'.$5 $ R(*.( E?/&($5 • •
o relati0e terms may enter into a multiplication or di0ision operation o 5 K ",;;' 'pressions that do not meet the conditions of either 3a#solute4 or 3relati0e4 should #e flagged as errors! ",;'D G ",;;' o 177 – ",;;' o
*5:5) R(*.( S6+,$ 5 SYMTAB • •
To determine the type of an epression2 we must *eep trac* of the types of all sym#ols defined in the program! Ee need a 3flag4 in the SUMT$" for indication!
2!3!4 P&$)&*+ B$' • • •
• •
$llow the generated machine instructions and data to appear in the o#@ect program in a different order Separating #loc*s for storing code2 data2 stac*2 and larger data #loc* %rogram #loc*s 0ersus! Control sections %rogram #loc*s o Segments of code that are rearranged within a single o#@ect program unit! Control sections o Segments of code that are translated into independent o#@ect program units! $ssem#ler rearranges these segments to gather together the pieces of each #loc* and assign address! Separate the program into #loc*s in a particular order 6
• •
&arge #uffer area is mo0ed to the end of the o#@ect program %rogram reada#ility is #etter if data areas are placed in the source program close to the statements that reference them!
A(+,(& :&('.(" USE • • • •
,S' Z#loc*name[ $t the #eginning2 statements are assumed to #e part of the unnamed (default) #loc* If no ,S' statements are included2 the entire program #elongs to this single #loc* 'ach program #loc* may actually contain se0eral separate segments of the source program
E?*+/(
6
T&(( ,$' *&( %(: • • •
default> eecuta#le instructions! CD$T$> all data areas that are less in length! C"&.S> all data areas that consists of larger #loc*s of memory!
6
R(*&&*5)( C$:( 5.$ P&$)&*+ B$'
%ass 1 •
• • • •
$ separate location counter for each program #loc* Sa0e and restore &CCT when switching #etween #loc*s o o $t the #eginning of a #loc*2 &CCT is set to 7! $ssign each la#el an address relati0e to the start of the #loc* Store the #loc* name or num#er in the SUMT$" along with the assigned relati0e address of the la#el Indicate the #loc* length as the latest 0alue of &CCT for each #loc* at the end of %ass1 $ssign to each #loc* a starting address in the o#@ect program #y concatenating the program #loc*s in a particular order
%ass 6 •
Calculate the address for each sym#ol relati0e to the start of the o#@ect program #y adding The location of the sym#ol relati0e to the start of its #loc* o The starting address of this #loc* o
P&$)&*+ B$' L$*:(: 5 M(+$&6
57
O,('. P&$)&*+ • • •
It is not necessary to physically rearrange the generated code in the o#@ect program The assem#ler @ust simply inserts the proper load address in each Tet record! The loader will load these codes into correct place
2!3!; C$5.&$ S('.$5 *5: P&$)&*+ L55) Control sections • can #e loaded and relocated independently of the other • are most often used for su#routines or other logical su#di0isions of a program • the programmer can assem#le2 load2 and manipulate each of these control sections separately • #ecause of this2 there should #e some means for lin*ing control sections together • assem#ler directi0e> CS'CT secname CS'CT • separate location counter for each control section E?.(&5* D(5.$5 *5: R((&(5'( • •
•
•
Instructions in one control section may need to refer to instructions or data located in another section 'ternal definition 'BTD'; name Z2 name[ o 'BTD'; names sym#ols that are defined in this control section and may o #e used #y other sections o '> 'BTD'; ",;;'2 ",;'D2 &'+T9 'ternal reference 'BT'; name Z2name[ o 'BT'; names sym#ols that are used in this control section and are o defined elsewhere '> 'BT'; D'C2 E'C o To reference an eternal sym#ol2 etended format instruction is needed!
51
56
E?.(&5* R((&(5'( *5:5)
Case 1 7775 C&% G:S," D'C <"177777 • 1= • The operand D'C is an eternal reference! • The assem#ler 9as no idea where D'C is o Inserts an address of Aero o Can only use etended format to pro0ide enough room (that is2 relati0e o addressing for eternal reference is in0alid) • The assem#ler generates information for each eternal reference that will allow the loader to perform the reuired lin*ing! Case 6 • • •
17
776 M$B&' ED ",;'D-",;;' 777777 There are two eternal references in the epression2 ",;'D and ",;;'! The assem#ler inserts a 0alue of Aero o passes information to the loader o $dd to this data area the address of ",;'D Su#tract from this data area the address of ",;;'
Case 5 • •
n line 172 ",;'D and ",;;' are defined in the same control section and the epression can #e calculated immediately! 17 1777 M$B&' 'H, ",;'D-",;;' 55
R('$&: $& O,('. P&$)&*+ •
The assem#ler must include information in the o#@ect program that will cause the loader to insert proper 0alues where they are reuired!
•
Define record ('BTD';) Col! 1 D Col! 6- ame of eternal sym#ol defined in this control section Col! -15 elati0e address within this control section (headeccimal) Col!1<-5 epeat information in Col! 6-15 for other eternal sym#ols
•
efer record ('BT';) Col! 1 Col! 6- ame of eternal sym#ol referred to in this control section Col! -5 ame of other eternal reference sym#ols
•
Modification record Col! 1 M Col! 6- Starting address of the field to #e modified (heiadecimal) Col! - &ength of the field to #e modified2 in half-#ytes (headeccimal) Col!11-18 'ternal sym#ol whose 0alue is to #e added to or su#tracted from the indicated field
•
Control section name is automatically an eternal sym#ol2 i!e! it is a0aila#le for use in Modification records!
O,('. P&$)&*+
5<
E?/&($5 5 M%./( C$5.&$ S('.$5 •
•
'tended restriction "oth terms in each pair of an epression must #e within the same control o section o &egal> ",;'D-",;;' o Illegal> D'C-C%U 9ow to enforce this restriction Ehen an epression in0ol0es eternal references2 the assem#ler cannot o determine whether or not the epression is legal! The assem#ler e0aluates all of the terms it can2 com#ines these to form an o initial epression 0alue2 and generates Modification records! The loader chec*s the epression for errors and finishes the e0aluation! o
2!4! ASSEMBLER DESI#N The assem#ler design deals with • Two-pass assem#ler with o0erlay structure • ne-pass assem#lers • Multi-pass assem#lers 6!
O5(-/* *(+,(&
L$*:-*5:-#$ A(+,(&
5=
• • • • •
&oad-and-go assem#ler generates their o#@ect code in memory for immediate eecution! o o#@ect program is written out2 no loader is needed! It is useful in a system with freuent program de0elopment and testing The efficiency of the assem#ly process is an important consideration! %rograms are re-assem#led nearly e0ery time they are runR efficiency of the assem#ly process is an important consideration!
O5(-P* A(+,(& •
•
•
Scenario for one-pass assem#lers +enerate their o#@ect code in memory for immediate eecution – loado and-go assem#ler 'ternal storage for the intermediate file #etween two passes is slow or is o incon0enient to use Main pro#lem - ;orward references Data items o &a#els on instructions o Solution euire that all areas #e defined #efore they are referenced! o It is possi#le2 although incon0enient2 to do so for data items! o ;orward @ump to instruction items cannot #e easily eliminated! o Insert (la#el2 address_to_be_modified ) to SUMT$" ,sually2 address_to_be_modified is stored in a lin*ed-list
S*+/( /&$)&*+ $& * $5(-/* *(+,(&
58
F$&@*&: R((&(5'( 5 O5(-/* A(+,(& • • • • • •
mits the operand address if the sym#ol has not yet #een defined! 'nters this undefined sym#ol into SUMT$" and indicates that it is undefined! $dds the address of this operand address to a list of forward references associated with the SUMT$" entry! Ehen the definition for the sym#ol is encountered2 scans the reference list and inserts the address! $t the end of the program2 reports the error if there are still SUMT$" entries indicated undefined sym#ols! ;or &oad-and-+o assem#ler Search SUMT$" for the sym#ol named in the 'D statement and @umps o to this location to #egin eecution if there is no error!
O,('. C$:( 5 M(+$&6 *5: SYMTAB
5
$fter scanning line <7 of the a#o0e program
$fter scanning line 187 of the a#o0e program
I O5(-P* A(+,(& 5((: .$ /&$:%'( $,('. '$:(
5
• • • •
If the operand contains an undefined sym#ol2 use 7 as the address and write the Tet record to the o#@ect program! ;orward references are entered into lists as in the load-and-go assem#ler! Ehen the definition of a sym#ol is encountered2 the assem#ler generates another Tet record with the correct operand address of each entry in the reference list! Ehen loaded2 the incorrect address 7 will #e updated #y the latter Tet record containing the sym#ol definition!
O,('. '$:( )(5(&*.(: ,6 $5(-/* *(+,(&
2!4!2 T@$-/* *(+,(& @. $(&*6 .&%'.%&( • • • • • • • • •
Most assem#lers di0ide the processing of the source program into two passes! The internal ta#les and su#routines that are used only during %ass 1 are no longer needed after the first pass is completed! The routines and ta#les for %ass 1 and %ass 6 are ne0er reuired at the same time! There are certain ta#les (SUMT$") and certain processing su#routines (searching SUMT$") that are used #y #oth passes! Since %ass 1 and %ass 6 segments are ne0er needed at the same time2 they can occupy the same locations in memory during eecution of the assem#ler! Initially the oot and %ass 1 segments are loaded into memory! The assem#ler then ma*es the first pass o0er the program #eing assem#led! $t the end of the %ass12 the %ass 6 segment is loaded2 replacing the %ass 1 segment! The assem#ler then ma*es its second pass of the source program and terminates! 5
• •
The assem#ler needs much less memory to run in this way than it would #e if #oth %ass 1 and %ass 6 were loaded at the same time! $ program that is designed to eecute in this way is called an 0erlay program #ecause some of its segments o0erlay others during eecution!
2!4!3 M%.-P* A(+,(& •
;or a two pass assem#ler2 forward references in sym#ol definition are not allowed> $&%9$ 'H, "'T$ "'T$ 'H, D'&T$ D'&T$ 'SE 1
•
The sym#ol "'T$ cannot #e assigned a 0alue when it is encountered during %ass 1 #ecause D'&T$ has not yet #een defined! 9ence $&%9$ cannot #e e0aluated during %ass 6! Sym#ol definition must #e completed in pass 1! %rohi#iting forward references in sym#ol definition is not a serious incon0enience! ;orward references tend to create difficulty for a person reading the program! The general solution for forward references is a multi-pass assem#ler that can ma*e as many passes as are needed to process the definitions of sym#ols! It is not necessary for such an assem#ler to ma*e more than 6 passes o0er the entire program! The portions of the program that in0ol0e forward references in sym#ol definition are sa0ed during %ass 1! $dditional passes through these stored definitions are made as the assem#ly progresses! This process is followed #y a normal %ass 6!
• • • • • • • • •
<7
I+/(+(5.*.$5 •
• •
;or a forward reference in sym#ol definition2 we store in the SUMT$"> The sym#ol name o The defining epression o The num#er of undefined sym#ols in the defining epression o The undefined sym#ol (mar*ed with a flag K) associated with a list of sym#ols depend on this undefined sym#ol! Ehen a sym#ol is defined2 we can recursi0ely e0aluate the sym#ol epressions depending on the newly defined sym#ol!
E?*+/( $ M%.-/* *(+,(&
Consider the sym#ol ta#le entries from %ass 1 processing of the statement! 9$&;S6 • • • •
'H,
M$B&'/6
Since M$B&' has not yet #een defined2 no 0alue for 9$&;S6 can #e computed! The defining epression for 9$&;S6 is stored in the sym#ol ta#le in place of its 0alue! The entry N1 indicates that 1 sym#ol in the defining epression undefined! SUMT$" simply contain a pointer to the defining epression!
<1
• •
The sym#ol M$B&' is also entered in the sym#ol ta#le2 with the flag K identifying it as undefined! $ssociated with this entry is a list of the sym#ols whose 0alues depend on M$B&'!
<6
UNIT III LOADERS AND LINKERS INTRODUCTION • • • • • • •
&oader is a system program that performs the loading function! Many loaders also support relocation and lin*ing! Some systems ha0e a lin*er (lin*age editor) to perform the lin*ing operations and a separate loader to handle relocation and loading! ne system loader or lin*er can #e used regardless of the original source programming language! &oading "rings the o#@ect program into memory for eecution! elocation Modifies the o#@ect program so that it can #e loaded at an address different from the location originally specified! &in*ing Com#ines two or more separate o#@ect programs and supplies the information needed to allow references #etween them!
3!1 BASIC LOADER FUNCTIONS ;undamental functions of a loader> 1! "ringing an o#@ect program into memory! 6! Starting its eecution!
3!1!1 D()5 $ *5 A,$%.( L$*:(& ;or a simple a#solute loader2 all functions are accomplished in a single pass as follows> 1) The 9eader record of o#@ect programs is chec*ed to 0erify that the correct program has #een presented for loading! 6) $s each Tet record is read2 the o#@ect code it contains is mo0ed to the indicated address in memory! 5) Ehen the 'nd record is encountered2 the loader @umps to the specified address to #egin eecution of the loaded program!
<5
A5 (?*+/( $,('. /&$)&*+ $@5 5 F) 7*!
F) 7, $@ * &(/&((5.*.$5 $ .( /&$)&*+ &$+ F) 7* *.(& $*:5)!
<<
A)$&.+ $& A,$%.( L$*:(&
• • •
•
It is 0ery important to realiAe that in ;ig (a)2 each printed character represents one #yte of the o#@ect program record! In ;ig (#)2 on the other hand2 each printed character represents one headecimal digit in memory (a half-#yte)! Therefore2 to sa0e space and eecution time of loaders2 most machines store o#@ect programs in a ,5*&6 $&+2 with each #yte of o#@ect code stored as a single #yte in the o#@ect program! In this type of representation a #yte may contain any #inary 0alue!
3!1!2 A S+/( B$$..&*/ L$*:(& Ehen a computer is first turned on or restarted2 a special type of a#solute loader2 called a ,$$..&*/ $*:(&2 is eecuted! This #ootstrap loads the first program to #e run #y the computer – usually an operating system! W$&5) $ * +/( B$$..&*/ $*:(& • • •
The #ootstrap #egins at address 7 in the memory of the machine! It loads the operating system at address 7! 'ach #yte of o#@ect code to #e loaded is represented on de0ice ;1 as two hexadecimal digits @ust as it is in a Tet record of a SIC o#@ect program!
<=
•
• • •
• •
The o#@ect code from de0ice ;1 is always loaded into consecuti0e #ytes of memory2 starting at address 7! The main loop of the #ootstrap *eeps the address of the net memory location to #e loaded in register B! $fter all of the o#@ect code from de0ice ;1 has #een loaded2 the #ootstrap @umps to address 72 which #egins the eecution of the program that was loaded! Much of the wor* of the #ootstrap loader is performed #y the su#routine +'TC! +'TC is used to read and con0ert a pair of characters from de0ice ;1 representing 1 #yte of o#@ect code to #e loaded! ;or eample2 two #ytes F C 3D4 X<<5?9 con0erting to one #yte XD?9! The resulting #yte is stored at the address currently in register B2 using STC9 instruction that refers to location 7 using indeed addressing! The TIB instruction is then used to add 1 to the 0alue in B!
S$%&'( '$:( $& ,$$..&*/ $*:(&
<8
3!2 MACINE-DEPENDENT LOADER FEATURES •
• •
•
•
•
The a#solute loader has se0eral potential disad0antages! ne of the most o#0ious is the need for the programmer to specify the actual address at which it will #e loaded into memory! n a simple computer with a small memory the actual address at which the program will #e loaded can #e specified easily! n a larger and more ad0anced machine2 we often li*e to run se0eral independent programs together2 sharing memory #etween them! Ee do not *now in ad0ance where a program will #e loaded! 9ence we write relocata#le programs instead of a#solute ones! Eriting a#solute programs also ma*es it difficult to use su#routine li#raries efficiently! This could not #e done effecti0ely if all of the su#routines had preassigned a#solute addresses! The need for program relocation is an indirect conseuence of the change to larger and more powerful computers! The way relocation is implemented in a loader is also dependent upon machine characteristics! &oaders that allow for program relocation are called relocating loaders or relati0e loaders!
3!2!1 R($'*.$5 T@$ +(.$: $& /('65) &($'*.$5 * /*&. $ .( $,('. /&$)&*+"
T( &. +(.$: " •
$ Modification is used to descri#e each part of the o#@ect code that must #e changed when the program is relocated!
<
F)71 "C$5:(& .( /&$)&*+
<
• •
Most of the instructions in this program use relati0e or immediate addressing! The only portions of the assem#led program that contain actual addresses are the etended format instructions on lines 1=2 5=2 and 8=! Thus these are the only items whose 0alues are affected #y relocation!
O,('. /&$)&*+
• • •
'ach Modification record specifies the starting address and length of the field whose 0alue is to #e altered! It then descri#es the modification to #e performed! In this eample2 all modifications add the 0alue of the sym#ol C%U2 which represents the starting address of the program!
F)72 "C$5:(& * R($'*.*,( /&$)&*+ $& * S.*5:*&: SIC +*'5(
<
.
.
.
•
• •
The Modification record is not well suited for use with all machine architectures!Consider2 for eample2 the program in ;ig (6) !This is a relocata#le program written for standard 0ersion for SIC! The important difference #etween this eample and the one in ;ig (1) is that the standard SIC machine does not use relati0e addressing! In this program the addresses in all the instructions ecept S," must modified when the program is relocated! This would reuire 51 Modification records2 which results in an o#@ect program more than twice as large as the one in ;ig (1)!
T( ('$5: +(.$: " • • •
There are no Modification records! The Tet records are the same as #efore ecept that there is a relocation bit associated with each word of o#@ect code! Since all SIC instructions occupy one word2 this means that there is one relocation #it for each possi#le instruction!
F) 73" O,('. /&$)&*+ @. &($'*.$5 ,6 ,. +*
=7
•
•
• •
•
The relocation #its are gathered together into a ,. +* following the length indicator in each Tet record! In ;ig (5) this mas* is represented (in character form) as three headecimal digits! If the relocation #it corresponding to a word of o#@ect code is set to 12 the program?s starting address is to #e added to this word when the program is relocated! $ #it 0alue of 0 indicates that no modification is necessary! If a Tet record contains fewer than 16 words of o#@ect code2 the #its corresponding to unused words are set to 7! ;or eample2 the #it mas* ;;C (representing the #it string 111111111177) in the first Tet record specifies that all 17 words of o#@ect code are to #e modified during relocation! E?*+/(" ote that the &DB instruction on line 617 (;ig (6)) #egins a new Tet record! If it were placed in the preceding Tet record2 it would not #e properly aligned to correspond to a relocation #it #ecause of the 1-#yte data 0alue generated from line 1=!
3!2!2 P&$)&*+ L55) Consider the three (separately assem#led) programs in the figure2 each of which consists of a single control section! P&$)&*+ 1 7PRO#A"
=1
P&$)&*+ 2 7PRO#B"
P&$)&*+ 3 7PRO#C"
=6
C$5:(& &. .( &((&(5'( +*&(: REF1!
;or the first program (%+$)2 • ';1 is simply a reference to a la#el within the program! • It is assem#led in the usual way as a %C relati0e instruction! • o modification for relocation or lin*ing is necessary! In %+"2 the same operand refers to an eternal sym#ol! • The assem#ler uses an etended-format instruction with address field set to 77777! • The o#@ect program for %+" contains a Modification record instructing the loader to add the value of the symbol LIST to this address field when the program is lin*ed! ;or %+C2 ';1 is handled in eactly the same way!
C$&&(/$5:5) $,('. /&$)&*+
PRO#A"
=5
PRO#B"
PRO#C"
=<
•
The reference mar*ed ';6 is processed in a similar manner!
•
';5 is an immediate operand whose 0alue is to #e the difference #etween 'D$ and &IST$ (that is2 the length of the list in #ytes)! In %+$2 the assem#ler has all of the information necessary to compute this 0alue! During the assem#ly of %+" (and %+C)2 the 0alues of the la#els are un*nown! In these programs2 the epression must #e assem#led as an eternal reference (with two !odification records) e0en though the final result will #e an a#solute 0alue independent of the locations at which the programs are loaded!
•
•
• •
•
•
•
C$5:(& REF4! The assem#ler for %+$ can e0aluate all of the epression in ';< ecept for the 0alue of &ISTC! This results in an initial 0alue of X777719 and one Modification record! The same epression in %+" contains no terms that can #e e0aluated #y the assem#ler! The o#@ect code therefore contains an initial 0alue of 777777 and three Modification records! ;or %+C2 the assem#ler can supply the 0alue of &ISTC relati0e to the #eginning of the program (#ut not the actual address2 which is not *nown until the program is loaded)! The initial 0alue of this data word contains the relati0e address of &ISTC (X777757?9)! Modification records instruct the loader to add the #eginning address of the program (i!e!2 the 0alue of %+C)2 to add the 0alue of 'D$2 and to su#tract the 0alue of &IST$!
F) 74" T( .&(( /&$)&*+ * .(6 +). *//(*& 5 +(+$&6 *.(& $*:5) *5: 55)!
==
%+$ has #een loaded starting at address <7772 with %+" and %+C immediately following!
;or eample2 the 0alue for reference ';< in %+$ is located at address <7=< (the #eginning address of %+$ plus 77=<)!
F) 7;" R($'*.$5 *5: 55) $/(&*.$5 /(&$&+(: $5 REF4 5 PRO#A
=8
The initial 0alue (from the Tet record) is 77771
3!2!3 A)$&.+ *5: D*.* S.&%'.%&( $& * L55) L$*:(& • •
• • •
The algorithm for a lin"ing loader is considera#ly more complicated than the absolute loader algorithm! $ lin*ing loader usually ma*es two passes o0er its input2 @ust as an assem#ler does! In terms of general function2 the two passes of a lin*ing loader are uite similar to the two passes of an assem#ler> %ass 1 assigns addresses to all eternal sym#ols! %ass 6 performs the actual loading2 relocation2 and lin*ing! The main data structure needed for our lin*ing loader is an external symbol table ESTAB! (1) This ta#le2 which is analogous to SUMT$" in our assem#ler algorithm2 is used to store the name and address of each eternal sym#ol in the set of control sections #eing loaded! =
(6) $ hashed organi#ation is typically used for this ta#le! •
Two other important 0aria#les are PRO#ADDR 7/&$)&*+ $*: *::&( and CSADDR 7'$5.&$ ('.$5 *::&(! (1) %+$DD is the beginning address in memory where the lin*ed program is to #e loaded! Its 0alue is supplied to the loader #y the S! (6) CS$DD contains the starting address assigned to the control section currently #eing scanned #y the loader! This 0alue is added to all relati0e addresses within the control section to con0ert them to actual addresses!
3!2!3!1 PASS 1 •
During %ass 12 the loader is concerned only with 9eader and Define record types in the control sections!
A)$&.+ $& P* 1 $ * L55) $*:(&
1) The #eginning load address for the lin*ed program (%+$DD) is o#tained from the S! This #ecomes the starting address (CS$DD) for the first control section in the input seuence! 6) The control section name from 9eader record is entered into 'ST$"2 with 0alue gi0en #y CS$DD! $ll (?.(&5* 6+,$ appearing in the Define record for the control =
section are also entered into 'ST$"! Their addresses are o#tained #y adding the 0alue specified in the Define record to CS$DD! 5) Ehen the 'nd record is read2 the control section length CS&T9 (which was sa0ed from the 'nd record) is added to CS$DD! This calculation gi0es the starting address for the net control section in seuence! •
$t the end of %ass 12 'ST$" contains all eternal sym#ols defined in the set of control sections together with the address assigned to each!
•
Many loaders include as an option the a#ility to print a $*: +*/ that shows these sym#ols and their addresses!
3!2!3!2 PASS 2 •
%ass 6 performs the actual loading 2 relocation2 and lin"ing of the program!
A)$&.+ $& P* 2 $ * L55) $*:(&
1) $s each Tet record is read2 the o#@ect code is mo0ed to the specified address (plus the current 0alue of CS$DD)! 6) Ehen a Modification record is encountered2 the sym#ol whose 0alue is to #e used for modification is loo*ed up in 'ST$"! 5) This 0alue is then added to or su#tracted from the indicated location in memory! <) The last step performed #y the loader is usually the transferring of control to the loaded program to #egin eecution!
•
• •
The 'nd record for each control section may contain the address of the first instruction in that control section to #e eecuted! ur loader ta*es this as the transfer point to #egin eecution! If more than one control section specifies a transfer address2 the loader ar#itrarily uses the last one encountered! If no control section contains a transfer address2 the loader uses the #eginning of the lin*ed program (i!e!2 %+$DD) as the transfer point! ormally2 a transfer address would #e placed in the 'nd record for a main program2 #ut not for a su#routine!
=
This algorithm can #e made more efficient! $ssign a reference num#er2 which is used (instead of the sym#ol name) in Modification records2 to each eternal sym#ol referred to in a control section! Suppose we always assign the reference num#er 71 to the control section name!
F) 7" O,('. /&$)&*+ %5) &((&(5'( 5%+,(& $& '$:( +$:'*.$5
87
81
3!3 MACINE-INDEPENDENT LOADER FEATURES • •
&oading and lin*ing are often thought of as S ser0ice functions! Therefore2 most loaders include fewer different features than are found in a typical assem#ler! They include the use of an automatic li#rary search process for handling eternal reference and some common options that can #e selected at the time of loading and lin*ing!
3!3!1 A%.$+*.' L,&*&6 S(*&' • • • •
•
•
Many lin*ing loaders can automatically incorporate routines from a su#program li#rary into the program #eing loaded! &in*ing loaders that support automatic library search must *eep trac* of eternal sym#ols that are referred to2 #ut not defined2 in the primary input to the loader! $t the end of %ass 12 the sym#ols in 'ST$" that remain undefined represent unresol0ed eternal references! The loader searches the li#rary or li#raries specified for routines that contain the definitions of these sym#ols2 and processes the su#routines found #y this search eactly as if they had #een part of the primary input stream! The su#routines fetched from a li#rary in this way may themsel0es contain eternal references! It is therefore necessary to repeat the li#rary search process until all references are resol0ed! If unresol0ed eternal references remain after the li#rary search is completed2 these must #e treated as errors!
3!3!2 L$*:(& O/.$5 •
•
Many loaders allow the user to specify options that modify the standard processing T6/'* $*:(& $/.$5 1" $llows the selection of alternati0e sources of input! E? " IC&,D' program-name (li#rary-name) might direct the loader to read the designated o#@ect program from a li#rary and treat it as if it were part of the primary loader input!
•
L$*:(& $/.$5 2" $llows the user to delete eternal sym#ols or entire control sections! E? " D'&'T' csect-name might instruct the loader to delete the named control section(s) from the set of programs #eing loaded!
C9$+' name12 name6 might cause the eternal sym#ol name1 to #e changed to name6 where0er it appears in the o#@ect programs! 86
•
L$*:(& $/.$5 3" In0ol0es the automatic inclusion of li#rary routines to satisfy eternal references! E?! " &I"$U MU&I" Such user-specified li#raries are normally searched #efore the standard system li#raries! This allows the user to use special 0ersions of the standard routines!
C$&& STDD'P2 %&T2 C'& •
To instruct the loader that these eternal references are to remain unresol0ed! This a0oids the o0erhead of loading and lin*ing the unneeded routines2 and sa0es the memory space that would otherwise #e reuired!
3!4 LOADER DESI#N OPTIONS • •
• •
•
&in*ing loaders perform all lin*ing and relocation at load time! There are two alternati0es> 1! L5*)( (:.$&2 which perform lin*ing prior to load time! 6! D65*+' 55)2 in which the lin*ing function is performed at eecution time! %recondition> The source program is first assem#led or compiled2 producing an o#@ect program! $ 55) $*:(& performs all lin*ing and relocation operations2 including automatic li#rary search if specified2 and loads the lin*ed program directly into memory for eecution! $ 5*)( (:.$& produces a lin*ed 0ersion of the program (load module or eecuta#le image)2 which is written to a file or li#rary for later eecution!
3!4!1 L5*)( E:.$& •
• •
•
The lin*age editor performs relocation of all control sections relati0e to the start of the lin*ed program! Thus2 all items that need to #e modified at load time ha0e 0alues that are relati0e to the start of the lin*ed program! This means that the loading can #e accomplished in one pass with no eternal sym#ol ta#le reuired! If a program is to #e eecuted many times without #eing reassem#led2 the use of a lin*age editor su#stantially reduces the o0erhead reuired! &in*age editors can perform many useful functions #esides simply preparing an o#@ect program for eecution! '!2 a typical seuence of lin*age editor commands used> IC&,D' %&$' (%+&I") 85
D'&'T' %:'CT :((.( &$+ (?.5) PLANNERG IC&,D' %:'CT ('E&I") 5'%:( 5(@ (&$5G '%&$C' %&$' (%+&I") •
•
&in*age editors can also #e used to #uild pac*ages of su#routines or other control sections that are generally used together! This can #e useful when dealing with su#routine li#raries that support high-le0el programming languages! &in*age editors often include a 0ariety of other options and commands li*e those discussed for lin*ing loaders! Compared to lin*ing loaders2 lin*age editors in general tend to offer more flei#ility and control!
F) 7H" P&$'(5) $ *5 $,('. /&$)&*+ %5) 7* L55) $*:(& *5: 7, L5*)( (:.$&
3!4!2 D65*+' L55)
8<
• • •
•
•
•
&in*age editors perform lin*ing operations #efore the program is loaded for eecution! &in*ing loaders perform these same operations at load time! Dynamic lin*ing2 dynamic loading2 or load on call postpones the lin*ing function until eecution time> a su#routine is loaded and lin*ed to the rest of the program when it is first called! Dynamic lin*ing is often used to allow se0eral eecuting programs to share one copy of a su#routine or li#rary2 e! run-time support routines for a high-le0el language li*e C! Eith a program that allows its user to interacti0ely call any of the su#routines of a large mathematical and statistical li#rary2 all of the li#rary su#routines could potentially #e needed2 #ut only a few will actually #e used in any one eecution! Dynamic lin*ing can a0oid the necessity of loading the entire li#rary for each eecution ecept those necessary su#routines!
8=
F) 7*" Instead of eecuting a :S," instruction referring to an eternal sym#ol2 the program ma*es a load-and-call ser0ice reuest to S! The parameter of this reuest is the sym#olic name of the routine to #e called! F) 7," S eamines its internal ta#les to determine whether or not the routine is already loaded! If necessary2 the routine is loaded from the specified user or system li#raries! F) 7'" Control is then passed from S to the routine #eing called F) 7:" Ehen the called su#routine completes it processing2 it returns to its caller (i!e!2 S)! S then returns control to the program that issued the reuest! F) 7(" If a su#routine is still in memory2 a second call to it may not reuire another load operation! Control may simply #e passed from the dynamic loader to the called routine!
88
3!4!3 B$$..&*/ L$*:(& • • •
• • • •
•
• • •
Eith the machine empty and idle there is no need for program relocation! Ee can specify the a#solute address for whate0er program is first loaded and this will #e the S2 which occupies a predefined location in memory! Ee need some means of accomplishing the functions of an a#solute loader! 1! To ha0e the operator enter into memory the o#@ect code for an a#solute loader2 using switches on the computer console! 6! To ha0e the a#solute loader program permanently resident in a M! 5! To ha0e a #uilt –in hardware function that reads a fied –length record from some de0ice into memory at a fied location! Ehen some hardware signal occurs2 the machine #egins to eecute this M program! n some computers2 the program is eecuted directly in the M> on others2 the program is copied from M to main memory and eecuted there! The particular de0ice to #e used can often #e selected 0ia console switches! $fter the read operation is complete2 control is automatically transferred to the address in memory where the record was stored2 which contains machine where the record was stored2 which contains machine instructions that load the a#solute program that follow! If the loading process reuires more instructions that can #e read in a single record2 this first record causes the reading of others2 and these in turn can cause the reading of still more records – #oots trap! The first record is generally referred to as #ootstrap loader> Such a loader is added to the #eginning of all o#@ect programs that are to #e loaded into an empty and idle system! This includes the S itself and all stand-alone programs that are to #e run without an S!
8
UNIT IV MACROPROCESSORS INTRODUCTION M*'&$ I5.&%'.$5
\
\
$ macro instruction (macro) – It is simply a notational con0enience for the programmer to write a shorthand 0ersion of a program! – It represents a commonly used group of statements in the source program! – It is replaced #y the macro processor with the corresponding group of source language statements! This operation is called 3epanding the macro4 ;or eample> – Suppose it is necessary to sa0e the contents of all registers #efore calling a su#routine! – This reuires a seuence of instructions! – Ee can define and use a macro2 S$P''+S2 to represent this seuence of instructions!
M*'&$ P&$'($&
\
\ \
$ macro processor – Its functions essentially in0ol0e the su#stitution of one group of characters or lines for another! – ormally2 it performs no analysis of the tet it handles! – It doesn?t concern the meaning of the in0ol0ed statements during macro epansion! Therefore2 the design of a macro processor generally is machine independent! Macro processors are used in – assem#ly language – high-le0el programming languages2 e!g!2 C or CGG – S command languages – general purpose
F$&+*. $ +*'&$ :(5.$5
$ macro can #e defined as follows M$C - M$C pseudo-op shows start of macro definition! ame Z&ist of %arameters[ – Macro name with a list of formal parameters! 8
YY! YY! YY!
-
Seuence of assem#ly language instructions!
M'D
-
M'D (M$C-'D) %seudo shows the end of macro definition!
E?*+/("
M$C S,M B2U &D$ B MP "B2B &D$ U $DD "B M'D
4!1 BASIC MACROPROCESSOR FUNCTIONS The fundamental functions common to all macro processors are> 1! Macro Definition 6! Macro In0ocation 5! Macro 'pansion
M*'&$ D(5.$5 *5: E?/*5$5 •
•
•
Two new assem#ler directi0es are used in macro definition> M$C> identify the #eginning of a macro definition o M'D> identify the end of a macro definition o %rototype for the macro> o 'ach parameter #egins with XN? la#el op operands name M$C parameters > body > M'D "ody> The statements that will #e generated as the epansion of the macro!
8
7
• • • • • • • • •
•
•
It shows an eample of a SIC/B' program using macro Instructions! This program defines and uses two macro instructions2 D",;; and ED,;; ! The functions and logic of D",;; macro are similar to those of the D",;; su#routine! The E",;; macro is similar to E'C su#routine! Two $ssem#ler directi0es (M$C and M'D) are used in macro definitions! The first M$C statement identifies the #eginning of macro definition! The Sym#ol in the la#el field (D",;;) is the name of macro2 and entries in the operand field identify the parameters of macro instruction! In our macro language2 each parameter #egins with character N2 which facilitates the su#stitution of parameters during macro epansion! The macro name and parameters define the pattern or prototype for the macro instruction used #y the programmer! The macro instruction definition has #een deleted since they ha0e #een no longer needed after macros are epanded! 'ach macro in0ocation statement has #een epanded into the statements that form the #ody of the macro2 with the arguments from macro in0ocation su#stituted for the parameters in macro prototype! The arguments and parameters are associated with one another according to their positions!
M*'&$ I5$'*.$5 • •
•
$ macro in0ocation statement (a macro call) gi0es the name of the macro instruction #eing in0o*ed and the arguments in epanding the macro! The processes of macro in0ocation and su#routine call are uite different! Statements of the macro #ody are epanded each time the macro is o in0o*ed! Statements of the su#routine appear only oneR regardless of how many o times the su#routine is called! The macro in0ocation statements treated as comments and the statements generated from macro epansion will #e assem#led as though they had #een written #y the programmer!
1
M*'&$ E?/*5$5 • •
• •
'ach macro in0ocation statement will #e epanded into the statements that form the #ody of the macro! $rguments from the macro in0ocation are su#stituted for the parameters in the macro prototype! The arguments and parameters are associated with one another according o to their positions! The first argument in the macro in0ocation corresponds to the first parameter in the macro prototype2 etc! Comment lines within the macro #ody ha0e #een deleted2 #ut comments on indi0idual statements ha0e #een retained! Macro in0ocation statement itself has #een included as a comment line!
'ample of a macro epansion
6
• • • • • • •
In epanding the macro in0ocation on line 172 the argument ;1 is su#stituted for the parameter and ID'P where0er it occurs in the #ody of the macro! Similarly ",;;' is su#stituted for ",;$D and &'+T9 is su#stituted for 'C&T9! &ines 17a through 17m show the complete epansion of the macro in0ocation on line 17! The la#el on the macro in0ocation statement C&% has #een retained as a la#el on the first statement generated in the macro epansion! This allows the programmer to use a macro instruction in eactly the same way as an assem#ler language mnemonic! $fter macro processing the epanded file can #e used as input to assem#ler! The macro in0ocation statement will #e treated as comments and the statements generated from the macro epansions will #e assem#led eactly as though they had #een written directly #y the programmer!
5
4!1!1 M*'&$ P&$'($& A)$&.+ *5: D*.* S.&%'.%&( •
•
It is easy to design a two-pass macro processor in which all macro definitions are processed during the first pass 2and all macro in0ocation statements are epanded during second pass Such a two pass macro processor would not allow the #ody of one macro instruction to contain definitions of other macros!
E?*+/( 1"
E?*+/( 2"
<
•
• •
Defining M$CS or M$CB does not define D",;; and the other macro instructions! These definitions are processed only when an in0ocation of M$CS or M$CB is epanded! $ one pass macroprocessor that can alternate #etween macro definition and macro epansion is a#le to handle macros li*e these! There are 5 main data structures in0ol0ed in our macro processor!
D(5.$5 .*,( 7DEFTAB 1! The macro definition themsel0es are stored in definition ta#le (D';T$")2 which contains the macro prototype and statements that ma*e up the macro #ody! 6! Comment lines from macro definition are not entered into D';T$" #ecause they will not #e a part of macro epansion! N*+( .*,( 7NAMTAB 1! eferences to macro instruction parameters are con0erted to a positional entered into $MT$"2 which ser0es the inde to D';T$"! 6! ;or each macro instruction defined2 $MT$" contains pointers to #eginning and end of definition in D';T$"! A&)%+(5. .*,( 7AR#TAB 1! The third Data Structure in an argument ta#le ($+T$")2 which is used during epansion of macro in0ocations! 6! Ehen macro in0ocation statements are recogniAed2 the arguments are stored in $+T$" according to their position in argument list! 5! $s the macro is epanded2 arguments from $+T$" are su#stituted for the corresponding parameters in the macro #ody!
=
• •
The position notation is used for the parameters! The parameter NID'P has #een con0erted to ]12 N",;$D has #een con0erted to ]6! Ehen the ]n notation is recogniAed in a line from D';T$"2 a simple indeing operation supplies the property argument from $+T$"!
A)$&.+" • • • •
The procedure D';I'2 which is called when the #eginning of a macro definition is recogniAed2 ma*es the appropriate entries in D';T$" and $MT$"! 'B%$D is called to set up the argument 0alues in $+T$" and epand a macro in0ocation statement! The procedure +'T&I' gets the net line to #e processed This line may come from D';T$" or from the input file2 depending upon whether the "oolean 0aria#le 'B%$DI+ is set to T,' or ;$&S'!
8
4!2 MACINE INDEPENDENT MACRO PROCESSOR FEATURES Machine independent macro processor features are etended features that are not directly related to architecture of computer for which the macro processor is written!
4!2!1 C$5'*.(5*.$5 $ M*'&$ P*&*+(.(& • •
Most Macro %rocessor allows parameters to #e concatenated with other character strings! $ program contains a set of series of 0aria#les> B$12 B$62 B$52Y
B"12 B"62 B"52Y If similar processing is to #e performed on each series of 0aria#les2 the programmer might want to incorporate this processing into a macro instructuion! The parameter to such a macro instruction could specify the series of 0aria#les to #e operated on ($2 "2 C Y)! The macro processor constructs the sym#ols #y concatenating B2 ($2 "2 Y)2 and (126252Y) in the macro epansion!
• • •
•
•
•
Suppose such parameter is named NID2 the macro #ody may contain a statement> &D$ BNID12 in which NID is concatenated after the string 3B4 and #efore the string 314! ◊ &D$ B$1 (NIDF$) ◊ &D$ B"1 (NIDF") $m#iguity pro#lem> '!g!2 BNID1 may mean 3B4 G NID G 314 3B4 G NID1 This pro#lem occurs #ecause the end of the parameter is not mar*ed! Solution to this am#iguity pro#lem> ,se a special concatenation operator 34 to specify the end of the parameter &D$ BNID 1 So that the end of parameter NID is clearly identified!
M*'&$ :(5.$5
M*'&$ 5$'*.$5 .*.(+(5.
•
The macroprocessor deletes all occurrences of the concatenation operator immediately after performing parameter su#stitution2 so the character will not appear in the macro epansion!
4!2!2 #(5(&*.$5 $ U5%( L*,( •
&a#els in the macro #ody may cause 3duplicate la#els4 pro#lem if the macro is in0ocated and epanded multiple times! • ,se of relati0e addressing at the source statement le0el is 0ery incon0enient2 error-prone2 and difficult to read! • It is highly desira#le to 1! &et the programmer use la#el in the macro #ody • &a#els used within the macro #ody #egin with ^! 6! &et the macro processor generate uniue la#els for each macro in0ocation and epansion! • During macro epansion2 the ^ will #e replaced with ^2 where is a two-character alphanumeric counter of the num#er of macro instructions epanded! • BBF$$2 $"2 $C YY!
JC$5:(& .( :(5.$5 $ WRBUFF
;
13; 140 1;; 2;;
COPY " " TD " E " LT " END
START
0
X =OUTDEV -3 -14 FIRST
• • •
• • •
If a la#el was placed on the TD instruction on line 15=2 this la#el would #e defined twice2 once for each in0ocation of E",;;! This duplicate definition would pre0ent correct assem#ly of the resulting epanded program! The @ump instructions on line 1<7 and 1== are written using the re
RDBUFF :(5.$5
•
&a#els within the macro #ody #egin with the special character ^!
M*'&$ (?/*5$5
7
• • • •
,niue la#els are generated within macro epansion! 'ach sym#ol #eginning with ^ has #een modified #y replacing ^ with ^$$! The character ^ will #e replaced #y ^2 where is a two-character alphanumeric counter of the num#er of macro instructions epanded! ;or the first macro epansion in a program2 will ha0e the 0alue $$! ;or succeeding macro epansions2 will #e set to $"2 $C etc!
4!2!3 C$5:.$5* M*'&$ E?/*5$5 •
$rguments in macro in0ocation can #e used to> o Su#stitute the parameters in the macro #ody without changing the seuence of statements epanded! Modify the seuence of statements for '$5:.$5* +*'&$ (?/*5$5 (or o conditional assem#ly when related to assem#ler)! This capa#ility adds greatly to the power and flei#ility of a macro language!
C$5:(& .( (?*+/(
1
Macro Time 0aria#le "oolean 'pression
•
Two additional parameters used in the eample of conditional macro epansion N'> specifies a headecimal character code that mar*s the end of a o record o NM$B&T9> specifies the maimum length of a record
•
Macro-time 0aria#le (S'T sym#ol) can #e used to o store wor*ing 0alues during the macro epansion store the e0aluation result of "oolean epression control the macro-time conditional structures o #egins with 3N4 and that is not a macro instruction parameter o #e initialiAed to a 0alue of 7 o #e set #y a macro processor directi0e2 S'T
•
Macro-time conditional structure I;-'&S'-'DI; o E9I&'-'DE o
6
4!2!3!1 I+/(+(5.*.$5 S.&%'.%&( •
• •
M*'&$
E?/*5$5 7IF-ELSE-ENDIF
$ sym#ol ta#le is maintained #y the macroprocessor! This ta#le contains the 0alues of all macro-time 0aria#les used! o 'ntries in this ta#le are made or modified when S'T statements are o processed! o This ta#le is used to loo* up the current 0alue of a macro-time 0aria#le whene0er it is reuired! The testing of the condition and looping are done while the macro is #eing epanded! Ehen an I; statement is encountered during the epansion of a macro2 the specified "oolean epression is e0aluated! If 0alue is T,' o The macro processor continues to process lines from D';T$" until it encounters the net '&S' or 'DI; statement! If '&S' is encountered2 then s*ips to 'DI; ;$&S' o The macro processor s*ips ahead in D';T$" until it finds the net '&S' or 'D&; statement!
4!2!3!2 I+/(+(5.*.$5 S.&%'.%&(
•
$ C$5:.$5*
$
C$5:.$5*
M*'&$
E?/*5$5
7WILE-ENDW
Ehen an E9I&' statement is encountered during the epansion of a macro2 the specified "oolean epression is e0aluated! If 0alue is T,' o The macro processor continues to process lines from D';T$" until it encounters the net 'DE statement! Ehen 'DE is encountered2 the macro processor returns to the preceding E9I&'2 re-e0aluates the "oolean epression2 and ta*es action again! ;$&S' o The macro processor s*ips ahead in D';T$" until it finds the net 'DE statement and then resumes normal macro epansion!
4!2!4 K(6@$&: M*'&$ P*&*+(.(& 5
•
P$.$5* /*&*+(.(& o
o
o
o o
•
%arameters and arguments are associated according to their positions in the macro prototype and in0ocation! The programmer must specify the arguments in proper order! If an argument is to #e omitted2 a null argument should #e used to maintain the proper order in macro in0ocation statement! ;or eample> Suppose a macro instruction +'' has 17 possi#le parameters2 #ut in a particular in0ocation of the macro only the 5rd and th parameters are to #e specified! The statement is +'' 22DI'CT2222225! It is not suita#le if a macro has a large num#er of parameters2 and only a few of these are gi0en 0alues in a typical in0ocation!
K(6@$&: /*&*+(.(& o
o o o
'ach argument 0alue is written with a *eyword that names the corresponding parameter! $rguments may appear in any order! ull arguments no longer need to #e used! If the 5rd parameter is named NTU%' and th parameter is named NC9$'&2 the macro in0ocation would #e +'' TU%'FDI'CT2C9$'&F5!
o
It is easier to read and much less error-prone than the positional method!
C$5:(& .( (?*+/( •
9ere each parameter name is followed #y eual sign2 which identifies a *eyword parameter and a default 0alue is specified for some of the parameters!
<
9ere the 0alue if NID'P is specified as ;5 and the 0alue of N' is specified as null! =
4!3! MACROPROCESSOR DESI#N OPTIONS 4!3!1 R('%&( M*'&$ E?/*5$5
•
DC9$> o read one character from a specified de0ice into register $ o should #e defined #eforehand (i!e!2 #efore D",;;) 8
I+/(+(5.*.$5 $ R('%&( M*'&$ E?/*5$5 • •
•
•
%re0ious macro processor design cannot handle such *ind of recursi0e macro in0ocation and epansion2 e!g!2 D",;; ",;;'2 &'+T92 ;1 easons> 1) The procedure 'B%$D would #e called recursi0ely2 thus the in0ocation arguments in the $+T$" will #e o0erwritten! 6) The "oolean 0aria#le 'B%$DI+ would #e set to ;$&S' when the 3inner4 macro epansion is finished2 that is2 the macro process would forget that it had #een in the middle of epanding an 3outer4 macro! 5) $ similar pro#lem would occur with %C'SS&I' since this procedure too would #e called recursi0ely! Solutions> 1) Erite the macro processor in a programming language that allows recursi0e calls2 thus local 0aria#les will #e retained! 6) ,se a stac* to ta*e care of pushing and popping local 0aria#les and return addresses! $nother pro#lem> can a macro in0o*e itself recursi0ely]
4!3!2 O5(-P* M*'&$ P&$'($& • •
$ one-pass macro processor that alternate #etween macro definition and macro epansion in a recursi0e way is a#le to handle recursi0e macro definition! "ecause of the one-pass structure2 the definition of a macro must appear in the source program #efore any statements that in0o*e that macro!
*5:5) R('%&( M*'&$ D(5.$5 •
In D';I' procedure Ehen a macro definition is #eing entered into D';T$"2 the normal o approach is to continue until an M'D directi0e is reached! This would not wor* for recursi0e macro definition #ecause the first o M'D encountered in the inner macro will terminate the whole macro definition process! To sol0e this pro#lem2 a counter &'P'& is used to *eep trac* of the le0el o of macro definitions! Increase &'P'& #y 1 each time a M$C directi0e is read! Decrease &'P'& #y 1 each time a M'D directi0e is read! $ M'D can terminate the whole macro definition process only when &'P'& reaches 7!
This process is 0ery much li*e matching left and right parentheses when scanning an arithmetic epression!
4!3!3 T@$-P* M*'&$ P&$'($& •
•
Two-pass macro processor o %ass 1> %rocess macro definition %ass 6> o 'pand all macro in0ocation statements %ro#lem o This *ind of macro processor cannot allow recursi0e macro definition2 that is2 the #ody of a macro contains definitions of other macros (#ecause all macros would ha0e to #e defined during the first pass #efore any macro in0ocations were epanded)!
E?*+/( $ R('%&( M*'&$ D(5.$5 •
•
• •
M$CS (for SIC) o Contains the definitions of D",;; and E",;; written in SIC instructions! M$CB (for SIC/B') Contains the definitions of D",;; and E",;; written in SIC/B' o instructions! $ program that is to #e run on SIC system could in0o*e M$CS whereas a program to #e run on SIC/B' can in0o*e M$CB! Defining M$CS or M$CB does not define D",;; and E",;;! These definitions are processed only when an in0ocation of M$CS or M$CB is epanded!
4!3!4 #(5(&*-P%&/$( M*'&$ P&$'($& #$* •
Macro processors that do not dependent on any particular programming language2 #ut can #e used with a 0ariety of different languages!
A:*5.*)( • %rogrammers do not need to learn many macro languages!
•
$lthough its de0elopment costs are somewhat greater than those for a languagespecific macro processor2 this epense does not need to #e repeated for each language2 thus sa0e su#stantial o0erall cost!
D*:*5.*)( • &arge num#er of details must #e dealt with in a real programming language • Situations in which normal macro parameter su#stitution should not occur2 e!g!2 comments! • ;acilities for grouping together terms2 epressions2 or statements • To*ens2 e!g!2 identifiers2 constants2 operators2 *eywords • Synta
4!3!; M*'&$ P&$'(5) @.5 L*5)%*)( T&*5*.$& Macro processors can #e 1 P&(/&$'($& %rocess macro definitions! o o 'pand macro in0ocations! o %roduce an epanded 0ersion of the source program2 which is then used as input to an assem#ler or compiler! 2 L5(-,6-5( +*'&$ /&$'($& ,sed as a sort of input routine for the assem#ler or compiler! o ead source program! o %rocess macro definitions and epand macro in0ocations! o %ass output lines to the assem#ler or compiler! o 3 I5.()&*.(: +*'&$ /&$'($&
4!3!;!1 L5(-,6-L5( M*'&$ P&$'($& B(5(. • It a0oids ma*ing an etra pass o0er the source program! • Data structures reuired #y the macro processor and the language translator can #e com#ined (e!g!2 %T$" and $MT$") • ,tility su#routines can #e used #y #oth macro processor and the language translator! Scanning input lines o Searching ta#les o Data format con0ersion o • It is easier to gi0e diagnostic messages related to the source statements!
4!3!;!2 I5.()&*.(: M*'&$ P&$'($&
7
•
$n integrated macro processor can potentially ma*e use of any information a#out the source program that is etracted #y the language translator! • $s an eample in ;T$ D 177 I F 1267 – a D statement> \ D> *eyword \ 177> statement num#er \ I> 0aria#le name D 177 I F 1 – $n assignment statement \ D177I> 0aria#le (#lan*s are not significant in ;T$) • $n integrated macro processor can support macro instructions that depend upon the contet in which they occur! D&*@,*' $ L5(-,6-5( $& I5.()&*.(: M*'&$ P&$'($& • • •
They must #e specially designed and written to wor* with a particular implementation of an assem#ler or compiler! The cost of macro processor de0elopment is added to the costs of the language translator2 which results in a more epensi0e software! The assem#ler or compiler will #e considera#ly larger and more comple!
UNIT V TEXT- EDITORS OVERVIEW OF TE EDITIN# PROCESS!
$n interacti0e editor is a computer program that allows a user to create and re0ise a target document! The term document includes o#@ects such as computer programs2 1
tets2 euations2 ta#les2 diagrams2 line art and photographs-anything that one might find on a printed page! Tet editor is one in which the primary elements #eing edited are character strings of the target tet! The document editing process is an interacti0e usercomputer dialogue designed to accomplish four tas*s> 1) Select the part of the target document to #e 0iewed and manipulated 6) Determine how to format this 0iew on-line and how to display it! 5) Specify and eecute operations that modify the target document! <) ,pdate the 0iew appropriately! T&*(5) – Selection of the part of the document to #e 0iewed and edited! It in0ol0es first .&*(5) through the document to locate the area of interest such as 3net screenful42 4#ottom42and 3find pattern4! Tra0eling specifies where the area of interest isR F.(&5) - The selection of what is to #e 0iewed and manipulated is controlled #y filtering! ;iltering etracts the rele0ant su#set of the target document at the point of interest such as net screenful of tet or net statement! F$&+*..5)" ;ormatting determines how the result of filtering will #e seen as a 0isi#le representation (the 0iew) on a display screen or other de0ice! E:.5)" In the actual editing phase2 the target document is created or altered with a set of operations such as insert2 delete2 replace2 mo0e or copy! M*5%'&/. $&(5.(: (:.$& operate on elements such as single characters2 words2 lines2 sentences and paragraphsR P&$)&*+-$&(5.(: (:.$& operates on elements such as identifiers2 *eywords and statements TE USER-INTERFACE OF AN EDITOR!
The user of an interacti0e editor is presented with a conceptual model of the editing system! The model is an a#stract framewor* on which the editor and the world on which the operations are #ased! The 5( (:.$& simulated the world of the *eypunch they allowed operations on num#ered seuence of 7-character card image lines! The S'&((5-(:.$& define a world in which a document is represented as a uarter-plane of tet lines2 un#ounded #oth down and to the right! The user sees2 through a cutout2 only a rectangular su#set of this plane on a multi line display terminal! The cutout can #e mo0ed left or right2 and up or down2 to display other portions of the document! The user interface is also concerned with the input de0ices2 the output de0ices2 and the interaction language of the system! INPUT DEVICES> The input de0ices are used to enter elements of tet #eing edited2 to enter commands2 and to designate edita#le elements! Input de0ices are categoriAed as> 1) Tet de0ices 6) "utton de0ices 5) &ocator de0ices
6
1 T(?. $& .&5) :('( are typically typewriter li*e *ey#oards on which user presses and release *eys2 sending uniue code for each *ey! Pirtually all computer *ey #oards are of the HE'TU type! 2 B%..$5 $& C$'( :('( generate an interrupt or set a system flag2 usually causing an in0ocation of an associated application program! $lso special function *eys are also a0aila#le on the *ey #oard! $lternati0ely2 #uttons can #e simulated in software #y displaying tet strings or sym#ols on the screen! The user chooses a string or sym#ol instead of pressing a #utton! 3 L$'*.$& :('(" They are two-dimensional analog-to-digital con0erters that position a cursor sym#ol on the screen #y o#ser0ing the user s mo0ement of the de0ice! The most common such de0ices are the +$%( and the .*,(.! ‟
T( D*.* T*,(. is a flat2 rectangular2 electromagnetically sensiti0e panel! 'ither the #allpoint pen li*e stylus or a puc*2 a small de0ice similar to a mouse is mo0ed o0er the surface! The ta#let returns to a system program the co-ordinates of the position on the data ta#let at which the stylus or puc* is currently located! The program can then map these data-ta#let coordinates to screen coordinates and mo0e the cursor to the corresponding screen position! Tet de0ices with arrow (Cursor) *eys can #e used to simulate locator de0ices! 'ach of these *eys shows an arrow that point up2 down2 left or right! %ressing an arrow *ey typically generates an appropriate character seuenceR the program interprets this seuence and mo0es the cursor in the direction of the arrow on the *ey pressed! VOICE-INPUT DEVICES> which translate spo*en words to their tetual eui0alents2 may pro0e to #e the tet input de0ices of the future! Poice recogniAers are currently a0aila#le for command input on some systems! OUTPUT DEVICES The output de0ices let the user 0iew the elements #eing edited and the result of the editing operations!
The first output de0ices were .((.6/(@&.(& and other character-printing terminals that generated output on paper! et 3)* .((.6/(4 #ased on Cathode ay Tu#e (CT) technology which uses CT screen essentially to simulate the hard-copy teletypewriter! Today s *:*5'(: CRT .(&+5* use hardware assistance for such features as mo0ing the cursor2 inserting and deleting characters and lines2 and scrolling lines and pages! The modern /&$($5* @$&.*.$5 are #ased on personal computers with high resolution displaysR support multiple proportionally spaced character fonts to produce realistic facsimiles of hard copy documents!
‟
INTERACTION LAN#UA#E>
5
The interaction language of the tet editor is generally one of se0eral common types! T( .6/5) $&(5.(: $& .(?. '$++*5:-$&(5.(: +(.$: It is the oldest of the ma@or editing interfaces! The user communicates with the editor #y typing tet strings #oth for command names and for operands! These strings are sent to the editor and are usually echoed to the output de0ice! Typed specification often reuires the user to remem#er the eact form of all commands2 or at least their a##re0iations! If the command language is comple2 the user must continually refer to a manual or an on-line 9elp function! The typing reuired can #e time consuming for in-eperienced users! F%5'.$5 (6 5.(&*'(" 'ach command is associated with mar*ed *ey on the *ey #oard! This eliminates much typing! '!g!> Insert *ey2 Shift *ey2 Control *ey D*:*5.*)("
9a0e too many uniue *eys Multiple *ey stro*e commands M(5% $&(5.(: 5.(&*'( $ menu is a multiple choice set of tet strings or icons which are graphical sym#ols that represent o#@ects or operations! The user can perform actions #y selecting items for the menus! The editor prompts the user with a menu! ne pro#lem with menu oriented system can arise when there are many possi#le actions and se0eral choices are reuired to complete an action! The display area of the menu is rather limited
<
Most Tet editors ha0e a structure similar to that shown a#o0e! The command &anguage %rocessor It accepts input from the user ‟s input de0ices2 and analyAes the to*ens and syntactic structure of the commands! It functions much li*e the leical and syntactic phases of a compiler! The command language processor may in0o*e the semantic routines directly! In a tet editor2 these semantic routines perform functions such as editing and 0iewing! The semantic routines in0ol0e tra0eling2 editing2 0iewing and display functions! 'diting operations are always specified #y the user and display operations are specified implicitly #y the other three categories of operations! Tra0eling and 0iewing operations may #e in0o*ed either eplicitly #y the user or implicitly #y the editing operations 'diting Component In editing a document2 the start of the area to #e edited is determined #y the '%&&(5. (:.5) /$5.(& maintained #y the editing component2 which is the collection of modules dealing with editing tas*s! The current editing pointer can #e set or reset eplicitly #y the user using tra0elling commands2 such as net paragraph and net screen2 or implicitly as a side effect of the pre0ious editing operation such as delete paragraph! Tra0eling Component The tra0eling component of the editor actually performs the setting of the current editing and 0iewing pointers2 and thus determines the point at which the 0iewing and /or editing filtering #egins! Piewing Component The start of the area to #e 0iewed is determined #y the current 0iewing pointer! This pointer is maintained #y the 0iewing component of the editor2 which is a collection of modules responsi#le for determining the net 0iew! The current 0iewing pointer can #e set or reset eplicitly #y the user or implicitly #y system as a result of pre0ious editing operation! The 0iewing component formulates an ideal 0iew2 often epressed in a de0ice independent intermediate representation! This 0iew may #e a 0ery simple one consisting of a window ‟s worth of tet arranged so that lines are not #ro*en in the middle of the words! Display Component It ta*es the idealiAed 0iew from the 0iewing component and maps it to a physical output de0ice in the most efficient manner! The display component produces a display #y mapping the #uffer to a rectangular su#set of the screen2 usually a window 'diting ;ilter ;iltering consists of the selection of contiguous characters #eginning at the current point! The editing filter filters the document to generate a new editing #uffer #ased on the current editing pointer as well as on the editing filter parameters 'diting "uffer It contains the su#set of the document filtered #y the editing filter #ased on the editing pointer and editing filter parameters Piewing ;ilter
=
Ehen the display needs to #e updated2 the 0iewing component in0o*es the 0iewing filter! This component filters the document to generate a new 0iewing #uffer #ased on the current 0iewing pointer as well as on the 0iewing filter parameters! Piewing "uffer It contains the su#set of the document filtered #y the 0iewing filter #ased on the 0iewing pointer and 0iewing filter parameters! '!g! The user of a certain editor might tra0el to line =2and after 0iewing it2 decide to change all occurrences of 3ugly duc*ling4 to 3swan4 in lines 1 through =7 of the file #y using a change command such as Z12=7[ c/ugly duc*ling/swan/ $s a part of the editing command there is implicit tra0el to the first line of the file! &ines 1 through =7 are then filtered from the document to #ecome the editing #uffer! Successi0e su#stitutions ta*e place in this editing #uffer without corresponding updates of the 0iew In Line editors2 the 0iewing #uffer may contain the current lineR in screen editors2 this #uffer may contain rectangular cut out of the uarter-plane of tet! This 0iewing #uffer is then passed to the display component of the editor2 which produces a display #y mapping the #uffer to a rectangular su#set of the screen2 usually called a window$ The editing and 0iewing #uffers2 while independent2 can #e related in many ways! In a simplest case2 they are identical> the user edits the material directly on the screen! n the other hand2 the editing and 0iewing #uffers may #e completely dis@oint!
W5:$@ typically co0er the entire screen or rectangular portion of it! Mapping 0iewing #uffers to windows that co0er only part of the screen is especially useful for editors on modern graphics #ased wor*stations! Such systems can support multiple windows2 simultaneously showing different portions of the same file or portions of different file!
8
This approach allows the user to perform inter-file editing operations much more effecti0ely than with a system only a single window! The mapping of the 0iewing #uffer to a window is accomplished #y two components of the system! (i) ;irst2 the 0iewing component formulates an ideal 0iew often epressed in a de0ice 5:(/(5:(5. 5.(&+(:*.( &(/&((5.*.$5! This 0iew may #e a 0ery simple one consisting of a windows worth of tet arranged so that lines are not #ro*en in the middle of words! $t the other etreme2 the idealiAed 0iew may #e a facsimile of a page of fully formatted and typeset tet with euations2 ta#les and figures! (ii) Second the display component ta*es these idealiAed 0iews from the 0iewing component and maps it to a physical output de0ice the most efficient manner possi#le! The components of the editor deal with a user document on two le0els> 7 I5 +*5 +(+$&6 *5: 7 I5 .( : ( 6.(+! &oading an entire document into main memory may #e infeasi#le! 9owe0er if only part of a document is loaded and if many user specified operations reuire a dis* read #y the editor to locate the affected portions2 editing might #e unaccepta#ly slow! In some systems this pro#lem is sol0ed #y the mapping the entire file into &.%* +(+$&6 and letting the operating system perform efficient demand paging! $n alternati0e is to pro0ide is the editor paging routines which read one or more logical portions of a document into memory as needed! Such portions are often termed /*)(2 although there is usually no relationship #etween these pages and the hard copy document pages or 0irtual memory pages! These pages remain resident in main memory until a user operation reuires that another portion of the document #e loaded! 'ditors function in three #asic types of computing en0ironment> (i) T+(-*&5) (5&$5+(5. (ii) S.*5:-*$5( (5&$5+(5. *5: (iii) D.&,%.(: (5&$5+(5.! 'ach type of en0ironment imposes some constraint on the design of an editor! The Time –Sharing 'n0ironment The time sharing editor must function swiftly within the contet of the load on the computer s processor2 central memory and I/ de0ices! ‟
The Stand alone 'n0ironment The editor on a stand-alone system must ha0e access to the functions that the time sharing editors o#tain from its host operating system! This may #e pro0ided in pare #y a small local operating system or they may #e #uilt into the editor itself if the stand alone system is dedicated to editing! Distri#uted 'n0ironment The editor operating in a distri#uted resource sharing local networ* must2 li*e a standalone editor2 run independently on each user ‟s machine and must2 li*e a time sharing editor2 content for shared resources such as files!
INTERACTIVE DEBU##IN# SYSTEMS $n interacti0e de#ugging system pro0ides programmers with facilities that aid in testing and de#ugging of programs interacti0ely! DEBU##IN# FUNCTIONS AND CAPABILITIES 'ecution seuencing> It is the o#ser0ation and control of the flow of program eecution! ;or eample2 the program may #e halted after a fied num#er of instructions are eecuted! B&(*/$5. – The programmer may define #rea* points which cause eecution to #e suspended2 when a specified point in the program is reached! $fter eecution is suspended2 the de#ugging command is used to analyAe the progress of the program and to diagnose errors detected! 'ecution of the program can then #e remo0ed! C$5:.$5* E?/&($5 – %rogrammers can define some conditional epressions2 e0aluated during the de#ugging session2 program eecution is suspended2 when conditions are met2 analysis is made2 later eecution is resumed #*.- +i0en a good graphical representation of program progress may e0en #e useful in running the program in 0arious speeds called gaits! $ De#ugging system should also pro0ide functions such as tracing and trace#ac*! Tracing can #e used to trac* the flow of eecution logic and data modifications! The control flow can #e traced at different le0els of detail – procedure2 #ranch2 indi0idual instruction2 and so onY T&*'(,*' can show the path #y which the current statement in the program was reached! It can also show which statements ha0e modified a gi0en 0aria#le or parameter! The statements are displayed rather than as headecimal displacements! %rogram-display Capa#ilities It is also important for a de#ugging system to ha0e good program display capa#ilities! It must #e possi#le to display the program #eing de#ugged2 complete with statement num#ers! Multilingual Capa#ility $ de#ugging system should consider the language in which the program #eing de#ugged is written! Most user en0ironments and many applications systems in0ol0e the use of different programming languages! $ single de#ugging tool should #e a0aila#le to multilingual situations! C$5.(?. E('. The contet #eing used has many different effects on the de#ugging interaction! ;or eample! The statements are different depending on the language
C"& - MP' 8!= T B ;T$ - B F 8!= &i*ewise conditional statements should use the notation of the source language C"& - I; $ T 'H,$& T " ;T$ - I; ($ !'! ") Similar differences eist with respect to the form of statement la#els2 *eywords and so on! D/*6 $ $%&'( '$:( The language translator may pro0ide the source code or source listing tagged in some standard way so that the de#ugger has a uniform method of na0igating a#out it!
O/.+*.$5" It is also important that a de#ugging system #e a#le to deal with optimiAed code! Many optimiAations in0ol0e the rearrangement of segments of code in the program ;or eg! - invariant expressions can be removed from loop - separate
loops can be combined into a single loop - redundant expression may be eliminated - elimination of unnecessary branch instructions The de#ugging of optimiAed code reuires a su#stantial amount of cooperation from the optimiAing compiler! R(*.$5/ @. O.(& P*&. $ .( S6.(+
$n interacti0e de#ugger must #e related to other parts of the system in many different ways! $0aila#ility Interacti0e de#ugger must appear to #e a part of the run-time en0ironment and an integral part of the system! Ehen an error is disco0ered2 immediate de#ugging must #e possi#le #ecause it may #e difficult or impossi#le to reproduce the program failure in some other en0ironment or at some other times! Consistency with security and integrity components ,ser need to #e a#le to de#ug in a production en0ironment! Ehen an application fails during a production run2 wor* dependent on that application stops! Since the production en0ironment is often uite different from the test en0ironment2 many program failures cannot #e repeated outside the production en0ironment! De#ugger must also eist in a way that is consistent with the security and integrity components of the system! ,se of de#ugger must #e su#@ected to the normal authoriAation mechanism and must lea0e the usual audit trails! Someone (unauthoriAed user) must not access any data or code! It must not #e possi#le to use the de#uggers to interface with any aspect of system integrity! Coordination with eisting and future systems The de#ugger must coordinate its acti0ities with those of eisting and future language compilers and interpreters! It is assumed that de#ugging facilities in eisting language will continue to eist and #e maintained! The reuirement of cross-language de#ugger assumes that such a facility would #e installed as an alternati0e to the indi0idual language de#uggers!
USER- INTERFACE CRITERIA The interacti0e de#ugging system should #e user friendly! The facilities of de#ugging system should #e organiAed into few #asic categories of functions which should closely reflect common user tas*s! F% – '&((5 :/*6 *5: @5:$@5) 6.(+ The user interaction should ma*e use of full-screen display and windowing systems! The ad0antage of such interface is that the information can #e should displayed and changed easily and uic*ly!
M(5%" Eith menus and full screen editors2 the user has far less information to enter and remem#er It should #e possi#le to go directly to the menus without ha0ing to retrace an entire hierarchy! Ehen a full-screen terminal de0ice is not a0aila#le2 user should ha0e an eui0alent action in a linear de#ugging language #y pro0iding commands!
C$++*5: *5)%*)(" The command language should ha0e a clear2 logical2 simple synta! %arameters names should #e consistent across set of commands