Table able of Contents 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Overview Basic Th Theor eory y First Program-Hello World World Addin Addi ng two si sin ngle di dig git numbers Average of single digit numbers Comparing two single characters characters Multiple digits and strings Entering Enterin g a double digit number Add two digit numbers Priintin Pr ting g Al Alph phab abets ets Compare input string Reverse a string Palindrome Date
Table able of Contents 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Overview Basic Th Theor eory y First Program-Hello World World Addin Addi ng two si sin ngle di dig git numbers Average of single digit numbers Comparing two single characters characters Multiple digits and strings Entering Enterin g a double digit number Add two digit numbers Priintin Pr ting g Al Alph phab abets ets Compare input string Reverse a string Palindrome Date
15 16 17 18 19 20 21 22 23
Cursor mov Cur oveement Order of alphabets Copying strings Directory lookup Strings Case Conver Conversi sion on Files Appendix I- Debug Appendix II- Interrupts
Overview Assembly is a foundation programming language. It is an essential component o learning computer architecture and organization. Assembly language will help one to understand the working o higher level languages such as C++, Java etc. The basic requirement for this subject i this book is an understanding of how a computer works in general. There is no prerequisite for this language. The book is intended as a practical laboratory guide for computer science and therefore should be considered
supplementary in nature. Programming is best understood and appreciated when they are actually typed. For the code in this book one could use the Tasm editor. I have run this code on a 32-bit computer using Tasm. Tasm for 64 bit computers is available for free on the internet. In addition to the Tasm for 64 bit one may have to download the dos box for running the scripts shown in this practical guide. Those using 32 bit computers can ru Tasm without Dos Box. The website for my http://beam.to/fjtbooks
books
-
Before we start:
Check if the computer is a 32 bit or 64 bit one by right clicking on “this pc” o the desktop. The scripts in this book were tested on a Windows 10 installed machine. Download a copy of Tasm according to the computer. If it is a 32-bit computer copy the Tasm to a folder on the desktop. For the 64-bit computer follow the instruc. In addition to this check for a notepad editor to type the actual programs. The online editor of Tasm will also do but I generally use notepad.
The program will be saved with a extention ".asm" either in notepad or online editor of Tasm. The program file should be saved in the Tasm folder which might be a subfolder of the mai Tasm folder on the desktop. Installing TASM:
If you are installing the Tasm for 32 bit, then download the copy and extract the contents to a location on your computer. Inside the default Tasm folder there will be another folder by the same name. It is here that the ‘asm’ files have to be installed.
For 64 bit computers first download the 64-bit version of Tasm along with the relevant Dos Box version and install it on the C drive. Open Dos Bix and mount Tasm folder i Dos Box by using the command: Mount c ://tasm Change drive to C if that is not the current drive. To avoid using the mount command one could go to C:\ Progam files (x86)\DosBox-0.74(for 64bit)
Open Dos Box 0.74 options which opens a text file in notepad Go to last line of file where it read “#You can put your mount lines here” and add mount c c://tasm c:// Save the file ow Dos Box and Tasm should work i tandem. One of the features that Dos Box helps to run in the 64-bit version of Tasm is the debug utility which in older versio could be directly run from the command line. For computer that are 32 bit just go to command line and type debug to start
the debug utility. Check Appendix I for debug.
Chapter 1. Basic Theory The 8086 processor for which the codes in this book have been written consists of 14 registers which are 16 bit each. The following are the registers: Register AX BX CX DX SP BP SI
Purpose Data Data Data Data Stack poi Base poi Source I
DI IP ES CS DS SS Flags
Destinati Instructio Extra Se Code Se Data Seg Stack Se Status an
The general purpose data registers ca be split into two parts a low bit part and high bit part. For example, AL is the low bit side and AH is the high bit part. So a number say 1234h can be split as 12h i AL and 34h in AH where h in the number indicates that the number is a hexadecimal number.
Flags consist of the following: Flag OF
DF
IF TF
Flag SF
Purpose Overflo instructio signed re Directio string op direction. Interrupt interrupts Trap flag work in s
Purpose Sign flag
ZF
AF
PF
CF
number o negative. Zero flag number o zero Auxiliary an operat from bit Parity fla instructio number o byte of th operand. Carry fla result of arithmeti large to f destinati
Assembly instructions are made up of a operation code (op-code) and a set o operands. The op-code identifies the action to be taken. The operands identify the source and destination of the data. The operands identify CPU registers, memory locations, or I/O ports. The complete form of an instruction is: op-code destination operand, source operand For example :
ADD AX,1; one operand (add 1 to register AX)
MOV AX, 100; two operands (store 100 in register AX) egments:
The 8086 processor defines four 64 K byte memory blocks called the code segment, data segment, stack segment, and the extra segment. Data Transfer Croup:
The data transfer instructions move data between registers or between memory and registers. MOV
Move
MVI Move Immediate LDA Load Accumulator Directly from Memory STA Store Accumulator Directly in Memory LHLD Load H & L Registers Directly from Memory SHLD Store H & L Registers Directly in Memory An 'X' in the name of a data transfer instruction implies that it deals with a register pair (16-bits); LXI Load Register Pair with Immediate data LDAX Load Accumulator from Address in Register Pair
STAX Store Accumulator i Address in Register Pair XCHG Exchange H & L with D & E XTHL Exchange Top o Stack with H & L Arithmetic Group:
The arithmetic instructions add, subtract, increment, or decrement data in registers or memory. ADD Add to Accumulator ADI Add Immediate Data to Accumulator ADC Add to Accumulator Using Carry Flag
ACI Add Immediate data to Accumulator Using Carry SUB Subtract fro Accumulator SUI Subtract Immediate Data from Accumulator SBB Subtract fro Accumulator Using Borrow (Carry) Flag SBI Sub Immediate fro Accumulator Using Borrow (Carry) Flag INR Increment Specified Byte by One DCR Decrement Specified Byte by One INX Increment Register Pair by One
DCX Decrement Register Pair by One DAD Double Register Add; Add Content of Register Pair to H & L Register Pair Logical Group: This group performs logical (Boolean) operations on data in registers and memory and on condition flags. The logical AND, OR, and Exclusive OR instructions enable you to set specific bits in the accumulator ON or OFF.
ANA Logical AND with Accumulator ANI Logical AND with Accumulator Using Immediate Data
ORA Logical OR with Accumulator OR Logical OR with Accumulator Using Immediate Data XRA Exclusive Logical OR wit Accumulator XRI Exclusive OR Using Immediate Data The Compare instructions compare the content of an 8-bit value with the contents of the accumulator; CMP Compare CPI Compare Using Immediate Data The rotate instructions shift the contents of the accumulator one-bit position to the left or right:
RLC RRC RAL RAR
Rotate Accumulator Left Rotate Accumulator Right Rotate Left Through Carry Rotate Right Through Carry
Complement and carry flag instructions: CMA Complement Accumulator CMC Complement Carry Flag STC Set Carry Flag Branch Group:
The branching instructions alter normal sequential program flow, either unconditionally or conditionally. The unconditional branching instructions are as follows:
JMP Jump CALL Call RET Return Conditional branching instructions examine the status of one of four condition flags to determine whether the specified branch is to be executed. The conditions that may be specified are as follows: Z Z C C PO PE
Not Zero (Z =0) Zero (Z = 1) No Carry (C =0) Carry (C = 1) Parity Odd (P= 0) Parity Even (P= 1)
P M
Plus (S = 0) Minus (S = 1)
Thus, the conditional branching instructions are specified as follows: Jumps C INC JZ JNZ JP JM JPE JP0
Calls CC CNC CZ CNZ CP CM CPE CPO
Stack I/O, and Instructions:
Returns RC (Carry) RNC (No Carry) RZ (Zero) RNZ (Not Zero) RP (Plus) RM (Minus) RPE (Parity Even) RPO (Parity Odd) Machine
Control
PUSH : Push Two bytes of Data onto the Stack POP: Pop Two Bytes of Data off the Stack XTHL: Exchange Top of Stack with H & L SPHL: Move content of H & L to Stack Pointer The I/0 instructions are as follows:
IN Initiate Input Operation OUT Initiate Output Operation The Machine Control instructions are as follows: EI Enable Interrupt System
DI Disable Interrupt System HLT Halt OP No Operation Further to this the following would be helpful in the programs to be tried and written in the chapters that follow: JNE Not equal JE Equal JL Less than JG Greater than JLE Less than and equal JGE Greater than and equal Usually these above are used i conjunction with CMP or compare event which requires a jump to a different part
of the program. There are no “if” statements i Assembly unlike other languages thus a combination of CMP and jump can act i a similar way to that of a condition. The jump option requires a label to jump to. In other words, the transfer from one part of the program to the other is based on labels. Common actions in a program such as display to screen can be put under a label say print to reduce the amount o code written. Assembly
language
programs
are
converted into executable machine code by a utility program referred to as a assembler. The assembler being used in this book is TASM however there are others such as MASM, NASM and FASM. Assembly is a low-level programming language for computers, microprocessors, microcontrollers, and other programmable devices in whic each statement corresponds to a single machine language instruction. Assembly language is specific to a certain computer architecture, in contrast to most high-level programming
languages, which generally are portable to multiple systems. Assembly code in general has the following parts: .model (small, tiny, medium, large) .stack 100h (or another location) .data (define variables and messages) . code (to start actual code) .start (or another name for a label. There should be at least one label.) Further one define a procedure. But the
five parts mentioned are standard to any assembly program on the 8086 microprocessor. Let us look at the memory allocatio line-. Model Tiny In the small model all code is placed i one physical segment and all data i another physical segment. Procedures and variables are addressed as ‘near’ by pointing to their offsets only. The tiny model automatically generates a co file, which is smaller in size than an exe file. Small Use the small model for average size
applications. The code and data segments are different and don't overlap, so you have 64K of code and 64K o data and stack. Near pointers are always used. All code is placed in one physical segment and all data in another physical segment. Compact All elements of code such as procedures are placed into one physical segment. Each element of data can be placed by default into its own physical segment. Data elements are addressed by pointing both at the segment and offset addresses. All code elements such as procedures are addressed as ‘near’ and data elements such are variables are addressed as ‘far’.
Medium
The medium model is best for large programs that don't keep much data i memory. Far pointers are used for code but not for data. As a result, data plus stack are limited to 64K, but code ca occupy up to 1MB. Each one of the segments (stack, data and code), in a program, is called a logical segment. Depending on the model used, segments may be in one or in different physical segments. Data elements are treated as ‘near’ and code elements are addressed as ‘far’. Large Code elements such as procedures and data elements such as variables are put
in different physical segments. Procedures and variables are addressed as ‘far’ by pointing at both the segment and offset addresses that contain those elements. No data array can have a size that exceeds one physical segment64KB. Huge The HUGE memory is similar to the LARGE model with the exception that a data array may have a size that exceeds one physical segment- 64 KB. Memory Size of Code Size of Data Model
Tiny
Code + Data < 64KB
Code + data < 64KB
Small
Less than 64KB
Less than 64KB
Can be more Medium than 64KB
Less than 64 KB
Less than Compact 64KB
Can be more than 64KB
Large
Can be more than 64K
Can be more than 64KB
Huge
Can be more than 64K
Can be more than 64KB
The amount of data that has to be manipulated and code that needs to be written are the criteria for choosing a appropriate model –for example- a small fast program that operates o small quantities of data, can use the small or tiny models.
This will allow up to 64K of memory and the executable code is fast since only near references are used in the calculation of addresses. Programs that require more than one code segment and operate on large amounts of data whic would require more than one data segment, the large and huge models are most appropriate. Most of the programs in this book use the small model.
Chapter 2. First Program It is an unwritten tradition i programming class that the first progra to be written when learning a new language should be one that prints out the words "Hello World". Therefore, i keeping with this let us write our first program in Assembly to display "Hello World". .model small .stack 100h .data message db "Hello World", 13,10,'$' .code start:
mov ax,@data mov ds,ax mov dx,offset message mov ah,9h int 21h mov ax,4c00h int 21h end start In this program: .model small means around 64k memory size program. This is also a directive to tell the assembler to use one data segment and one code segment. .stack 100h gives the location of the stack
.data is a place holder for start of data. Variables are defined here. Here in our example that is message "Hello World". It is a directive to assembler to put i data segment. @data is a default address of data segment to put it in ax register. .code starts the code and thus the directive to assembler to put in code segment. start: is a label which indicated the start of primary code. Like all programs if there is a beginning
there should be an end. End start: tells assembler that the program has ended. Save the file in notepad with ".asm " extension. Hello.asm is the name give to the above code and place it in the tasm folder and compile it. ote the carriage return in line four o the code. ow let us try to display a single character on the screen. A single character could be any alphabet or number or anything on the keyboard. The code for that is as follows: .model small
.stack 100h .code start: mov dl, "a" mov ah, 2h int 21h mov ax,4c00h int 21h end start The code above will display the character ‘a’ on the screen. It can also display any other single character fro the keyboard by replacing the character in line five with another. The following script will accept a single character from the keyboard.
.model small .stack 100h .code start: mov ah,1h int 21h mov dl,al int 21h mov ax,4c00h int 21h end start
Is there anything in common in these three scripts? - int 21h where h is hexadecimal and mov ax, 4c00h are common. Remove them from the scripts
and see the effect. Exercise:
Use the script above to display a single character on the screen to display the word Hello one character at a time. Write a program that will display a question (?) mark and then accept a character from the keyboard. 3. Modify the program in question 2 to provide for a prompt such as “Enter a character” before accepting an input from the keyboard. 4. List the functions of the interrupts in
the three scripts of this chapter. 5. Is there another way to include a carriage return for a prompt? 6. What is the output of the following code? model small .stack 100h .code start: mov dl,'?' mov ah,2h int 21h mov ah,1h int 21h mov bl,al
mov dl,'' mov ah,2h int 21h mov dl,bl mov ah,2h int 21h mov ax,4c00h int 21h end start
Chapter 3. Adding two single digit numbers In this chapter, we look at a code to add two single numbers. The numbers are hexa-decimal numbers. . model small . stack 100h .data message1 db "Enter First Number:", 13, message2 db ,0dh,0ah,"Enter Second N message3 db ,0dh,0ah,"Sum of Entered =",13,10,"$" num1 db ?,13,10,"$"
num2 db ?,13,10,"$" ans db ?,13,10,"$" .code start: mov ax,@data ;initiaize ds mov ds,ax mov dx,offset message1 ;load and di mov ah,09 int 21h mov ah,1h ;read first initial int 21h sub al,30h mov num1,al mov dx,offset message2 ;load and di mov ah,9 int 21h ;read second initial mov ah,1h int 21h
sub al,30h mov num2,al mov dx,offset message3 mov ah,9 ;load and display ms int 21h mov al,num1 ;add num1 and nu add al,num2 add al,30h ; moves value into a mov ans,al mov dx,offset ans ;load and display mov ah,9 int 21h ; returns control to dos mov ah, 4ch int 21h end start In
this
program
two
single
digit
hexadecimal values can be entered. Thus acceptable input is from 0 –through 9. In the previous chapter (1) it was noticed that using mov ah,9 int 21 would print the message on the screen. The same is used here. mov ah,1h int 21h sub al,30h The above lines are used to take an input from the key board. Say the number selected on the keyboard is 5 the number will be received in asci format by the assembly program.
To transform that into BCD form we use the line sub al, 30h. When the process o adding complete we have to convert it back into asci format before showing the result on the screen. This is achieved by using add al, 30h. Here sub means subtract and add is addition. The final lines display the answer on the screen. mov dx,offset ans mov ah,9 int 21h mov ah, 4ch int 21h One data segment and code segment has
been initialized in this program. The word ‘mul’ is used to multiply and ‘div’ for division. ow let us look at the following code: .model small .stack 100h .data num1 db 5h,13,10,"$" num2 db 2h,13,10,"$" ans db ?,13,10,"$" .code start: mov ax,@data mov ds,ax mov al,num1
add al,num2 add al,30h mov ans,al mov dx,offset ans mov ah,9 int 21h mov ah, 4ch int 21h end start The code adds two numbers which are predefined in the code-5 and 2 hexadecimals. Is there a difference betwee the two codes? There are two variables with predefined numbers in hexadecimal but in this code they have not bee converted into BCD form by subtractio but converted into asci before printing
the result. Exercise:
Write an assembly program to multiply two numbers received from the keyboard using appropriate prompts. Modify the code to add two numbers such that if the sum is greater than 9 a user friendly message will be displayed. Write a program to divide two numbers received from the keyboard using appropriate prompts. Modify the above program (3) to make
sure division by zero is flagged or not allowed.
Chapter 4. Average of single digit numbers The code below will find the average of n number of single digit numbers. .model small .stack 100h .data x1 db ? v1 db 'Enter the number of entries:',1 v2 db 'ENTER NO:',13,10,'$' v3 db 'Average is:',13,10,'$' .code start: mov ax,@data mov ds,ax
lea dx,v1 mov ah,09h int 21h mov ah,01h int 21h sub al,30h mov cl,al mov bl,al mov al,00 mov x1,al l1: lea dx,v2 mov ah,09h int 21h mov ah,01h int 21h sub al,30h add al,x1
mov x1,al dec cl cmp cl,0 jne l1 lea dx,v3 mov ah,09h int 21h mov ax,00 mov al,x1 div bl add ax,3030h mov dx,ax mov ah,02h int 21h mov ah,4ch int 21h end start
This code has some features not seen i the previous chapters. First the number of entries to be averaged is asked for. Once that is done it is put in the counter register low bit (cl). It is also entered i base register low bit(bl) for divisio later. The variable x1 will be initialized to 0 because it is the average and initially it should be zero. Under label l1 we see a section of code that repeatedly will request for the number to be averaged. Each time the counter will be decremented by 1 and the CMP statement will compare the counter with 0. If it is 0 then it will continue with the code below the line
(jne l1) otherwise it will loop back to label l1. The numbers entered to x1 and finally that sum in al is divided by ‘bl’ and the printed after conversion to asci throug add 3030h statement and the statements that follows it. The other new term in this code is ‘lea’. It calculates the effective address specified by its second operand as if it were going to load or store data from it, but instead it stores the calculated address into the register specified by its first operand. LEA means Load Effective Address. MOV means Load Value.
LEA loads a pointer to the item being addressed whereas MOV loads the actual value at that address. It can be seen that the program in assembly can be written in small or large alphabets wit respect to key words. I always use small alphabets as most keyboards are defaulted to small letters. Exercise:
Modify the code above to find the su of n single digital numbers.
Chapter 5. Comparing two single characters
This program will compare two characters input from the keyboard. .model small .stack 100h .data x1 db 'Equal',13,10,'$' x2 db 'Not Equal',13,10,'$' .code start: mov ax,@data mov ds,ax mov ah,01h int 21h mov dl,al int 21h cmp al,dl jne noequal
mov ah,09h lea dx,x1 int 21h jmp print noequal: mov ah,09h lea dx,x2 int 21h print: mov ah,4ch int 21h end start The characters can be alphabets or numbers. In this program like in the previous chapter we are using ‘lea’ to display the prompts for the inputs.
Exercise:
1. Modify the code to accept only numbers to compare.
Chapter 6. Multiple Digit or Strings The program below will allow input o multiple digits or strings. . model small .stack 100h .data cr equ 0dh lf equ 0ah nl equ 00h msgx db 32h dup (nl) .code start: mov ax, @data mov ds, ax
mov ax, Offset msgx call getstr call exit getchr: push bx push cx push dx mov ah, 01h int 21h pop dx pop cx pop bx getstr: push ax push bx push cx push dx mov bx, ax
getstrloop: call getchr mov byte ptr [bx], al cmp byte ptr [bx], cr je getstrfree inc bx jmp getstrloop getstrfree: mov byte ptr [bx], nl pop dx pop cx pop bx pop ax exit: mov ah, 4ch int 21h d st t
The above code uses elements of a stack to accomplish the reading of multiple digit and string inputs. A stack has two main properties. They are push and pop. Push is to add a element in the stack and pop is to remove an element from the stack. A stack works on the basis of LIFO. That is last in first out. The program uses call to get to the labels in different parts of the program. The program also uses pointers. The PTR operator is used to define a memory reference with a certain type. The assembler determines the correct
instruction to assemble based on the type of the operands to the instruction. The type field can have one of the following values: byte, word, dword, qword, tbyte, near, far. The field(s) following the PTR keyword can be a variable name, a label name, an address or register expression or an integer that represents an offset. The most important and frequent use for PTR is to assure that the assembler understands what attribute the expression is supposed to have. The second use of PTR is to access data by type other than the type in the
variable definition. This occurs i structures. If the structure is defined as WORD but it required to access an ite as a byte, PTR is the operator for this.
Chapter 7. Entering a double Digit number The example below will show how to enter a double digit number using a nonstack method. A macro is being defined and used in this code.
.model small .stack 100h .data msg1 db 10,13,"Enter a two digit numb disp macro msg ;defination of macr mov ah,09h lea dx,msg int 21h endm ;end of macro .code mov ax,@data ;intialization of dat mov ds,ax disp msg1 ;calling macro mov ah,01h ;accept single digit int 21h cmp al,39h jle xx ;jump if lessthan or equ
xx:sub al,30h mov cl,04 shl al,cl mov bl,al mov ah,01h ;accept single digit int 21h cmp al,39h ;cmp the asci of 9 to digit/char jle x ;jump if lessthan or equal sub al,07h x:sub al,30h add al,bl mov ah,4ch ;exit from program t int 21h end
Chapter 8. Add two digit numbers This code will add two numbers whic are of two digits. .model small .stack 100h .data msga db 13,10,"Input first number: ","$ msgb db 13,10,"Input second number:", msgc db 13,10,"The sum is: ","$" num1 db ? num2 db ? num3 db ? .code start:
mov ax, @data mov ds, ax ; get first number lea dx,msga ;Inputs the first number mov ah,09h int 21h mov ah,01 int 21h sub al,'0' mov bl,al mov ah,01 int 21h sub al,'0' mov cl,al lea dx,msgb ;Input the second number mov ah, 09h int 21h mov ah, 01
int 21h sub al,'0' mov dl,al mov ah, 01
int 21h sub al,'0' mov dh,al mov al,cl mov ah,bl add al,dh add ah,dl mov num1,al add num1, '0' cmp ah, 9 jle not_3digit ; checking for three digit is_3digit: mov al,ah ; move value of ah to al sub ah,ah ; clear ah add al, 0 ; al + 0 = al (tens digit) aaa ; move for addition
add ah, 0 ; ah + 0 + 1 = ah + 1 (hundr mov num2,al mov num3,ah add num2,'0' add num3,'0' lea dx,msgc mov ah, 09h int 21h mov dl,num3 mov ah,02H int 21h mov dl,num2 mov ah, 02h int 21h jmp print_lastdigit not_3digit: mov num2,ah add num2, '0'
lea dx,msgc mov ah, 09h int 21h mov dl,num2 mov ah, 02h int 21h print_lastdigit: mov dl,num1 mov ah,02h int 21h exit: mov ah, 4ch int 21h end start In this program the sum of numbers which have two digits is being shown. I
the sum becomes a three-digit number, the answer will not be displayed. First we get two numbers which are two digits long. The sum is checked to see i that too is two digits long before showing it on the screen. The num values are not initialized but memory has been reserved for them. The key word ‘aaa’ should be used after a one-byte add instruction whose destination was the al register. The value in the low nibble of AL and the auxiliary carry flag AF, will determine whether the addition has overflowed, and make adjustments. Exercise:
Extend the method used in the above code to add numbers which are three digits long. Write a program based on the previous chapter where two numbers irrespective of the number of digits can be entered for the purpose of: Addition Multiplication Division Subtraction Modify the code in question 2 to create a menu interface that shows the various options of math before entering the two
numbers. What do the terms aaa, aam , aas, aad mean in assembly language? How are the terms in question 4 used?
Chapter 9. Printing Alphabets This code will print all the alphabets o English. .model small .stack 100h .code start: mov dl, 'a' mov cx,26 ;initialize counter to 26 mov ah,2 loop1: cmp cx,0 je end1 int 21h inc dl; going for the next alphabet dec cx ; decrease counter by 1
jmp loop1 end1: mov ah,4ch int 21h end start In this code the number 26 indicates the number of alphabets to print. If the number 26 is replaced with the number 10 the first ten alphabets after a will be printed. Exercise:
1. Write a code that will print the last ten alphabets of the English language. 2.
Write a code that will print
alphabets from a through f and then from s through z. 3. Write a code that will ask for an alphabet and print the remaining alphabets from that point to z. 4. Write a code that will check if the input character is a number or an alphabet. If it is a number, it will give an error message and if not will print the next alphabet if any.