Syllabus

ARM Instruction Set

Structure of an ARM Instruction

All ARM instructions conform to the following structural layout:

label <space> opcode <space> operands <space> ; comment

  • Each field (label, opcode, operands and comment) must be separated by one or more <space> elements. A <space> element can be an actual space character (from your space bar) or a tab.
  • If there is no label, there must be a <space> element before the actual instruction (i.e. the opcode and any operands.)
  • The semicolon character indicates the start of a comment, which is terminated by the end of the line.
  • All sections are optional. Blank lines will be accepted by the assembler, making it easier to improve code clarity.

Types of Instructions

There are three broad categories of instruction in the ARM assembly language. These are

  • Data Processing: this includes arithmetic and logical operations, comparison operations and register movement operations;
  • Data Movement: these are instructions to load and store data from and to memory;
  • Control Flow: these are software interrupts and branch instructions that alter the order of execution.

Note: For ARM data processing instructions, operands (values) are always 32 bits wide. The operands are either held in registers or are specified as constants (called literals) in the instruction itself. The result of a data processing instruction is also a 32 bit datum and is stored in a register. Most data processing instructions will have three operands, two of which are inputs and one for the result.

Data Processing Instructions

Register Move Operations

  1. Failed to execute the [code] macro. Cause: [String index out of range: 3]. Click on this message for details.
    • The MOV instruction loads a value (either a literal or from a source register, e.g. r1) into the destination register (e.g. r0 in this example). A MOV instruction that has the same source and destination registers does not do anything!
  2. Failed to execute the [code] macro. Cause: [String index out of range: 3]. Click on this message for details.
    • MVN is shorthand for "move negated". The MVN instruction take a value (which may be a literal or held in a register), inverts all the bits in this value, and then places it into the destination register.
      • r1: 1101 0110 1010 0000 0111 0101 1010 0011
      • r0: 0010 1001 0101 1111 1000 1010 0101 1100
    • The MVN instruction allows a programmer to put a negative value into a register. As 2's complement is involved, it will be necessary to move one less than the desired value.

Arithmetic Operations

Example: Failed to execute the [code] macro. Cause: [String index out of range: 3]. Click on this message for details.

r1 and r2 are the source registers containing the input operands, r0 is the destination register where the result is stored. The destination register can be the same as one of the source registers, i.e. ADD r0, r0, r1 is legal and means add the values in r0 and r1 together and place the result in r0.

This ADD instruction can be used on both unsigned and 2's complement signed numbers. It may produce a carry out signal and set overflow bits, but such signals are ignored by default.

ADD r0, r1, r2 ; r0 := r1 + r2  (Addition)
ADC r0, r1, r2 ; r0 := r1 + r2 + C (ADD with Carry)
SUB r0, r1, r2 ; r0 := r1 - r2   (Substract)
SBC r0, r1, r2 ; r0 := r1 - r2 + C - 1 (Substract with Carry)
RSB r0, r1, r2 ; r0 := r2 - r1 (Reverse substract)
RSC r0, r1, r2 ; r0 := r2 - r1 + C - 1(Reverse substract with carry)
  • The ADC, SBC, and RSC instructions utilise the value of the carry bit (C) whose value is stored in the Current Program Status Register (covered in the next lecture).
  • Thus the ADC instruction example, ADC r0, r1, r2, adds the values stored in r1 and r2 and the value of the carry bit, placing the result in r0. In the SBC instruction, the carry bit is used to indicate a "borrow", which is unset (set to 0) when a "borrow" is required. These instructions are useful in that they allow programmers to perform arithmetic operations on numbers that are larger than 32 bits.
  • RSB is shorthand for "reverse subtraction" (and RSC is "reverse subtraction with carry") and switches the order of operands for the subtraction process.

Logical Operations

Logical operation are performed bit by bit on the input operands (which may be values in registers or constants) and the result placed in a destination register.

  1. Failed to execute the [code] macro. Cause: [String index out of range: 3]. Click on this message for details.
    • r1:  0101 0011  1010 1111    1101 1010   0110 1011
    • r2: 1101  0110  1010 0000 0111 0101   1010 0011
    • r0: 0101 0010 1010 0000 0101 0000 0010 0011
    • AND instructions are useful for "masking" parts of values that you are not interested in currently.
  2. Failed to execute the [code] macro. Cause: [String index out of range: 3]. Click on this message for details.
    • r1: 0101 0011 1010 1111 1101 1010 0110 1011
    • r2: 1101 0110 1010 0000 0111 0101 1010 0011
    • r0: 1101 0111 1010 1111 1111 1111 1110 1011
    • ORR instructions are useful for ensuring that certain bits are set.
  3. Failed to execute the [code] macro. Cause: [String index out of range: 3]. Click on this message for details.
    • r1: 0101 0011 1010 1111 1101 1010 0110 1011
    • r2: 1101 0110 1010 0000 0111 0101 1010 0011
    • r0: 1000 0101 0000 1111 1010 1111 1100 1000
    • EOR instructions can be useful for inverting specific bits.
  4. Failed to execute the [code] macro. Cause: [String index out of range: 3]. Click on this message for details.
    • BIC stands for 'bit clear', where every '1' in the second operand clears the corresponding bit in the first:
    • r1: 0101 0011 1010 1111 1101 1010 0110 1011
    • r2: 1101 0110 1010 0000 0111 0101 1010 0011
    • r0: 0000 0001 0000 1111 1000 1010 0100 1000
    • BIC instructions can be considered as kind of reverse OR. They can be used to clear specific regions of a word, e.g.
    • r2: 1111 1111 1111 1111 0000 0000 0000 0000
    • would clear the upper halfword leaving the bits in the lower two bytes untouched.

Comparison Operations

There are four comparison operations in ARM assembly language. These comparisons work by performing arithmetic or logical operations on the values stored in the source registers and setting the appropriate condition code flags in the Current Program Status Register as necessary. However, the actual result of the underlying arithmetic or logical operation is not stored in any register.

Note that the comparison operations can have literals instead of registers as operands.

  1. CMP r1,r2; set condition codes according to the result of r1 - r2
    • The CMP (compare) instruction will set the condition codes as follows:
    • N =1 if the most significant bit of (r1 - r2) is 1, i.e. r2 > r1
    • Z = 1 if (r1 - r2) = 0, i.e. r1 = r2
    • C = 1 if r1 and r2 are both unsigned integers AND (r1 < r2)
    • V = 1 if r1 and r2 are both signed integers AND (r1 < r2)
  2. CMN r1, r2    ;set condition codes according to the result of r1 + r2
    • The CMN (compare negative) instruction determines the condition codes by performing the equivalent of: operand1 - ( - operand2). It is useful for comparing the values in registers against small negative numbers (such as -1 which might be used to mark the end of a data structure.)
  3. TST r1, r2    ; set condition codes on r1 AND r2
    • The TST (test bits) instruction can be used to test if one or more bits are set. The first operand is the value to be tested; the second operand is the bit mask. The Z flag will be set if there is a match, otherwise it will be cleared.
  4. TEQ r1, r2    ; set condition codes on r1 XOR r2
    • The TEQ (test equivalent) instruction is similar to TST, but differs in that it uses an exclusive-or operation. It can be used to determine if specific bits in two operands are the same or different. It does not change the overflow flag, unlike CMP. TEQ can be used to determine if two values have the same sign.
  5. Failed to execute the [code] macro. Cause: [String index out of range: 3]. Click on this message for details.

Note: The CMP, CMN, TST, and TEQ instructions always alter the condition codes. Other data processing instructions (such as ADD, ADC, SUB, etc.,) can alter the condition codes if they have an "S" suffix.

Control Flow

Branch Instructions

Basic Branching

This movement around a program is called branching. It is accomplished by suitable branch instructions, sometimes known as jump instructions.

The simplest branch instruction is:

B    label    ; unconditionally branch to the instruction at "label"
      ...          
      ...          
label ...          

The B instruction is an unconditional branch - when the processor encounters such a branch, it always jumps to the designated point. In assembly language, it is possible and desirable to represent the destination of the branch by a symbolic label. The assembler translates this label into the correct memory location, saving the programmer from having to figure out where this actually will be!

Conditional Branches

Conditional branches are much more useful than unconditional branches. If a comparison operation gives a computer the ability to make a decision, conditional branches allow the computer to act on a decision.

In the ARM, conditional branches are executed (taken) or not according to the contents of the four condition codes of the Current Program Status Register (CPSR). For instance, the BNE (branch not equal) instruction refers to the Z bit flag in the CPSR. If the Z bit is set (i.e. some result was zero or a comparison was equal), then the branch is not taken. If the Z bit is clear (i.e. some result was non-zero or a comparison was not equal), then the branch will be taken.

Consider the following example code, which uses a BNE instruction to control a loop:

MOV r0, #5 ; use register r0 as a loop counter and initialize it to the value 5 loop label indicating the start of the loop
...
...
SUB r0, r0, #1 ; decrement the loop counter by subtracting 1 from r0
CMP r0, #0 ; perform a comparison between the value in r0 and zero
BNE loop ; if comparison was not equal, jump back to the loop label and repeat instructions after the end of the loop

Remember that the CMP instruction does not issue a result; rather it tinkers with the condition code bits. The BNE instruction then inspects the Z bit to determine whether the branch should be taken or not. The above code will repeat the loop 5 times before the program can proceed to executing the instructions after the loop's end.

The same program can be rewritten by using the "S" suffix version of the SUB instruction, namely SUBS.

MOV r0, #5     ; use register r0 as a loop counter
               ; and initialize it to the value 5
loop ...       ; label indicating the start of the loop
     ...       ; various instructions that form the body of the loop
     ...       ; and perform useful tasks
SUBS r0, r0, #1; decrement the loop counter by subtracting 1 from r0 and set the condition code bits based on the subtraction result

BNE loop      ; if the Z bit is clear (i.e. "0"), jump back to the loop label and repeat
      ...     ; instructions after the end of the loop

Conditional Branch Instructions

There are 16 possible conditional branches in the ARM assembly language, including "always" (which is effectively an unconditional branch) and "never" (which is never used but exists for future possible extensions to the architecture). The complete set of branch instructions is given in the table:

BranchCondition TestMeaningUses
BNo testUnconditionalAlways take the branch
BALNo testAlwaysAlways take the branch
BEQZ=1EqualComparison equal or zero result
BNEZ=0Not equalComparison not equal or non-zero result
BCSC=1Carry setArithmetic operation gave carry out
BCCC=1Carry clearArithmetic operation did not produce a carry
BHSC=1Higher or sameUnsigned comparison gave higher or same result
BLOC=0LowerUnsigned comparison gave lower result
BMIN=1MinusResult is minus or negative
BPLN=0PlusResult is positive (plus) or zero
BVSV=1Overflow SetSigned integer operation: overflow occurred
BVCV=0Overflow ClearSigned integer operation: no overflow occurred
BHI((NOT C) OR Z) =0
{C set and Z clear}
HigherUnsigned comparison gave higher
BLS((NOT C) OR Z) =1
{C set or Z clear}
Lower or sameUnsigned comparison gave lower or same
BGE(N EOR V) =0
{(N and V) set or (N and V) clear}
Greater or EqualSigned integer comparison gave greater than or equal
BLT(N EOR V) =1
{(N set and V clear) or (N clear and V set)}
Less ThanSigned integer comparison gave less than
BGT(Z OR (N EOR V)) =0
{((N and V) set or clear) and Z clear}
Greater ThanSigned integer comparison gave greater than
BLE(Z OR (N EOR V)) =1
{(N set and V clear) or (N clear and V set) or Z set}
Less or EqualSigned integer comparison gave less than or equal

Conditional Execution

  • Unlike most processor architectures, all instructions in the ARM assembly language can be conditionally executed.
  • The ADDNE and SUBNE instructions are executed only if the Z bit is "0", i.e. the CMP instruction had a non-zero result and cleared the Z bit.
  • This form of conditional execution is efficient if the conditional sequence is only three or less instructions. For longer sequences, it is more efficient to use proper loops and conventional branching.
  • The "S" suffix can be appended to instructions with the conditional suffices, enabling them to adjust the condition codes. However this can lead to unexpected behaviour in succeeding instructions.

Let r0 be 4, r1 be -4, and r2 be 7. Alter the code from above to be as follows:

CMP r0,#10 ;         compare the value in r0 with the value 10
ADDNE r1, r1, r0;    r0 ≠10, so Z is 0, so r1 = -4 + 4, and Z is now set
SUBNE r2, r2, r0;    Z is set (i.e. "1"), so SUBNE is not executed
...         ;        other code now follows

The conditional suffix mnemonics are the same as for conditional branches:

SuffixCondition TestMeaning
EQZ=1Equal
NEZ=0Not Equal
CSC=1Carry Set (Unsigned Higher or Same)
CCC=1Carry Clear (Unsigned Lower than)
MIN=1Minus
PLN=0Plus
VSV=1Overflow Set
VCV=0Overflow Clear
HI((NOT C) OR Z) =0
{C set and Z clear}
Higher unsigned
LS((NOT C) OR Z) =1
{C set or Z clear}
Lower or same unsigned
GE(N EOR V) =0
{(N and V) set or (N and V) clear}
Greater or Equal
LT(N EOR V) =1
{(N set and V clear) or (N clear and V set)}
Less Than
GT(Z OR (N EOR V)) =0
{((N and V) set or clear) and Z clear}
Greater Than
LE(Z OR (N EOR V)) =1
{(N set and V clear) or (N clear and V set) or Z set}
Less or Equal

References

  • Prof. Sujit Wagh
  • www-mdp.eng.cam.ac.uk
Tags:
Created by Sujit Wagh on 2017/04/23 12:46