Computer Architecture 2 - lab 2:
Instruction set architecture x86

The subject of this exercise is programming in a machine language of the x86 architecture, and connecting the machine code with the high-level programming language.

1. Introduction

Become familiar with the basic properties of the x86 architecture machine language, and particularly with the available address modes and registers [1], [2]. Become familiar with the ways (conventions) of transferring parameters to subroutines [3], and in particular the cdecl convention for 32-bit operating systems which will be used in this exercise.

2. The programming environment

Subroutines in x86 assembly language will be called from the program written in C++. Instruction execution can be traced from the debugger (for example dbg). Unfortunately, the x86 assembly syntax is not equal among popular C/C++ compilers (there are two syntaxes AT&T syntax used by gcc (GNU Compiler Collection) and Intel syntax used by MSVC (Microsoft Compilers)). Since we are going to use Intel syntax in this exercise and gcc compiler offers suitable flags during compile time (gcc: options -S -masm=intel), this part of the difference is not going to be a problem. The only problem that still remains is how function calls (for the functions written in x86 assembly) are specified within C++ code. For this reason we provide short instructions for both popular compilers MSVC and gcc (it's up to the student to choose which compiler is going to be used in this exercise).

3. Work with gcc

The easiest way to write an x86 routine in gcc is to write it in a separate file with the extension . s . The file that defines the x86 routine subroutine_asm has the following basic structure:

  // this is comment (like in C++)
  //
  // syntax label (we use Intel):
  .intel_syntax noprefix

  // we want the subroutine name (subroutine_asm)
  // to be visible from the C++ code, so we
  // specify its name like this:
  .global subroutine_asm

  // and here we repeat the same label again:
  subroutine_asm:

  // ... assembly code

Routines defined in such a manner are called in the same way as ordinary routines in C/C++ (it will be shown little later). For now, just assume that the main program is in the file main.cpp , while the assembly routine is located in subroutine.s . Then the compiling and linking can be done by (note that you might want to modify this on 64-bit UNIX systems as shall be explained in section 7):

 $ g++ -g -o main subroutine.s main.cpp

Program tracing can now be initiated by the command (using gdb):

 $ gdb main

For this exercise, we need only a small subset of all the capabilities of gdb described in [4], these are break, run, next ,step, print, te info registers. The way to use these commands is explained in the gdb documentation [5].

4. Work with MSVC

The easiest way to write an assembly routine in MSVC is to provide function body called naked function as follows:

  // directive __declspec(naked) tells the compiler
  // that parameters for the function call will be transfered using
  // cdecl convention, and
  // that any code before or after function call should not be 
  // generated (thus the name naked)  
  int __declspec(naked) function_asm(int i){

    __asm{
      // ... assembly kod
    }
  }

Subroutines written in assembly are called in the same way as ordinary routines in C, as will be explained in more detail later. Translation files with machine subroutine takes place in a standard way. It is necessary to add the subroutine source file to the Visual Studio console project (this is a type of project we use to build console C++ applications) and start compiling (Build Solution).

Program tracing can be initiated through the integrated development environment (by clicking on Start debugging). Useful tracing actions are Toggle Breakpoint Step over and Step into. Useful windows are Watch and Registers.

5. Assembly subroutine structure

Basic introduction to x86 assembly programming can be found at x86 assembly guide (it is advisable to skim through it). The standard way to access parameters and local variables in subroutine is through the register ebp (base pointer). To make this possible, within the cdecl convention, assembly subroutine have the following structure (more on the topic is available here ([3]):

              	     /* cdecl prologue: */
  push  ebp          /* store ebp at stack */
  mov   ebp, esp     /* move esp to ebp */

		     /* allocate 4 bytes for the local variables */
		     /* (or more if needed)*/
  sub   esp, 4       /* local variables are "under" ebp (stack growth */
		     /* direction is toward lower addresses)*/


                     /* main subroutine functionality */
  ...
                     /* return value is in eax*/


                     /* release local variables:*/
  add   esp, 4

                     /* cdecl epilogue: */
  pop ebp            /* instead of 'add esp,4, pop ebp' we can also write 'leave'*/
  ret                /* return from the subroutine */  

6. An example

We will write a C/C++ subroutine that computes equation (a+b)*c for the given integer values a,b,c and returns the result. This subroutine in C (subroutine_c) looks like this:

  int subroutine_c(int a, int b, int c) {
    return (a + b) * c;
  }  

The body of the corresponding subroutine written in x86 assembly (subroutine_asm) is the following:

                      /* [ebp] stores previous value of ebp  */
                      /* [ebp+4] is return address (it is eip register) */
  mov   eax, [ebp+12] /* b */
  add   eax, [ebp+8]  /* a */
  imul  eax, [ebp+16] /* c */  

The subroutine returns the result in register eax. The previous code snippet presents the main functionality of the function subroutine_asm, but to write complete function we have to embrace it by standard prologue and epilogue as shown in the previous chapter (we can paste the snippet on the place denoted by the 3 dots. Also, since in this example we don't have local variables, we can omit instructions (sub esp, 4 and add esp, 4).

It can be noted that in this simple subroutine prologue and epilogue are not really necessary. That is, function can be rewritten as follows:

  sub_asm_noebp:
                        /* [esp] return address */
    mov   eax, [esp+8]  /* b */
    add   eax, [esp+4]  /* a */
    imul  eax, [esp+12] /* c */
    ret                 /* return from subroutine */  

Here, we do all the referencing by register esp (not ebp as before). However, in general, we will use prologue and epilogue in our assembly subroutines since they will not be so simple. Using prologue and epilogue may help us to keep our code structured and maintainable. Note finally that most compilers use prologue and epilogue while translating our C/C++ code to assembly code.

Please note that calling conventions require that some registers (e.g. EBX) must be preserved across the subroutine calls. Such registers are denoted as callee-saved in the documentation.

7. Assembly subroutines under 64-bit Linux, FreeBSD and OS X systems

Instructions from sections 3, 5 and 6 are not applicable to 64-bit UNIX systems, since there cdecl is not the default calling convention. That problem can be solved in two ways (either way is going to work):
  1. compile the program for a 32-bit platform by supplying the -m32 flag to the g++ invocation
  2. pass parameters under the default calling convention (System V AMD64 ABI):

8. x86 assembly subroutine call

Assembly subroutine calls are transparent, quite the same as a C/C++ routines. That means subroutines subroutine_asm and subroutine_c are called in the exact same way. If the subroutine declaration is not visible when it is called from the C/C++ program (e.g. a subroutine is defined in a separate file) then it is necessary to provide an appropriate prototype (we just mention it, but this is an usual action in C/C++ programs).

Assembly subroutines written in a pure assembly (specified in a separate file with extension .s, gcc) during the translation produce an object code in accordance with platforming binary interface (ABI) for the language C/C++. If we want to call such a subroutine from a C/C++, then we need to prefix subroutine prototype with the extern C to prevent subroutine name mangling (compiler, according to the calling convention, adds certain prefixes and suffixes to the function name. For cdecl convention, it is an underscore as a prefix). In our example this will look like this:

extern "C" int subroutine_asm(int,int,int);  
  

If we want to use gcc on Windows, we need to tell the compiler that during the compilation of the main function do not prefix assembly function name with the underscore. This is achieved by a keyword asm() in the external subroutine prototype declaration:

extern "C" int subroutine_asm(int,int,int) asm("subroutine_asm");  
  
If we don't do that, we will get link error because linker will not be able to resolve reference to the symbol subroutine_asm. There is also another way to do the same. In assembly code we can add an external label which has an underscore in it's name. It will look like this:
  .global subroutine_asm
  .global subroutine_asm_

  subroutine_asm:
  subroutine_asm_:
  ...
  

9. Exercises

  1. Test and analyse the example with the functions subroutine_asm i subroutine_c which was discussed in the previous sections:
  2. Using x86 assembly instruction set reference [6] write assembly subroutine which does the following
  3. Write two subroutines (one in C, other in assembly) that sums all integer values in the interval [0,n> where n is specified as a subroutine parameter. Instructions:
  4. Write three subroutines (one using standard C, second using instructions from x87 instruction set (for floating point arithmetic), and third using instructions from the Streaming SIMD Extensions (SSE) instruction set) that sum up two single precision float vectors of length n. Instructions:

    References

    [1] Wikipedia: x86 architecture

    [2] Wikipedia: x86 assembly language

    [3] Wikipedia: x86 calling conventions

    [4] Wikipedia: GNU Debugger

    [5] Using GNU's GDB Debugger

    [6] x86 Instruction Set Reference

    [7] Wikipedia: Streaming SIMD Extensions

    [8] x86 Instruction Set Reference

    [9] x86 Instruction Set Reference

    [10] x86 instruction listings

    Last change: 15th October 2012.