Assembly Language Style Guidelines


Assembly Language Style Guidelines - Module Organization

3.0 Module Organization

A module is a collection of objects that are logically related. Those objects may include constants, data types, variables, and program units (e.g., functions, procedures, etc.). Note that objects in a module need not be physically related. For example, it is quite possible to construct a module using several different source files. Likewise, it is quite possible to have several different modules in the same source file. However, the best modules are physically related as well as logically related; that is, all the objects associated with a module exist in a single source file (or directory if the source file would be too large) and nothing else is present.

Modules contain several different objects including constants, types, variables, and program units (routines). Modules shares many of the attributes with routines (program units); this is not surprising since routines are the major component of a typical module. However, modules have some additional attributes of their own. The following sections describe the attributes of a well-written module.

Note:: Unit and package are both synonyms for the term module.

3.1 Module Attributes

A module is a generic term that describes a set of program related objects (program units as well as data and type objects) that are somehow coupled. Good modules share many of the same attributes as good program units as well as the ability to hide certain details from code outside the module.

3.1.1 Module Cohesion

Modules exhibit the following different kinds of cohesion (listed from good to bad):

Functional or logical cohesion exists if the module accomplishes exactly one (simple) task.
Sequential or pipelined cohesion exists when a module does several sequential operations that must be performed in a certain order with the data from one operation being fed to the next in a "filter-like" fashion.
Global or communicational cohesion exists when a module performs a set of operations that make use of a common set of data, but are otherwise unrelated.
Temporal cohesion exists when a module performs a set of operations that need to be done at the same time (though not necessarily in the same order). A typical initialization module is an example of such code.
Procedural cohesion exists when a module performs a sequence of operations in a specific order, but the only thing that binds them together is the order in which they must be done. Unlike sequential cohesion, the operations do not share data.
State cohesion occurs when several different (unrelated) operations appear in the same module and a state variable (e.g., a parameter) selects the operation to execute. Typically such modules contain a case (switch) or if..elseif..elseif... statement.
No cohesion exists if the operations in a module have no apparent relationship with one another.

The first three forms of cohesion above are generally acceptable in a program. The fourth (temporal) is probably okay, but you should rarely use it. The last three forms should almost never appear in a program. For some reasonable examples of module cohesion, you should consult "Code Complete".

Guideline:: Design good modules! Good modules exhibit strong cohesion. That is, a module should offer a (small) group of services that are logically related. For example, a "printer" module might provide all the services one would expect from a printer. The individual routines within the module would provide the individual services.

3.1.2 Module Coupling

Coupling refers to the way that two modules communicate with one another. There are several criteria that define the level of coupling between two modules:

Cardinality- the number of objects communicated between two modules. The fewer objects the better (i.e., fewer parameters).
Intimacy- how "private" is the communication? Parameter lists are the most private form; private data fields in a class or object are next level; public data fields in a class or object are next, global variables are even less intimate, and passing data in a file or database is the least intimate connection. Well-written modules exhibit a high degree of intimacy.
Visibility- this is somewhat related to intimacy above. This refers to how visible the data is to the entire system that you pass between two modules. For example, passing data in a parameter list is direct and very visible (you always see the data the caller is passing in the call to the routine); passing data in global variables makes the transfer less visible (you could have set up the global variable long before the call to the routine). Another example is passing simple (scalar) variables rather than loading up a bunch of values into a structure/record and passing that structure/record to the callee.
Flexibility- This refers to how easy it is to make the connection between two routines that may not have been originally intended to call one another. For example, suppose you pass a structure containing three fields into a function. If you want to call that function but you only have three data objects, not the structure, you would have to create a dummy structure, copy the three values into the field of that structure, and then call the function. On the other hand, had you simply passed the three values as separate parameters, you could still pass in structures (by specifying each field) as well as call the function with separate values. The module containing this later function is more flexible.

A module is loosely coupled if its functions exhibit low cardinality, high intimacy, high visibility, and high flexibility. Often, these features are in conflict with one another (e.g., increasing the flexibility by breaking out the fields from a structures [a good thing] will also increase the cardinality [a bad thing]). It is the traditional goal of any engineer to choose the appropriate compromises for each individual circumstance; therefore, you will need to carefully balance each of the four attributes above.

A module that uses loose coupling generally contains fewer errors per KLOC (thousands of lines of code). Furthermore, modules that exhibit loose coupling are easier to reuse (both in the current and future projects). For more information on coupling, see the appropriate chapter in "Code Complete".

Guideline:: Design good modules! Good modules exhibit loose coupling. That is, there are only a few, well-defined (visible) interfaces between the module and the outside world. Most data is private, accessible only through accessor functions (see information hiding below). Furthermore, the interface should be flexible.
Guideline:: Design good modules! Good modules exhibit information hiding. Code outside the module should only have access to the module through a small set of public routines. All data should be private to that module. A module should implement an abstract data type. All interface to the module should be through a well-defined set of operations.

3.1.3 Physical Organization of Modules

Many languages provide direct support for modules (e.g., packages in Ada, modules in Modula-2, and units in Delphi/Pascal). Some languages provide only indirect support for modules (e.g., a source file in C/C++). Others, like BASIC, don't really support modules, so you would have to simulate them by physically grouping objects together and exercising some discipline. Assembly language falls into the middle ground. The primary mechanism for hiding names from other modules is to implement a module as an individual source file and publish only those names that are part of the module's interface to the outside world.

Rule:: Each module should completely reside in a single source file. If size considerations prevent this, then all the source files for a given module should reside in a subdirectory specifically designated for that module.

Some people have the crazy idea that modularization means putting each function in a separate source file. Such physical modularization generally impairs the readability of a program more than it helps. Strive instead for logical modularization, that is, defining a module by its actions rather than by source code syntax (e.g., separating out functions).

This document does not address the decomposition of a problem into its modular components. Presumably, you can already handle that part of the task. There are a wide variety of texts on this subject if you feel weak in this area.

3.1.4 Module Interface

In any language system that supports modules, there are two primary components of a module: the interface component that publicizes the module visible names and the implementation component that contains the actual code, data, and private objects. MASM (and most assemblers) uses a scheme that is very similar to the one C/C++ uses. There are directives that let you import and export names. Like C/C++, you could place these directives directly in the related source modules. However, such code is difficult to maintain (since you need to change the directives in every file whenever you modify a public name). The solution, as adopted in the C/C++ programming languages, is to use header files. Header files contain all the public definitions and exports (as well as common data type definitions and constant definitions). The header file provides the interface to the other modules that want to use the code present in the implementation module.

The MASM 6.x externdef directive is perfect for creating interface files. When you use externdef within a source module that defines a symbol, externdef behaves like the public directive, exporting the name to other modules. When you use externdef within a source modules that refers to an external name, externdef behaves like the extern (or extrn ) directive. This lets you place an externdef directive in a single file and include this file into both the modules that import and export the public names.

If you are using an assembler that does not support externdef, you should probably consider switching to MASM 6.x. If switching to a better assembler (that supports externdef) is not feasible, the last thing you want to do is have to maintain the interface information in several separate files. Instead, use the assembler's ifdef conditional assembly directives to assemble a set of public statements in the header file if a symbol with the module's name is defined prior to including the header file. It should assemble a set of extrn statements otherwise. Although you still have to maintain the public and external information in two places (in the ifdef true and false sections), they are in the same file and located near one another.

Rule:: Keep all module interface directives (public, extrn, extern, and externdef) in a single header file for a given module. Place any other common data type definitions and constant definitions in this header file as well.
Guideline:: There should only be a single header file associated with any one module (even if the module has multiple source files associated with it). If, for some reason, you feel it is necessary to have multiple header files associated with a module, you should create a single file that includes all of the other interface files. That way a program that wants to use all the header files need only include the single file.

When designing header files, make sure you can include a file more than once without ill effects (e.g., duplicate symbol errors). The traditional way to do this is to put an IFDEF statement like the following around all the statements in a header file:



; Module: MyHeader.a

                ifndef   MyHeader_A
MyHeader_A      =       0
                .
                .       ;Statements in this header file.
                .
                endif

The first time a source file includes "MyHeader.a" the symbol "MyHeader_A" is undefined. Therefore, the assembler will process all the statements in the header file. In successive include operations (during the same assembly) the symbol "MyHeader_A" is already defined, so the assembler ignores the body of the include file.

My would you ever include a file twice? Easy. Some header files may include other header files. By including the file "YourHeader.a" a module might also be including "MyHeader.a" (assuming "YourHeader.a" contains the appropriate include directive). Your main program, that includes "YourHeader.a" might also need "MyHeader.a" so it explicitly includes this file not realizing "YourHeader.a" has already processed "MyHeader.a" thereby causing symbol redefinitions.

Rule:: Always put an appropriate IFNDEF statement around all the definitions in a header file to allow multiple inclusion of the header file without ill effect.
Guideline:: Use the ".a" suffix for assembly language header/interface files.
Rule:: Include files for library functions on a system should exist as ".a" files and should appear in the "/include" or "/asminc" subdirectory.
Guideline:: "/asminc" is probably a better choice if you're using multiple languages since those other languages may need to put files in a "/include" directory.
Exception:: It's probably reasonable to leave the UCR Standard Library's "stdlib.a" file in the "/stdlib/include" directory since most people expect it there.

Return to Assembly Language Style Guidelines Index.