3.0 Module Organization
A module is a collection of objects that are logically related. Those objects
may include constants, data types, variables, and program units (e.g.,
functions, procedures, etc.). Note that objects in a module need not be
physically related. For example, it is quite possible to construct a module
using several different source files. Likewise, it is quite possible to have
several different modules in the same source file. However, the best modules are
physically related as well as logically related; that is, all the objects
associated with a module exist in a single source file (or directory if the
source file would be too large) and nothing else is present.
Modules contain several different objects including constants, types,
variables, and program units (routines). Modules shares many of the attributes
with routines (program units); this is not surprising since routines are the
major component of a typical module. However, modules have some additional
attributes of their own. The following sections describe the attributes of a
well-written module.
- Note:
- Unit and package are both synonyms for the term module.
3.1 Module Attributes
A module is a generic term that describes a set of program related objects
(program units as well as data and type objects) that are somehow coupled. Good
modules share many of the same attributes as good program units as well as the
ability to hide certain details from code outside the module.
3.1.1 Module Cohesion
Modules exhibit the following different kinds of cohesion (listed from good
to bad):
- Functional or logical cohesion exists if the module accomplishes exactly
one (simple) task.
- Sequential or pipelined cohesion exists when a module does several
sequential operations that must be performed in a certain order with the data
from one operation being fed to the next in a "filter-like" fashion.
- Global or communicational cohesion exists when a module performs a set of
operations that make use of a common set of data, but are otherwise unrelated.
- Temporal cohesion exists when a module performs a set of operations that
need to be done at the same time (though not necessarily in the same order). A
typical initialization module is an example of such code.
- Procedural cohesion exists when a module performs a sequence of operations
in a specific order, but the only thing that binds them together is the order
in which they must be done. Unlike sequential cohesion, the operations do not
share data.
- State cohesion occurs when several different (unrelated) operations appear
in the same module and a state variable (e.g., a parameter) selects the
operation to execute. Typically such modules contain a case (switch) or
if..elseif..elseif... statement.
- No cohesion exists if the operations in a module have no apparent
relationship with one another.
The first three forms of cohesion above are generally acceptable in a
program. The fourth (temporal) is probably okay, but you should rarely use it.
The last three forms should almost never appear in a program. For some
reasonable examples of module cohesion, you should consult "Code Complete".
- Guideline:
- Design good modules! Good modules exhibit strong cohesion. That is, a
module should offer a (small) group of services that are logically related.
For example, a "printer" module might provide all the services one would
expect from a printer. The individual routines within the module would provide
the individual services.
3.1.2 Module Coupling
Coupling refers to the way that two modules communicate with one another.
There are several criteria that define the level of coupling between two
modules:
- Cardinality- the number of objects communicated between two modules. The
fewer objects the better (i.e., fewer parameters).
- Intimacy- how "private" is the communication? Parameter lists are the most
private form; private data fields in a class or object are next level; public
data fields in a class or object are next, global variables are even less
intimate, and passing data in a file or database is the least intimate
connection. Well-written modules exhibit a high degree of intimacy.
- Visibility- this is somewhat related to intimacy above. This refers to how
visible the data is to the entire system that you pass between two modules.
For example, passing data in a parameter list is direct and very visible (you
always see the data the caller is passing in the call to the routine); passing
data in global variables makes the transfer less visible (you could have set
up the global variable long before the call to the routine). Another example
is passing simple (scalar) variables rather than loading up a bunch of values
into a structure/record and passing that structure/record to the callee.
- Flexibility- This refers to how easy it is to make the connection between
two routines that may not have been originally intended to call one another.
For example, suppose you pass a structure containing three fields into a
function. If you want to call that function but you only have three data
objects, not the structure, you would have to create a dummy structure, copy
the three values into the field of that structure, and then call the function.
On the other hand, had you simply passed the three values as separate
parameters, you could still pass in structures (by specifying each field) as
well as call the function with separate values. The module containing this
later function is more flexible.
A module is loosely coupled if its functions exhibit low cardinality, high
intimacy, high visibility, and high flexibility. Often, these features are in
conflict with one another (e.g., increasing the flexibility by breaking out the
fields from a structures [a good thing] will also increase the cardinality [a
bad thing]). It is the traditional goal of any engineer to choose the
appropriate compromises for each individual circumstance; therefore, you will
need to carefully balance each of the four attributes above.
A module that uses loose coupling generally contains fewer errors per KLOC
(thousands of lines of code). Furthermore, modules that exhibit loose coupling
are easier to reuse (both in the current and future projects). For more
information on coupling, see the appropriate chapter in "Code Complete".
- Guideline:
- Design good modules! Good modules exhibit loose coupling. That is, there
are only a few, well-defined (visible) interfaces between the module and the
outside world. Most data is private, accessible only through accessor
functions (see information hiding below). Furthermore, the interface should be
flexible.
- Guideline:
- Design good modules! Good modules exhibit information hiding. Code outside
the module should only have access to the module through a small set of public
routines. All data should be private to that module. A module should implement
an abstract data type. All interface to the module should be through a
well-defined set of operations.
3.1.3 Physical Organization of Modules
Many languages provide direct support for modules (e.g., packages in Ada,
modules in Modula-2, and units in Delphi/Pascal). Some languages provide only
indirect support for modules (e.g., a source file in C/C++). Others, like BASIC,
don't really support modules, so you would have to simulate them by physically
grouping objects together and exercising some discipline. Assembly language
falls into the middle ground. The primary mechanism for hiding names from other
modules is to implement a module as an individual source file and publish only
those names that are part of the module's interface to the outside world.
- Rule:
- Each module should completely reside in a single source file. If size
considerations prevent this, then all the source files for a given module
should reside in a subdirectory specifically designated for that module.
Some people have the crazy idea that modularization means putting each
function in a separate source file. Such physical modularization generally
impairs the readability of a program more than it helps. Strive instead for
logical modularization, that is, defining a module by its actions rather than by
source code syntax (e.g., separating out functions).
This document does not address the decomposition of a problem into its
modular components. Presumably, you can already handle that part of the task.
There are a wide variety of texts on this subject if you feel weak in this
area.
3.1.4 Module Interface
In any language system that supports modules, there are two primary
components of a module: the interface component that publicizes the module
visible names and the implementation component that contains the actual code,
data, and private objects. MASM (and most assemblers) uses a scheme that is very
similar to the one C/C++ uses. There are directives that let you import and
export names. Like C/C++, you could place these directives directly in the
related source modules. However, such code is difficult to maintain (since you
need to change the directives in every file whenever you modify a public name).
The solution, as adopted in the C/C++ programming languages, is to use header
files. Header files contain all the public definitions and exports (as well as
common data type definitions and constant definitions). The header file provides
the interface to the other modules that want to use the code present in the
implementation module.
The MASM 6.x externdef directive is perfect for creating interface files.
When you use externdef within a source module that defines a symbol, externdef
behaves like the public directive, exporting the name to other modules. When you
use externdef within a source modules that refers to an external name, externdef
behaves like the extern (or extrn ) directive. This lets you place an externdef
directive in a single file and include this file into both the modules that
import and export the public names.
If you are using an assembler that does not support externdef, you should
probably consider switching to MASM 6.x. If switching to a better assembler
(that supports externdef) is not feasible, the last thing you want to do is have
to maintain the interface information in several separate files. Instead, use
the assembler's ifdef conditional assembly directives to assemble a set of
public statements in the header file if a symbol with the module's name is
defined prior to including the header file. It should assemble a set of extrn
statements otherwise. Although you still have to maintain the public and
external information in two places (in the ifdef true and false sections), they
are in the same file and located near one another.
- Rule:
- Keep all module interface directives (public, extrn, extern, and
externdef) in a single header file for a given module. Place any other common
data type definitions and constant definitions in this header file as well.
- Guideline:
- There should only be a single header file associated with any one module
(even if the module has multiple source files associated with it). If, for
some reason, you feel it is necessary to have multiple header files associated
with a module, you should create a single file that includes all of the other
interface files. That way a program that wants to use all the header files
need only include the single file.
When designing header files, make sure you can include a file more than once
without ill effects (e.g., duplicate symbol errors). The traditional way to do
this is to put an IFDEF statement like the following around all the statements
in a header file:
; Module: MyHeader.a
ifndef MyHeader_A
MyHeader_A = 0
.
. ;Statements in this header file.
.
endif
The first time a source file includes "MyHeader.a" the symbol "MyHeader_A" is
undefined. Therefore, the assembler will process all the statements in the
header file. In successive include operations (during the same assembly) the
symbol "MyHeader_A" is already defined, so the assembler ignores the body of the
include file.
My would you ever include a file twice? Easy. Some header files may include
other header files. By including the file "YourHeader.a" a module might also be
including "MyHeader.a" (assuming "YourHeader.a" contains the appropriate include
directive). Your main program, that includes "YourHeader.a" might also need
"MyHeader.a" so it explicitly includes this file not realizing "YourHeader.a"
has already processed "MyHeader.a" thereby causing symbol redefinitions.
- Rule:
- Always put an appropriate IFNDEF statement around all the definitions in a
header file to allow multiple inclusion of the header file without ill effect.
- Guideline:
- Use the ".a" suffix for assembly language header/interface files.
- Rule:
- Include files for library functions on a system should exist as ".a" files
and should appear in the "/include" or "/asminc" subdirectory.
- Guideline:
- "/asminc" is probably a better choice if you're using multiple languages
since those other languages may need to put files in a "/include" directory.
- Exception:
- It's probably reasonable to leave the UCR Standard Library's "stdlib.a"
file in the "/stdlib/include" directory since most people expect it there.
|