[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This chapter presents several topics related to program performance.
It first describes some of the tradeoffs that need to be considered
and some of the techniques for making your program run faster.
It then documents the gnatelim
tool, which can reduce
the size of program executables.
7.1 Performance Considerations 7.2 Reducing the Size of Ada Executables with gnatelim
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The GNAT system provides a number of options that allow a trade-off between
The defaults (if no options are selected) aim at improving the speed of compilation and minimizing dependences, at the expense of performance of the generated code:
These options are suitable for most program development purposes. This chapter describes how you can modify these choices, and also provides some guidelines on debugging optimized code.
7.1.1 Controlling Run-Time Checks 7.1.2 Use of Restrictions 7.1.3 Optimization Levels 7.1.4 Debugging Optimized Code 7.1.5 Inlining of Subprograms 7.1.6 Optimization and Strict Aliasing
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
By default, GNAT generates all run-time checks, except arithmetic overflow checking for integer operations and checks for access before elaboration on subprogram calls. The latter are not required in default mode, because all necessary checking is done at compile time. Two gnat switches, `-gnatp' and `-gnato' allow this default to be modified. See section 3.2.6 Run-Time Checks.
Our experience is that the default is suitable for most development purposes.
We treat integer overflow specially because these are quite expensive and in our experience are not as important as other run-time checks in the development process. Note that division by zero is not considered an overflow check, and divide by zero checks are generated where required by default.
Elaboration checks are off by default, and also not needed by default, since GNAT uses a static elaboration analysis approach that avoids the need for run-time checking. This manual contains a full chapter discussing the issue of elaboration checks, and if the default is not satisfactory for your use, you should read this chapter.
For validity checks, the minimal checks required by the Ada Reference Manual (for case statements and assignments to array elements) are on by default. These can be suppressed by use of the `-gnatVn' switch. Note that in Ada 83, there were no validity checks, so if the Ada 83 mode is acceptable (or when comparing GNAT performance with an Ada 83 compiler), it may be reasonable to routinely use `-gnatVn'. Validity checks are also suppressed entirely if `-gnatp' is used.
Note that the setting of the switches controls the default setting of
the checks. They may be modified using either pragma Suppress
(to
remove checks) or pragma Unsuppress
(to add back suppressed
checks) in the program source.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The use of pragma Restrictions allows you to control which features are permitted in your program. Apart from the obvious point that if you avoid relatively expensive features like finalization (enforceable by the use of pragma Restrictions (No_Finalization), the use of this pragma does not affect the generated code in most cases.
One notable exception to this rule is that the possibility of task abort results in some distributed overhead, particularly if finalization or exception handlers are used. The reason is that certain sections of code have to be marked as non-abortable.
If you use neither the abort
statement, nor asynchronous transfer
of control (select .. then abort
), then this distributed overhead
is removed, which may have a general positive effect in improving
overall performance. Especially code involving frequent use of tasking
constructs and controlled types will show much improved performance.
The relevant restrictions pragmas are
pragma Restrictions (No_Abort_Statements); pragma Restrictions (Max_Asynchronous_Select_Nesting => 0); |
It is recommended that these restriction pragmas be used if possible. Note that this also means that you can write code without worrying about the possibility of an immediate abort at any point.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The default is optimization off. This results in the fastest compile
times, but GNAT makes absolutely no attempt to optimize, and the
generated programs are considerably larger and slower than when
optimization is enabled. You can use the
`-On' switch, where n is an integer from 0 to 3,
to gcc
to control the optimization level:
Higher optimization levels perform more global transformations on the program and apply more expensive analysis algorithms in order to generate faster and more compact code. The price in compilation time, and the resulting improvement in execution time, both depend on the particular application and the hardware environment. You should experiment to find the best level for your application.
Since the precise set of optimizations done at each level will vary from release to release (and sometime from target to target), it is best to think of the optimization settings in general terms. The Using GNU GCC manual contains details about the `-O' settings and a number of `-f' options that individually enable or disable specific optimizations.
Unlike some other compilation systems, gcc
has
been tested extensively at all optimization levels. There are some bugs
which appear only with optimization turned on, but there have also been
bugs which show up only in unoptimized code. Selecting a lower
level of optimization does not improve the reliability of the code
generator, which in practice is highly reliable at all optimization
levels.
Note regarding the use of `-O3': The use of this optimization level is generally discouraged with GNAT, since it often results in larger executables which run more slowly. See further discussion of this point in 7.1.5 Inlining of Subprograms.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Although it is possible to do a reasonable amount of debugging at non-zero optimization levels, the higher the level the more likely that source-level constructs will have been eliminated by optimization. For example, if a loop is strength-reduced, the loop control variable may be completely eliminated and thus cannot be displayed in the debugger. This can only happen at `-O2' or `-O3'. Explicit temporary variables that you code might be eliminated at level `-O1' or higher.
The use of the `-g' switch, which is needed for source-level debugging, affects the size of the program executable on disk, and indeed the debugging information can be quite large. However, it has no effect on the generated code (and thus does not degrade performance)
Since the compiler generates debugging tables for a compilation unit before it performs optimizations, the optimizing transformations may invalidate some of the debugging data. You therefore need to anticipate certain anomalous situations that may arise while debugging optimized code. These are the most common cases:
step
or next
commands show
the PC bouncing back and forth in the code. This may result from any of
the following optimizations:
goto
, a return
, or
a break
in a C switch
statement.
In general, when an unexpected value appears for a local variable or parameter you should first ascertain if that value was actually computed by your program, as opposed to being incorrectly reported by the debugger. Record fields or array elements in an object designated by an access value are generally less of a problem, once you have ascertained that the access value is sensible. Typically, this means checking variables in the preceding code and in the calling subprogram to verify that the value observed is explainable from other values (one must apply the procedure recursively to those other values); or re-running the code and stopping a little earlier (perhaps before the call) and stepping to better see how the variable obtained the value in question; or continuing to step from the point of the strange value to see if code motion had simply moved the variable's assignments later.
In light of such anomalies, a recommended technique is to use `-O0'
early in the software development cycle, when extensive debugging capabilities
are most needed, and then move to `-O1' and later `-O2' as
the debugger becomes less critical.
Whether to use the `-g' switch in the release version is
a release management issue.
Note that if you use `-g' you can then use the strip
program
on the resulting executable,
which removes both debugging information and global symbols.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
A call to a subprogram in the current unit is inlined if all the following conditions are met:
gcc
cannot support in inlined subprograms.
pragma Inline
applies to the subprogram or it is
small and automatic inlining (optimization level `-O3') is
specified.
Calls to subprograms in with
'ed units are normally not inlined.
To achieve this level of inlining, the following conditions must all be
true:
gcc
cannot
support in inlined subprograms.
pragma Inline
for the subprogram.
gcc
command line
Note that specifying the `-gnatn' switch causes additional compilation dependencies. Consider the following:
package R is procedure Q; pragma Inline (Q); end R; package body R is ... end R; with R; procedure Main is begin ... R.Q; end Main; |
With the default behavior (no `-gnatn' switch specified), the
compilation of the Main
procedure depends only on its own source,
`main.adb', and the spec of the package in file `r.ads'. This
means that editing the body of R
does not require recompiling
Main
.
On the other hand, the call R.Q
is not inlined under these
circumstances. If the `-gnatn' switch is present when Main
is compiled, the call will be inlined if the body of Q
is small
enough, but now Main
depends on the body of R
in
`r.adb' as well as on the spec. This means that if this body is edited,
the main program must be recompiled. Note that this extra dependency
occurs whether or not the call is in fact inlined by gcc
.
The use of front end inlining with `-gnatN' generates similar additional dependencies.
Note: The `-fno-inline' switch can be used to prevent all inlining. This switch overrides all other conditions and ensures that no inlining occurs. The extra dependences resulting from `-gnatn' will still be active, even if this switch is used to suppress the resulting inlining actions.
Note regarding the use of `-O3': There is no difference in inlining
behavior between `-O2' and `-O3' for subprograms with an explicit
pragma Inline
assuming the use of `-gnatn'
or `-gnatN' (the switches that activate inlining). If you have used
pragma Inline
in appropriate cases, then it is usually much better
to use `-O2' and `-gnatn' and avoid the use of `-O3' which
in this case only has the effect of inlining subprograms you did not
think should be inlined. We often find that the use of `-O3' slows
down code by performing excessive inlining, leading to increased instruction
cache pressure from the increased code size. So the bottom line here is
that you should not automatically assume that `-O3' is better than
`-O2', and indeed you should use `-O3' only if tests show that
it actually improves performance.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The strong typing capabilities of Ada allow an optimizer to generate efficient code in situations where other languages would be forced to make worst case assumptions preventing such optimizations. Consider the following example:
procedure R is type Int1 is new Integer; type Int2 is new Integer; type Int1A is access Int1; type Int2A is access Int2; Int1V : Int1A; Int2V : Int2A; ... begin ... for J in Data'Range loop if Data (J) = Int1V.all then Int2V.all := Int2V.all + 1; end if; end loop; ... end R; |
In this example, since the variable Int1V
can only access objects
of type Int1
, and Int2V
can only access objects of type
Int2
, there is no possibility that the assignment to
Int2V.all
affects the value of Int1V.all
. This means that
the compiler optimizer can "know" that the value Int1V.all
is constant
for all iterations of the loop and avoid the extra memory reference
required to dereference it each time through the loop.
This kind of optimziation, called strict aliasing analysis, is
triggered by specifying an optimization level of `-O2' or
higher and allows GNAT
to generate more efficient code
when access values are involved.
However, although this optimization is always correct in terms of
the formal semantics of the Ada Reference Manual, difficulties can
arise if features like Unchecked_Conversion
are used to break
the typing system. Consider the following complete program example:
package p1 is type int1 is new integer; type int2 is new integer; type a1 is access int1; type a2 is access int2; end p1; with p1; use p1; package p2 is function to_a2 (Input : a1) return a2; end p2; with Unchecked_Conversion; package body p2 is function to_a2 (Input : a1) return a2 is function to_a2u is new Unchecked_Conversion (a1, a2); begin return to_a2u (Input); end to_a2; end p2; with p2; use p2; with p1; use p1; with Text_IO; use Text_IO; procedure m is v1 : a1 := new int1; v2 : a2 := to_a2 (v1); begin v1.all := 1; v2.all := 0; put_line (int1'image (v1.all)); end; |
This program prints out 0 in -O0
or -O1
mode, but it prints out 1 in -O2
mode. That's
because in strict aliasing mode, the compiler can and
does assume that the assignment to v2.all
could not
affect the value of v1.all
, since different types
are involved.
This behavior is not a case of non-conformance with the standard, since
the Ada RM specifies that an unchecked conversion where the resulting
bit pattern is not a correct value of the target type can result in an
abnormal value and attempting to reference an abnormal value makes the
execution of a program erroneous. That's the case here since the result
does not point to an object of type int2
. This means that the
effect is entirely unpredictable.
However, although that explanation may satisfy a language lawyer, in practice an applications programmer expects an unchecked conversion involving pointers to create true aliases and the behavior of printing 1 seems plain wrong. In this case, the strict aliasing optimization is unwelcome.
Indeed the compiler recognizes this possibility, and the unchecked conversion generates a warning:
p2.adb:5:07: warning: possible aliasing problem with type "a2" p2.adb:5:07: warning: use -fno-strict-aliasing switch for references p2.adb:5:07: warning: or use "pragma No_Strict_Aliasing (a2);" |
Unfortunately the problem is recognized when compiling the body of
package p2
, but the actual "bad" code is generated while
compiling the body of m
and this latter compilation does not see
the suspicious Unchecked_Conversion
.
As implied by the warning message, there are approaches you can use to avoid the unwanted strict aliasing optimization in a case like this.
One possibility is to simply avoid the use of -O2
, but
that is a bit drastic, since it throws away a number of useful
optimizations that do not involve strict aliasing assumptions.
A less drastic approach is to compile the program using the
option -fno-strict-aliasing
. Actually it is only the
unit containing the dereferencing of the suspicious pointer
that needs to be compiled. So in this case, if we compile
unit m
with this switch, then we get the expected
value of zero printed. Analyzing which units might need
the switch can be painful, so a more reasonable approach
is to compile the entire program with options -O2
and -fno-strict-aliasing
. If the performance is
satisfactory with this combination of options, then the
advantage is that the entire issue of possible "wrong"
optimization due to strict aliasing is avoided.
To avoid the use of compiler switches, the configuration
pragma No_Strict_Aliasing
with no parameters may be
used to specify that for all access types, the strict
aliasing optimization should be suppressed.
However, these approaches are still overkill, in that they causes all manipulations of all access values to be deoptimized. A more refined approach is to concentrate attention on the specific access type identified as problematic.
First, if a careful analysis of uses of the pointer shows
that there are no possible problematic references, then
the warning can be suppressed by bracketing the
instantiation of Unchecked_Conversion
to turn
the warning off:
pragma Warnings (Off); function to_a2u is new Unchecked_Conversion (a1, a2); pragma Warnings (On); |
Of course that approach is not appropriate for this particular example, since indeed there is a problematic reference. In this case we can take one of two other approaches.
The first possibility is to move the instantiation of unchecked
conversion to the unit in which the type is declared. In
this example, we would move the instantiation of
Unchecked_Conversion
from the body of package
p2
to the spec of package p1
. Now the
warning disappears. That's because any use of the
access type knows there is a suspicious unchecked
conversion, and the strict aliasing optimization
is automatically suppressed for the type.
If it is not practical to move the unchecked conversion to the same unit
in which the destination access type is declared (perhaps because the
source type is not visible in that unit), you may use pragma
No_Strict_Aliasing
for the type. This pragma must occur in the
same declarative sequence as the declaration of the access type:
type a2 is access int2; pragma No_Strict_Aliasing (a2); |
Here again, the compiler now knows that the strict aliasing optimization
should be suppressed for any reference to type a2
and the
expected behavior is obtained.
Finally, note that although the compiler can generate warnings for simple cases of unchecked conversions, there are tricker and more indirect ways of creating type incorrect aliases which the compiler cannot detect. Examples are the use of address overlays and unchecked conversions involving composite types containing access types as components. In such cases, no warnings are generated, but there can still be aliasing problems. One safe coding practice is to forbid the use of address clauses for type overlaying, and to allow unchecked conversion only for primitive types. This is not really a significant restriction since any possible desired effect can be achieved by unchecked conversion of access values.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
gnatelim
This section describes gnatelim
, a tool which detects unused
subprograms and helps the compiler to create a smaller executable for your
program.
7.2.1 About gnatelim
7.2.2 Running gnatelim
7.2.3 Correcting the List of Eliminate Pragmas 7.2.4 Making Your Executables Smaller 7.2.5 Summary of the gnatelim Usage Cycle
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
gnatelim
When a program shares a set of Ada packages with other programs, it may happen that this program uses only a fraction of the subprograms defined in these packages. The code created for these unused subprograms increases the size of the executable.
gnatelim
tracks unused subprograms in an Ada program and
outputs a list of GNAT-specific pragmas Eliminate
marking all the
subprograms that are declared but never called. By placing the list of
Eliminate
pragmas in the GNAT configuration file `gnat.adc' and
recompiling your program, you may decrease the size of its executable,
because the compiler will not generate the code for 'eliminated' subprograms.
See GNAT Reference Manual for more information about this pragma.
gnatelim
needs as its input data the name of the main subprogram
and a bind file for a main subprogram.
To create a bind file for gnatelim
, run gnatbind
for
the main subprogram. gnatelim
can work with both Ada and C
bind files; when both are present, it uses the Ada bind file.
The following commands will build the program and create the bind file:
$ gnatmake -c Main_Prog $ gnatbind main_prog |
Note that gnatelim
needs neither object nor ALI files.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
gnatelim
gnatelim
has the following command-line interface:
$ gnatelim [options] name |
name
should be a name of a source file that contains the main subprogram
of a program (partition).
gnatelim
has the following switches:
gnatelim
outputs to the standard error
stream the number of program units left to be processed. This option turns
this trace off.
gnatelim
version information is printed as Ada
comments to the standard output stream. Also, in addition to the number of
program units left gnatelim
will output the name of the current unit
being processed.
gnatmake
.
gnatelim
not to look for
sources in the current directory.
gnatelim
to use specific gcc
compiler instead of one
available on the path.
gnatelim
to use specific gnatmake
instead of one
available on the path.
gnatelim
sends its output to the standard output stream, and all the
tracing and debug information is sent to the standard error stream.
In order to produce a proper GNAT configuration file
`gnat.adc', redirection must be used:
$ gnatelim main_prog.adb > gnat.adc |
or
$ gnatelim main_prog.adb >> gnat.adc |
in order to append the gnatelim
output to the existing contents of
`gnat.adc'.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
In some rare cases gnatelim
may try to eliminate
subprograms that are actually called in the program. In this case, the
compiler will generate an error message of the form:
file.adb:106:07: cannot call eliminated subprogram "My_Prog" |
You will need to manually remove the wrong Eliminate
pragmas from
the `gnat.adc' file. You should recompile your program
from scratch after that, because you need a consistent `gnat.adc' file
during the entire compilation.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
In order to get a smaller executable for your program you now have to
recompile the program completely with the new `gnat.adc' file
created by gnatelim
in your current directory:
$ gnatmake -f main_prog |
(Use the `-f' option for gnatmake
to
recompile everything
with the set of pragmas Eliminate
that you have obtained with
gnatelim
).
Be aware that the set of Eliminate
pragmas is specific to each
program. It is not recommended to merge sets of Eliminate
pragmas created for different programs in one `gnat.adc' file.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Here is a quick summary of the steps to be taken in order to reduce
the size of your executables with gnatelim
. You may use
other GNAT options to control the optimization level,
to produce the debugging information, to set search path, etc.
$ gnatmake -c main_prog $ gnatbind main_prog |
Eliminate
pragmas
$ gnatelim main_prog >[>] gnat.adc |
$ gnatmake -f main_prog |
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |