spike(1)

Index for
Section 1
Alphabetical
listing for S
Bottom of
page
spike(1)
NAME
  spike - Performs code optimization after linking a program

SYNOPSIS
  spike binary_file [options...]

OPTIONS
  -align_threshold n
      Determines the frequency cutoff point for the alignment of profile-
      based basic blocks. Valid threshold values are floating-point numbers
      between 0 and 1, inclusive. A higher number means more basic blocks
      will be quadword aligned. [Default: 0.95]

  -arch option
      Specifies the version of the Alpha architecture for which to generate
      instructions. See cc(1) for information about the possible values of
      option and for a comparison of -arch and -tune. The default option is
      ev4. Spike will accept binaries that contain instructions that require
      an architectural extension not present in the processor specified by
      -arch. However, Spike will assume that the instructions are guarded by
      code that prevents their execution on some systems and will restrict
      some optimizations. For best results, use an appropriate -arch option.

  -D n
      Specifies new data segment starting address, where n is a 64-bit
      hexadecimal number without the leading "0x".

  -dcpi_prefetch
      Enables DCPI profile-based prefetching (only works with feedback
      profile).

  -dcpi_prefetch_latency n
      Controls DCPI profile-based prefetching. The number (n) describes the
      minimum latency (in cycles) that a load must have to become a candidate
      for prefetching. [Default: 50]

  -dcpi_prefetch_threshold float-arg
      Controls DCPI profile-based prefetching. Valid threshold values are
      floating-point numbers between 0 and 1, inclusive. The number is used
      to determine the frequency cutoff for loads to be prefetched; a higher
      number means more loads will be prefetched. [Default: 0.75]

  -feedback file
      Causes Spike to use the feedback database stored in file, where file is
      the name of the input executable. This database is created by first
      compiling the program with the -feedback option (for example, cc
      -feedback prog) and then instrumenting and running the program with the
      pixie -update or prof -pixie -update command (see cc(1), pixie(1), and
      prof(1)).

  -fb file
      Causes Spike to use file.Addrs (basic block addresses file) and
      file.Counts (basic block counts file) for profile-based optimization.
      These files are produced by the pixie tool (see pixie(1) and prof(1)).

  -help
      Prints a short help screen.

  -kernel
      Use this option when applying Spike to the UNIX kernel (vmunix). Spike
      can be applied only to V5.1 or later kernels.

  -noaggressiveAlign
      Reduces the number of padding nops inserted into the code to align
      instructions. The alignment usually makes the code run faster, but
      makes the code larger, which can cause more instruction cache misses.

  -nochain
      Disables basic block chaining, which arranges code so that the fall
      through path is the commonly taken path.

  -nodtk
      Invokes the original Spike and not the DTK version.

  -noporder
      Disables procedure ordering.

  -nosplit
      Disables code layout optimization that splits procedures into multiple
      parts.

  -O, -O0, -O1, -O2
      Specify Spike's optimization level. These flags are provided for
      compatibility with other compilation tools and currently have no
      effect.

  -o output_file
      Names the optimized binary output file. The default file name is a.out.

  -obsolete_linkerdefs
      Specifies that obsolete linker-defined symbols are to be ignored. (See
      RESTRICTIONS.)

  -optimize_threshold n
      Determines the frequency cutoff point for profile-based optimizations.
      Valid threshold values are floating-point numbers between 0 and 1,
      inclusive.  A higher number means more routines will be optimized.
      [Default: 0.99]

  -split_threshold n
      Determines the frequency cutoff point for profile-based routine
      splitting. Valid threshold values are floating-point numbers between 0
      and 1, inclusive. A higher number means more code is considered hot and
      less code is considered cold. [Default: 0.95]

  -stride_prefetch
      Enables stride prefetching based on a profile of data-address strides
      collected by using the Pixie tool. This optimization mainly targets
      programs in which many data cache misses occur inside loops.

  -symbols_live
      Keeps unreachable routines from being deleted by Spike if they have an
      entry in the symbol table.

  -T n
      Specifies new text segment starting address, where n is a 64-bit
      hexadecimal number without the leading "0x".

  -tune option
      Instructs the optimizer to tune the application for a specific version
      of the Alpha architecture. See cc(1) for information about the possible
      values of option and for a comparison of -tune and -arch. The default
      option is ev6.

  -V  Displays the version number of Spike.

  -verbose
      Enables extra warning messages.

OPERANDS
  binary_file
	  Name of the binary file to which Spike is to be applied.

DESCRIPTION
  Spike is a tool for performing code optimization after linking. It is a
  replacement for om and does similar optimizations. Because it can operate
  on an entire program, Spike is able to do optimizations that cannot be done
  by the compiler.

  Some of the optimizations that Spike performs are code layout, deleting
  unreachable code, and optimization of address computations. Spike is most
  effective when it uses profile information to guide optimization.

  Spike can process binaries linked on Tru64 UNIX (formerly Digital UNIX)
  Version 4.0 or later systems. Binaries that are linked on Version 5.1 or
  later systems contain information that allows Spike to do additional
  optimization.

  You can use Spike in two ways:

    ·  By applying the spike command to a binary file after compilation.

    ·  As part of the compilation process, by specifying the -spike option
       with the cc command (or the cxx, f77, or f90 command, if the
       associated compiler is installed).

  The -spike option is more convenient when you are not using profile
  information (Example 2), or you are using profile information in the
  compiler, too (Example 3). The spike command is more convenient if you do
  not want to relink the executable (Example 1) or you are using profile
  information after compilation (Examples 4 and 5).

  All spike command options can be passed directly to the cc command's -spike
  option by using the cc command's -WS option. Example 6 shows the syntax.

RESTRICTIONS
  Spike cannot process the following images:

    ·  Images that have been stripped.

    ·  Images that contain certain obsolete linker-defined symbols and
       structures such as RPDR tables (see Section 2.3.7, Special Symbols, of
       the Object File/Symbol Table Format Specification). This can be
       overruled by using the -obsolete_linkerdefs option, but the resulting
       binary files may be incorrect, so use with caution.

    ·  Images that modify the text section at run time.

  Using cord, atom, pixie, hiprof, or third on an image that has been
  processed with Spike is unsupported.

NOTES
  Spike tries to update the symbol table in the binary so that the optimized
  binary can be debugged. As with other compiler optimizations, there may be
  some situations where the debugger may not be able to properly report the
  current location in the program or display the values of variables. If
  Spike divides a procedure into multiple disjoint parts, the main body will
  keep the original procedure name, but the other parts will have names that
  are the original name with _cold_n (where n is a unique number) appended to
  the end.

EXAMPLES
   1.  In the following example, Spike is applied to the binary my_prog,
       producing the optimized output file prog1.opt:
	    % spike my_prog -o prog1.opt

   2.  In the following example, Spike is applied during compilation with the
       cc command's -spike option:
	    % cc -c file1.c
	    % cc -o prog3 file1.

       The first command line creates the object file file1.o.	The second
       command line links file1.o into an executable image and uses Spike to
       optimize the executable image.

   3.  The following example shows how to optimize a program, prog, by first
       compiling it with the -feedback option, then merging profiling
       statistics from two instrumented runs of the program, and then
       compiling it with the -spike and -feedback options so that the
       feedback information stored in the executable image is used by the
       compiler and Spike:
	    % cc -feedback prog -o prog *.c
	    % pixie -pids prog
	    % prog.pixie
	    (input set 1)
	    % prog.pixie
	    (input set 2)
	    % prof -pixie -update prog prog.Counts.*
	    % cc -spike -feedback prog -o prog *.c

       The first compilation produces an augmented executable image that will
       later accept feedback information.

       The pixie command creates an instrumented program (prog.pixie), which
       is then run twice. The -pids option adds the process ID of each test
       run to the name of the profiling data file produced -- for example,
       prog.Counts.371 and prog.Counts.422.

       The prof -pixie command merges the two data files.  The -update option
       updates the executable image, prog, with the combined information.

       The program is compiled with the -spike and -feedback options so the
       feedback information stored in the executable image is used by the
       compiler and Spike.

   4.  The following example shows how to optimize a program, prog, by first
       compiling it with the -feedback option, then merging profiling
       statistics from two instrumented runs of the program, and then
       applying the spike -feedback command to use the feedback information
       stored in the executable image:
	    % cc -feedback prog -o prog *.c
	    % pixie -pids prog
	    % prog.pixie
	    (input set 1)
	    % prog.pixie
	    (input set 2)
	    % prof -pixie -update prog prog.Counts.*
	    % spike prog -feedback prog -o prog.opt

       As in the previous example, the first compilation produces an
       augmented executable image. The instrumented program is run twice,
       producing a uniquely named data file each time. The prof -pixie
       -update command merges the two data files and updates the executable
       image with the combined information.

       The spike -feedback command uses the combined profiling information to
       produce the optimized output file prog.opt.

   5.  The following example shows how to optimize a program, prog, by
       merging profiling statistics from two instrumented runs of the
       program, then applying the spike -fb command to use the feedback
       information in the .Addrs and .Counts files:
	    % cc prog -o prog *.c
	    % pixie -pids prog
	    % prog.pixie
	    (input set 1)
	    % prog.pixie
	    (input set 2)
	    % prof -pixie -merge prog.Counts prog prog.Addrs prog.Counts.*
	    % spike prog -fb prog -o prog.opt

       The first compilation produces a normal executable image. As in the
       previous example, the instrumented program is run twice, producing a
       uniquely named data file each time.

       The prof -pixie -merge command merges the two data files into one
       combined prog.Counts file.

       The spike -fb command uses the information in prog.Addrs and
       prog.Counts to produce the optimized output file prog.opt.

       The method in Example 4 is preferred. You should use the method in
       Example 5 only if you cannot compile with the -feedback option, which
       uses feedback information stored in the executable image.

   6.  The following example shows the syntax for passing spike command
       options to the cc command's -spike option by using the cc command's
       -WS option:
	    % cc -spike -feedback prog -o prog *.c \
		 -WS,-splitThresh,.999,-noaggressiveAlign

   7.  The following example shows how to optimize a program, prog, using
       profiles obtained by using the DCPI profiler:
	    % mkdir db	 # create profile directory
	    % dcpid db	 # start dcpi demon
	    % ./prog	  # run your program
	    % dcpiquit	 # stop dcpi demon
	    % dcpi2bb -make_bbdb -counts -pm all -conf_low -db db prog
					       # store feedback information in the binary
	     spike prog -feedback prog	  # spike your program utilizing feedback

   8.  The following example is similar to the previous one, but it contains
       three modifications for DCPI-based prefetching:
	    % mkdir db	  # create profile directory
	    % dcpid -vtrace /usr/lib/dcpi/vp-ldlatency.so db	  # start dcpi demon
	    % ./prog	   # run your program
	    % dcpiquit	  # stop dcpi demon
	    % dcpi2bb -make_bbdb -counts -pm all -conf_low -load_lat -db db prog
					    # store feedback information in the binary
	    % spike prog -dcpi_prefetch -feedback prog
					    # spike your program utilizing feedback

   9.  The following example demonstrates how to perform stride prefetching:

	a.  First, instrument an executable image (prog) for profiling
	    address strides by the following command:
		 % pixie -stats dstride prog	  # Step (a): instrumentation

	    This command creates an instrumented program (prog.pixie).

	b.  Second, run the instrumented program with the input intended for
	    training purpose:
		 % prog.pixie input		   # Step (b): stride profiling

	    This command generates a profile of address strides, which is
	    stored into the file prog.Counts.

	c.  Finally, invoke Spike to insert stride prefetches:
		 % spike prog -fb prog -stride_prefetch -o prog.pf
				       # Step (c): prefetch insertion

	    The output (prog.pf) is a version of the program with stride
	    prefetches inserted.

       Note that it is possible to perform both stride prefetching and other
       feedback-directed optimizations at the same time. To do this, you need
       to first collect the feedback information for the other optimizations
       and store it into the executable image using the following sequence:
	    % cc -feedback prog -o prog *.c
	    % pixie prog
	    % prog.pixie input
	    % prof -pixie -update prog prog.counts

       Then, you basically repeat Steps (a) to (c) for stride prefetching,
       except that you need to turn on both stride prefetching and other
       feedback-directed optimizations in a single spike command:
	    % pixie -stats dstride prog	     # same as Step (a)
	    % prog.pixie input			# same as Step (b)
	    % spike prog -feedback prog -fb prog -stride_prefetch -o prog.opt_pf
			     # Step (c) plus other feedback-directed optimizations

       The output (prog.opt_pf) is a version of the program with both stride
       prefetching and other feedback-directed optimizations.

RETURN STATUS
  Spike returns the following status values:

  0:	     Success
  Nonzero:   Error

SEE ALSO
  cc(1), pixie(1), prof(1)

  Programmer's Guide

  The spike web page at http://www.tru64unix.compaq.com/spike/

  The DCPI web page at http://www.tru64unix.compaq.com/dcpi/
Index for
Section 1
Alphabetical
listing for S
Top of
page