Man Page inline.1




NAME

     inline - in-line procedure call expander


DESCRIPTION

     Assembly language call instructions are replaced by  a  copy
     of  their  corresponding  function  body  obtained  from the
     inline template (*.il) file.

     Inline files have a suffix of .il,

          for example: % CC foo.il hello.c

     Inlining is done by code generator (cg).


USAGE

     Each  inlinefile  contains  one  or  more  labeled  assembly
     language templates of the form:
          inline-directive
          instructions
          ...
          .end

     where the instructions constitute an  in-line  expansion  of
     the  named routine.  An inline-directive is a command of the
     form:

          .inline   identifier, argsize

     This declares a block of code for the routine named by iden-
     tifier,  with  argsize  as  the  total size of the routine's
     arguments,  in  bytes.   Calls  to  the  named  routine  are
     replaced by the code in the in-line template.

     NOTE:
     The value of argsize is ignored but the argument  should  be
     included  for compatibility with compiler versions predating
     the Sun WorkShop[tm] 5.0 compilers.

     Multiple templates are permitted; matching  templates  after
     the first are ignored.  Duplicate templates may be placed in
     order  of  decreasing  performance  of   the   corresponding
     hardware;  thus  the  most  efficient usable version will be
     selected.

  Coding Conventions for all Sun Systems
     Inline  templates  should  be  coded  as  expansions  of  C-
     compatible  procedure  calls,  with  the difference that the
     return address cannot be depended upon to be in the expected
     place, since no call instruction will have been executed.

     Inline templates must  conform  to  standard  Sun  parameter
     passing  and  register usage conventions, as detailed below.
     They must not call routines that violate these  conventions;
     for  example,  assembly language routines such as setjmp(3c)
     may cause problems.

     Registers other than the ones mentioned below  must  not  be
     used or set.

     Branch instructions in an in-line template may only transfer
     to  numeric  labels  (1f,  2b, and so on) defined within the
     in-line template.  No other control transfers are allowed.

     Templates do not need ret or retl instructions,  and  should
     not include them.

     Only opcodes and addressing modes generated by Sun compilers
     are  guaranteed  to  work.  Binary encodings of instructions
     are not supported.

  Coding Conventions for SPARC Systems
     Arguments are  passed  in  registers  %o0-%o5,  followed  by
     memory  locations starting at [%sp+0x5c].  %sp is guaranteed
     to be 64-bit aligned.  The contents of  %o7  are  undefined,
     since no call instruction will have been executed.

     Results are returned in %o0 or %f0/%f1.

     Registers %o0-%o5 and %f0-%f31 may be used as temporaries.

     Integral and single-precision floating-point  arguments  are
     32-bit aligned.

     Double-precision floating-point arguments are guaranteed  to
     be 64-bit aligned if their offsets are multiples of 8.

     Each control-transfer instruction (branches and calls)  must
     be immediately followed by a nop.

     Call instructions must include  an  extra  (final)  argument
     which indicates the number of registers used to pass parame-
     ters to the called routine.

     Note that for SPARC systems, the  instruction  following  an
     expanded 'call' is deleted.

  Coding Conventions for x86 Systems
     Arguments are passed on the stack. Since no call instruction
     was  issued,  the  first  argument  is at (%esp), the second
     argument is at 4 (%esp), etc. Integer results of 32 bits  or
     less  are  returned  in  %eax,  64-bit  integer results  are
     returned in %edx:%eax. Floating point results  are  returned
     in %st(0).

     The code may use registers %eax, %ecx and %edx.  The  values
     in any other registers must be preserved. The floating point
     stack will be empty at the start  of  the  inline  expansion
     template,  and must be empty (except for a returned floating
     point value) at the end.


SPECIAL x86 NOTE

     Programs compiled with -xarch={sse|sse2} to run  on  Solaris
     x86 SSE/SSE2 Pentium 4-compatible platforms must be run only
     on platforms that are SSE/SSE2 enabled.  Running  such  pro-
     grams  on  platforms  that  are  not  SSE/SSE2-enabled could
     result in segmentation faults or incorrect results  occuring
     without any explicit warning messages. Patches to the OS and
     compilers to prevent execution of SSE/SSE2-compiled binaries
     on platforms not SSE/SSE2-enabled might be made available at
     a later date.

     OS releases starting with Solaris 9 update 6  are  SSE/SSE2-
     enabled  on Pentium 4-compatible platforms. Earlier versions
     of Solaris OS are not SSE/SSE2-enabled.

     This warning extends also to programs that employ .il inline
     assembly  language  functions or __asm() assembler code that
     utililize SSE/SSE2 instructions.

     If you compile and link in separate steps, always link using
     the  compiler  and with -xarch={sse|sse2} to ensure that the
     correct startup routine is linked.

  Coding Conventions for AMD-64 Platforms
     Arguments are passed according to their classification.  The
     classification includes integer-, sse- and memory-arguments.

     Arguments of types (signed and unsigned) _Bool, char, short,
     int,  long,  long  long  and pointers are integer arguments.
     Arguments of aggregate types  (struct,union,array)  of  size
     less  than  or  equal  to  16 bytes and that contain aligned
     members of types _Bool, char, short, int,  long,  long  long
     and pointers are also integer.

     Arguments of types  float  and  double  are  sse  arguments.
     Arguments  of  aggregate types of size less than or equal to
     16 bytes and that contain aligned members of types float and
     double are also sse.

     Arguments of types long double and  of  aggregate  types  of
     size  greater  than  16 bytes, or with unaligned members are
     memory arguments.

     Integer arguments are passed in  integer  registers  by  the
     next  sequence:  %rdi,  %rsi,  %rdx,  %rcx, %r8 and %r9. One
     integer argument of aggregate type can hold up to 2  integer
     registers.  If  the  number  of integer arguments is greater
     than 6, the 7th and next integer arguments are considered as
     memory arguments.

     Sse arguments are passed in sse registers in the order  from
     %xmm0  to %xmm7. One sse argument of aggregate type can hold
     up to 2 sse registers, each sse register holds up to 8 bytes
     of  argument.   For example, argument of type double complex
     is passed in 2 consequent see registers,  argument  of  type
     float  complex is passed in 1 see register. If the number of
     sse arguments is greater than 8, the 9th and next sse  argu-
     ments are considered as memory arguments.

     Integer and sse arguments are numbered independently.

     Memory arguments are passed on the stack in order from right
     to  left  how  they  appear in function arguments list. Each
     argument on stack is aligned according to its size, on 8  if
     size is less or equal to 8, on 16 otherwise. at the start of
     the inline expansion template stack is aligned on 16.

     Since no call instruction was issued, the first memory argu-
     ment  is  at (%rsp), the second argument is at 8(%rsp) or at
     16(%rsp) depending on the first memory argument size and the
     second memory argument alignment, etc.

     Returning values are classified in the  same  way  as  argu-
     ments.

     Integer results of 8 bytes or less  are  returned  in  %rax,
     integer results of 9 to 16 bytes are returned in %rdx:%rax.

     Sse results are returned depending on  their  size  too,  in
     %xmm0 or in %xmm1:%xmm0.

     Results of type long double are returned in %st(0).

     If returning value is of type long double complex, the  real
     part of the value is returned in %st0 and the imaginary part
     in %st1.

     For memory results the caller provides space for the  return
     value  and  passes the address of this storage in %rdi as if
     it were the first argument to the function.  In effect, this
     address  becomes  a  hidden  first argument.  On return %rax
     will contain the address that has  been  passed  in  by  the
     caller in %rdi.

     The code may not change register %rbp.  The  floating  point
     stack  will  be  empty  at the start of the inline expansion
     template, and must be empty (except for a returned  floating
     point value) at the end.


EXAMPLES

     Please review libm.il or vis.il for examples. You can find a
     version  of  these  libraries  that is specific to each sup-
     ported architecture under the compiler's lib/ directory.


WARNING

     inline does not check for violations of the  coding  conven-
     tions described above.


SEE ALSO:

     "Techniques for Optimizing  Applications:  High  Performance
     Computing"  by  Rajat P. Garg and Ilya Sharapov uses Fortran
     to provide a useful explanation  of  inline  templates.  See
     Chapter 8.

     "The SPARC Architecture Manual Version 9" provided by  SPARC
     International Inc. at http://www.sparc.com/resource.htm. See
     appendix G.