|
General Get Started Status |
eXtensible Programming Language Core (XPL-Core)Contents
1. IntroductionIn order to create your own XPL-based programming language, it is necessary to understand the primitives upon which it is built. This document contains the definition of the XPL-Core language which is at the heart of any XPL based language. The XPL-core language is akin to the assembly language of a programming system that also includes higher-order languages. Its semantics are very analogous to the LLVM assembly language. In this specification, we use the Relax NG schema for XPL-core directly as it is both succinct and fairly easy to read. In general, Relax NG schemata define a set of patterns. Patterns specify what is allowed for element and attribute content, or portions of them. Patterns are created with the <define> element and referenced in other patterns with the <ref> element. The definition of a pattern consists of several kinds of elements that define selection of choices, optionality, repetition, elements and attributes. If all this is unfamiliar to you, or you feel you need a primer on Relax NG, it is recommended that you review the Relax NG Tutorial provided by Oasis. 2. Abstract DefinitionsBefore we begin the detailed specification, there are some fundamental Relax NG patterns that are used throughout the XPL-Core schema. Those patterns are defined here, first, so we can get them out of the way and use them in subsequent pattern definitions. Pattern NamingThe pattern definitions in the XPL-core schema use a naming convention that helps determine what the pattern is used for. Each pattern name has a suffix that starts with a period. The suffixes have the following meanings:
Identifier<define name="Identifier.type"> <choice> <ref name="Unprefixed_Identifier.type"/> <ref name="Prefixed_Identifier.type"/> </choice> </define> <define name="Unprefixed_Identifier.type"> <data type="string"> <param name="pattern">[^:]+</param> <param name="maxLength">1024</param> </data> </define> <define name="Prefixed_Identifier.type"> <data type="string"> <param name="pattern">[^:]+:[^:]+</param> <param name="maxLength">1024</param> </data> </define> These three patterns define identifiers: Identifier, Unprefixed_Identifier, and Prefixed_Identifier. The Identifier pattern simply allows a choice between the prefixed and un-prefixe identifier types. Any identifier is up to 1024 characters from the NCName XML data type. That is they pretty much allow any character except the colon character. A prefixed name permits the colon character between the prefix and the name portion of the identifier. Prefixed names are used to reference definitions in other modules. The prefix must match the prefix of an import statement in the current module. Documentation<define name="Documentation.pat"> <optional> <element name="doc"> <zeroOrMore> <choice> <text/> <ref name="Any.pat"/> </choice> </zeroOrMore> </element> </optional> </define> <define name="Any.pat"> <element> <anyName/> <zeroOrMore> <choice> <attribute> <anyName/> </attribute> <text/> <ref name="Any.pat"/> </choice> </zeroOrMore> </element> </define> The Documentation.pat pattern is used as the basis for any XPL element definition that can include documentation. It simply defines an optional <doc> element that allows XPL programs to be documented directly. The <doc> element allows text content to be interleaved with any element. The Any.elem pattern defines an arbitrary element with arbitrary content. However, the only acceptable content is the XHTML Text Module which includes the common formatting elements for XHTML permitted within the <body> element. This pattern does not formally restrict the content to XHTML because the overhead of specifying those patterns in the XPL-Core schema is too high for content that has a net-zero effect on the program being compiled. It is up to the author to use the correct element combinations per the XHTML standard. Documentation generators may apply the XHTML schema to the content of the <doc> elements when documentation is being generated. Names And Types<define name="Named_Element.pat"> <ref name="Documentation.pat"/> <attribute name="name"> <ref name="Identifier.type"/> </attribute> </define> <define name="Typed_Element.pat"> <ref name="Documentation.pat"/> <attribute name="type"> <ref name="Identifier.type"/> </attribute> </define> <define name="Named_Typed_Element.pat"> <ref name="Documentation.pat"/> <attribute name="name"> <ref name="Identifier.type"/> </attribute> <attribute name="type"> <ref name="Identifier.type"/> </attribute> </define> These patterns define element types that include attributes name, type, or both. The name attribute is used in XPL-core definitions that require a unique name. The type attribute is used in elements that refer to a named type. The content of both of these attributes is Identifier.type Boolean<define name="Boolean.type"> <choice> <data type="boolean"/> <value>TRUE</value> <value>True</value> <value>yes</value> <value>Yes</value> <value>YES</value> <value>1</value> <value>FALSE</value> <value>False</value> <value>no</value> <value>No</value> <value>NO</value> <value>0</value> </choice> </define> The Boolean.type pattern defines a data type that results in a true or false value. There are several ways to express a boolean to aid in readability. Linkage<define name="Linkage.type"> <choice> <value>appending</value> <value>external</value> <value>internal</value> <value>linkonce</value> <value>weak</value> </choice> </define> Both variables and functions participate in linking (binding of modules together to form a complete program). Together they are known as global values because their values (an address in the program) are global to the program. We'll describe the linkage types here so it doesn't have to be repeated in the definitions of variables and functions to follow. Linkage of a global value is specified with the linkage attribute. If no linkage attribute is provided, the linkage defaults to external. The following linkage types are available:
Value<define name="Value.pat"> <choice> <ref name="Binary_Operators.pat"/> <ref name="Bitwise_Operators.pat"/> <ref name="Constant.pat"/> <ref name="Invoke.elem"/> <ref name="Get.elem"/> <ref name="Alloca.elem"/> <ref name="Malloc.elem"/> <ref name="Index.elem"/> <ref name="Cast.elem"/> <ref name="Select.elem"/> <ref name="Call.elem"/> <ref name="VA_Arg.elem"/> <ref name="VA_Next.elem"/> </choice> </define> The Value.pat pattern is used any where a computable value is required as an operand to an instruction. It consists of all of the instructions that return a value as well as literal constants. Note that the type of the value is not factored into the schema at all. A syntactically valid program may not be correct as the types of operands to most instructions must agree. Constant<define name="Constant.pat"> <choice> <ref name="Reference.elem"/> <ref name="Literal.pat"/> <ref name="Cast.elem"/> <ref name="Index.elem"/> <ref name="Binary_Operators.pat"/> <ref name="Bitwise_Operators.pat"/> </choice> </define> The Constant.pat pattern is used any where a constant value is required. It consists of literal constants, references to variables and functions, and a few of the instructions that can be used to derive constant values. These will all be defined later in this document but it is important to recognize at this point that some patterns require constant rather than variable values and this pattern defines what is a valid constant. ExtensionsSince XPL is all about extending itself to do additional things, it is natural that the notion of an extension be reflected in the schema. However, there is no one pattern that defines extensions in XPL-core. Instead there is an idiom. This is required because of the way Relax NG works. To specify an extension in Relax NG, we use the combine attribute. This attribute may have the values choice or interleave and identifies a pattern that may be re-defined multiple times. The combine attributes specifies the way in which these multiple pattern definitions are combined to arrive at a final definition for the pattern. In this way, we can allow re-definition or alternate definition of a pattern. In XPL-core, any pattern defined with the combine attribute having a value of choice and empty content is an extension pattern. The name of such patterns will have the .ext suffix to further identify them. For example: <define name="Example.ext" combine="choice"> <empty/> </define> Note that this is not part of the XPL-core schema but simply an example of the extension idiom. 3. High Level StructureXPL is XMLIf you hadn't noticed already, all XPL programs are proper XML documents. It is XML that gives XPL its extensibility features. As with any XML document, you need to declare it as XML. The following content must come first in any XPL program: <?xml version="1.0" encoding="UTF-8"?> XPL should be encoded in either UTF-8 or UTF-16. Compilation Units<define name="XPL.elem"> <element name="XPL"> <ref name="Documentation.pat"/> <oneOrMore> <interleave> <ref name="Module.elem"/> <ref name="XPL.ext"/> </interleave> </oneOrMore> </element> </define> <define name="XPL.ext" combine="choice"><empty/></define> The content of the <XPL> element is called a compilation unit. It defines what the compiler will compile into code. A compilation unit is simply a set of interleaved modules and compilation unit extensions. The start of a compilation unit is defined by an <XPL> element that declares the language of the compilation unit. For the XPL-core language, we use the "xplcore.rng" schema and its corresponding namespace in the usual XML way: <XPL xmlns="http://x-p-s.org/XPS/xps/schemas/xplcore.rng"> Other languages may use other name spaces and top level element names. Modules<define name="Module.elem"> <element name="Module"> <ref name="Module.pat"/> </element> </define> <define name="Module.pat"> <ref name="Named_Element.pat"/> <attribute name="pubid"><data type="anyURI"/></attribute> <optional> <attribute name="prefix"><data type="NCName"/></attribute> </optional> <zeroOrMore> <ref name="Import.elem"/> </zeroOrMore> <zeroOrMore> <choice> <ref name="Type.pat"/> <ref name="Variable.elem"/> <ref name="Function.elem"/> <ref name="Module.ext"/> </choice> </zeroOrMore> </define> <define name="Import.elem"> <element name="import"> <ref name="Documentation.pat"/> <attribute name="prefix"> <ref name="Identifier.type"/> </attribute> <attribute name="pubid"> <data type="anyURI"/> </attribute> </element> </define> <define name="Module.ext" combine="choice"><empty/></define> The Module element defines a module of definitions that are grouped together under one name. Each Module has a pubid attribute of type anyURI This attribute is the public identifier for the module and it must uniquely identify the module. The URL also provides the means by which the module may be retrieved if it is not available locally. Modules may import the definitions of other modules using an import element. If a module imports the definitions of another module, the import elements must occur at the beginning of the module. For example, suppose module "InputOutput" defines I/O primitives that the "HelloWorld" module needs to display its message. You would import those definitions with: <import name="InputOutput" publid="file:///xps/io/InputOutput.xpl" prefix="io"> In this example, we have declared that elements prefixed with io in the current module refer to definitions found in the InputOutput module which the compiler can find at file:////xps/io/InputOutput.xpl. As many import statements as necessary can be used. The content of a module consists of only four types of definitions:
Each of these are defined in the following sections. Variables<define name="Variable.elem"> <element name="Var"> <ref name="Named_Typed_Element.pat"/> <optional> <choice> <element name="const"> <ref name="Constant.pat"/> </element> <element name="init"> <ref name="Constant.pat"/> </element> <element name="zero"> <empty/> </element> </choice> </optional> <optional> <attribute name="linkage"> <ref name="Linkage.type"/> </attribute> </optional> <ref name="Variable.ext"/> </element> </define> <define name="Variable.ext" combine="choice"><empty/></define> Variables at the module level are defined with the <Var> element. They are global and therefore available for linkage. Variables have a name and type, specified with attributes, and an initializer specified with one of three elements:
Functions<define name="Function.elem"> <element name="Function"> <ref name="Named_Typed_Element.pat"/> <optional> <attribute name="symbol"><ref name="Identifier.type"/></attribute> </optional> <optional> <attribute name="linkage"><ref name="Linkage.type"/></attribute> </optional> <zeroOrMore> <element name="var"> <ref name="Named_Typed_Element.pat"/> </element> </zeroOrMore> <zeroOrMore> <ref name="Block.elem"/> </zeroOrMore> <ref name="Function.ext"/> </element> </define> <define name="Function.ext" combine="choice"><empty/></define> Functions are defined with the <Function> element. They are global and therefore available for linkage. Functions have a name and a signature (type) that are specified with the name and type attributes. Two element types are permitted in the content of a function: <var> and <block>. These define the automatic (stack) variables the function uses and the basic blocks of instructions in the function, respectively. Function definitions need to store partial results in variables that are automatically allocated when the function is active. Automatic variables do not participate in linkage and have a life span no longer than the lifespan of the function on the dynamic stack. The initial portion of a function definition may define as many automatic variables as needed by the function using the <var> element which requires a name and a type attribute. The <var> element allows an optional attribute named gc. If this attribute is set to the value true then the variable is identified as a stack root for garbage collection. In this case, the variable must have pointer type. Stack roots help the garbage collector determine which memory areas in the heap are "live" and therefore not available for collection. Basic Blocks<define name="Block.elem"> <element name="block"> <optional> <attribute name="label"><ref name="Identifier.type"/></attribute> </optional> <zeroOrMore> <choice> <ref name="Binary_Operators.pat"/> <ref name="Bitwise_Operators.pat"/> <ref name="Memory_Accessors.pat"/> <ref name="Other_Instructions.pat"/> </choice> </zeroOrMore> <ref name="Terminals.pat"/> </element> </define> Basic blocks define what a function does. Each basic block is a list of instructions that are executed sequentially terminating in one of the terminator instructions which will transfer control to another block or return from the function. Each block may have a label which is the name used in a <br> (branch) instruction to transfer control to the first instruction in the basic block. All blocks except the first must have a label. 4. Types<define name="Type.pat"> <choice> <ref name="Alias.elem"/> <ref name="Atom.elem"/> <ref name="Enumeration.elem"/> <ref name="Pointer.elem"/> <ref name="Array.elem"/> <ref name="Vector.elem"/> <ref name="Aggregate.elem"/> <ref name="Signature.elem"/> <ref name="Opaque.elem"/> </choice> </define> Types in XPL define how memory should be interpreted. There are two classes of types in XPL: atomic and compound. Atomic types are indivisible. Their values cannot be further decomposed. Compound types are groups of atomic (or other compound) types whose elements can be selected. XPL defines a set of intrinsic (pre-defined) atomic types but no intrinsic compound types. Atomic types are defined with the Atom, Enumeration and Pointer elements. Compound types are defined with the Vector, Array and Aggregate elements. There are three additional type definitions: Alias simply allows types to be renamed; Signature defines a function type; and, Opaque defines a type with undisclosed structure. The following sections define each of these XPL types. Aliases<define name="Alias.elem"> <element name="Alias"> <ref name="Named_Element.pat"/> <attribute name="renames"><ref name="Identifier.type"/></attribute> </element> <ref name="Alias.ext"/> </define> <define name="Alias.ext" combine="choice"><empty/></define> Aliases are simply a renaming of an existing type. This is just syntactic sugar to make programs a little more readable. In particular, the intrinsic atom types such as i32 are often renamed to give the type more meaning. Aliases can rename both atomic and compound types. Atomic Types<define name="Atom.elem"> <element name="Atom"> <ref name="Named_Element.pat"/> <attribute name="is"> <ref name="Intrinsic_Atoms.type"/> </attribute> <ref name="Atom.ext"/> </element> </define> <define name="Intrinsic_Atoms.type"> <choice> <value>bool</value> <value>char</value> <value>f32</value> <value>f64</value> <value>i8</value> <value>i16</value> <value>i32</value> <value>i64</value> <value>u8</value> <value>u16</value> <value>u32</value> <value>u64</value> <value>void</value> </choice> </define> <define name="Atom.ext" combine="choice"><empty/></define> At first blush, you might think that the <Atom> element is unnecessary, but it is important for extensions. The XPL-Core atom definition is simply an equivalent of an Alias, with one exception: it permits the Atom.ext pattern to be part of the definition. This allows extensions to define new kinds of atomic data types. Enumeration Types<define name="Enumeration.elem"> <element name="Enum"> <ref name="Named_Element.pat"/> <oneOrMore> <element name="value"> <ref name="Named_Element.pat"/> <ref name="Constant.pat"/> </element> </oneOrMore> <ref name="Enumeration.ext"/> </element> </define> <define name="Enumeration.ext" combine="choice"><empty/></define> The Enumeration element defines an atomic integer type that has named constant values. The element consists of a list of name/value pairs. Note that the corresponding size in bytes of an enumeration depends on the range of constant values (not the number of enumerators) in the enumeration. The fewest number of bytes required to store the complete range of values will be used. Pointers<define name="Pointer.elem"> <element name="Pointer"> <ref name="Named_Element.pat"/> <attribute name="to"><ref name="Identifier.type"/></attribute> <ref name="Pointer.ext"/> </element> </define> <define name="Pointer.ext" combine="choice"><empty/></define> The Pointer element defines an atomic type whose content is a pointer to another storage location. All pointers are typed in XPL. Vectors<define name="Vector.elem"> <element name="Vector"> <ref name="Named_Element.pat"/> <attribute name="length"><data type="nonNegativeInteger"/></attribute> <attribute name="of"> <ref name="Intrinsic_Atoms.type"/> </attribute> <ref name="Vector.ext"/> </element> </define> <define name="Vector.ext" combine="choice"><empty/></define> The Vector element defines a type whose elements are of uniform atomic type. They are similar to arrays but have two differences: the elements must be of atomic type and the vector is packed to eliminate wasted space. Vectors are useful for supporting processors that have vector type operations such as SIMD (Single Instruction, Multiple Data) architectures. Aggregate<define name="Aggregate.elem"> <element name="Aggregate"> <ref name="Named_Element.pat"/> <zeroOrMore> <element name="field"> <ref name="Named_Typed_Element.pat"/> </element> </zeroOrMore> <ref name="Aggregate.ext"/> </element> </define> <define name="Aggregate.ext" combine="choice"><empty/></define> The Aggregate element defines a type whose elements are named and may have any type. Each element in an aggregate is called a field and has a name. The field implies an offset from the beginning of the aggregate at which location the field's data is stored. This is similar to the struct declaration in the "C" language. Fields of the aggregate are accessed with the index instruction. Array<define name="Array.elem"> <element name="Array"> <ref name="Named_Element.pat"/> <attribute name="length"><data type="nonNegativeInteger"/></attribute> <attribute name="of"><ref name="Identifier.type"/></attribute> <ref name="Array.ext"/> </element> </define> <define name="Array.ext" combine="choice"><empty/></define> The Array element defines a type whose elements have arbitrary but uniform type. The number of elements is specified with the length attribute. Only uni-dimensional arrays may be defined, but multi-dimensional arrays can be constructed by defining an array whose element type is also an array. Elements are accessed by indexing them relative to the base (first) element of the array with the index instruction. Signature Types<define name="Signature.elem"> <element name="Signature"> <ref name="Named_Element.pat"/> <attribute name="result"><ref name="Identifier.type"/></attribute> <optional> <attribute name="varargs"><ref name="Boolean.type"/></attribute> </optional> <optional> <attribute name="cc"><ref name="Calling_Convention.type"/></attribute> </optional> <zeroOrMore> <element name="arg"> <ref name="Named_Typed_Element.pat"/> </element> </zeroOrMore> <ref name="Signature.ext"/> </element> </define> <define name="Signature.ext" combine="choice"><empty/></define> The Signature element defines a type of function. Signatures do not define a memory structure but rather the interface or protocol for calling a given type of function. Signatures define a function with respect to the following factors:
Opaques<define name="Opaque.elem"> <element name="Opaque"> <ref name="Named_Element.pat"/> <empty/> </element> <ref name="Opaque.ext"/> </define> <define name="Opaque.ext" combine="choice"><empty/></define> The Opaque element defines a type of undisclosed structure. Opaques are not atomic or compound, but could be either when linking resolves the opaque types. Opaques are a way of defining encapsulated types that shield their users from the internal details of the opaque type. It is not valid to declare a variable of opaque type. It is very common to declare a pointer type that points to an opaque type which is then cast to the actual type by the type's implementation module. 5. InstructionsInstructions in XPL-Core define the basic operations that can be performed in defining a function. The XPL-Core instruction set consists of several different classifications of instructions: Terminators, Binary Operators, Bitwise Binary Operators, Memory Access Operators, and Other Operators as defined in the following sections. In XPL, instructions are composable. That is, most instructions produce a value. That value is either discarded or used as the value for the operand of another instruction. For example, suppose we wanted to increment a variable x by the value of variable y. We could write this in XPL as: <put> <ref name="x"/> <add> <get><ref name="x"/></get> <get><ref name="y"/></get> </add> </put> Note how the add instruction is the value to be placed into the variable x since it is the second operand of the put instruction. Similarly note how the two get instructions are used as the operands of the add instruction. This composition of instructions makes the use of many temporary variables unnecessary. Terminators<define name="Terminators.pat"> <choice> <ref name="Return.elem"/> <ref name="Branch.elem"/> <ref name="Switch.elem"/> <ref name="Invoke.elem"/> <ref name="Unwind.elem"/> <ref name="Unreachable.elem"/> </choice> </define> The terminator instructions can only be used as the last instruction of a basic block. They are illegal anywhere else. Terminators all transfer program control to some other basic block. Each terminator instruction is defined below. ret<define name="Return.elem"> <element name="ret"> <optional> <ref name="Value.pat"/> </optional> </element> </define>The ret instruction is used to return control (and possibly a value) from a function back to the caller of that function. The content of the <ret> element specifies the value to be returned. If the content is empty, no value is returned. The type of the returned value must match the type specified in the function signature's result attribute. If the returned value is empty, the function signature's result must be void.
When the ret instruction is executed, control flow returns back to the calling function's context. If the call was made using a call instruction, execution continues at the instruction after the call. If the call was made with an invoke instruction, execution continues at the beginning of the normal destination block, specified by the to attribute of the invoke instruction. If the instruction returns a value, that value will be used as the value of the call or invoke instruction. br<define name="Branch.elem"> <element name="br"> <choice> <group> <attribute name="to"><ref name="Identifier.type"/></attribute> <empty/> </group> <group> <attribute name="then"><ref name="Identifier.type"/></attribute> <attribute name="else"><ref name="Identifier.type"/></attribute> <ref name="Value.pat"/> </group> </choice> </element> </define> The br instruction has two forms: conditional and unconditional. When used with the to attribute, the instruction unconditionally transfers control to the block named by the attribute and the element has empty content. In its conditional form, the instruction requires both the then and else attributes to specify the blocks to which control is transfered if the condition is satisfied or unsatisfied, respectively. The condition is provided in the content of the element and may be any boolean value. switch<define name="Switch.elem"> <element name="switch"> <attribute name="default"><ref name="Identifier.type"/></attribute> <element name="on"><ref name="Value.pat"/></element> <zeroOrMore> <element name="jump"> <attribute name="to"><ref name="Identifier.type"/></attribute> <ref name="Value.pat"/> </element> </zeroOrMore> </element> </define> The switch instruction selects a particular case from a set of alternatives and transfers control to the matching case. The first operand of the switch must be an integer control value which is compared to the various alternatives given by the jump elements. The switch element must have a default attribute which specifies the label to which control is transferred if the control value does not match any of the alternatives. Each alternative is specified by a jump element whose to attribute specifies the label of the block to which control is transferred and whose content specifies the integer value to be matched against the control value. This instruction is not limited in its number of operands. There can be as many jump alternatives as necessary. For example: <switch default="dflt"> <get><ref name="control_var"></get> <jump to="case1"><dec>1</dec></jump> <jump to="case2"><dec>2</dec></jump> <jump to="case3"><dec>3</dec></jump> </switch> In this example, the control value is the value of the variable named control_var as indicated by the get and ref instructions. There are three alternatives that jump to basic blocks labelled case1, case2, and case3 based on the decimal integer literal constants 1, 2 and 3. If the value of control_var is not 1,2, or 3, then control is transferred to the basic block named dflt. invoke<define name="Invoke.elem"> <element name="invoke"> <attribute name="var"><ref name="Identifier.type"/></attribute> <attribute name="to"><ref name="Identifier.type"/></attribute> <attribute name="except"><ref name="Identifier.type"/></attribute> <ref name="Value.pat"/> <zeroOrMore> <ref name="Value.pat"/> </zeroOrMore> </element> </define> The invoke instruction is similar to a call instruction in that it transfers control to another function by calling it. However, call and invoke differ in the way they return. A call instruction is not terminating because after the called function returns, execution resumes with the instruction after the call instruction. This is not the case with invoke. Similar to a conditional branch, an invoke instruction requires two labels to which control is transferred to one based on how the called function returns. The to attribute specifies the branch for normal returns, that is when the ret instruction is used to return. The except attribute specifies the branch for exceptional returns, that is when the unwind instruction is used to return from any subordinate function context. The var attribute is used to capture the result value of the called function. Since the invoke instruction is a terminator instruction, it cannot be used as the the operand of another instruction (terminator instructions have no value). unwind<define name="Unwind.elem"> <element name="unwind"><empty/></element> </define> The unwind instruction unwinds the call stack, continuing the execution at the "except" label of the first caller in the dynamic call stack that used the "invoke" instruction for the call. This is primarily used to implement exception handling but could be used for other purposes. Undefined results occur if an unwind instruction is executed with no prior invoke instruction in the dynamic call stack. unreachableThe unreachable instruction informs the compiler that execution beyond the unreachable instruction cannot occur due to higher order semantics such as a function that does not return or that only throws an exception. Binary Operators<define name="Binary_Operators.pat"> <choice> <ref name="Add.elem"/> <ref name="Subtract.elem"/> <ref name="Multiply.elem"/> <ref name="Divide.elem"/> <ref name="Remainder.elem"/> <ref name="Equal.elem"/> <ref name="NotEqual.elem"/> <ref name="GreaterThan.elem"/> <ref name="GreaterEqual.elem"/> <ref name="LessThan.elem"/> <ref name="LessEqual.elem"/> </choice> </define> <define name="Binary_Operator.pat"> <ref name="Value.pat"/> <ref name="Value.pat"/> </define> The binary operators all require two operands of the same integral or floating point type, as specified in the Binary_Operator.pat pattern. These operators provide the basic mathematical operations:
and the comparison operations:
Bitwise Binary Operators<define name="Bitwise_Operators.pat"> <choice> <ref name="And.elem"/> <ref name="Or.elem"/> <ref name="Xor.elem"/> <ref name="ShiftLeft.elem"/> <ref name="ShiftRight.elem"/> </choice> </define> The bitwise binary operators all require two operands of the same integral type. These operators provide bitwise logical operators:
For the shr and shl instructions, the second operand may be of any unsigned integer type and specifies the number of bits to shift. Memory Access Operators<define name="Memory_Accessors.pat"> <choice> <ref name="Get.elem"/> <ref name="Put.elem"/> <ref name="Alloca.elem"/> <ref name="Malloc.elem"/> <ref name="Free.elem"/> <ref name="Index.elem"/> </choice> </define> get<define name="Get.elem"> <element name="get"> <ref name="Value.pat"/> <optional> <attribute name="volatile"> <ref name="Boolean.type"/> </attribute> </optional> <optional> <element name="gc"> <ref name="Value.pat"/> </element> </optional> </element> </define> The get instruction loads a value from memory. The value of the instruction is the value loaded from memory. The instruction takes a single operand, in the element content, which must be of pointer to an atomic type. If the gc element is used, it specifies that the loading of the memory is from a memory area that was allocated with garbage collection semantics. The content of the gc element must contain a value that specifies the base address of the memory area that was allocated. This assists garbage collection with identification of stores that require read barriers. If the value of the volatile attribute is true then the instruction will not be optimized out and a load of the value will always occur. put<define name="Put.elem"> <element name="put"> <ref name="Value.pat"/> <ref name="Value.pat"/> <optional> <attribute name="volatile"><ref name="Boolean.type"/></attribute> </optional> <optional> <element name="gc"> <ref name="Value.pat"/> </element> </optional> </element> </define> The put instruction stores a value into memory. This instruction has no value so it must occur as the immediate child of a block element (i.e. it can't be the operand of some other instruction). If the gc element is used, it specifies that the storing of the memory is to a memory area that was allocated with garbage collection semantics. The content of the gc element must contain a value that specifies the base address of the memory area that was allocated. This assists garbage collection with identification of stores that require write barriers. If the value of the volatile attribute is true then the instruction will not be optimized out and a store of the value will always occur. alloca<define name="Alloca.elem"> <element name="alloca"> <ref name="Typed_Element.pat"/> <ref name="Value.pat"/> </element> </define> The alloca instruction allocates a block of memory from the stack. Its operands, a type and a count, determine the size of the memory block allocated, which is defined as sizeof(type)*count. The value of the instruction is a pointer to the base of the memory area allocated. The type attribute names the type to be allocated. malloc<define name="Malloc.elem"> <element name="malloc"> <ref name="Typed_Element.pat"/> <ref name="Value.pat"/> </element> </define> The malloc instruction allocates a block of memory. Its operands, a type, and a count determine the size of the memory block allocated, which is defined as sizeof(type)*count. The value of the instruction is a pointer to the base of the memory area allocated. The type attribute names the type to be allocated. free<define name="Free.elem"> <element name="free"> <ref name="Value.pat"/> </element> </define> The free instruction returns memory back to the unused memory heap so it may be reallocated again. Its single operand, a pointer value, must be the same value as returned from a malloc instruction. Access to the freed memory subsequent to this instruction's execution may lead to undefined results. This instruction has no value and may not be used as an operand to another instruction. index<define name="Index.elem"> <element name="index"> <ref name="Value.pat"/> <zeroOrMore> <choice> <element name="idx"><ref name="Value.pat"/></element> <element name="fld"><ref name="Named_Element.pat"/></element> </choice> </zeroOrMore> </element> </define> The index operator makes it possible to index into an array, vector or aggregate to extract a pointer to one of the elements. The instruction must have at least two operands. The first operand is always a pointer to the base of the array, vector or aggregate being indexed. The second and subsequent operands must be either <idx> elements or <fld> elements for indexing into an array/vector or an aggregate, respectively. The <idx> element takes an integer value that specifies the element index into the array or vector. The <fld> element requires a name attribute that must specify a valid field name for the aggregate being indexed at that point. Other Instructions<define name="Other_Instructions.pat"> <choice> <ref name="Cast.elem"/> <ref name="Select.elem"/> <ref name="Call.elem"/> <ref name="VA_Arg.elem"/> <ref name="VA_Next.elem"/> </choice> </define> Operators in this section defy categorization. cast<define name="Cast.elem"> <element name="cast"> <ref name="Value.pat"/> <attribute name="to"><ref name="Identifier.type"/></attribute> </element> </define> The cast instruction permits conversion between types which really amounts to re-interpretation of memory. This instruction can be used to do conversion of integer and floating point values, changing the size of integer values, and breaking type safety rules by converting pointer types. The to attribute provides the name of the type to which the value is cast. The first operand provides the value being casted. The instruction yields a value of the type specified. select<define name="Select.elem"> <element name="select"> <ref name="Value.pat"/> <ref name="Value.pat"/> <ref name="Value.pat"/> </element> </define> The select instruction allows the use of a condition to select one of two atomic values based on a condition without the use of branching. The instruction takes three operands. The first must be a boolean value that specifies the condition. If the condition is "true", the second operand becomes the value of the instruction. If the condition is false, the third operand becomes the value of the instruction. The second and third operands must have the same type. call<define name="Call.elem"> <element name="call"> <ref name="Value.pat"/> <zeroOrMore> <ref name="Value.pat"/> </zeroOrMore> </element> </define> The call instruction performs a call to a function. Control flow transfers to a specified function with its incoming arguments bound to the value specified in the call instruction. Upon a ret instruction in the called function, control flow continues with the instruction immediately after the call instruction and the return value of the function is bound to the operand of the ret instruction (if any). Note that if an unwind instruction is executed in the called function, control will not be transferred to the instruction following the call instruction. See unwind and invoke for details. 5. Literal Constants<define name="Literal.pat"> <choice> <ref name="Binary_Literal.elem"/> <ref name="Octal_Literal.elem"/> <ref name="Decimal_Literal.elem"/> <ref name="Hexadecimal_Literal.elem"/> <ref name="Boolean_Literal.elem"/> <ref name="Character_Literal.elem"/> <ref name="Real_Literal.elem"/> <ref name="Text_Literal.elem"/> <ref name="Null_Literal.elem"/> <ref name="Array_Literal.elem"/> <ref name="Vector_Literal.elem"/> <ref name="Aggregate_Literal.elem"/> </choice> </define> <define name="Typed_Literal.pat"> <optional> <attribute name="type"><ref name="Identifier.type"/></attribute> </optional> </define> Literal constants come in varios forms. Both atomic and compound types can have literal constants. The sections below define the XPL-Core literal ref<define name="Reference.elem"> <element name="ref"> <ref name="Named_Element.pat"/> <empty/> </element> </define>The ref element returns a constant pointer to a global value (variable or function) or an automatic (stack) variable. bin<define name="Binary_Literal.elem"> <element name="bin"> <ref name="Typed_Literal.pat"/> <ref name="Binary.type"/> </element> </define> <define name="Binary.type"> <data type="string"> <param name="minLength">1</param> <param name="maxLength">1024</param> <param name="pattern">[01]+</param> </data> </define> The bin element permits a binary integer literal constant to be specified. A binary integer consists of a string of up to 1024 binary digits (0s or 1s). If the number of bits exceeds the range of the type required, high-order bits will be truncated. If fewer bits are specified than required by the type, the value will be zero filled in the high-order bits.
oct<define name="Octal_Literal.elem"> <element name="oct"> <ref name="Typed_Literal.pat"/> <ref name="Octal.type"/> </element> </define> <define name="Octal.type" > <data type="string"> <param name="minLength">1</param> <param name="maxLength">1024</param> <param name="pattern">[0-7]+</param> </data> </define> The oct element permits an octal integer literal constant to be specified. An octal integer constant is a string of up to 1024 octal digits (base 8). Only digits in the range 0-7 are permitted. This is for compatibility with older 7 bit systems or situations where an octal encoding is more natural (e.g. places where groups of 3 bits are used frequently. If the number of octal digits exceeds the range of the type required, high-order bits (not octal digits) will be truncated. If fewer digits are specified than required by the type, the value will be zero filled in the high-order bits. dec<define name="Decimal_Literal.elem"> <element name="dec"> <ref name="Typed_Literal.pat"/> <ref name="Decimal.type"/> </element> </define> <define name="Decimal.type"> <data type="string"> <param name="minLength">1</param> <param name="maxLength">1024</param> <param name="pattern">[+\-]?\d+</param> </data> </define> The dec element permits a decimal integer literal constant to be specified. A decimal integer constant is a string of up to 1024 decimal digits (base 10). Only digits in the range 0-9 are permitted. If the number of decimal digits exceeds the range of the type required, high-order bits (not decimal digits) will be truncated. If fewer digits are specified than required by the type, the value will be zero filled in the high-order bits. hex<define name="Hexadecimal_Literal.elem"> <element name="hex"> <ref name="Typed_Literal.pat"/> <ref name="Hexadecimal.type"/> </element> </define> <define name="Hexadecimal.type"> <data type="string"> <param name="minLength">1</param> <param name="maxLength">1024</param> <param name="pattern">([0-9A-Fa-f][0-9A-Fa-f])+</param> </data> </define> The hex element permits a hexadecimal integer literal constant to be specified. An hexadecimal integer constant is a string of up to 1024 decimal digits (base 16). Only digits in the range 0-9 and A-F (or a-f) are permitted. If the number of hexadecimal digits exceeds the range of the type required, high-order bits (not hexadecimal digits) will be truncated. If fewer digits are specified than required by the type, the value will be zero filled in the high-order bits. true and false<define name="Boolean_Literal.elem"> <choice> <element name="true"><empty/></element> <element name="false"><empty/></element> </choice> </define> The true and the false elements represent the two boolean values as literal constants. char <define name="Character_Literal.elem">
<element name="char"><ref name="Character.type"/></element>
</define>
<define name="Character.type">
<choice>
<data type="string"><param name="length">1</param></data>
<data type="string">
<param name="length">5</param>
<param name="pattern">[#][0-9A-Fa-f]{4,4}</param>
</data>
</choice>
</define>
The char element permits a single unicode character literal to be specified. flt and dbl <define name="Real_Literal.elem">
<choice>
<element name="flt"><ref name="Real.type"/></element>
<element name="dbl"><ref name="Real.type"/></element>
</choice>
</define>
<define name="Real.type">
<data type="string">
<param name="minLength">1</param>
<param name="maxLength">1024</param>
<param name="pattern">
ninf|pinf|nan|signan|zero|nzero|
[+\-]?0x[0-9A-Fa-f](\.[0-9A-Fa-f]+)?p[-+][0-9]+|
#[0-9A-Fa-f]{16}|
#[0-9a-fA-F]{8}|
[+\-]?\d+\.\d*([Ee][+\-]?\d+)?</param>
</data>
</define>
The flt and dbl elements are used to specify floating point literal constants corresponding to IEEE 32-bit single-precision and IEEE 64-bit double-precision, respectively. Several forms of floating point numbers are accepted, as follows:
text<define name="Text_Literal.elem"> <element name="text"><text/></element> </define> The text element can be used to initialize an array of u8 (UTF-8 encoded) or u16 (UTF-16 encoded) elements. The textual characters are placed into successive elements in the array. This is simply a short hand for using the more verbose char element to initialize the elements of the array. It is invalid to initialize any other kind of array with a text element. null<define name="Null_Literal.elem"> <element name="null"><empty/></element> </define> This element provides a null (zero) value in whatever type it is applied to. array<define name="Array_Literal.elem"> <element name="array"> <ref name="Typed_Element.pat"/> <oneOrMore> <ref name="Constant.pat"/> </oneOrMore> </element> </define> The array element allows a literal array to be specified. This is typically used to initialize an array typed variable. The array literal consists of a list of constants. vector<define name="Vector_Literal.elem"> <element name="vector"> <ref name="Typed_Element.pat"/> <oneOrMore> <ref name="Constant.pat"/> </oneOrMore> </element> </define> The vector element allows a literal vector to be specified. This is typically used to initialize a vector typed variable. The vector literal consists of a list of constants. aggregate<define name="Aggregate_Literal.elem"> <element name="aggregate"> <ref name="Typed_Element.pat"/> <oneOrMore> <ref name="Constant.pat"/> </oneOrMore> </element> </define> The aggregate element allows a literal aggregate constant to be specified. THis is typically used to initialize an aggregate typed variable. The aggregate literal consits of a list of constants that correspond to the fields in the aggregate in declaration order. 7. Intrinsic FunctionsIntrinsic functions have not been defined for XPL-Core yet, however several are planned. When the definitions are complete, documentation will be provided here. |
Technical Resources Notices |