This document is intended as a reference manual for the OCaml language. It lists the language constructs, and gives their precise syntax and informal semantics. It is by no means a tutorial introduction to the language: there is not a single example. A good working knowledge of OCaml is assumed.
No attempt has been made at mathematical rigor: words are employed with their intuitive meaning, without further definition. As a consequence, the typing rules have been left out, by lack of the mathematical framework required to express them, while they are definitely part of a full formal definition of the language.
The syntax of the language is given in BNF-like notation. Terminal symbols are set in typewriter font (like this). Non-terminal symbols are set in italic font (like that). Square brackets […] denote optional components. Curly brackets {…} denotes zero, one or several repetitions of the enclosed components. Curly brackets with a trailing plus sign {…}+ denote one or several repetitions of the enclosed components. Parentheses (…) denote grouping.
|
See also the following language extensions: object notations, first-class modules, overriding in open statements, syntax for Bigarray access, attributes, extension nodes, local exceptions extended indexing operators.
The table below shows the relative precedences and associativity of operators and non-closed constructions. The constructions with higher precedence come first. For infix and prefix symbols, we write “*…” to mean “any symbol starting with *”.
Construction or operator | Associativity |
prefix-symbol | – |
. .( .[ .{ (see section 8.15) | – |
#… | – |
function application, constructor application, tag application, assert, lazy | left |
- -. (prefix) | – |
**… lsl lsr asr | right |
*… /… %… mod land lor lxor | left |
+… -… | left |
:: | right |
@… ^… | right |
=… <… >… |… &… $… != | left |
& && | right |
or || | right |
, | – |
<- := | right |
if | – |
; | right |
let match fun function try | – |
An expression consisting in a constant evaluates to this constant.
An expression consisting in an access path evaluates to the value bound to this path in the current evaluation environment. The path can be either a value name or an access path to a value component of a module.
The expressions ( expr ) and begin expr end have the same value as expr. The two constructs are semantically equivalent, but it is good style to use begin … end inside control structures:
if … then begin … ; … end else begin … ; … end
and ( … ) for the other grouping situations.
Parenthesized expressions can contain a type constraint, as in ( expr : typexpr ). This constraint forces the type of expr to be compatible with typexpr.
Parenthesized expressions can also contain coercions ( expr [: typexpr] :> typexpr) (see subsection 7.7.6 below).
Function application is denoted by juxtaposition of (possibly labeled) expressions. The expression expr argument1 … argumentn evaluates the expression expr and those appearing in argument1 to argumentn. The expression expr must evaluate to a functional value f, which is then applied to the values of argument1, …, argumentn.
The order in which the expressions expr, argument1, …, argumentn are evaluated is not specified.
Arguments and parameters are matched according to their respective labels. Argument order is irrelevant, except among arguments with the same label, or no label.
If a parameter is specified as optional (label prefixed by ?) in the type of expr, the corresponding argument will be automatically wrapped with the constructor Some, except if the argument itself is also prefixed by ?, in which case it is passed as is. If a non-labeled argument is passed, and its corresponding parameter is preceded by one or several optional parameters, then these parameters are defaulted, i.e. the value None will be passed for them. All other missing parameters (without corresponding argument), both optional and non-optional, will be kept, and the result of the function will still be a function of these missing parameters to the body of f.
As a special case, if the function has a known arity, all the arguments are unlabeled, and their number matches the number of non-optional parameters, then labels are ignored and non-optional parameters are matched in their definition order. Optional arguments are defaulted.
In all cases but exact match of order and labels, without optional parameters, the function type should be known at the application point. This can be ensured by adding a type constraint. Principality of the derivation can be checked in the -principal mode.
Two syntactic forms are provided to define functions. The first form is introduced by the keyword function:
|
This expression evaluates to a functional value with one argument. When this function is applied to a value v, this value is matched against each pattern pattern1 to patternn. If one of these matchings succeeds, that is, if the value v matches the pattern patterni for some i, then the expression expri associated to the selected pattern is evaluated, and its value becomes the value of the function application. The evaluation of expri takes place in an environment enriched by the bindings performed during the matching.
If several patterns match the argument v, the one that occurs first in the function definition is selected. If none of the patterns matches the argument, the exception Match_failure is raised.
The other form of function definition is introduced by the keyword fun:
This expression is equivalent to:
An optional type constraint typexpr can be added before -> to enforce the type of the result to be compatible with the constraint typexpr:
is equivalent to
Beware of the small syntactic difference between a type constraint on the last parameter
and one on the result
The parameter patterns ~lab and ~(lab [: typ]) are shorthands for respectively ~lab: lab and ~lab:( lab [: typ]), and similarly for their optional counterparts.
A function of the form fun ? lab :( pattern = expr0 ) -> expr is equivalent to
where ident is a fresh variable, except that it is unspecified when expr0 is evaluated.
After these two transformations, expressions are of the form
If we ignore labels, which will only be meaningful at function application, this is equivalent to
That is, the fun expression above evaluates to a curried function with n arguments: after applying this function n times to the values v1 … vn, the values will be matched in parallel against the patterns pattern1 … patternn. If the matching succeeds, the function returns the value of expr in an environment enriched by the bindings performed during the matchings. If the matching fails, the exception Match_failure is raised.
The cases of a pattern matching (in the function, match and try constructs) can include guard expressions, which are arbitrary boolean expressions that must evaluate to true for the match case to be selected. Guards occur just before the -> token and are introduced by the when keyword:
|
Matching proceeds as described before, except that if the value matches some pattern patterni which has a guard condi, then the expression condi is evaluated (in an environment enriched by the bindings performed during matching). If condi evaluates to true, then expri is evaluated and its value returned as the result of the matching, as usual. But if condi evaluates to false, the matching is resumed against the patterns following patterni.
The let and let rec constructs bind value names locally. The construct
evaluates expr1 … exprn in some unspecified order and matches their values against the patterns pattern1 … patternn. If the matchings succeed, expr is evaluated in the environment enriched by the bindings performed during matching, and the value of expr is returned as the value of the whole let expression. If one of the matchings fails, the exception Match_failure is raised.
An alternate syntax is provided to bind variables to functional values: instead of writing
in a let expression, one may instead write
Recursive definitions of names are introduced by let rec:
The only difference with the let construct described above is that the bindings of names to values performed by the pattern-matching are considered already performed when the expressions expr1 to exprn are evaluated. That is, the expressions expr1 to exprn can reference identifiers that are bound by one of the patterns pattern1, …, patternn, and expect them to have the same value as in expr, the body of the let rec construct.
The recursive definition is guaranteed to behave as described above if the expressions expr1 to exprn are function definitions (fun … or function …), and the patterns pattern1 … patternn are just value names, as in:
This defines name1 … namen as mutually recursive functions local to expr.
The behavior of other forms of let rec definitions is implementation-dependent. The current implementation also supports a certain class of recursive definitions of non-functional values, as explained in section 8.2.
(Introduced in OCaml 3.12)
Polymorphic type annotations in let-definitions behave in a way similar to polymorphic methods:
These annotations explicitly require the defined value to be polymorphic, and allow one to use this polymorphism in recursive occurrences (when using let rec). Note however that this is a normal polymorphic type, unifiable with any instance of itself.
The expression expr1 ; expr2 evaluates expr1 first, then expr2, and returns the value of expr2.
The expression if expr1 then expr2 else expr3 evaluates to the value of expr2 if expr1 evaluates to the boolean true, and to the value of expr3 if expr1 evaluates to the boolean false.
The else expr3 part can be omitted, in which case it defaults to else ().
The expression
|
matches the value of expr against the patterns pattern1 to patternn. If the matching against patterni succeeds, the associated expression expri is evaluated, and its value becomes the value of the whole match expression. The evaluation of expri takes place in an environment enriched by the bindings performed during matching. If several patterns match the value of expr, the one that occurs first in the match expression is selected. If none of the patterns match the value of expr, the exception Match_failure is raised.
The expression expr1 && expr2 evaluates to true if both expr1 and expr2 evaluate to true; otherwise, it evaluates to false. The first component, expr1, is evaluated first. The second component, expr2, is not evaluated if the first component evaluates to false. Hence, the expression expr1 && expr2 behaves exactly as
The expression expr1 || expr2 evaluates to true if one of the expressions expr1 and expr2 evaluates to true; otherwise, it evaluates to false. The first component, expr1, is evaluated first. The second component, expr2, is not evaluated if the first component evaluates to true. Hence, the expression expr1 || expr2 behaves exactly as
The boolean operators & and or are deprecated synonyms for (respectively) && and ||.
The expression while expr1 do expr2 done repeatedly evaluates expr2 while expr1 evaluates to true. The loop condition expr1 is evaluated and tested at the beginning of each iteration. The whole while … done expression evaluates to the unit value ().
The expression for name = expr1 to expr2 do expr3 done first evaluates the expressions expr1 and expr2 (the boundaries) into integer values n and p. Then, the loop body expr3 is repeatedly evaluated in an environment where name is successively bound to the values n, n+1, …, p−1, p. The loop body is never evaluated if n > p.
The expression for name = expr1 downto expr2 do expr3 done evaluates similarly, except that name is successively bound to the values n, n−1, …, p+1, p. The loop body is never evaluated if n < p.
In both cases, the whole for expression evaluates to the unit value ().
The expression
|
evaluates the expression expr and returns its value if the evaluation of expr does not raise any exception. If the evaluation of expr raises an exception, the exception value is matched against the patterns pattern1 to patternn. If the matching against patterni succeeds, the associated expression expri is evaluated, and its value becomes the value of the whole try expression. The evaluation of expri takes place in an environment enriched by the bindings performed during matching. If several patterns match the value of expr, the one that occurs first in the try expression is selected. If none of the patterns matches the value of expr, the exception value is raised again, thereby transparently “passing through” the try construct.
The expression expr1 , … , exprn evaluates to the n-tuple of the values of expressions expr1 to exprn. The evaluation order of the subexpressions is not specified.
The expression constr expr evaluates to the unary variant value whose constructor is constr, and whose argument is the value of expr. Similarly, the expression constr ( expr1 , … , exprn ) evaluates to the n-ary variant value whose constructor is constr and whose arguments are the values of expr1, …, exprn.
The expression constr ( expr1, …, exprn) evaluates to the variant value whose constructor is constr, and whose arguments are the values of expr1 … exprn.
For lists, some syntactic sugar is provided. The expression expr1 :: expr2 stands for the constructor ( :: ) applied to the arguments ( expr1 , expr2 ), and therefore evaluates to the list whose head is the value of expr1 and whose tail is the value of expr2. The expression [ expr1 ; … ; exprn ] is equivalent to expr1 :: … :: exprn :: [], and therefore evaluates to the list whose elements are the values of expr1 to exprn.
The expression `tag-name expr evaluates to the polymorphic variant value whose tag is tag-name, and whose argument is the value of expr.
The expression { field1 [= expr1] ; … ; fieldn [= exprn ]} evaluates to the record value { field1 = v1; …; fieldn = vn } where vi is the value of expri for i = 1,… , n. A single identifier fieldk stands for fieldk = fieldk, and a qualified identifier module-path . fieldk stands for module-path . fieldk = fieldk. The fields field1 to fieldn must all belong to the same record type; each field of this record type must appear exactly once in the record expression, though they can appear in any order. The order in which expr1 to exprn are evaluated is not specified. Optional type constraints can be added after each field { field1 : typexpr1 = expr1 ;… ; fieldn : typexprn = exprn } to force the type of fieldk to be compatible with typexprk.
The expression { expr with field1 [= expr1] ; … ; fieldn [= exprn] } builds a fresh record with fields field1 … fieldn equal to expr1 … exprn, and all other fields having the same value as in the record expr. In other terms, it returns a shallow copy of the record expr, except for the fields field1 … fieldn, which are initialized to expr1 … exprn. As previously, single identifier fieldk stands for fieldk = fieldk, a qualified identifier module-path . fieldk stands for module-path . fieldk = fieldk and it is possible to add an optional type constraint on each field being updated with { expr with field1 : typexpr1 = expr1 ; … ; fieldn : typexprn = exprn }.
The expression expr1 . field evaluates expr1 to a record value, and returns the value associated to field in this record value.
The expression expr1 . field <- expr2 evaluates expr1 to a record value, which is then modified in-place by replacing the value associated to field in this record by the value of expr2. This operation is permitted only if field has been declared mutable in the definition of the record type. The whole expression expr1 . field <- expr2 evaluates to the unit value ().
The expression [| expr1 ; … ; exprn |] evaluates to a n-element array, whose elements are initialized with the values of expr1 to exprn respectively. The order in which these expressions are evaluated is unspecified.
The expression expr1 .( expr2 ) returns the value of element number expr2 in the array denoted by expr1. The first element has number 0; the last element has number n−1, where n is the size of the array. The exception Invalid_argument is raised if the access is out of bounds.
The expression expr1 .( expr2 ) <- expr3 modifies in-place the array denoted by expr1, replacing element number expr2 by the value of expr3. The exception Invalid_argument is raised if the access is out of bounds. The value of the whole expression is ().
The expression expr1 .[ expr2 ] returns the value of character number expr2 in the string denoted by expr1. The first character has number 0; the last character has number n−1, where n is the length of the string. The exception Invalid_argument is raised if the access is out of bounds.
The expression expr1 .[ expr2 ] <- expr3 modifies in-place the string denoted by expr1, replacing character number expr2 by the value of expr3. The exception Invalid_argument is raised if the access is out of bounds. The value of the whole expression is ().
Note: this possibility is offered only for backward compatibility with older versions of OCaml and will be removed in a future version. New code should use byte sequences and the Bytes.set function.
Symbols from the class infix-symbol, as well as the keywords *, +, -, -., =, !=, <, >, or, ||, &, &&, :=, mod, land, lor, lxor, lsl, lsr, and asr can appear in infix position (between two expressions). Symbols from the class prefix-symbol, as well as the keywords - and -. can appear in prefix position (in front of an expression).
Infix and prefix symbols do not have a fixed meaning: they are simply interpreted as applications of functions bound to the names corresponding to the symbols. The expression prefix-symbol expr is interpreted as the application ( prefix-symbol ) expr. Similarly, the expression expr1 infix-symbol expr2 is interpreted as the application ( infix-symbol ) expr1 expr2.
The table below lists the symbols defined in the initial environment and their initial meaning. (See the description of the core library module Pervasives in chapter 25 for more details). Their meaning may be changed at any time using let ( infix-op ) name1 name2 = …
Note: the operators &&, ||, and ~- are handled specially and it is not advisable to change their meaning.
The keywords - and -. can appear both as infix and prefix operators. When they appear as prefix operators, they are interpreted respectively as the functions (~-) and (~-.).
Operator | Initial meaning |
+ | Integer addition. |
- (infix) | Integer subtraction. |
~- - (prefix) | Integer negation. |
* | Integer multiplication. |
/ | Integer division. Raise Division_by_zero if second argument is zero. |
mod | Integer modulus. Raise Division_by_zero if second argument is zero. |
land | Bitwise logical “and” on integers. |
lor | Bitwise logical “or” on integers. |
lxor | Bitwise logical “exclusive or” on integers. |
lsl | Bitwise logical shift left on integers. |
lsr | Bitwise logical shift right on integers. |
asr | Bitwise arithmetic shift right on integers. |
+. | Floating-point addition. |
-. (infix) | Floating-point subtraction. |
~-. -. (prefix) | Floating-point negation. |
*. | Floating-point multiplication. |
/. | Floating-point division. |
** | Floating-point exponentiation. |
@ | List concatenation. |
^ | String concatenation. |
! | Dereferencing (return the current contents of a reference). |
:= | Reference assignment (update the reference given as first argument with the value of the second argument). |
= | Structural equality test. |
<> | Structural inequality test. |
== | Physical equality test. |
!= | Physical inequality test. |
< | Test “less than”. |
<= | Test “less than or equal”. |
> | Test “greater than”. |
>= | Test “greater than or equal”. |
&& & | Boolean conjunction. |
|| or | Boolean disjunction. |
When class-path evaluates to a class body, new class-path evaluates to a new object containing the instance variables and methods of this class.
When class-path evaluates to a class function, new class-path evaluates to a function expecting the same number of arguments and returning a new object of this class.
Creating directly an object through the object class-body end construct is operationally equivalent to defining locally a class class-name = object class-body end —see sections 7.9.2 and following for the syntax of class-body— and immediately creating a single object from it by new class-name.
The typing of immediate objects is slightly different from explicitly defining a class in two respects. First, the inferred object type may contain free type variables. Second, since the class body of an immediate object will never be extended, its self type can be unified with a closed object type.
The expression expr # method-name invokes the method method-name of the object denoted by expr.
If method-name is a polymorphic method, its type should be known at the invocation site. This is true for instance if expr is the name of a fresh object (let ident = new class-path … ) or if there is a type constraint. Principality of the derivation can be checked in the -principal mode.
The instance variables of a class are visible only in the body of the methods defined in the same class or a class that inherits from the class defining the instance variables. The expression inst-var-name evaluates to the value of the given instance variable. The expression inst-var-name <- expr assigns the value of expr to the instance variable inst-var-name, which must be mutable. The whole expression inst-var-name <- expr evaluates to ().
An object can be duplicated using the library function Oo.copy (see Module Oo). Inside a method, the expression {< inst-var-name = expr { ; inst-var-name = expr } >} returns a copy of self with the given instance variables replaced by the values of the associated expressions; other instance variables have the same value in the returned object as in self.
Expressions whose type contains object or polymorphic variant types can be explicitly coerced (weakened) to a supertype. The expression (expr :> typexpr) coerces the expression expr to type typexpr. The expression (expr : typexpr1 :> typexpr2) coerces the expression expr from type typexpr1 to type typexpr2.
The former operator will sometimes fail to coerce an expression expr from a type typ1 to a type typ2 even if type typ1 is a subtype of type typ2: in the current implementation it only expands two levels of type abbreviations containing objects and/or polymorphic variants, keeping only recursion when it is explicit in the class type (for objects). As an exception to the above algorithm, if both the inferred type of expr and typ are ground (i.e. do not contain type variables), the former operator behaves as the latter one, taking the inferred type of expr as typ1. In case of failure with the former operator, the latter one should be used.
It is only possible to coerce an expression expr from type typ1 to type typ2, if the type of expr is an instance of typ1 (like for a type annotation), and typ1 is a subtype of typ2. The type of the coerced expression is an instance of typ2. If the types contain variables, they may be instantiated by the subtyping algorithm, but this is only done after determining whether typ1 is a potential subtype of typ2. This means that typing may fail during this latter unification step, even if some instance of typ1 is a subtype of some instance of typ2. In the following paragraphs we describe the subtyping relation used.
A fixed object type admits as subtype any object type that includes all its methods. The types of the methods shall be subtypes of those in the supertype. Namely,
is a supertype of
which may contain an ellipsis .. if every typi is a supertype of the corresponding typ′i.
A monomorphic method type can be a supertype of a polymorphic method type. Namely, if typ is an instance of typ′, then 'a1 … 'an . typ′ is a subtype of typ.
Inside a class definition, newly defined types are not available for subtyping, as the type abbreviations are not yet completely defined. There is an exception for coercing self to the (exact) type of its class: this is allowed if the type of self does not appear in a contravariant position in the class type, i.e. if there are no binary methods.
A polymorphic variant type typ is a subtype of another polymorphic variant type typ′ if the upper bound of typ (i.e. the maximum set of constructors that may appear in an instance of typ) is included in the lower bound of typ′, and the types of arguments for the constructors of typ are subtypes of those in typ′. Namely,
which may be a shrinkable type, is a subtype of
which may be an extensible type, if every typi is a subtype of typ′i.
Other types do not introduce new subtyping, but they may propagate the subtyping of their arguments. For instance, typ1 * typ2 is a subtype of typ′1 * typ′2 when typ1 and typ2 are respectively subtypes of typ′1 and typ′2. For function types, the relation is more subtle: typ1 -> typ2 is a subtype of typ′1 -> typ′2 if typ1 is a supertype of typ′1 and typ2 is a subtype of typ′2. For this reason, function types are covariant in their second argument (like tuples), but contravariant in their first argument. Mutable types, like array or ref are neither covariant nor contravariant, they are nonvariant, that is they do not propagate subtyping.
For user-defined types, the variance is automatically inferred: a parameter is covariant if it has only covariant occurrences, contravariant if it has only contravariant occurrences, variance-free if it has no occurrences, and nonvariant otherwise. A variance-free parameter may change freely through subtyping, it does not have to be a subtype or a supertype. For abstract and private types, the variance must be given explicitly (see section 7.8.1), otherwise the default is nonvariant. This is also the case for constrained arguments in type definitions.
OCaml supports the assert construct to check debugging assertions. The expression assert expr evaluates the expression expr and returns () if expr evaluates to true. If it evaluates to false the exception Assert_failure is raised with the source file name and the location of expr as arguments. Assertion checking can be turned off with the -noassert compiler option. In this case, expr is not evaluated at all.
As a special case, assert false is reduced to raise (Assert_failure ...), which gives it a polymorphic type. This means that it can be used in place of any expression (for example as a branch of any pattern-matching). It also means that the assert false “assertions” cannot be turned off by the -noassert option.
The expression lazy expr returns a value v of type Lazy.t that encapsulates the computation of expr. The argument expr is not evaluated at this point in the program. Instead, its evaluation will be performed the first time the function Lazy.force is applied to the value v, returning the actual value of expr. Subsequent applications of Lazy.force to v do not evaluate expr again. Applications of Lazy.force may be implicit through pattern matching (see 8.3).
The expression let module module-name = module-expr in expr locally binds the module expression module-expr to the identifier module-name during the evaluation of the expression expr. It then returns the value of expr. For example:
let remove_duplicates comparison_fun string_list = let module StringSet = Set.Make(struct type t = string let compare = comparison_fun end) in StringSet.elements (List.fold_right StringSet.add string_list StringSet.empty)
The expressions let open module-path in expr and module-path.( expr) are strictly equivalent. These constructions locally open the module referred to by the module path module-path in the respective scope of the expression expr.
When the body of a local open expression is delimited by [ ], [| |], or { }, the parentheses can be omitted. For expression, parentheses can also be omitted for {< >}. For example, module-path.[ expr] is equivalent to module-path.([ expr]), and module-path.[| expr |] is equivalent to module-path.([| expr |]).