Chapter 6  The OCaml language

Foreword

This document is intended as a reference manual for the OCaml language. It lists the language constructs, and gives their precise syntax and informal semantics. It is by no means a tutorial introduction to the language: there is not a single example. A good working knowledge of OCaml is assumed.

No attempt has been made at mathematical rigor: words are employed with their intuitive meaning, without further definition. As a consequence, the typing rules have been left out, by lack of the mathematical framework required to express them, while they are definitely part of a full formal definition of the language.

Notations

The syntax of the language is given in BNF-like notation. Terminal symbols are set in typewriter font (like this). Non-terminal symbols are set in italic font (like  that). Square brackets […] denote optional components. Curly brackets {…} denotes zero, one or several repetitions of the enclosed components. Curly brackets with a trailing plus sign {…}+ denote one or several repetitions of the enclosed components. Parentheses (…) denote grouping.

2  Values

This section describes the kinds of values that are manipulated by OCaml programs.

2.1  Base values

Integer numbers

Integer values are integer numbers from −230 to 230−1, that is −1073741824 to 1073741823. The implementation may support a wider range of integer values: on 64-bit platforms, the current implementation supports integers ranging from −262 to 262−1.

Floating-point numbers

Floating-point values are numbers in floating-point representation. The current implementation uses double-precision floating-point numbers conforming to the IEEE 754 standard, with 53 bits of mantissa and an exponent ranging from −1022 to 1023.

Characters

Character values are represented as 8-bit integers between 0 and 255. Character codes between 0 and 127 are interpreted following the ASCII standard. The current implementation interprets character codes between 128 and 255 following the ISO 8859-1 standard.

Character strings

String values are finite sequences of characters. The current implementation supports strings containing up to 224 − 5 characters (16777211 characters); on 64-bit platforms, the limit is 257 − 9.

2.2  Tuples

Tuples of values are written (v1,, vn), standing for the n-tuple of values v1 to vn. The current implementation supports tuple of up to 222 − 1 elements (4194303 elements).

2.3  Records

Record values are labeled tuples of values. The record value written { field1 = v1;;  fieldn = vn } associates the value vi to the record field fieldi, for i = 1 … n. The current implementation supports records with up to 222 − 1 fields (4194303 fields).

2.4  Arrays

Arrays are finite, variable-sized sequences of values of the same type. The current implementation supports arrays containing up to 222 − 1 elements (4194303 elements) unless the elements are floating-point numbers (2097151 elements in this case); on 64-bit platforms, the limit is 254 − 1 for all arrays.

2.5  Variant values

Variant values are either a constant constructor, or a non-constant constructor applied to a number of values. The former case is written constr; the latter case is written constr (v1, ... , vn ), where the vi are said to be the arguments of the non-constant constructor constr. The parentheses may be omitted if there is only one argument.

The following constants are treated like built-in constant constructors:

ConstantConstructor
falsethe boolean false
truethe boolean true
()the “unit” value
[]the empty list

The current implementation limits each variant type to have at most 246 non-constant constructors and 230−1 constant constructors.

2.6  Polymorphic variants

Polymorphic variants are an alternate form of variant values, not belonging explicitly to a predefined variant type, and following specific typing rules. They can be either constant, written `tag-name, or non-constant, written `tag-name(v).

2.7  Functions

Functional values are mappings from values to values.

2.8  Objects

Objects are composed of a hidden internal state which is a record of instance variables, and a set of methods for accessing and modifying these variables. The structure of an object is described by the toplevel class that created it.