Jalog Language Reference

Contents

Lexical Structure
Syntax
Notes on Semantics

Lexical Structure

A program consists of tokens. Between tokens there may, sometimes must, be whitespace. Character Set

Jalog uses UTF-8 character set. ASCII character set is sufficient for writing a program.

Whitespace

Whitespace (spaces, tabs, newlines and comments) are allowed between tokens but not within tokens. However whitespace in strings is part of the token and not considered whitespace.

Comments

Block comments

Everything between character combinations "/*" and "*/" is considered comment.

Example:

  /* This
     is
     comment */

End-line comments

End-line comments start with percent sign "%". Everything after it until the end of line is considered a comment.

Example:

       write("This is program"), % This is comment

Tokens

Each token is either a simple token or of one of the following types: INT, REAL, CHAR, STRING, NAME, VARIABLE_NAME, and EOF.

Simple Tokens

COMMA,
SEMICOLON;
IFSYM:-
POINT.
EQ=
STAR*
MINUS-
LPAR(
RPAR)
NE<>
GT>
GE>=
LT<
LE<=
PLUS+
SLASH/
LBRAK[
RBRAK]
CUT!
VBAR|
OPEN_

INT

Integer literal. A decimal literal consist of digits "0"..."9". A hexadecimal literals start with "$" or "0x" followed by hexadecimal digits "0"..."9" and "a"..."f" (upper-case is also allowed). The range of allowed values is 0...2147483647 or $0...$7FFFFFFF.

Examples:

  0
  42
  2147483647
  $0
  0x2a
  $7FFFffff

REAL

Floating point literal. A floating-point literal starts with decimal digits followed by fraction part or exponent part or both. Fraction part consits of point followed by one or more digits. Exponent part starts with "e" or "E" followed by an optional "+" or "-" sign and decimal digits. The largest positive finite literal is 1.7976931348623157e308.

Examples:

  0.0
  4.94065645841247e-324
  6.62607004E-34
  0.5
  2.718281828459045
  3.141592653589793
  42.0
  9.4605284E15
  1.7976931348623157e308

CHAR

Character literal. A character literal starts and ends with an apostrophe "'". Between the apostrophes is an ordinary character (printable character other than "'" or "\") or an escape sequence.

Image Explanation
\a Control character Bell (0x07)
\b Control character backspace (0x08)
\t Control character tab (0x09)
\n Control character line feed (0x0A), used as line separator
\v Control character vertical tabulator (0x0B)
\f Control character form feed (0x0C)
\r Control character carriage return (0x0D)
\e Control character escape (0x1B)
\xhh Character represented by the two digit hexadecimal number hh
\uhhhh Character represented by the four digit hexadecimal number hhhh
\Uhhhhhhhh Character represented by the eight digit hexadecimal number hhhhhhhh
\ddd Character represented by the three digit decimal number ddd
\\ Backslash \
\' Apostrophe '
\" Quote "

A char variable can contain a 16 bit value, so values from '\u0000' to '\uFFFF' are available. Values starting from '\U00010000' cannot be stored in char variables but can be used in strings.

Caaracters are coded in Unicode. For code charts see e.g. Unicode Character Code Charts.

Examples:

  ' '
  '\n'
  '\''
  'a'
  '\x07'
  '\176'
  '\u0110'

STRING

String literal. A string literal starts and ends with a quote """. Between the quotes is an (possible empty) sequence of characters. The characters in a string are represented in the same way as in a char literal (except a quote must be represented using an escape sequence "\"", and an apostrophe does not need escaping). Characters in range '\U00010000' .. '\U0010FFFF' are stored as two surrogate characters.

Strings can be broken to several string literals, which may be separated by whitespace.

Examples:

  ""
  "a"
  "He said \"Hello world.\""
  "The better angel is a man right fair,\nThe worser spirit a woman color'd ill."
  "Shall we see again? \U0001F642"

NAME

Names can be used as data values, functors for compound data, and predicate names.

A name must start with a lower-case letter "a"..."z". The other characters can be lower-case or upper-case letters, digits, or underscores "_". Some names are reserved for library predicates, functions, or operators.

Examples:

  x
  master
  isValid
  is_valid
  find_2nd_best

VARIABLE_NAME

A variable name must start with a upper-case letter "A"..."Z" or an underscore "_". The other characters can be lower-case or upper-case letters, digits, or underscores.

A special case is the one-character variable name "_". Unlike other variable names each occurrence of it refers to a separate dummy variable. Recommended uses:

Examples:

  _
  X
  Index
  TotalCount
  _result
  Row_number
  BestOf3

EOF

Physical end-of-file.

Syntax

NOTE: The main differences to other Prolog syntaxes:

  1. Jalog does not support operator definitions.
  2. Arithmetics, including numeric literals, resembles Java.
  3. Strings and character literals resemble those of Java.
  4. Jalog has no is operator. Equal sign = is used to store the value of an expression to a variable.

Backus-Naur form is used here to present the syntax. Names in CAPITAL letters refer to tokens.

  <program> ::= <program> <clause> 
            | EOF

  <clause> ::= <sentence> POINT 

  <sentence> ::= IFSYM <body> 
            | <head>
            | <head> IFSYM <body>
            
  <head> ::= <structure>

  <body> ::= <formula>
            | <body> COMMA <formula>
            
  <formula> ::= <addend>
            | <addend> EQ <addend>
            | <addend> NE <addend>
            | <addend> GT <addend>
            | <addend> GE <addend>
            | <addend> LT <addend>
            | <addend> LE <addend>
            
  <addend> ::= PLUS <factor>
            | MINUS <factor>
            | <factor>
            | <addend> PLUS <factor>
            | <addend> MINUS <factor>
  
  <factor> ::= <primary>
            | <factor> STAR <primary>
            | <factor> SLASH <primary>

  <primary> ::= VARIABLE_NAME
            | OPEN
            | INT
            | REAL
            | CHAR
            | CUT
            | LPAR <sentence> RPAR
            | <string>
            | <list>
            | <structure>

  <string> ::= STRING
            | <string> STRING

  <list> ::= LBRAK RBRAK
            | LBRAK <elements> VBAR <formula> RBRAK
            | LBRAK <elements> RBRAK

  <elements> ::= <formula>
            | <elements> COMMA <formula>

  <structure> ::= NAME LPAR RPAR
            | NAME LPAR <arguments> RPAR
            | NAME

  <arguments> ::= <formula>
            | <arguments> COMMA <formula>

Notes on Semantics

For general idea read a short Wikipedia article Prolog syntax and semantics.

Differences between Jalog and traditional Prolog

The differences between traditional Prolog and Jalog can be broadly divided into these categories:

  1. The equal sign = unifies left and right side expressions.
  2. Arithmetic expressions are replaced by their values in unification. This happens when the = sign is used or a predicate is called.
  3. When backtracking to database query the modifications after the previous query can be accessed.
  4. Unification is done mainly as explained in section 2.1 Unification in Learn Prolog Now!. The biggest difference is that arithmetic expressions of known values are evaluated.
  5. Identification and handling cyclic terms is not supported.
  6. String is not a list. Instead, it is a primitive data type.
  7. Single quotes do not indicate an atom but a character literal as in Java.