Description
The Cocktail Toolbox is a set of program generators or compiler construction tools for nearly all phases of a compiler. The compiler construction tools support the automatic generation of compilers for imperative programming languages. The design goals for this toolbox were practical usability, significantly reduced construction effort for compilers, and high quality of the generated compilers. Especially with respect to efficiency the tools are competitive to programming by hand. The tools can generate compiler modules in the languages C, C++, Modula-2, or Java. Many realistic applications demonstrate the excellent performance of the tools and show that Cocktail allows the construction of compilers and language processing software in production quality.
The Cocktail Toolbox runs on all variants of Linux, Unix and
Windows (9x, ME, NT, 2000, XP, Vista, 7, 8).
The name Cocktail stands for COmpiler Compiler ToolkIt KArLsruhe.
Tools
The Cocktail Toolbox contains the following main tools:
Rex | generator for lexical analyzers |
Lark | LR(1) and LALR(2) parser generator with backtracking and predicates |
Ell | LL(1) parser generator |
Ast | generator for abstract syntax trees |
Ag | generator for attribute evaluators |
Puma | transformation of attributed trees using pattern matching |
Reuse | library of reusable modules for compiler construction |
Distribution
Two types of distribution of the Cocktail Toolbox are available: CD-ROM and ESD (Electronic Software Distribution).
A CD-ROM contains:
- source code in ANSI C
- binary executables for a few common platforms such as Linux and Microsoft Windows (9x, ME, NT, 2000, XP, Vista, 7, 8)
- manuals and documentation in the formats PDF, Postscript, and text
- example specifications (grammars) for: Ada, C, Minilax, Modula-2, Modula-3, Oberon, Occam, Pascal, Sather.
ESD (Electronic Software Distribution) is available via downloading a binary executable version from CoCoLab's web site. The items available for downloading are:
- binary executables for Linux or Microsoft Windows (9x, ME, NT, 2000, XP, Vista, 7, 8)
- manuals and documentation in the formats PDF and Postscript
- example specifications (grammars) for: Ada, C, Minilax, Modula-2, Modula-3, Oberon, Occam, Pascal, Sather.
Note, ESD is available for a single user version, only. It requires to purchase a license key from CoCoLab for enabling the software. ESD is available for Linux and Microsoft Windows (9x, ME, NT, 2000, XP, Vista) platforms, only. Source code is not included. Instructions for downloading and installation using ESD can be found on the ESD page.
Updates
Updates for the source version of the Cocktail Toolbox are available for downloading from the Updates page.
Documentation
The manuals and documentation of the Cocktail Toolbox in the formats PDF and Postscript are available for downloading from the Documentation page.
Rex
Rex (Regular EXpression tool) is a generator for scanners or lexical analyzers. Its specifications are based on regular expressions and on arbitrary semantic actions written in one of the implementation languages C, C++, Modula-2, or Java. When the context has to be considered in order to unambiguously recognize a token the right context can be specified by an additional regular expression and the left context can be handled by so-called start states. The generated scanners automatically compute the line and column position of the tokens and they contain a mechansim for include files. Identifiers and keywords can be normalized efficiently to upper case or lower case letters. The scanners are table-driven and run very fast.
Lark
Lark is a generator for LR(1) and LALR(2) parsers or syntax analyzers. It accepts grammars written in BNF notation. The grammars may be augmented with semantic actions formulated as statements of the implementation language. The parser generator provides a mechanism for S-attribution, that is synthesized attributes can be computed during parsing. In case of LR-conflicts unlike other tools Lark provides not only information about an internal state consisting of a set of items but it prints a derivation tree which is much more useful to analyze the problem. Conflicts can be resolved by specifying precedence and associativity of operators and productions as well as by using syntactic and semantic predicates. The generated parsers include automatic error recovery, error messages, and error repair. The parsers are table-driven and run very fast. Parsers can be generated in the implementation languages C, C++, Modula-2, or Java.
More Lark features are:
- fast generation of useful information about LR conflicts
- optional tracing of parsing steps during run-time
- support of named attributes as well as the $i notation
- semantic predicates control parsing by conditions
- support for backtracking parsing
- token declarations with cost and representation specs improve error handling
- parsers can have several start symbols
- Yacc input is accepted additionally to Lark input
Ell
Ell is a generator for LL(1) parsers or syntax analyzers. It accepts grammars written in extended BNF notation. During parsing an L-attribution can be evaluated. The generated parsers include automatic error recovery, error messages, and error repair like Lark. The parsers are implemented following the recursive descent method and they run very fast. The possible implementation languages are C, C++, or Modula-2. The input languages of the tools are improved versions of the Lex and Yacc inputs. All tools also understand Lex and Yacc syntax with the help of preprocessors.
Ast
Ast is a generator for program modules that define the structure of abstract syntax trees and provide general tree manipulating procedures. The defined trees may be decorated with attributes of arbitrary data types. Besides trees, graphs can be handled, too. The structure of the trees is specified by a formalism based on context-free grammars. The generated module includes procedures to construct and destroy trees, to read and write trees from (to) files, and to traverse trees in some commonly used manners. The mentioned readers and writers process text as well as binary tree representations. All procedures work for graphs as well. The advantages of this approach are: record aggregates are provided which allow a concise notation for node creation. It is possible to build trees by writing terms. An extension mechanism avoids chain rules and allows, for example lists with elements of different types. Input/output procedures for trees and complete graphs are provided. The output procedures and the interactive graph browser facilitate the debugging phase as they operate on a readable level and know the data structure. The user does not have to care about algorithms for traversing graphs. He/she is freed from the task of writing large amounts of relatively simple code. All of these features significantly increase programmer productivity.
Ast features:
- generates abstract data types (program modules) that handle trees
- the trees may be attributed
- besides trees graphs are handled as well
- nodes may be associated with arbitrary many attributes of arbitrary types
- specifications are based on extended context-free grammars
- common notation for concrete and abstract syntax
- common notation for attributed trees and graphs
- an extension mechanism provides single and multiple inheritance
- trees are stored as linked records
- generates efficient program modules
- generates modules in C, C++, Modula-2, or Java
- provides many tree operations (procedures):
- node constructors combine aggregate notation and storage management
- text graph reader and writer
- binary graph reader and writer
- processing of lists such as reversal or forall
- top down and bottom up traversal
- graphic interactive graph browser
Ag
Ag is a generator for attribute evaluators. It processes ordered attribute grammars (OAGs), well-defined attribute grammars (WAGs) as well as higher order attribute grammars (HAGs). It is oriented towards abstract syntax trees. Therefore the tree structure is fully known. The terminals and nonterminals may have an arbitrary number of attributes of any type. This includes tree-valued attributes. Ag allows attributes local to rules and offers an extension mechanism which provides single inheritance as well as multiple inheritance for attributes and for attribute computations. It also allows the elimination of chain rules. The attribute computations are expressed in the implementation language and should be written in a functional style. It is possible to call external functions of separately compiled modules. Non-functional statements and side-effects are possible but require careful consideration. The syntax of the specification language is designed to support compact, modular, and readable documents. An attribute grammar can consist of several modules where the context-free grammar is specified only once. There are shorthand notations for copy rules and threaded attributes which allow the user to omit many trivial attribute computations. The generated evaluators are very run-time efficient because they are directly coded using recursive procedures. Attribute storage for OAGs is optimized by implementing attributes as local variables and procedure parameters whenever their lifetime is contained within one visit.
Ag features:
- processes ordered attribute grammars (OAGs)
- processes well-defined attribute grammars (WAGs)
- processes higher order attribute grammars (HAGs)
- allows tree-valued attributes
- allows the use of subtrees as attribute values
- allows the creation of parts of the tree during evaluation time
- allows read access to non-local attributes
- operates on abstract syntax trees
- cooperates with the generator for abstract syntax trees Ast
- the tree structure is fully known
- terminals and nonterminals may have attributes
- allows attributes of any type
- differentiates input and output attributes
- allows the elimination of chain rules
- allows attributes local to rules
- offers single and multiple inheritance for attributes and attribute computations
- attributes are denoted by unique selector names instead of nonterminals with subscripts
- the tool is largely independent of the implementation language
- attribute computations are expressed in the implementation language
- attribute computations are written in a functional style
- attribute computations can call external functions
- non-functional statements and side-effects are possible
- allows to write compact, modular, and readable specifications
- attribute grammars can consist of several modules
- the context-free grammar is specified only once
- checks an attribute grammar for completeness of the attribute computations
- checks for unused attributes
- checks an attribute grammar for the classes WAG, SNC, DNC, OAG, LAG, and SAG
- generates efficient evaluators
- the evaluators are directly coded using recursive procedures
- the implementation of the trees is efficient
- optimizes attribute storage (for OAG evaluators)
- attributes may be implemented as local variables in procedures and passed as parameters
- generates evaluators in C, C++, Modula-2, or Java
Puma
Puma is a tool supporting the transformation and manipulation of attributed trees. It is based on pattern-matching and recursive procedures. Puma cooperates with the generator for abstract syntax trees Ast, which already supports the definition, creation, and storage of attributed trees. Puma adds a concise notation for the analysis and synthesis of trees. The pattern-matching capability facilitates the specification of decision tables. Puma provides implicit declarations of variables, strong type checking with respect to trees, and checks the single assignment restriction for variables. The output is the source code of a program module written in one of the implementation languages C, C++, Modula-2, or Java. This module implements the specified transformation routines. It can be integrated easily with arbitrary program code. The generated routines are optimized with respect to common subexpression elimination and tail recursion elimination.
References
- J. Grosch, Generators for High-Speed Front-Ends, LNCS, 371, 81-92 (Oct. 1988), Springer Verlag.
- W. M. Waite, J. Grosch and F. W. Schröer, Three Compiler Specifications, GMD-Studie Nr. 166, GMD Forschungsstelle an der Universität Karlsruhe, Aug. 1989.
- J. Grosch, Efficient Generation of Lexical Analysers, Software-Practice & Experience, 19, 1089-1103 (Nov. 1989).
- J. Grosch, Efficient and Comfortable Error Recovery in Recursive Descent Parsers, Structured Programming, 11, 129-140 (1990).
- J. Grosch, H. Emmelmann, A Tool Box for Compiler Construction, LNCS, 477, 106-116 (Oct. 1990), Springer Verlag.
- J. Grosch, Object-Oriented Attribute Grammars, in: Proceedings of the Fifth International Symposium on Computer and Information Sciences (ISCIS V) (Eds. A. E. Harmanci, E. Gelenbe), Cappadocia, Nevsehir, Turkey, 807-816, (Oct. 1990).
- J. Grosch, Lalr - a Generator for Efficient Parsers, Software-Practice & Experience, 20, 1115-1135 (Nov. 1990).
- J. Grosch, Tool Support for Data Structures, Structured Programming, 12, 31-38 (1991).
- U. Aszmann, H. Emmelmann, J. Grosch, Stein auf Stein - Cocktail: Eine Compiler-Compiler-Toolbox, iX 2/1992, 123-127.
- J. Grosch, Transformation of Attributed Trees Using Pattern Matching, LNCS, 641, 1-15 (Oct. 1992), Springer Verlag.
- R. M. Bates, Examining the Cocktail Toolbox, Dr. Dobb's Journal, #245 March 1996, 78-82, 95-96.
- J. Grosch, Are Attribute Grammars Used in Industry?, Proceedings of the Second Workshop on Attribute Grammars and their Application (WAGA 99) (Eds. D. Parigot, M. Mernik), Amsterdam, 1-15, (March 1999).