xref: /Universal-ctags/misc/packcc/README.md (revision 212cf26607535acfbbd5b96604cc099e08456838)
1# PackCC #
2
3## Overview ##
4
5**PackCC** is a parser generator for C.
6Its main features are as follows:
7
8- Generates your parser in C from a grammar described in a **PEG**,
9- Gives your parser great efficiency by **packrat parsing**,
10- Supports direct and indirect **left-recursive** grammar rules.
11
12The grammar of your parser can be described in a **PEG** ([Parsing Expression Grammar](http://en.wikipedia.org/wiki/Parsing_expression_grammar)).
13The PEG is a [top-down parsing language](http://en.wikipedia.org/wiki/Top-down_parsing_language),
14and is similar to the [regular-expression](http://en.wikipedia.org/wiki/Regular_expression) grammar.
15Compared with a bottom-up parsing language, like Yacc's one, the PEG is much more intuitive and cannot be ambiguous.
16The PEG does not require tokenization to be a separate step, and tokenization rules can be written in the same way as any other grammar rules.
17
18Your generated parser can parse inputs very efficiently by **packrat parsing**.
19The packrat parsing is the [recursive descent parsing](http://en.wikipedia.org/wiki/Recursive_descent_parser) algorithm
20that is accelerated using [memoization](http://en.wikipedia.org/wiki/Memoization).
21By using packrat parsing, any input can be parsed in linear time.
22Without it, however, the resulting parser could exhibit exponential time performance in the worst case due to the unlimited look-ahead capability.
23
24Unlike common packrat parsers, PackCC can support direct and indirect **left-recursive** grammar rules.
25This powerful feature enables you to describe your language grammar in a much simpler way.
26<small>(The algorithm is based on the paper [*"Packrat Parsers Can Support Left Recursion"*](http://www.cs.ucla.edu/~todd/research/pub.php?id=pepm08)
27authored by A. Warth, J. R. Douglass, and T. Millstein.)</small>
28
29Some additional features are as follows:
30
31- Thread-safe and reentrant,
32- Supports UTF-8 multibyte characters (version 1.4.0 or later),
33- Generates more ease-of-understanding parser source codes,
34- Consists of just a single compact source file,
35- Under MIT license. (not under a certain contagious license!)
36
37The generated code is beautified and as ease-of-understanding as possible.
38Actually, it uses lots of *goto* statements, but the control flows are much more traceable
39than *goto* spaghetti storms generated by Yacc or other parser generators.
40This feature is irrelevant to common users, but helpful for PackCC developers to debug it.
41
42PackCC itself is under MIT license, but you can distribute your generated code under any license you like.
43
44## Installation ##
45
46You can obtain the executable `packcc` by compiling [`src/packcc.c`](src/packcc.c) using your favorite C compiler.
47For convenience, the build environments using GCC, Clang, and Microsoft Visual Studio are prepared under [`build`](build) directory.
48
49### Using GCC ###
50
51#### Other than MinGW ####
52
53`packcc` will be built in both directories `build/gcc/debug/bin` and `build/gcc/release/bin` using `gcc` by executing the following commands:
54
55```
56cd build/gcc
57make
58make check  # bats-core and uncrustify are required (see tests/README.md)
59```
60
61`packcc` in the directory `build/gcc/release/bin` is suitable for practical use.
62
63#### MinGW ####
64
65`packcc` will be built in both directories `build/mingw-gcc/debug/bin` and `build/mingw-gcc/release/bin` using `gcc` by executing the following commands:
66
67```
68cd build/mingw-gcc
69make
70make check  # bats-core and uncrustify are required (see tests/README.md)
71```
72
73`packcc` in the directory `build/mingw-gcc/release/bin` is suitable for practical use.
74
75### Using Clang ###
76
77#### Other than MinGW ####
78
79`packcc` will be built in both directories `build/clang/debug/bin` and `build/clang/release/bin` using `clang` by executing the following commands:
80
81```
82cd build/clang
83make
84make check  # bats-core and uncrustify are required (see tests/README.md)
85```
86
87`packcc` in the directory `build/clang/release/bin` is suitable for practical use.
88
89#### MinGW ####
90
91`packcc` will be built in both directories `build/mingw-clang/debug/bin` and `build/mingw-clang/release/bin` using `clang` by executing the following commands:
92
93```
94cd build/mingw-clang
95make
96make check  # bats-core and uncrustify are required (see tests/README.md)
97```
98
99`packcc` in the directory `build/mingw-clang/release/bin` is suitable for practical use.
100
101### Using Microsoft Visual Studio ###
102
103You have to install Microsoft Visual Studio 2019 in advance.
104After that, you can build `packcc.exe` by the following instructions:
105- Open the solution file `build\msvc\msvc.sln`,
106- Select a preferred solution configuration (*Debug* or *Release*) and a preferred solution platform (*x64* or *x86*),
107- Invoke the *Build Solution* menu item.
108
109`packcc.exe` will appear in `build\msvc\XXX\YYY` directory.
110Here, `XXX` is `x64` or `x86`, and `YYY` is `Debug` or `Release`.
111`packcc.exe` in the directory `build\msvc\XXX\Release` is suitable for practical use.
112
113## Usage ##
114
115### Command ###
116
117You must prepare a PEG source file (see the following section).
118Let the file name `example.peg` for example.
119
120```
121packcc example.peg
122```
123
124By running this, the parser source `example.h` and `example.c` are generated.
125
126If no PEG file name is specified, the PEG source is read from the standard input, and `-.h` and `-.c` are generated.
127
128The base name of the parser source files can be changed by `-o` option.
129
130```
131packcc -o parser example.peg
132```
133
134By running this, the parser source `parser.h` and `parser.c` are generated.
135
136If you want to disable UTF-8 support, specify the command line option `-a` or `--ascii` (version 1.4.0 or later).
137
138If you want to insert `#line` directives in the generated source and header files, specify the command line option `-l` or `--lines` (version 1.7.0 or later).
139It is helpful to trace compilation errors of the generated source and header files back to the codes written in the PEG source file.
140
141If you want to confirm the version of the `packcc` command, execute the below.
142
143```
144packcc -v
145```
146
147### Syntax ###
148
149A grammar consists of a set of named rules.
150A rule definition can be split into multiple lines.
151
152**_rulename_ `<-` _pattern_**
153
154The _rulename_ is the name of the rule to define.
155The _pattern_ is a text pattern that contains one or more of the following elements.
156
157**_rulename_**
158
159The element stands for the entire pattern in the rule with the name given by _rulename_.
160
161**_variable_`:`_rulename_**
162
163The element stands for the entire pattern in the rule with the name given by _rulename_.
164The _variable_ is an identifier associated with the semantic value returned from the rule by assigning to `$$` in its action.
165The identifier can be referred to in subsequent actions as a variable.
166The example is shown below.
167
168```
169term <- l:term _ '+' _ r:factor { $$ = l + r; }
170```
171
172A variable identifier must consist of alphabets (both uppercase and lowercase letters), digits, and underscores.
173The first letter must be an alphabet.
174The reserved keywords in C cannot be used.
175
176**_sequence1_ `/` _sequence2_ `/` ... `/` _sequenceN_**
177
178Each _sequence_ is tried in turn until one of them matches, at which time matching for the overall pattern succeeds.
179If no _sequence_ matches then the matching for the overall pattern fails.
180The operator slash (`/`) has the least priority.
181The example is shown below.
182
183```
184'foo' rule1 / 'bar'+ [0-9]? / rule2
185```
186
187This pattern tries matching of the first sequence (`'foo' rule1`).
188If it succeeds, then the overall pattern matching succeeds and ends without evaluating the subsequent sequences.
189Otherwise, it tries matching of the next sequence (`'bar'+ [0-9]?`).
190If it succeeds, then the overall pattern matching succeeds and ends without evaluating the subsequent sequence.
191Finally, it tries matching of the last sequence (`rule2`).
192If it succeeds, then the overall pattern matching succeeds.
193Otherwise, the overall pattern matching fails.
194
195**`'`_string_`'`**
196
197A character or string enclosed in single quotes is matched literally.
198The ANSI C escape sequences are recognized within the characters.
199The UNICODE escape sequences (ex. `\u20AC`) are also recognized including surrogate pairs,
200if the command line option `-a` is not specified (version 1.4.0 or later).
201The example is shown below.
202
203```
204'foo bar'
205```
206
207**`"`_string_`"`**
208
209A character or string enclosed in double quotes is matched literally.
210The ANSI C escape sequences are recognized within the characters.
211The UNICODE escape sequences (ex. `\u20AC`) are also recognized including surrogate pairs,
212if the command line option `-a` is not specified (version 1.4.0 or later).
213The example is shown below.
214
215```
216"foo bar"
217```
218
219**`[`_character class_`]`**
220
221A set of characters enclosed in square brackets matches any single character from the set.
222The ANSI C escape sequences are recognized within the characters.
223The UNICODE escape sequences (ex. `\u20AC`) are also recognized including surrogate pairs,
224if the command line option `-a` is not specified (version 1.4.0 or later).
225If the set begins with an up-arrow (`^`), the set is negated (the element matches any character not in the set).
226Any pair of characters separated with a dash (`-`) represents the range of characters from the first to the second, inclusive.
227The examples are shown below.
228
229```
230[abc]
231[^abc]
232[a-zA-Z0-9_]
233```
234
235**`.`**
236
237A dot (`.`) matches any single character.
238Note that the only time this fails is at the end of input, where there is no character to match.
239
240**_element_ `?`**
241
242The _element_ is optional.
243If present on the input, it is consumed and the match succeeds.
244If not present on the input, no text is consumed and the match succeeds anyway.
245
246**_element_ `*`**
247
248The _element_ is optional and repeatable.
249If present on the input, one or more occurrences of the _element_ are consumed and the match succeeds.
250If no occurrence of the _element_ is present on the input, the match succeeds anyway.
251
252**_element_ `+`**
253
254The _element_ is repeatable.
255If present on the input, one or more occurrences of the _element_ are consumed and the match succeeds.
256If no occurrence of the _element_ is present on the input, the match fails.
257
258**`&` _element_**
259
260The predicate succeeds only if the _element_ can be matched.
261The input text scanned while matching _element_ is not consumed from the input and remains available for subsequent matching.
262
263**`!` _element_**
264
265The predicate succeeds only if the _element_ cannot be matched.
266The input text scanned while matching _element_ is not consumed from the input and remains available for subsequent matching.
267A popular idiom is the following, which matches the end of input, after the last character of the input has already been consumed.
268
269```
270!.
271```
272
273**`(` _pattern_ `)`**
274
275Parentheses are used for grouping (modifying the precedence of the _pattern_).
276
277**`<` _pattern_ `>`**
278
279Angle brackets are used for grouping (modifying the precedence of the _pattern_) and text capturing.
280The captured text is numbered in evaluation order, and can be referred to later using `$1`, `$2`, etc.
281
282**`$`_n_**
283
284A dollar (`$`) followed by a positive integer represents a text previously captured.
285The positive integer corresponds to the order of capturing.
286A `$1` represents the first captured text.
287The examples are shown below.
288
289```
290< [0-9]+ > 'foo' $1
291```
292
293This matches `0foo0`, `123foo123`, etc.
294
295```
296'[' < '='* > '[' ( !( ']' $1 ']' ) . )* ( ']' $1 ']' )
297```
298
299This matches `[[`...`]]`, `[=[`...`]=]`, `[==[`...`]==]`, etc.
300
301**`{` _c source code_ `}`**
302
303Curly braces surround an action.
304The action is arbitrary C source code to be executed at the end of matching.
305Any braces within the action must be properly nested.
306Note that braces in directive lines and in comments (`/*`...`*/` and `//`...) are appropriately ignored.
307One or more actions can be inserted in any places between elements in the pattern.
308Actions are not executed where matching fails.
309
310```
311[0-9]+ 'foo' { puts("OK"); } 'bar' / [0-9]+ 'foo' 'baz'
312```
313
314In this example, if the input is `012foobar`, the action `{ puts("OK"); }` is to be executed, but if the input is `012foobaz`,
315the action is not to be executed.
316All matched actions are guaranteed to be executed only once.
317
318In the action, the C source code can use the predefined variables below.
319
320- **`$$`**
321    The output variable, to which the result of the rule is stored.
322    The data type is the one specified by `%value`.
323    The default data type is `int`.
324- **`auxil`**
325    The user-defined data that has been given via the API function `pcc_create()`.
326    The data type is the one specified by `%auxil`.
327    The default data type is `void *`.
328- _variable_
329    The result of another rule that has already been evaluated.
330    If the rule has not been evaluated, it is ensured that the value is zero-cleared (version 1.7.1 or later).
331    The data type is the one specified by `%value`.
332    The default data type is `int`.
333- **`$`**_n_
334    The string of the captured text.
335    The _n_ is the positive integer that corresponds to the order of capturing.
336    The variable `$1` holds the string of the first captured text.
337- **`$`**_n_**`s`**
338    The start position in the input of the captured text, inclusive.
339    The _n_ is the positive integer that corresponds to the order of capturing.
340    The variable `$1s` holds the start position of the first captured text.
341- **`$`**_n_**`e`**
342    The end position in the input of the captured text, exclusive.
343    The _n_ is the positive integer that corresponds to the order of capturing.
344    The variable `$1e` holds the end position of the first captured text.
345- **`$0`**
346    The string of the text between the start position in the input at which the rule pattern begins to match
347    and the current position in the input at which the element immediately before the action ends to match.
348- **`$0s`**
349    The start position in the input at which the rule pattern begins to match.
350- **`$0e`**
351    The current position in the input at which the element immediately before the action ends to match.
352
353An example is shown below.
354
355```
356term <- l:term _ '+' _ r:factor { $$ = l + r; }
357factor <- < [0-9]+ >            { $$ = atoi($1); }
358_ <- [ \t]*
359```
360
361Note that the string data held by `$`_n_ and `$0` are discarded immediately after evaluation of the action.
362If the string data are needed after the action, they must be copied in `$$` or `auxil`.
363If they are required to be copied in `$$`, it is recommended to define a structure as the type of output data using `%value`,
364and to copy the necessary string data in its member variable.
365Similarly, if they are required to be copied in `auxil`, it is recommended to define a structure as the type of user-defined data using `%auxil`,
366and to copy the necessary string data in its member variable.
367
368The position values are 0-based; that is, the first position is 0.
369The data type is `size_t` (before version 1.4.0, it was `int`).
370
371**_element_ `~` `{` _c source code_ `}`**
372
373Curly braces following tilde (`~`) surround an error action.
374The error action is arbitrary C source code to be executed at the end of matching only if the preceding _element_ matching fails.
375Any braces within the error action must be properly nested.
376Note that braces in directive lines and in comments (`/*`...`*/` and `//`...) are appropriately ignored.
377One or more error actions can be inserted in any places after elements in the pattern.
378The operator tilde (`~`) binds less tightly than any other operator except alternation (`/`) and sequencing.
379The error action is intended to make error handling and recovery code easier to write.
380In the error action, all predefined variables described above are available as well.
381The examples are shown below.
382
383```
384rule1 <- e1 e2 e3 ~{ error("e[12] ok; e3 has failed"); }
385rule2 <- (e1 e2 e3) ~{ error("one of e[123] has failed"); }
386```
387
388**`%header` `{` _c source code_ `}`**
389
390The specified C source code is copied verbatim to the C header file before the generated parser API function declarations.
391Any braces in the C source code must be properly nested.
392Note that braces in directive lines and in comments (`/*`...`*/` and `//`...) are appropriately ignored.
393
394**`%source` `{` _c source code_ `}`**
395
396The specified C source code is copied verbatim to the C source file before the generated parser implementation code.
397Any braces in the C source code must be properly nested.
398Note that braces in directive lines and in comments (`/*`...`*/` and `//`...) are appropriately ignored.
399
400**`%common` `{` _c source code_ `}`**
401
402The specified C source code is copied verbatim to both of the C header file and the C source file
403before the generated parser API function declarations and the implementation code respectively.
404Any braces in the C source code must be properly nested.
405Note that braces in directive lines and in comments (`/*`...`*/` and `//`...) are appropriately ignored.
406
407**`%earlyheader` `{` _c source code_ `}`**
408
409**`%earlysource` `{` _c source code_ `}`**
410
411**`%earlycommon` `{` _c source code_ `}`**
412
413Same as `%header`, `%source` and `%common`, respectively.
414The only difference is that these directives place the code at the very beginning of the generated file,
415before any code or includes generated by PackCC.
416This can be useful for example when it is necessary to modify behavior of standard libraries via a macro definition.
417
418**`%value` `"`_output data type_`"`**
419
420The type of output data, which is output as `$$` in each action and can be retrieved from the parser API function `pcc_parse()`,
421is changed to the specified one from the default `int`.
422
423**`%auxil` `"`_user-defined data type_`"`**
424
425The type of user-defined data, which is passed to the parser API function `pcc_create()`,
426is changed to the specified one from the default `void *`.
427
428**`%prefix` `"`_prefix_`"`**
429
430The prefix of the parser API functions is changed to the specified one from the default `pcc`.
431
432**`#`_comment_**
433
434A comment can be inserted between `#` and the end of the line.
435
436**`%%`**
437
438A double percent `%%` terminates the section for rule definitions of the grammar.
439All text following `%%` is copied verbatim to the C source file after the generated parser implementation code.
440
441<small>(The specification is determined by referring to [peg/leg](http://piumarta.com/software/peg/) developed by Ian Piumarta.)</small>
442
443### Macros ###
444
445Some macros are prepared to customize the parser.
446The macro definition should be in <u>`%source` section</u> in the PEG source.
447
448```
449%source {
450#define PCC_GETCHAR(auxil) get_character((auxil)->input)
451#define PCC_BUFFERSIZE 1024
452}
453```
454
455The following macros are available.
456
457**`PCC_GETCHAR(`**_auxil_**`)`**
458
459The function macro to get a character from the input.
460The user-defined data passed to the API function `pcc_create()` can be retrieved from the argument _auxil_.
461It can be ignored if no user-defined data.
462This macro must return a character code as an `int` type, or `-1` if the input ends.
463
464The default is defined as below.
465
466```C
467#define PCC_GETCHAR(auxil) getchar()
468```
469
470**`PCC_ERROR(`**_auxil_**`)`**
471
472The function macro to handle a syntax error.
473The user-defined data passed to the API function `pcc_create()` can be retrieved from the argument _auxil_.
474It can be ignored if no user-defined data.
475This macro need not return a value.
476It may abort the process (by using `exit()` for example) when a fatal error occurs, and can also return normally to deal with warnings.
477
478The default is defined as below.
479
480```C
481#define PCC_ERROR(auxil) pcc_error()
482static void pcc_error(void) {
483    fprintf(stderr, "Syntax error\n");
484    exit(1);
485}
486```
487
488**`PCC_MALLOC(`**_auxil_**`,`**_size_**`)`**
489
490The function macro to allocate a memory block.
491The user-defined data passed to the API function `pcc_create()` can be retrieved from the argument _auxil_.
492It can be ignored if no user-defined data.
493The argument _size_ is the number of bytes to allocate.
494This macro must return a pointer to the allocated memory block, or `NULL` if no sufficient memory is available.
495
496The default is defined as below.
497
498```C
499#define PCC_MALLOC(auxil, size) pcc_malloc_e(size)
500static void *pcc_malloc_e(size_t size) {
501    void *p = malloc(size);
502    if (p == NULL) {
503        fprintf(stderr, "Out of memory\n");
504        exit(1);
505    }
506    return p;
507}
508```
509
510**`PCC_REALLOC(`**_auxil_**`,`**_ptr_**`,`**_size_**`)`**
511
512The function macro to reallocate the existing memory block.
513The user-defined data passed to the API function `pcc_create()` can be retrieved from the argument _auxil_.
514It can be ignored if no user-defined data.
515The argument _ptr_ is the pointer to the previously allocated memory block.
516The argument _size_ is the new number of bytes to reallocate.
517This macro must return a pointer to the reallocated memory block, or `NULL` if no sufficient memory is available.
518The contents of the memory block should be left unchanged in any case even if the reallocation fails.
519
520The default is defined as below.
521
522```C
523#define PCC_REALLOC(auxil, ptr, size) pcc_realloc_e(ptr, size)
524static void *pcc_realloc_e(void *ptr, size_t size) {
525    void *p = realloc(ptr, size);
526    if (p == NULL) {
527        fprintf(stderr, "Out of memory\n");
528        exit(1);
529    }
530    return p;
531}
532```
533
534**`PCC_FREE(`**_auxil_**`,`**_ptr_**`)`**
535
536The function macro to free the existing memory block.
537The user-defined data passed to the API function `pcc_create()` can be retrieved from the argument _auxil_.
538It can be ignored if no user-defined data.
539The argument _ptr_ is the pointer to the previously allocated memory block.
540This macro need not return a value.
541
542The default is defined as below.
543
544```C
545#define PCC_FREE(auxil, ptr) free(ptr)
546```
547
548**`PCC_DEBUG(`**_auxil_**`,`**_event_**`,`**_rule_**`,`**_level_**`,`**_pos_**`,`**_buffer_**`,`**_length_**`)`**
549
550The function macro for debugging (version 1.5.0 or later).
551Sometimes, especially for complex parsers, it is useful to see how exactly the parser processes the input.
552This macro is called on important *events* and allows to log or display the current state of the parser.
553The argument `rule` is a string that contains the name of the currently evaluated rule.
554The non-negative integer `level` specifies how deep in the rule hierarchy the parser currently is.
555The argument `pos` holds the position from the start of the current context in bytes.
556In case of `event == PCC_DBG_MATCH`, the argument `buffer` holds the matched input and `length` is its size.
557For other events, `buffer` and `length` indicate a part of the currently loaded input, which is used to evaluate the current rule.
558
559**Caution:** Since version 1.6.0, the first argument _auxil_ is added to this macro.
560The user-defined data passed to the API function `pcc_create()` can be retrieved from this argument.
561
562There are currently three supported events:
563 - `PCC_DBG_EVALUATE` (= 0) - called when the parser starts to evaluate `rule`
564 - `PCC_DBG_MATCH` (= 1) - called when `rule` is matched, at which point buffer holds entire matched string
565 - `PCC_DBG_NOMATCH` (= 2) - called when the parser determines that the input does not match currently evaluated `rule`
566
567A very simple implementation could look like this:
568
569```C
570static const char *dbg_str[] = { "Evaluating rule", "Matched rule", "Abandoning rule" };
571#define PCC_DEBUG(auxil, event, rule, level, pos, buffer, length) \
572    fprintf(stderr, "%*s%s %s @%zu [%.*s]\n", (int)((level) * 2), "", dbg_str[event], rule, pos, (int)(length), buffer)
573```
574
575The default is to do nothing:
576
577```C
578#define PCC_DEBUG(auxil, event, rule, level, pos, buffer, length) ((void)0)
579```
580
581**`PCC_BUFFERSIZE`**
582
583The initial size (the number of characters) of the text buffer.
584The text buffer is expanded as needed.
585The default is `256`.
586
587**`PCC_ARRAYSIZE`**
588
589The initial size (the number of elements) of the internal arrays other than the text buffer.
590The arrays are expanded as needed.
591The default is `2`.
592
593### API ###
594
595The parser API has only 3 simple functions below.
596
597```C
598pcc_context_t *pcc_create(void *auxil);
599```
600
601Creates a parser context.
602This context needs to be passed to the functions below.
603The `auxil` can be used to pass user-defined data to be bound to the context.
604`NULL` can be specified if no user-defined data.
605
606```C
607int pcc_parse(pcc_context_t *ctx, int *ret);
608```
609
610Parses an input text (from standard input by default) and returns the result in `ret`.
611The `ret` can be `NULL` if no output data is needed.
612This function returns `0` if no text is left to be parsed, or a nonzero value otherwise.
613
614```C
615void pcc_destroy(pcc_context_t *ctx);
616```
617
618Destroys the parser context.
619All resources allocated in the parser context are released.
620
621The type of output data `ret` can be changed.
622If you want change it to `char *`, specify `%value "char *"` in the PEG source.
623The default is `int`.
624
625The type of user-defined data `auxil` can be changed.
626If you want change it to `long`, specify `%auxil "long"` in the PEG source.
627The default is `void *`.
628
629The prefix `pcc` can be changed.
630If you want change it to `foo`, specify `%prefix "foo"` in the PEG source.
631The default is `pcc`.
632
633After the above settings, the API functions change like below.
634
635```C
636foo_context_t *foo_create(long auxil);
637```
638
639```C
640int foo_parse(foo_context_t *ctx, char **ret);
641```
642
643```C
644void foo_destroy(foo_context_t *ctx);
645```
646
647The typical usage of the API functions is shown below.
648
649```C
650int ret;
651pcc_context_t *ctx = pcc_create(NULL);
652while (pcc_parse(ctx, &ret));
653pcc_destroy(ctx);
654```
655
656## Examples ##
657
658### Desktop calculator ###
659
660A simple example which provides interactive four arithmetic operations of integers is shown here.
661Note that **left-recursive** grammar rules are defined in this example.
662
663```
664%prefix "calc"
665
666%source {
667#include <stdio.h>
668#include <stdlib.h>
669}
670
671statement <- _ e:expression _ EOL { printf("answer=%d\n", e); }
672           / ( !EOL . )* EOL      { printf("error\n"); }
673
674expression <- e:term { $$ = e; }
675
676term <- l:term _ '+' _ r:factor { $$ = l + r; }
677      / l:term _ '-' _ r:factor { $$ = l - r; }
678      / e:factor                { $$ = e; }
679
680factor <- l:factor _ '*' _ r:unary { $$ = l * r; }
681        / l:factor _ '/' _ r:unary { $$ = l / r; }
682        / e:unary                  { $$ = e; }
683
684unary <- '+' _ e:unary { $$ = +e; }
685       / '-' _ e:unary { $$ = -e; }
686       / e:primary     { $$ = e; }
687
688primary <- < [0-9]+ >               { $$ = atoi($1); }
689         / '(' _ e:expression _ ')' { $$ = e; }
690
691_      <- [ \t]*
692EOL    <- '\n' / '\r\n' / '\r' / ';'
693
694%%
695int main() {
696    calc_context_t *ctx = calc_create(NULL);
697    while (calc_parse(ctx, NULL));
698    calc_destroy(ctx);
699    return 0;
700}
701```
702
703### AST builder for Tiny-C ###
704
705You can find the more practical example in the directory [`examples/ast-tinyc`](examples/ast-tinyc).
706It builds an AST (abstract syntax tree) from an input source file
707written in [Tiny-C](http://www.iro.umontreal.ca/~felipe/IFT2030-Automne2002/Complements/tinyc.c) and dump the AST.
708