1.. _writing_parser_in_c: 2 3============================================================================= 4Writing a parser in C 5============================================================================= 6 7The section is based on the section "Integrating a new language parser" in "`How 8to Add Support for a New Language to Exuberant Ctags (EXTENDING) 9<http://ctags.sourceforge.net/EXTENDING.html>`_" of Exuberant Ctags documents. 10 11Now suppose that I want to truly integrate compiled-in support for Swine into 12ctags. 13 14Registering a parser 15------------------------------------------------- 16First, I create a new module, ``swine.c``, and add one externally visible function 17to it, ``extern parserDefinition *SwineParser(void)``, and add its name to the 18table in ``parsers.h``. The job of this parser definition function is to create 19an instance of the ``parserDefinition`` structure (using ``parserNew()``) and 20populate it with information defining how files of this language are recognized, 21what kinds of tags it can locate, and the function used to invoke the parser on 22the currently open file. 23 24The structure ``parserDefinition`` allows assignment of the following fields: 25 26.. code-block:: c 27 28 struct sParserDefinition { 29 /* defined by parser */ 30 char* name; /* name of language */ 31 kindDefinition* kindTable; /* tag kinds handled by parser */ 32 unsigned int kindCount; /* size of 'kinds' list */ 33 const char *const *extensions; /* list of default extensions */ 34 const char *const *patterns; /* list of default file name patterns */ 35 const char *const *aliases; /* list of default aliases (alternative names) */ 36 parserInitialize initialize; /* initialization routine, if needed */ 37 parserFinalize finalize; /* finalize routine, if needed */ 38 simpleParser parser; /* simple parser (common case) */ 39 rescanParser parser2; /* rescanning parser (unusual case) */ 40 selectLanguage* selectLanguage; /* may be used to resolve conflicts */ 41 unsigned int method; /* See METHOD_ definitions above */ 42 unsigned int useCork; /* bit fields of corkUsage */ 43 ... 44 }; 45 46The ``name`` field must be set to a non-empty string. Also either ``parser`` or 47``parser2`` must set to point to a parsing routine which will generate the tag 48entries. All other fields are optional. 49 50Reading input file stream 51------------------------------------------------- 52Now all that is left is to implement the parser. In order to do its job, the 53parser should read the file stream using using one of the two I/O interfaces: 54either the character-oriented ``getcFromInputFile()``, or the line-oriented 55``readLineFromInputFile()``. 56 57See ":ref:`input-text-stream`" for more details. 58 59Parsing 60------------------------------------------------- 61How our Swine parser actually parses the contents of the file is entirely up to 62the writer of the parser--it can be as crude or elegant as desired. You will 63note a variety of examples from the most complex (``parsers/cxx/*.[hc]``) to the 64simplest (``parsers/make.[ch]``). 65 66Adding a tag to the tag file 67------------------------------------------------- 68When the Swine parser identifies an interesting token for which it wants to add 69a tag to the tag file, it should create a ``tagEntryInfo`` structure and 70initialize it by calling ``initTagEntry()``, which initializes defaults and 71fills information about the current line number and the file position of the 72beginning of the line. After filling in information defining the current entry 73(and possibly overriding the file position or other defaults), the parser passes 74this structure to ``makeTagEntry()``. 75 76See ":ref:`output-tag-stream`" for more details. 77 78Adding the parser to ``ctags`` 79------------------------------------------------- 80Lastly, be sure to add your the name of the file containing your parser (e.g. 81``parsers/swine.c``) to the macro ``PARSER_SRCS`` in the file ``source.mak``, so 82that your new module will be compiled into the program. 83 84Misc. 85------------------------------------------------- 86This is all there is to it. All other details are specific to the parser and how 87it wants to do its job. 88 89There are some support functions which can take care of some commonly needed 90parsing tasks, such as *keyword table lookups* (see ``main/keyword.c``), which you 91can make use of if desired (examples of its use can be found in ``parsers/c.c``, 92``parsers/eiffel.c``, and ``parsers/fortran.c``). 93 94Support functions can be found in ``main/*.h`` excluding ``main/*_p.h``. 95 96Almost everything is already taken care of automatically for you by the 97infrastructure. Writing the actual parsing algorithm is the hardest part, but is 98not constrained by any need to conform to anything in ctags other than that 99mentioned above. 100 101There are several different approaches used in the parsers inside Universal 102Ctags and you can browse through these as examples of how to go about creating 103your own. 104