1063580daSHiroo HAYASHI.. _writing_parser_in_c: 2063580daSHiroo HAYASHI 3063580daSHiroo HAYASHI============================================================================= 4063580daSHiroo HAYASHIWriting a parser in C 5063580daSHiroo HAYASHI============================================================================= 6063580daSHiroo HAYASHI 7*86bcb5c2SHiroo HAYASHIThe section is based on the section "Integrating a new language parser" in "`How 8063580daSHiroo HAYASHIto Add Support for a New Language to Exuberant Ctags (EXTENDING) 9*86bcb5c2SHiroo HAYASHI<http://ctags.sourceforge.net/EXTENDING.html>`_" of Exuberant Ctags documents. 10063580daSHiroo HAYASHI 11063580daSHiroo HAYASHINow suppose that I want to truly integrate compiled-in support for Swine into 12063580daSHiroo HAYASHIctags. 13063580daSHiroo HAYASHI 14063580daSHiroo HAYASHIRegistering a parser 15063580daSHiroo HAYASHI------------------------------------------------- 16*86bcb5c2SHiroo HAYASHIFirst, I create a new module, ``swine.c``, and add one externally visible function 17*86bcb5c2SHiroo HAYASHIto it, ``extern parserDefinition *SwineParser(void)``, and add its name to the 18063580daSHiroo HAYASHItable in ``parsers.h``. The job of this parser definition function is to create 19*86bcb5c2SHiroo HAYASHIan instance of the ``parserDefinition`` structure (using ``parserNew()``) and 20063580daSHiroo HAYASHIpopulate it with information defining how files of this language are recognized, 21063580daSHiroo HAYASHIwhat kinds of tags it can locate, and the function used to invoke the parser on 22063580daSHiroo HAYASHIthe currently open file. 23063580daSHiroo HAYASHI 24*86bcb5c2SHiroo HAYASHIThe structure ``parserDefinition`` allows assignment of the following fields: 25063580daSHiroo HAYASHI 26063580daSHiroo HAYASHI.. code-block:: c 27063580daSHiroo HAYASHI 28063580daSHiroo HAYASHI struct sParserDefinition { 29063580daSHiroo HAYASHI /* defined by parser */ 30063580daSHiroo HAYASHI char* name; /* name of language */ 31063580daSHiroo HAYASHI kindDefinition* kindTable; /* tag kinds handled by parser */ 32063580daSHiroo HAYASHI unsigned int kindCount; /* size of 'kinds' list */ 33063580daSHiroo HAYASHI const char *const *extensions; /* list of default extensions */ 34063580daSHiroo HAYASHI const char *const *patterns; /* list of default file name patterns */ 35063580daSHiroo HAYASHI const char *const *aliases; /* list of default aliases (alternative names) */ 36063580daSHiroo HAYASHI parserInitialize initialize; /* initialization routine, if needed */ 37063580daSHiroo HAYASHI parserFinalize finalize; /* finalize routine, if needed */ 38063580daSHiroo HAYASHI simpleParser parser; /* simple parser (common case) */ 39063580daSHiroo HAYASHI rescanParser parser2; /* rescanning parser (unusual case) */ 40063580daSHiroo HAYASHI selectLanguage* selectLanguage; /* may be used to resolve conflicts */ 41063580daSHiroo HAYASHI unsigned int method; /* See METHOD_ definitions above */ 42063580daSHiroo HAYASHI unsigned int useCork; /* bit fields of corkUsage */ 43063580daSHiroo HAYASHI ... 44063580daSHiroo HAYASHI }; 45063580daSHiroo HAYASHI 46063580daSHiroo HAYASHIThe ``name`` field must be set to a non-empty string. Also either ``parser`` or 47063580daSHiroo HAYASHI``parser2`` must set to point to a parsing routine which will generate the tag 48063580daSHiroo HAYASHIentries. All other fields are optional. 49063580daSHiroo HAYASHI 50063580daSHiroo HAYASHIReading input file stream 51063580daSHiroo HAYASHI------------------------------------------------- 52063580daSHiroo HAYASHINow all that is left is to implement the parser. In order to do its job, the 53063580daSHiroo HAYASHIparser should read the file stream using using one of the two I/O interfaces: 54063580daSHiroo HAYASHIeither the character-oriented ``getcFromInputFile()``, or the line-oriented 55063580daSHiroo HAYASHI``readLineFromInputFile()``. 56063580daSHiroo HAYASHI 57*86bcb5c2SHiroo HAYASHISee ":ref:`input-text-stream`" for more details. 58063580daSHiroo HAYASHI 59063580daSHiroo HAYASHIParsing 60063580daSHiroo HAYASHI------------------------------------------------- 61063580daSHiroo HAYASHIHow our Swine parser actually parses the contents of the file is entirely up to 62063580daSHiroo HAYASHIthe writer of the parser--it can be as crude or elegant as desired. You will 63063580daSHiroo HAYASHInote a variety of examples from the most complex (``parsers/cxx/*.[hc]``) to the 64063580daSHiroo HAYASHIsimplest (``parsers/make.[ch]``). 65063580daSHiroo HAYASHI 66063580daSHiroo HAYASHIAdding a tag to the tag file 67063580daSHiroo HAYASHI------------------------------------------------- 68063580daSHiroo HAYASHIWhen the Swine parser identifies an interesting token for which it wants to add 69063580daSHiroo HAYASHIa tag to the tag file, it should create a ``tagEntryInfo`` structure and 70063580daSHiroo HAYASHIinitialize it by calling ``initTagEntry()``, which initializes defaults and 71063580daSHiroo HAYASHIfills information about the current line number and the file position of the 72063580daSHiroo HAYASHIbeginning of the line. After filling in information defining the current entry 73063580daSHiroo HAYASHI(and possibly overriding the file position or other defaults), the parser passes 74063580daSHiroo HAYASHIthis structure to ``makeTagEntry()``. 75063580daSHiroo HAYASHI 76*86bcb5c2SHiroo HAYASHISee ":ref:`output-tag-stream`" for more details. 77063580daSHiroo HAYASHI 78063580daSHiroo HAYASHIAdding the parser to ``ctags`` 79063580daSHiroo HAYASHI------------------------------------------------- 80063580daSHiroo HAYASHILastly, be sure to add your the name of the file containing your parser (e.g. 81063580daSHiroo HAYASHI``parsers/swine.c``) to the macro ``PARSER_SRCS`` in the file ``source.mak``, so 82063580daSHiroo HAYASHIthat your new module will be compiled into the program. 83063580daSHiroo HAYASHI 84063580daSHiroo HAYASHIMisc. 85063580daSHiroo HAYASHI------------------------------------------------- 86063580daSHiroo HAYASHIThis is all there is to it. All other details are specific to the parser and how 87063580daSHiroo HAYASHIit wants to do its job. 88063580daSHiroo HAYASHI 89063580daSHiroo HAYASHIThere are some support functions which can take care of some commonly needed 90*86bcb5c2SHiroo HAYASHIparsing tasks, such as *keyword table lookups* (see ``main/keyword.c``), which you 91063580daSHiroo HAYASHIcan make use of if desired (examples of its use can be found in ``parsers/c.c``, 92063580daSHiroo HAYASHI``parsers/eiffel.c``, and ``parsers/fortran.c``). 93063580daSHiroo HAYASHI 94063580daSHiroo HAYASHISupport functions can be found in ``main/*.h`` excluding ``main/*_p.h``. 95063580daSHiroo HAYASHI 96063580daSHiroo HAYASHIAlmost everything is already taken care of automatically for you by the 97063580daSHiroo HAYASHIinfrastructure. Writing the actual parsing algorithm is the hardest part, but is 98063580daSHiroo HAYASHInot constrained by any need to conform to anything in ctags other than that 99063580daSHiroo HAYASHImentioned above. 100063580daSHiroo HAYASHI 101063580daSHiroo HAYASHIThere are several different approaches used in the parsers inside Universal 102063580daSHiroo HAYASHICtags and you can browse through these as examples of how to go about creating 103063580daSHiroo HAYASHIyour own. 104