xref: /Universal-ctags/docs/parser-in-c.rst (revision 86bcb5c2d162b4f8df782db0bb7293094d899fd2)
1.. _writing_parser_in_c:
2
3=============================================================================
4Writing a parser in C
5=============================================================================
6
7The section is based on the section "Integrating a new language parser" in "`How
8to Add Support for a New Language to Exuberant Ctags (EXTENDING)
9<http://ctags.sourceforge.net/EXTENDING.html>`_" of Exuberant Ctags documents.
10
11Now suppose that I want to truly integrate compiled-in support for Swine into
12ctags.
13
14Registering a parser
15-------------------------------------------------
16First, I create a new module, ``swine.c``, and add one externally visible function
17to it, ``extern parserDefinition *SwineParser(void)``, and add its name to the
18table in ``parsers.h``. The job of this parser definition function is to create
19an instance of the ``parserDefinition`` structure (using ``parserNew()``) and
20populate it with information defining how files of this language are recognized,
21what kinds of tags it can locate, and the function used to invoke the parser on
22the currently open file.
23
24The structure ``parserDefinition`` allows assignment of the following fields:
25
26.. code-block:: c
27
28	struct sParserDefinition {
29		/* defined by parser */
30		char* name;                    /* name of language */
31		kindDefinition* kindTable;	   /* tag kinds handled by parser */
32		unsigned int kindCount;        /* size of 'kinds' list */
33		const char *const *extensions; /* list of default extensions */
34		const char *const *patterns;   /* list of default file name patterns */
35		const char *const *aliases;    /* list of default aliases (alternative names) */
36		parserInitialize initialize;   /* initialization routine, if needed */
37		parserFinalize finalize;       /* finalize routine, if needed */
38		simpleParser parser;           /* simple parser (common case) */
39		rescanParser parser2;          /* rescanning parser (unusual case) */
40		selectLanguage* selectLanguage; /* may be used to resolve conflicts */
41		unsigned int method;           /* See METHOD_ definitions above */
42		unsigned int useCork;		   /* bit fields of corkUsage */
43		...
44	};
45
46The ``name`` field must be set to a non-empty string. Also either ``parser`` or
47``parser2`` must set to point to a parsing routine which will generate the tag
48entries. All other fields are optional.
49
50Reading input file stream
51-------------------------------------------------
52Now all that is left is to implement the parser. In order to do its job, the
53parser should read the file stream using using one of the two I/O interfaces:
54either the character-oriented ``getcFromInputFile()``, or the line-oriented
55``readLineFromInputFile()``.
56
57See ":ref:`input-text-stream`" for more details.
58
59Parsing
60-------------------------------------------------
61How our Swine parser actually parses the contents of the file is entirely up to
62the writer of the parser--it can be as crude or elegant as desired. You will
63note a variety of examples from the most complex (``parsers/cxx/*.[hc]``) to the
64simplest (``parsers/make.[ch]``).
65
66Adding a tag to the tag file
67-------------------------------------------------
68When the Swine parser identifies an interesting token for which it wants to add
69a tag to the tag file, it should create a ``tagEntryInfo`` structure and
70initialize it by calling ``initTagEntry()``, which initializes defaults and
71fills information about the current line number and the file position of the
72beginning of the line. After filling in information defining the current entry
73(and possibly overriding the file position or other defaults), the parser passes
74this structure to ``makeTagEntry()``.
75
76See ":ref:`output-tag-stream`" for more details.
77
78Adding the parser to ``ctags``
79-------------------------------------------------
80Lastly, be sure to add your the name of the file containing your parser (e.g.
81``parsers/swine.c``) to the macro ``PARSER_SRCS`` in the file ``source.mak``, so
82that your new module will be compiled into the program.
83
84Misc.
85-------------------------------------------------
86This is all there is to it. All other details are specific to the parser and how
87it wants to do its job.
88
89There are some support functions which can take care of some commonly needed
90parsing tasks, such as *keyword table lookups* (see ``main/keyword.c``), which you
91can make use of if desired (examples of its use can be found in ``parsers/c.c``,
92``parsers/eiffel.c``, and ``parsers/fortran.c``).
93
94Support functions can be found in ``main/*.h`` excluding ``main/*_p.h``.
95
96Almost everything is already taken care of automatically for you by the
97infrastructure. Writing the actual parsing algorithm is the hardest part, but is
98not constrained by any need to conform to anything in ctags other than that
99mentioned above.
100
101There are several different approaches used in the parsers inside Universal
102Ctags and you can browse through these as examples of how to go about creating
103your own.
104