16575e367SMasatake YAMATO.. _optlib: 26575e367SMasatake YAMATO 309be9c82SMasatake YAMATOExtending ctags with Regex parser (*optlib*) 4eb375513SMasatake YAMATO--------------------------------------------------------------------- 5f439b71bSVitor Antunes 6f439b71bSVitor Antunes:Maintainer: Masatake YAMATO <yamato@redhat.com> 7f439b71bSVitor Antunes 84351b915SHiroo HAYASHI.. contents:: `Table of contents` 94351b915SHiroo HAYASHI :depth: 3 104351b915SHiroo HAYASHI :local: 114351b915SHiroo HAYASHI 12b40096fdSHadriel Kaplan.. TODO: 13b40096fdSHadriel Kaplan add a section on debugging 14b40096fdSHadriel Kaplan 15bb84f88aSHiroo HAYASHIExuberant Ctags allows a user to add a new parser to ctags with ``--langdef=<LANG>`` 16d170c1c2SHiroo HAYASHIand ``--regex-<LANG>=...`` options. 17bb84f88aSHiroo HAYASHIUniversal Ctags follows and extends the design of Exuberant Ctags in more 18e30940dcSHiroo HAYASHIpowerful ways and call the feature as *optlib parser*, which is described in in 19e30940dcSHiroo HAYASHI:ref:`ctags-optlib(7) <ctags-optlib(7)>` and the following sections. 20d170c1c2SHiroo HAYASHI 2186bcb5c2SHiroo HAYASHI:ref:`ctags-optlib(7) <ctags-optlib(7)>` is the primary document of the optlib 2286bcb5c2SHiroo HAYASHIparser feature. The following sections provide additional information and more 23e30940dcSHiroo HAYASHIadvanced features. Note that some of the features are experimental, and will be 24e30940dcSHiroo HAYASHImarked as such in the documentation. 25d170c1c2SHiroo HAYASHI 2686bcb5c2SHiroo HAYASHILots of optlib parsers are included in Universal Ctags, 27e30940dcSHiroo HAYASHI`optlib/*.ctags <https://github.com/universal-ctags/ctags/tree/master/optlib>`_. 28e30940dcSHiroo HAYASHIThey will be good examples when you develop your own parsers. 29e30940dcSHiroo HAYASHI 3086bcb5c2SHiroo HAYASHIA optlib parser can be translated into C source code. Your optlib parser can 3186bcb5c2SHiroo HAYASHIthus easily become a built-in parser. See ":ref:`optlib2c`" for details. 32a3343725SMasatake YAMATO 33b40096fdSHadriel KaplanRegular expression (regex) engine 34eb375513SMasatake YAMATO~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3509be9c82SMasatake YAMATO 36ea999d80SMasatake YAMATOUniversal Ctags uses `the POSIX Extended Regular Expressions (ERE) 37ac0c751cSHiroo HAYASHI<https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html>`_ 38ea999d80SMasatake YAMATOsyntax as same as Exuberant Ctags by default. 39ac0c751cSHiroo HAYASHI 40ac0c751cSHiroo HAYASHIDuring building Universal Ctags the ``configure`` script runs compatibility 41ac0c751cSHiroo HAYASHItests of the regex engine in the system library. If tests pass the engine is 42ac0c751cSHiroo HAYASHIused, otherwise the regex engine imported from `the GNU Gnulib library 43ac0c751cSHiroo HAYASHI<https://www.gnu.org/software/gnulib/manual/gnulib.html#Regular-expressions>`_ 44ac0c751cSHiroo HAYASHIis used. In the latter case, ``ctags --list-features`` will contain 45ac0c751cSHiroo HAYASHI``gnulib_regex``. 46ac0c751cSHiroo HAYASHI 47ac0c751cSHiroo HAYASHISee ``regex(7)`` or `the GNU Gnulib Manual 48ac0c751cSHiroo HAYASHI<https://www.gnu.org/software/gnulib/manual/gnulib.html#Regular-expressions>`_ 49ac0c751cSHiroo HAYASHIfor the details of the regular expression syntax. 50ac0c751cSHiroo HAYASHI 51ac0c751cSHiroo HAYASHI.. note:: 52ac0c751cSHiroo HAYASHI 53ac0c751cSHiroo HAYASHI The GNU regex engine supports some GNU extensions described `here 54ac0c751cSHiroo HAYASHI <https://www.gnu.org/software/gnulib/manual/gnulib.html#posix_002dextended-regular-expression-syntax>`_. 55ac0c751cSHiroo HAYASHI Note that an optlib parser using the extensions may not work with Universal 56ac0c751cSHiroo HAYASHI Ctags on some other systems. 57ac0c751cSHiroo HAYASHI 580bceb411SHiroo HAYASHIThe POSIX Extended Regular Expressions (ERE) does 59b40096fdSHadriel Kaplan*not* support many of the "modern" extensions such as lazy captures, 60b40096fdSHadriel Kaplannon-capturing grouping, atomic grouping, possessive quantifiers, look-ahead/behind, 610bceb411SHiroo HAYASHIetc. It may be notoriously slow when backtracking. 6209be9c82SMasatake YAMATO 633676b2a7SHiroo HAYASHIA common error is forgetting that a 643676b2a7SHiroo HAYASHIPOSIX ERE engine is always *greedy*; the '``*``' and '``+``' quantifiers match 653676b2a7SHiroo HAYASHIas much as possible, before backtracking from the end of their match. 663676b2a7SHiroo HAYASHI 673676b2a7SHiroo HAYASHIFor example this pattern:: 683676b2a7SHiroo HAYASHI 693676b2a7SHiroo HAYASHI foo.*bar 703676b2a7SHiroo HAYASHI 713676b2a7SHiroo HAYASHIWill match this entire string, not just the first part:: 723676b2a7SHiroo HAYASHI 733676b2a7SHiroo HAYASHI foobar, bar, and even more bar 7409be9c82SMasatake YAMATO 75b40096fdSHadriel KaplanAnother detail to keep in mind is how the regex engine treats newlines. 76dccba5efSHiroo HAYASHIUniversal Ctags compiles the regular expressions in the ``--regex-<LANG>`` and 7786bcb5c2SHiroo HAYASHI``--mline-regex-<LANG>`` options with ``REG_NEWLINE`` set. What that means is documented 78b40096fdSHadriel Kaplanin the 790bceb411SHiroo HAYASHI`POSIX specification <https://pubs.opengroup.org/onlinepubs/9699919799/functions/regcomp.html>`_. 8086bcb5c2SHiroo HAYASHIOne obvious effect is that the regex special dot any-character '``.``' does not match 8186bcb5c2SHiroo HAYASHInewline characters, the '``^``' anchor *does* match right after a newline, and 8286bcb5c2SHiroo HAYASHIthe '``$``' anchor matches right before a newline. A more subtle issue is this text from the 830bceb411SHiroo HAYASHIchapter "`Regular Expressions <https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html>`_"; 84b40096fdSHadriel Kaplan"the use of literal <newline>s or any escape sequence equivalent produces undefined 85b40096fdSHadriel Kaplanresults". What that means is using a regex pattern with ``[^\n]+`` is invalid, 8686bcb5c2SHiroo HAYASHIand indeed in glibc produces very odd results. **Never use** '``\n``' in patterns 8786bcb5c2SHiroo HAYASHIfor ``--regex-<LANG>``, and **never use them** in non-matching bracket expressions 88b40096fdSHadriel Kaplanfor ``--mline-regex-<LANG>`` patterns. For the experimental ``--_mtable-regex-<LANG>`` 8986bcb5c2SHiroo HAYASHIyou can safely use '``\n``' because that regex is not compiled with ``REG_NEWLINE``. 9009be9c82SMasatake YAMATO 913676b2a7SHiroo HAYASHIAnd it may also have some known "quirks" 923676b2a7SHiroo HAYASHIwith respect to escaping special characters in bracket expressions. 933676b2a7SHiroo HAYASHIFor example, a pattern of ``[^\]]+`` is invalid in POSIX ERE, because the '``]``' is 9409be9c82SMasatake YAMATO*not* special inside a bracket expression, and thus should **not** be escaped. 953676b2a7SHiroo HAYASHIMost regex engines ignore this subtle detail in POSIX ERE, and instead allow 9609be9c82SMasatake YAMATOescaping it with '``\]``' inside the bracket expression and treat it as the 9709be9c82SMasatake YAMATOliteral character '``]``'. GNU glibc, however, does not generate an error but 9809be9c82SMasatake YAMATOinstead considers it undefined behavior, and in fact it will match very odd 9909be9c82SMasatake YAMATOthings. Instead you **must** use the more unintuitive ``[^]]+`` syntax. The same 10009be9c82SMasatake YAMATOis technically true of other special characters inside a bracket expression, 10109be9c82SMasatake YAMATOsuch as ``[^\)]+``, which should instead be ``[^)]+``. The ``[^\)]+`` will 10209be9c82SMasatake YAMATOappear to work usually, but only because what it is really doing is matching any 10309be9c82SMasatake YAMATOcharacter but '``\``' *or* '``)``'. The only exceptions for using '``\``' inside a 10409be9c82SMasatake YAMATObracket expression are for '``\t``' and '``\n``', which ctags converts to their 10509be9c82SMasatake YAMATOsingle literal character control codes before passing the pattern to glibc. 10609be9c82SMasatake YAMATO 107b40096fdSHadriel KaplanYou should always test your regex patterns against test files with strings that 108b40096fdSHadriel Kaplando and do not match. Pay particular emphasis to when it should *not* match, and 1093676b2a7SHiroo HAYASHIhow *much* it matches when it should. 11009be9c82SMasatake YAMATO 111ea999d80SMasatake YAMATOPerl-compatible regular expressions (PCRE2) engine 112ea999d80SMasatake YAMATO~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 113ea999d80SMasatake YAMATO 114ea999d80SMasatake YAMATOUniversal Ctags optionally supports `Perl-Compatible Regular Expressions (PCRE2) 115ea999d80SMasatake YAMATO<https://www.pcre.org/current/doc/html/pcre2syntax.html>`_ syntax 116ea999d80SMasatake YAMATOonly if the Universal Ctags is built with ``pcre2`` library. 117ea999d80SMasatake YAMATOSee the output of ``--list-features`` option to know whether your Universal 118ea999d80SMasatake YAMATOCtags is built-with ``pcre2`` or not. 119ea999d80SMasatake YAMATO 120ea999d80SMasatake YAMATOPCRE2 *does* support many "modern" extensions. 121ea999d80SMasatake YAMATOFor example this pattern:: 122ea999d80SMasatake YAMATO 123ea999d80SMasatake YAMATO foo.*?bar 124ea999d80SMasatake YAMATO 125ea999d80SMasatake YAMATOWill match just the first part, ``foobar``, not this entire string,:: 126ea999d80SMasatake YAMATO 127ea999d80SMasatake YAMATO foobar, bar, and even more bar 128ea999d80SMasatake YAMATO 129b40096fdSHadriel KaplanRegex option argument flags 130b40096fdSHadriel Kaplan~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 131b40096fdSHadriel Kaplan 132be11ec05SMasanari IidaMany regex-based options described in this document support additional arguments 13386bcb5c2SHiroo HAYASHIin the form of long flags. Long flags are specified with surrounding '``{``' and 13486bcb5c2SHiroo HAYASHI'``}``'. 135b40096fdSHadriel Kaplan 1363cd8570eSHiroo HAYASHIThe general format and placement is as follows: 1373cd8570eSHiroo HAYASHI 1383cd8570eSHiroo HAYASHI.. code-block:: ctags 139b40096fdSHadriel Kaplan 140b40096fdSHadriel Kaplan --regex-<LANG>=<PATTERN>/<NAME>/[<KIND>/]LONGFLAGS 141b40096fdSHadriel Kaplan 142b40096fdSHadriel KaplanSome examples: 143b40096fdSHadriel Kaplan 144d14dd918SMasatake YAMATO.. code-block:: ctags 145b40096fdSHadriel Kaplan 146b40096fdSHadriel Kaplan --regex-Pod=/^=head1[ \t]+(.+)/\1/c/ 147b40096fdSHadriel Kaplan --regex-Foo=/set=[^;]+/\1/v/{icase} 148b40096fdSHadriel Kaplan --regex-Man=/^\.TH[[:space:]]{1,}"([^"]{1,})".*/\1/t/{exclusive}{icase}{scope=push} 149b40096fdSHadriel Kaplan --regex-Gdbinit=/^#//{exclusive} 150b40096fdSHadriel Kaplan 1513cd8570eSHiroo HAYASHINote that the last example only has two '``/``' forward-slashes following 152b40096fdSHadriel Kaplanthe regex pattern, as a shortened form when no kind-spec exists. 153b40096fdSHadriel Kaplan 154b40096fdSHadriel KaplanThe ``--mline-regex-<LANG>`` option also follows the above format. The 155b40096fdSHadriel Kaplanexperimental ``--_mtable-regex-<LANG>`` option follows a slightly 156b40096fdSHadriel Kaplanmodified version as well. 157b40096fdSHadriel Kaplan 158b40096fdSHadriel KaplanRegex control flags 159b40096fdSHadriel Kaplan...................................................................... 160b40096fdSHadriel Kaplan 161b40096fdSHadriel Kaplan.. Q: why even discuss the single-character version of the flags? Just 162b40096fdSHadriel Kaplan make everyone use the long form. 163b40096fdSHadriel Kaplan 164b40096fdSHadriel KaplanThe regex matching can be controlled by adding flags to the ``--regex-<LANG>``, 165b40096fdSHadriel Kaplan``--mline-regex-<LANG>``, and experimental ``--_mtable-regex-<LANG>`` options. 166b40096fdSHadriel KaplanThis is done by either using the single character short flags ``b``, ``e`` and 167b40096fdSHadriel Kaplan``i`` flags as explained in the *ctags.1* man page, or by using long flags 168b40096fdSHadriel Kaplandescribed earlier. The long flags require more typing but are much more 169b40096fdSHadriel Kaplanreadable. 170b40096fdSHadriel Kaplan 171b40096fdSHadriel KaplanThe mapping between the older short flag names and long flag names is: 172b40096fdSHadriel Kaplan 173b40096fdSHadriel Kaplan=========== =========== =========== 174b40096fdSHadriel Kaplanshort flag long flag description 175b40096fdSHadriel Kaplan=========== =========== =========== 176b40096fdSHadriel Kaplanb basic Posix basic regular expression syntax. 177b40096fdSHadriel Kaplane extend Posix extended regular expression syntax (default). 178b40096fdSHadriel Kaplani icase Case-insensitive matching. 179b40096fdSHadriel Kaplan=========== =========== =========== 180b40096fdSHadriel Kaplan 181b40096fdSHadriel Kaplan 182b40096fdSHadriel KaplanSo the following ``--regex-<LANG>`` expression: 183b40096fdSHadriel Kaplan 184d14dd918SMasatake YAMATO.. code-block:: ctags 185b40096fdSHadriel Kaplan 1869c9a7a7cSMasatake YAMATO --kinddef-m4=d,definition,definitions 1879c9a7a7cSMasatake YAMATO --regex-m4=/^m4_define\(\[([^]$\(]+).+$/\1/d/x 188b40096fdSHadriel Kaplan 189b40096fdSHadriel Kaplanis the same as: 190b40096fdSHadriel Kaplan 191d14dd918SMasatake YAMATO.. code-block:: ctags 192b40096fdSHadriel Kaplan 1939c9a7a7cSMasatake YAMATO --kinddef-m4=d,definition,definitions 1949c9a7a7cSMasatake YAMATO --regex-m4=/^m4_define\(\[([^]$\(]+).+$/\1/d/{extend} 19509be9c82SMasatake YAMATO 19686bcb5c2SHiroo HAYASHIThe characters '``{``' and '``}``' may not be suitable for command line 197b40096fdSHadriel Kaplanuse, but long flags are mostly intended for option files. 19809be9c82SMasatake YAMATO 19909be9c82SMasatake YAMATOExclusive flag in regex 200b40096fdSHadriel Kaplan...................................................................... 20109be9c82SMasatake YAMATO 20286bcb5c2SHiroo HAYASHIBy default, lines read from the input files will be matched against all the 203da7b7cd3SIvan Gonzalez Polancoregular expressions defined with ``--regex-<LANG>``. Each successfully matched 204da7b7cd3SIvan Gonzalez Polancoregular expression will emit a tag. 20509be9c82SMasatake YAMATO 20609be9c82SMasatake YAMATOIn some cases another policy, exclusive-matching, is preferable to the 20709be9c82SMasatake YAMATOall-matching policy. Exclusive-matching means the rest of regular 20809be9c82SMasatake YAMATOexpressions are not tried if one of regular expressions is matched 209b40096fdSHadriel Kaplansuccessfully, for that input line. 21009be9c82SMasatake YAMATO 211b40096fdSHadriel KaplanFor specifying exclusive-matching the flags ``exclusive`` (long) and ``x`` 212b40096fdSHadriel Kaplan(short) were introduced. For example, this is used in 21386bcb5c2SHiroo HAYASHI:file:`optlib/gdbinit.ctags` for ignoring comment lines in gdb files, 214b40096fdSHadriel Kaplanas follows: 21509be9c82SMasatake YAMATO 216d14dd918SMasatake YAMATO.. code-block:: ctags 21709be9c82SMasatake YAMATO 218b40096fdSHadriel Kaplan --regex-Gdbinit=/^#//{exclusive} 21909be9c82SMasatake YAMATO 2203cd8570eSHiroo HAYASHIComments in gdb files start with '``#``' so the above line is the first regex 221b40096fdSHadriel Kaplanmatch line in :file:`gdbinit.ctags`, so that subsequent regex matches are 222b40096fdSHadriel Kaplannot tried for the input line. 22309be9c82SMasatake YAMATO 224b40096fdSHadriel KaplanIf an empty name pattern (``//``) is used for the ``--regex-<LANG>`` option, 225b40096fdSHadriel Kaplanctags warns it as a wrong usage of the option. However, if the flags 226b40096fdSHadriel Kaplan``exclusive`` or ``x`` is specified, the warning is suppressed. 2273cd8570eSHiroo HAYASHIThis is useful to ignore matched patterns as above. 228b40096fdSHadriel Kaplan 229b40096fdSHadriel KaplanNOTE: This flag does not make sense in the multi-line ``--mline-regex-<LANG>`` 230b40096fdSHadriel Kaplanoption nor the multi-table ``--_mtable-regex-<LANG>`` option. 231b40096fdSHadriel Kaplan 232b40096fdSHadriel Kaplan 233b40096fdSHadriel KaplanExperimental flags 234b40096fdSHadriel Kaplan...................................................................... 235b40096fdSHadriel Kaplan 236b40096fdSHadriel Kaplan.. note:: These flags are experimental. They apply to all regex option 237b40096fdSHadriel Kaplan types: basic ``--regex-<LANG>``, multi-line ``--mline-regex-<LANG>``, 238b40096fdSHadriel Kaplan and the experimental multi-table ``--_mtable-regex-<LANG>`` option. 239b40096fdSHadriel Kaplan 240b40096fdSHadriel Kaplan``_extra`` 241b40096fdSHadriel Kaplan 242b40096fdSHadriel Kaplan This flag indicates the tag should only be generated if the given 24386bcb5c2SHiroo HAYASHI ``extra`` type is enabled, as explained in ":ref:`extras`". 244b40096fdSHadriel Kaplan 245b40096fdSHadriel Kaplan``_field`` 246b40096fdSHadriel Kaplan 247b40096fdSHadriel Kaplan This flag allows a regex match to add additional custom fields to the 24886bcb5c2SHiroo HAYASHI generated tag entry, as explained in ":ref:`fields`". 249b40096fdSHadriel Kaplan 250b40096fdSHadriel Kaplan``_role`` 251b40096fdSHadriel Kaplan 252b40096fdSHadriel Kaplan This flag allows a regex match to generate a reference tag entry and 25386bcb5c2SHiroo HAYASHI specify the role of the reference, as explained in ":ref:`roles`". 2548370e4a6SMasatake YAMATO 2550d56cc8eSMasatake YAMATO.. NOT REVIEWED YET 2560d56cc8eSMasatake YAMATO 2570d56cc8eSMasatake YAMATO``_anonymous=PREFIX`` 2580d56cc8eSMasatake YAMATO 2590d56cc8eSMasatake YAMATO This flag allows a regex match to generate an anonymous tag entry. 2600d56cc8eSMasatake YAMATO ctags gives a name starting with ``PREFIX`` and emits it. 2610d56cc8eSMasatake YAMATO This flag is useful to record the position for a language object 2620d56cc8eSMasatake YAMATO having no name. A lambda function in a functional programming 2630d56cc8eSMasatake YAMATO language is a typical example of a language object having no name. 2640d56cc8eSMasatake YAMATO 26586bcb5c2SHiroo HAYASHI Consider following input (``input.foo``): 2660d56cc8eSMasatake YAMATO 2670d56cc8eSMasatake YAMATO .. code-block:: lisp 2680d56cc8eSMasatake YAMATO 2690d56cc8eSMasatake YAMATO (let ((f (lambda (x) (+ 1 x)))) 2700d56cc8eSMasatake YAMATO ... 2710d56cc8eSMasatake YAMATO ) 2720d56cc8eSMasatake YAMATO 27386bcb5c2SHiroo HAYASHI Consider following optlib file (``foo.ctags``): 2740d56cc8eSMasatake YAMATO 275d14dd918SMasatake YAMATO .. code-block:: ctags 276a5c14cdaSHiroo HAYASHI :emphasize-lines: 4 2770d56cc8eSMasatake YAMATO 2780d56cc8eSMasatake YAMATO --langdef=Foo 2790d56cc8eSMasatake YAMATO --map-Foo=+.foo 2800d56cc8eSMasatake YAMATO --kinddef-Foo=l,lambda,lambda functions 2810d56cc8eSMasatake YAMATO --regex-Foo=/.*\(lambda .*//l/{_anonymous=L} 2820d56cc8eSMasatake YAMATO 2830d56cc8eSMasatake YAMATO You can get following tags file: 2840d56cc8eSMasatake YAMATO 2850d56cc8eSMasatake YAMATO .. code-block:: console 2860d56cc8eSMasatake YAMATO 2870d56cc8eSMasatake YAMATO $ u-ctags --options=foo.ctags -o - /tmp/input.foo 2880d56cc8eSMasatake YAMATO Le4679d360100 /tmp/input.foo /^(let ((f (lambda (x) (+ 1 x))))$/;" l 2890d56cc8eSMasatake YAMATO 2908370e4a6SMasatake YAMATO 291a60d2470SHiroo HAYASHI.. _extras: 292a60d2470SHiroo HAYASHI 293a60d2470SHiroo HAYASHIConditional tagging with extras 294a60d2470SHiroo HAYASHI^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 295a60d2470SHiroo HAYASHI 296a60d2470SHiroo HAYASHI.. NEEDS MORE REVIEWS 297a60d2470SHiroo HAYASHI 298a60d2470SHiroo HAYASHIIf a matched pattern should only be tagged when an ``extra`` flag is enabled, 299a60d2470SHiroo HAYASHImark the pattern with ``{_extra=XNAME}`` where ``XNAME`` is the name of the 300a60d2470SHiroo HAYASHIextra. You must define a ``XNAME`` with the 301a60d2470SHiroo HAYASHI``--_extradef-<LANG>=XNAME,DESCRIPTION`` option before defining a regex flag 302a60d2470SHiroo HAYASHImarked ``{_extra=XNAME}``. 303a60d2470SHiroo HAYASHI 304a60d2470SHiroo HAYASHI.. code-block:: python 305a60d2470SHiroo HAYASHI 306a60d2470SHiroo HAYASHI if __name__ == '__main__': 307a60d2470SHiroo HAYASHI do_something() 308a60d2470SHiroo HAYASHI 309a60d2470SHiroo HAYASHITo capture the lines above in a python program (``input.py``), an ``extra`` flag can 310a60d2470SHiroo HAYASHIbe used. 311a60d2470SHiroo HAYASHI 312a60d2470SHiroo HAYASHI.. code-block:: ctags 313a60d2470SHiroo HAYASHI :emphasize-lines: 1-2 314a60d2470SHiroo HAYASHI 315a60d2470SHiroo HAYASHI --_extradef-Python=main,__main__ entry points 316a60d2470SHiroo HAYASHI --regex-Python=/^if __name__ == '__main__':/__main__/f/{_extra=main} 317a60d2470SHiroo HAYASHI 318a60d2470SHiroo HAYASHIThe above optlib (``python-main.ctags``) introduces ``main`` extra to the Python parser. 319a60d2470SHiroo HAYASHIThe pattern matching is done only when the ``main`` is enabled. 320a60d2470SHiroo HAYASHI 321a60d2470SHiroo HAYASHI.. code-block:: console 322a60d2470SHiroo HAYASHI 323a60d2470SHiroo HAYASHI $ ctags --options=python-main.ctags -o - --extras-Python='+{main}' input.py 324a60d2470SHiroo HAYASHI __main__ input.py /^if __name__ == '__main__':$/;" f 325a60d2470SHiroo HAYASHI 326a60d2470SHiroo HAYASHI 327a60d2470SHiroo HAYASHI.. TODO: this "fields" section should probably be moved up this document, as a 328a60d2470SHiroo HAYASHI subsection in the "Regex option argument flags" section 329a60d2470SHiroo HAYASHI 330a60d2470SHiroo HAYASHI.. _fields: 331a60d2470SHiroo HAYASHI 332a60d2470SHiroo HAYASHIAdding custom fields to the tag output 333a60d2470SHiroo HAYASHI^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 334a60d2470SHiroo HAYASHI 335a60d2470SHiroo HAYASHI.. NEEDS MORE REVIEWS 336a60d2470SHiroo HAYASHI 337a60d2470SHiroo HAYASHIExuberant Ctags allows just one of the specified groups in a regex pattern to 338a60d2470SHiroo HAYASHIbe used as a part of the name of a tag entry. 339a60d2470SHiroo HAYASHI 340a60d2470SHiroo HAYASHIUniversal Ctags allows using the other groups in the regex pattern. 341a60d2470SHiroo HAYASHIAn optlib parser can have its specific fields. The groups can be used as a 342a60d2470SHiroo HAYASHIvalue of the fields of a tag entry. 343a60d2470SHiroo HAYASHI 344a60d2470SHiroo HAYASHILet's think about `Unknown`, an imaginary language. 345a60d2470SHiroo HAYASHIHere is a source file (``input.unknown``) written in `Unknown`: 346a60d2470SHiroo HAYASHI 347a60d2470SHiroo HAYASHI.. code-block:: java 348a60d2470SHiroo HAYASHI 349a60d2470SHiroo HAYASHI public func foo(n, m); 350a60d2470SHiroo HAYASHI protected func bar(n); 351a60d2470SHiroo HAYASHI private func baz(n,...); 352a60d2470SHiroo HAYASHI 353a60d2470SHiroo HAYASHIWith ``--regex-Unknown=...`` Exuberant Ctags can capture ``foo``, ``bar``, and ``baz`` 354a60d2470SHiroo HAYASHIas names. Universal Ctags can attach extra context information to the 355a60d2470SHiroo HAYASHInames as values for fields. Let's focus on ``bar``. ``protected`` is a 356a60d2470SHiroo HAYASHIkeyword to control how widely the identifier ``bar`` can be accessed. 357a60d2470SHiroo HAYASHI``(n)`` is the parameter list of ``bar``. ``protected`` and ``(n)`` are 358a60d2470SHiroo HAYASHIextra context information of ``bar``. 359a60d2470SHiroo HAYASHI 360a60d2470SHiroo HAYASHIWith the following optlib file (``unknown.ctags``), ctags can attach 361a60d2470SHiroo HAYASHI``protected`` to the field protection and ``(n)`` to the field signature. 362a60d2470SHiroo HAYASHI 363a60d2470SHiroo HAYASHI.. code-block:: ctags 364a60d2470SHiroo HAYASHI :emphasize-lines: 5-9 365a60d2470SHiroo HAYASHI 366a60d2470SHiroo HAYASHI --langdef=unknown 367a60d2470SHiroo HAYASHI --kinddef-unknown=f,func,functions 368a60d2470SHiroo HAYASHI --map-unknown=+.unknown 369a60d2470SHiroo HAYASHI 370a60d2470SHiroo HAYASHI --_fielddef-unknown=protection,access scope 371a60d2470SHiroo HAYASHI --_fielddef-unknown=signature,signatures 372a60d2470SHiroo HAYASHI 373a60d2470SHiroo HAYASHI --regex-unknown=/^((public|protected|private) +)?func ([^\(]+)\((.*)\)/\3/f/{_field=protection:\1}{_field=signature:(\4)} 374a60d2470SHiroo HAYASHI --fields-unknown=+'{protection}{signature}' 375a60d2470SHiroo HAYASHI 376a60d2470SHiroo HAYASHIFor the line ``protected func bar(n);`` you will get following tags output:: 377a60d2470SHiroo HAYASHI 378a60d2470SHiroo HAYASHI bar input.unknown /^protected func bar(n);$/;" f protection:protected signature:(n) 379a60d2470SHiroo HAYASHI 380a60d2470SHiroo HAYASHILet's see the detail of ``unknown.ctags``. 381a60d2470SHiroo HAYASHI 382a60d2470SHiroo HAYASHI.. code-block:: ctags 383a60d2470SHiroo HAYASHI 384a60d2470SHiroo HAYASHI --_fielddef-unknown=protection,access scope 385a60d2470SHiroo HAYASHI 386a60d2470SHiroo HAYASHI``--_fielddef-<LANG>=name,description`` defines a new field for a parser 387a60d2470SHiroo HAYASHIspecified by *<LANG>*. Before defining a new field for the parser, 388a60d2470SHiroo HAYASHIthe parser must be defined with ``--langdef=<LANG>``. ``protection`` is 389a60d2470SHiroo HAYASHIthe field name used in tags output. ``access scope`` is the description 390a60d2470SHiroo HAYASHIused in the output of ``--list-fields`` and ``--list-fields=Unknown``. 391a60d2470SHiroo HAYASHI 392a60d2470SHiroo HAYASHI.. code-block:: ctags 393a60d2470SHiroo HAYASHI 394a60d2470SHiroo HAYASHI --_fielddef-unknown=signature,signatures 395a60d2470SHiroo HAYASHI 396a60d2470SHiroo HAYASHIThis defines a field named ``signature``. 397a60d2470SHiroo HAYASHI 398a60d2470SHiroo HAYASHI.. code-block:: ctags 399a60d2470SHiroo HAYASHI 400a60d2470SHiroo HAYASHI --regex-unknown=/^((public|protected|private) +)?func ([^\(]+)\((.*)\)/\3/f/{_field=protection:\1}{_field=signature:(\4)} 401a60d2470SHiroo HAYASHI 402a60d2470SHiroo HAYASHIThis option requests making a tag for the name that is specified with the group 3 of the 403a60d2470SHiroo HAYASHIpattern, attaching the group 1 as a value for ``protection`` field to the tag, and attaching 404a60d2470SHiroo HAYASHIthe group 4 as a value for ``signature`` field to the tag. You can use the long regex flag 405a60d2470SHiroo HAYASHI``_field`` for attaching fields to a tag with the following notation rule:: 406a60d2470SHiroo HAYASHI 407a60d2470SHiroo HAYASHI {_field=FIELDNAME:GROUP} 408a60d2470SHiroo HAYASHI 409a60d2470SHiroo HAYASHI 410a60d2470SHiroo HAYASHI``--fields-<LANG>=[+|-]{FIELDNAME}`` can be used to enable or disable specified field. 411a60d2470SHiroo HAYASHI 412a60d2470SHiroo HAYASHIWhen defining a new parser specific field, it is disabled by default. Enable the 413a60d2470SHiroo HAYASHIfield explicitly to use the field. See ":ref:`Parser specific fields <parser-specific-fields>`" 414a60d2470SHiroo HAYASHIabout ``--fields-<LANG>`` option. 415a60d2470SHiroo HAYASHI 416a60d2470SHiroo HAYASHI`passwd` parser is a simple example that uses ``--fields-<LANG>`` option. 417a60d2470SHiroo HAYASHI 418a60d2470SHiroo HAYASHI 419a60d2470SHiroo HAYASHI.. _roles: 420a60d2470SHiroo HAYASHI 421a60d2470SHiroo HAYASHICapturing reference tags 422a60d2470SHiroo HAYASHI^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 423a60d2470SHiroo HAYASHI 424a60d2470SHiroo HAYASHI.. NOT REVIEWED YET 425a60d2470SHiroo HAYASHI 426a60d2470SHiroo HAYASHITo make a reference tag with an optlib parser, specify a role with 427a60d2470SHiroo HAYASHI``_role`` long regex flag. Let's see an example: 428a60d2470SHiroo HAYASHI 429a60d2470SHiroo HAYASHI.. code-block:: ctags 430a60d2470SHiroo HAYASHI :emphasize-lines: 3-6 431a60d2470SHiroo HAYASHI 432a60d2470SHiroo HAYASHI --langdef=FOO 433a60d2470SHiroo HAYASHI --kinddef-FOO=m,module,modules 434a60d2470SHiroo HAYASHI --_roledef-FOO.m=imported,imported module 435a60d2470SHiroo HAYASHI --regex-FOO=/import[ \t]+([a-z]+)/\1/m/{_role=imported} 436a60d2470SHiroo HAYASHI --extras=+r 437a60d2470SHiroo HAYASHI --fields=+r 438a60d2470SHiroo HAYASHI 439a60d2470SHiroo HAYASHIA role must be defined before specifying it as value for ``_role`` flag. 440a60d2470SHiroo HAYASHI``--_roledef-<LANG>.<KIND>=<ROLE>,<ROLEDESC>`` option is for defining a role. 441a60d2470SHiroo HAYASHISee the line, ``--regex-FOO=...``. In this parser `FOO`, the name of an 442a60d2470SHiroo HAYASHIimported module is captured as a reference tag with role ``imported``. 443a60d2470SHiroo HAYASHI 444a60d2470SHiroo HAYASHIFor specifying *<KIND>* where the role is defined, you can use either a 445a60d2470SHiroo HAYASHIkind letter or a kind name surrounded by '``{``' and '``}``'. 446a60d2470SHiroo HAYASHI 447a60d2470SHiroo HAYASHIThe option has two parameters separated by a comma: 448a60d2470SHiroo HAYASHI 449a60d2470SHiroo HAYASHI*<ROLE>* 450a60d2470SHiroo HAYASHI 451a60d2470SHiroo HAYASHI the role name, and 452a60d2470SHiroo HAYASHI 453a60d2470SHiroo HAYASHI*<ROLEDESC>* 454a60d2470SHiroo HAYASHI 455a60d2470SHiroo HAYASHI the description of the role. 456a60d2470SHiroo HAYASHI 457a60d2470SHiroo HAYASHIThe first parameter is the name of the role. The role is defined in 458a60d2470SHiroo HAYASHIthe kind *<KIND>* of the language *<LANG>*. In the example, 459a60d2470SHiroo HAYASHI``imported`` role is defined in the ``module`` kind, which is specified 460a60d2470SHiroo HAYASHIwith ``m``. You can use ``{module}``, the name of the kind instead. 461a60d2470SHiroo HAYASHI 462a60d2470SHiroo HAYASHIThe kind specified in ``--_roledef-<LANG>.<KIND>`` option must be 463a60d2470SHiroo HAYASHIdefined *before* using the option. See the description of 464a60d2470SHiroo HAYASHI``--kinddef-<LANG>`` for defining a kind. 465a60d2470SHiroo HAYASHI 466a60d2470SHiroo HAYASHIThe roles are listed with ``--list-roles=<LANG>``. The name and description 467a60d2470SHiroo HAYASHIpassed to ``--_roledef-<LANG>.<KIND>`` option are used in the output like:: 468a60d2470SHiroo HAYASHI 469a60d2470SHiroo HAYASHI $ ctags --langdef=FOO --kinddef-FOO=m,module,modules \ 470a60d2470SHiroo HAYASHI --_roledef-FOO.m='imported,imported module' --list-roles=FOO 471a60d2470SHiroo HAYASHI #KIND(L/N) NAME ENABLED DESCRIPTION 472a60d2470SHiroo HAYASHI m/module imported on imported module 473a60d2470SHiroo HAYASHI 474a60d2470SHiroo HAYASHI 475a60d2470SHiroo HAYASHIIf specifying ``_role`` regex flag multiple times with different roles, you can 476a60d2470SHiroo HAYASHIassign multiple roles to a reference tag. See following input of C language 477a60d2470SHiroo HAYASHI 478a60d2470SHiroo HAYASHI.. code-block:: C 479a60d2470SHiroo HAYASHI 480a60d2470SHiroo HAYASHI x = 0; 481a60d2470SHiroo HAYASHI i += 1; 482a60d2470SHiroo HAYASHI 483a60d2470SHiroo HAYASHIAn ultra fine grained C parser may capture the variable ``x`` with 484a60d2470SHiroo HAYASHI``lvalue`` role and the variable ``i`` with ``lvalue`` and ``incremented`` 485a60d2470SHiroo HAYASHIroles. 486a60d2470SHiroo HAYASHI 487a60d2470SHiroo HAYASHIYou can implement such roles by extending the built-in C parser: 488a60d2470SHiroo HAYASHI 489a60d2470SHiroo HAYASHI.. code-block:: ctags 490a60d2470SHiroo HAYASHI :emphasize-lines: 2-5 491a60d2470SHiroo HAYASHI 492a60d2470SHiroo HAYASHI # c-extra.ctags 493a60d2470SHiroo HAYASHI --_roledef-C.v=lvalue,locator values 494a60d2470SHiroo HAYASHI --_roledef-C.v=incremented,incremented with ++ operator 495a60d2470SHiroo HAYASHI --regex-C=/([a-zA-Z_][a-zA-Z_0-9]*) *=/\1/v/{_role=lvalue} 496a60d2470SHiroo HAYASHI --regex-C=/([a-zA-Z_][a-zA-Z_0-9]*) *\+=/\1/v/{_role=lvalue}{_role=incremented} 497a60d2470SHiroo HAYASHI 498a60d2470SHiroo HAYASHI.. code-block:: console 499a60d2470SHiroo HAYASHI 500a60d2470SHiroo HAYASHI $ ctags with --options=c-extra.ctags --extras=+r --fields=+r 501a60d2470SHiroo HAYASHI i input.c /^i += 1;$/;" v roles:lvalue,incremented 502a60d2470SHiroo HAYASHI x input.c /^x = 0;$/;" v roles:lvalue 503a60d2470SHiroo HAYASHI 504a60d2470SHiroo HAYASHI 5053c49e28cSMasatake YAMATOScope tracking in a regex parser 506a60d2470SHiroo HAYASHI...................................................................... 5073c49e28cSMasatake YAMATO 50886bcb5c2SHiroo HAYASHIAbout the ``{scope=..}`` flag itself for scope tracking, see "FLAGS FOR 509fbfefc14SMasatake YAMATO--regex-<LANG> OPTION" section of :ref:`ctags-optlib(7) <ctags-optlib(7)>`. 5103c49e28cSMasatake YAMATO 511b40096fdSHadriel KaplanExample 1: 5123c49e28cSMasatake YAMATO 513b40096fdSHadriel Kaplan.. code-block:: python 514b40096fdSHadriel Kaplan 515b40096fdSHadriel Kaplan # in /tmp/input.foo 5163c49e28cSMasatake YAMATO class foo: 5173c49e28cSMasatake YAMATO def bar(baz): 5183c49e28cSMasatake YAMATO print(baz) 5193c49e28cSMasatake YAMATO class goo: 5203c49e28cSMasatake YAMATO def gar(gaz): 5213c49e28cSMasatake YAMATO print(gaz) 5223c49e28cSMasatake YAMATO 523d14dd918SMasatake YAMATO.. code-block:: ctags 524a5c14cdaSHiroo HAYASHI :emphasize-lines: 7,8 5253c49e28cSMasatake YAMATO 526b40096fdSHadriel Kaplan # in /tmp/foo.ctags: 527b40096fdSHadriel Kaplan --langdef=Foo 528b40096fdSHadriel Kaplan --map-Foo=+.foo 5299c9a7a7cSMasatake YAMATO --kinddef-Foo=c,class,classes 5309c9a7a7cSMasatake YAMATO --kinddef-Foo=d,definition,definitions 531b40096fdSHadriel Kaplan 5329c9a7a7cSMasatake YAMATO --regex-Foo=/^class[[:blank:]]+([[:alpha:]]+):/\1/c/{scope=set} 5339c9a7a7cSMasatake YAMATO --regex-Foo=/^[[:blank:]]+def[[:blank:]]+([[:alpha:]]+).*:/\1/d/{scope=ref} 534b40096fdSHadriel Kaplan 535b40096fdSHadriel Kaplan.. code-block:: console 536b40096fdSHadriel Kaplan 537b40096fdSHadriel Kaplan $ ctags --options=/tmp/foo.ctags -o - /tmp/input.foo 5383c49e28cSMasatake YAMATO bar /tmp/input.foo /^ def bar(baz):$/;" d class:foo 5393c49e28cSMasatake YAMATO foo /tmp/input.foo /^class foo:$/;" c 5403c49e28cSMasatake YAMATO gar /tmp/input.foo /^ def gar(gaz):$/;" d class:goo 5413c49e28cSMasatake YAMATO goo /tmp/input.foo /^class goo:$/;" c 5423c49e28cSMasatake YAMATO 5433c49e28cSMasatake YAMATO 544b40096fdSHadriel KaplanExample 2: 5453c49e28cSMasatake YAMATO 546b40096fdSHadriel Kaplan.. code-block:: c 547b40096fdSHadriel Kaplan 548b40096fdSHadriel Kaplan // in /tmp/input.pp 5493c49e28cSMasatake YAMATO class foo { 550b40096fdSHadriel Kaplan int bar; 5513c49e28cSMasatake YAMATO } 5523c49e28cSMasatake YAMATO 553d14dd918SMasatake YAMATO.. code-block:: ctags 554a5c14cdaSHiroo HAYASHI :emphasize-lines: 7-9 555b40096fdSHadriel Kaplan 556b40096fdSHadriel Kaplan # in /tmp/pp.ctags: 5573c49e28cSMasatake YAMATO --langdef=pp 5583c49e28cSMasatake YAMATO --map-pp=+.pp 5599c9a7a7cSMasatake YAMATO --kinddef-pp=c,class,classes 5609c9a7a7cSMasatake YAMATO --kinddef-pp=v,variable,variables 5613c49e28cSMasatake YAMATO 562b40096fdSHadriel Kaplan --regex-pp=/^[[:blank:]]*\}//{scope=pop}{exclusive} 5639c9a7a7cSMasatake YAMATO --regex-pp=/^class[[:blank:]]*([[:alnum:]]+)[[[:blank:]]]*\{/\1/c/{scope=push} 5649c9a7a7cSMasatake YAMATO --regex-pp=/^[[:blank:]]*int[[:blank:]]*([[:alnum:]]+)/\1/v/{scope=ref} 565b40096fdSHadriel Kaplan 566b40096fdSHadriel Kaplan.. code-block:: console 567b40096fdSHadriel Kaplan 568b40096fdSHadriel Kaplan $ ctags --options=/tmp/pp.ctags -o - /tmp/input.pp 569c180d919SK.Takata bar /tmp/input.pp /^ int bar$/;" v class:foo 5703c49e28cSMasatake YAMATO foo /tmp/input.pp /^class foo {$/;" c 5713c49e28cSMasatake YAMATO 57209be9c82SMasatake YAMATO 573f998e51dSMasatake YAMATOExample 3: 574f998e51dSMasatake YAMATO 575f998e51dSMasatake YAMATO.. code-block:: 576f998e51dSMasatake YAMATO 577f998e51dSMasatake YAMATO # in /tmp/input.docdoc 578f998e51dSMasatake YAMATO title T 579f998e51dSMasatake YAMATO ... 580f998e51dSMasatake YAMATO section S0 581f998e51dSMasatake YAMATO ... 582f998e51dSMasatake YAMATO section S1 583f998e51dSMasatake YAMATO ... 584f998e51dSMasatake YAMATO 585f998e51dSMasatake YAMATO.. code-block:: ctags 586f998e51dSMasatake YAMATO :emphasize-lines: 15,21 587f998e51dSMasatake YAMATO 588f998e51dSMasatake YAMATO # in /tmp/doc.ctags: 589f998e51dSMasatake YAMATO --langdef=doc 590f998e51dSMasatake YAMATO --map-doc=+.docdoc 591f998e51dSMasatake YAMATO --kinddef-doc=s,section,sections 592f998e51dSMasatake YAMATO --kinddef-doc=S,subsection,subsections 593f998e51dSMasatake YAMATO 594f998e51dSMasatake YAMATO --_tabledef-doc=main 595f998e51dSMasatake YAMATO --_tabledef-doc=section 596f998e51dSMasatake YAMATO --_tabledef-doc=subsection 597f998e51dSMasatake YAMATO 598f998e51dSMasatake YAMATO --_mtable-regex-doc=main/section +([^\n]+)\n/\1/s/{scope=push}{tenter=section} 599f998e51dSMasatake YAMATO --_mtable-regex-doc=main/[^\n]+\n|[^\n]+|\n// 600f998e51dSMasatake YAMATO --_mtable-regex-doc=main///{scope=clear}{tquit} 601f998e51dSMasatake YAMATO 602f998e51dSMasatake YAMATO --_mtable-regex-doc=section/section +([^\n]+)\n/\1/s/{scope=replace} 603f998e51dSMasatake YAMATO --_mtable-regex-doc=section/subsection +([^\n]+)\n/\1/S/{scope=push}{tenter=subsection} 604f998e51dSMasatake YAMATO --_mtable-regex-doc=section/[^\n]+\n|[^\n]+|\n// 605f998e51dSMasatake YAMATO --_mtable-regex-doc=section///{scope=clear}{tquit} 606f998e51dSMasatake YAMATO 607f998e51dSMasatake YAMATO --_mtable-regex-doc=subsection/(section )//{_advanceTo=0start}{tleave}{scope=pop} 608f998e51dSMasatake YAMATO --_mtable-regex-doc=subsection/subsection +([^\n]+)\n/\1/S/{scope=replace} 609f998e51dSMasatake YAMATO --_mtable-regex-doc=subsection/[^\n]+\n|[^\n]+|\n// 610f998e51dSMasatake YAMATO --_mtable-regex-doc=subsection///{scope=clear}{tquit} 611f998e51dSMasatake YAMATO 612f998e51dSMasatake YAMATO.. code-block:: console 613f998e51dSMasatake YAMATO 614f998e51dSMasatake YAMATO % ctags --sort=no --fields=+nl --options=/tmp/doc.ctags -o - /tmp/input.docdoc 615f998e51dSMasatake YAMATO SEC0 /tmp/input.docdoc /^section SEC0$/;" s line:1 language:doc 616f998e51dSMasatake YAMATO SUB0-1 /tmp/input.docdoc /^subsection SUB0-1$/;" S line:3 language:doc section:SEC0 617f998e51dSMasatake YAMATO SUB0-2 /tmp/input.docdoc /^subsection SUB0-2$/;" S line:5 language:doc section:SEC0 618f998e51dSMasatake YAMATO SEC1 /tmp/input.docdoc /^section SEC1$/;" s line:7 language:doc 619f998e51dSMasatake YAMATO SUB1-1 /tmp/input.docdoc /^subsection SUB1-1$/;" S line:9 language:doc section:SEC1 620f998e51dSMasatake YAMATO SUB1-2 /tmp/input.docdoc /^subsection SUB1-2$/;" S line:11 language:doc section:SEC1 621f998e51dSMasatake YAMATO 622f998e51dSMasatake YAMATO 623641e337aSMasatake YAMATONOTE: This flag doesn't work well with ``--mline-regex-<LANG>=``. 6248370e4a6SMasatake YAMATO 625b40096fdSHadriel KaplanOverriding the letter for file kind 626eb375513SMasatake YAMATO~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 62709be9c82SMasatake YAMATO 628b40096fdSHadriel Kaplan.. Q: this was fixed in https://github.com/universal-ctags/ctags/pull/331 629b40096fdSHadriel Kaplan so can we remove this section? 630b40096fdSHadriel Kaplan 631dccba5efSHiroo HAYASHIOne of the built-in tag kinds in Universal Ctags is the ``F`` file kind. 632dccba5efSHiroo HAYASHIOverriding the letter for file kind is not allowed in Universal Ctags. 633599fcc99SMasatake YAMATO 634b40096fdSHadriel Kaplan.. warning:: 635f7c45d47SMasatake YAMATO 63604cce070SHiroo HAYASHI Don't use ``F`` as a kind letter in your parser. (See issue `#317 63704cce070SHiroo HAYASHI <https://github.com/universal-ctags/ctags/issues/317>`_ on github) 63809be9c82SMasatake YAMATO 639ecc1c043SMasatake YAMATOGenerating fully qualified tags automatically from scope information 640ecc1c043SMasatake YAMATO~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 641ecc1c043SMasatake YAMATO 64286bcb5c2SHiroo HAYASHIIf scope fields are filled properly with ``{scope=...}`` regex flags, 643ecc1c043SMasatake YAMATOyou can use the field values for generating fully qualified tags. 64486bcb5c2SHiroo HAYASHIAbout the ``{scope=..}`` flag itself, see "FLAGS FOR --regex-<LANG> 6454d0efd68SMasatake YAMATOOPTION" section of :ref:`ctags-optlib(7) <ctags-optlib(7)>`. 646ecc1c043SMasatake YAMATO 64786bcb5c2SHiroo HAYASHISpecify ``{_autoFQTag}`` to the end of ``--langdef=<LANG>`` option like 6483cd8570eSHiroo HAYASHI``--langdef=Foo{_autoFQTag}`` to make ctags generate fully qualified 649ecc1c043SMasatake YAMATOtags automatically. 650ecc1c043SMasatake YAMATO 65186bcb5c2SHiroo HAYASHI'``.``' is the (ctags global) default separator combining names into a 6527fa16a7fSMasatake YAMATOfully qualified tag. You can customize separators with 653a118be61SMasatake YAMATO``--_scopesep-<LANG>=...`` option. 654ecc1c043SMasatake YAMATO 655ecc1c043SMasatake YAMATOinput.foo:: 656ecc1c043SMasatake YAMATO 657ecc1c043SMasatake YAMATO class X 658ecc1c043SMasatake YAMATO var y 659ecc1c043SMasatake YAMATO end 660ecc1c043SMasatake YAMATO 661d14dd918SMasatake YAMATOfoo.ctags: 662d14dd918SMasatake YAMATO 663d14dd918SMasatake YAMATO.. code-block:: ctags 664a5c14cdaSHiroo HAYASHI :emphasize-lines: 1 665ecc1c043SMasatake YAMATO 666ecc1c043SMasatake YAMATO --langdef=foo{_autoFQTag} 667ecc1c043SMasatake YAMATO --map-foo=+.foo 668ecc1c043SMasatake YAMATO --kinddef-foo=c,class,classes 669ecc1c043SMasatake YAMATO --kinddef-foo=v,var,variables 670ecc1c043SMasatake YAMATO --regex-foo=/class ([A-Z]*)/\1/c/{scope=push} 671ecc1c043SMasatake YAMATO --regex-foo=/end///{placeholder}{scope=pop} 672ecc1c043SMasatake YAMATO --regex-foo=/[ \t]*var ([a-z]*)/\1/v/{scope=ref} 673ecc1c043SMasatake YAMATO 674ecc1c043SMasatake YAMATOOutput:: 675ecc1c043SMasatake YAMATO 67645e335abSHiroo HAYASHI $ u-ctags --quiet --options=./foo.ctags -o - input.foo 677ecc1c043SMasatake YAMATO X input.foo /^class X$/;" c 678ecc1c043SMasatake YAMATO y input.foo /^ var y$/;" v class:X 679ecc1c043SMasatake YAMATO 68045e335abSHiroo HAYASHI $ u-ctags --quiet --options=./foo.ctags --extras=+q -o - input.foo 681ecc1c043SMasatake YAMATO X input.foo /^class X$/;" c 682ecc1c043SMasatake YAMATO X.y input.foo /^ var y$/;" v class:X 683ecc1c043SMasatake YAMATO y input.foo /^ var y$/;" v class:X 684ecc1c043SMasatake YAMATO 685ecc1c043SMasatake YAMATO 68686bcb5c2SHiroo HAYASHI``X.y`` is printed as a fully qualified tag when ``--extras=+q`` is given. 687ecc1c043SMasatake YAMATO 6887fa16a7fSMasatake YAMATO.. NOT REVIEWED YET (--_scopesep) 6897fa16a7fSMasatake YAMATO 6907fa16a7fSMasatake YAMATOCustomizing scope separators 6917fa16a7fSMasatake YAMATO...................................................................... 6927fa16a7fSMasatake YAMATOUse ``--_scopesep-<LANG>=[<parent-kindLetter>]/<child-kindLetter>:<sep>`` 69386bcb5c2SHiroo HAYASHIoption for customizing if the language uses ``{_autoFQTag}``. 6947fa16a7fSMasatake YAMATO 6957fa16a7fSMasatake YAMATO``parent-kindLetter`` 6967fa16a7fSMasatake YAMATO 6977fa16a7fSMasatake YAMATO The kind letter for a tag of outer-scope. 6987fa16a7fSMasatake YAMATO 69986bcb5c2SHiroo HAYASHI You can use '``*``' for specifying as wildcards that means 70086bcb5c2SHiroo HAYASHI *any kinds* for a tag of outer-scope. 7017fa16a7fSMasatake YAMATO 7027fa16a7fSMasatake YAMATO If you omit ``parent-kindLetter``, the separator is used as 7037fa16a7fSMasatake YAMATO a prefix for tags having the kind specified with ``child-kindLetter``. 7047fa16a7fSMasatake YAMATO This prefix can be used to refer to global namespace or similar concepts if the 7057fa16a7fSMasatake YAMATO language has one. 7067fa16a7fSMasatake YAMATO 7077fa16a7fSMasatake YAMATO``child-kindLetter`` 7087fa16a7fSMasatake YAMATO 7097fa16a7fSMasatake YAMATO The kind letter for a tag of inner-scope. 7107fa16a7fSMasatake YAMATO 71186bcb5c2SHiroo HAYASHI You can use '``*``' for specifying as wildcards that means 71286bcb5c2SHiroo HAYASHI *any kinds* for a tag of inner-scope. 7137fa16a7fSMasatake YAMATO 7147fa16a7fSMasatake YAMATO``sep`` 7157fa16a7fSMasatake YAMATO 7167fa16a7fSMasatake YAMATO In a qualified tag, if the outer-scope has kind and ``parent-kindLetter`` 7177fa16a7fSMasatake YAMATO the inner-scope has ``child-kindLetter``, then ``sep`` is instead in 7187fa16a7fSMasatake YAMATO between the scope names in the generated tags file. 7197fa16a7fSMasatake YAMATO 72086bcb5c2SHiroo HAYASHIspecifying '``*``' as both ``parent-kindLetter`` and ``child-kindLetter`` 7217fa16a7fSMasatake YAMATOsets ``sep`` as the language default separator. It is used as fallback. 7227fa16a7fSMasatake YAMATO 72386bcb5c2SHiroo HAYASHISpecifying '``*``' as ``child-kindLetter`` and omitting ``parent-kindLetter`` 7247fa16a7fSMasatake YAMATOsets ``sep`` as the language default prefix. It is used as fallback. 7257fa16a7fSMasatake YAMATO 7267fa16a7fSMasatake YAMATO 7277fa16a7fSMasatake YAMATONOTE: There is no ctags global default prefix. 7283cd8570eSHiroo HAYASHI 7297fa16a7fSMasatake YAMATONOTE: ``_scopesep-<LANG>=...`` option affects only a parser that 7307fa16a7fSMasatake YAMATOenables ``_autoFQTag``. A parser building full qualified tags 7317fa16a7fSMasatake YAMATOmanually ignores the option. 7327fa16a7fSMasatake YAMATO 7337fa16a7fSMasatake YAMATOLet's see an example. 7347fa16a7fSMasatake YAMATOThe input file is written in Tcl. Tcl parser is not an optlib 7357fa16a7fSMasatake YAMATOparser. However, it uses the ``_autoFQTag`` feature internally. 7367fa16a7fSMasatake YAMATOTherefore, ``_scopesep-Tcl=`` option works well. Tcl parser 73786bcb5c2SHiroo HAYASHIdefines two kinds ``n`` (``namespace``) and ``p`` (``procedure``). 7387fa16a7fSMasatake YAMATO 73986bcb5c2SHiroo HAYASHIBy default, Tcl parser uses ``::`` as scope separator. The parser also 74086bcb5c2SHiroo HAYASHIuses ``::`` as root prefix. 7417fa16a7fSMasatake YAMATO 7427fa16a7fSMasatake YAMATO.. code-block:: tcl 7437fa16a7fSMasatake YAMATO 7447fa16a7fSMasatake YAMATO namespace eval N { 7457fa16a7fSMasatake YAMATO namespace eval M { 7467fa16a7fSMasatake YAMATO proc pr0 {s} { 7477fa16a7fSMasatake YAMATO puts $s 7487fa16a7fSMasatake YAMATO } 7497fa16a7fSMasatake YAMATO } 7507fa16a7fSMasatake YAMATO } 7517fa16a7fSMasatake YAMATO 7527fa16a7fSMasatake YAMATO proc pr1 {s} { 7537fa16a7fSMasatake YAMATO puts $s 7547fa16a7fSMasatake YAMATO } 7557fa16a7fSMasatake YAMATO 75686bcb5c2SHiroo HAYASHI``M`` is defined under the scope of ``N``. ``pr0`` is defined under the scope 75786bcb5c2SHiroo HAYASHIof ``M``. ``N`` and ``pr1`` are at top level (so they are candidates to be added 75886bcb5c2SHiroo HAYASHIprefixes). ``M`` and ``N`` are language objects with ``n`` (``namespace``) kind. 75986bcb5c2SHiroo HAYASHI``pr0`` and ``pr1`` are language objects with ``p`` (``procedure``) kind. 7607fa16a7fSMasatake YAMATO 7617fa16a7fSMasatake YAMATO.. code-block:: console 7627fa16a7fSMasatake YAMATO 7637fa16a7fSMasatake YAMATO $ ctags -o - --extras=+q input.tcl 7647fa16a7fSMasatake YAMATO ::N input.tcl /^namespace eval N {$/;" n 7657fa16a7fSMasatake YAMATO ::N::M input.tcl /^ namespace eval M {$/;" n namespace:::N 7667fa16a7fSMasatake YAMATO ::N::M::pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:::N::M 7677fa16a7fSMasatake YAMATO ::pr1 input.tcl /^proc pr1 {s} {$/;" p 7687fa16a7fSMasatake YAMATO M input.tcl /^ namespace eval M {$/;" n namespace:::N 7697fa16a7fSMasatake YAMATO N input.tcl /^namespace eval N {$/;" n 7707fa16a7fSMasatake YAMATO pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:::N::M 7717fa16a7fSMasatake YAMATO pr1 input.tcl /^proc pr1 {s} {$/;" p 7727fa16a7fSMasatake YAMATO 77386bcb5c2SHiroo HAYASHILet's change the default separator to ``->``: 7747fa16a7fSMasatake YAMATO 7757fa16a7fSMasatake YAMATO.. code-block:: console 776a5c14cdaSHiroo HAYASHI :emphasize-lines: 1 7777fa16a7fSMasatake YAMATO 7787fa16a7fSMasatake YAMATO $ ctags -o - --extras=+q --_scopesep-Tcl='*/*:->' input.tcl 7797fa16a7fSMasatake YAMATO ::N input.tcl /^namespace eval N {$/;" n 7807fa16a7fSMasatake YAMATO ::N->M input.tcl /^ namespace eval M {$/;" n namespace:::N 7817fa16a7fSMasatake YAMATO ::N->M->pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:::N->M 7827fa16a7fSMasatake YAMATO ::pr1 input.tcl /^proc pr1 {s} {$/;" p 7837fa16a7fSMasatake YAMATO M input.tcl /^ namespace eval M {$/;" n namespace:::N 7847fa16a7fSMasatake YAMATO N input.tcl /^namespace eval N {$/;" n 7857fa16a7fSMasatake YAMATO pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:::N->M 7867fa16a7fSMasatake YAMATO pr1 input.tcl /^proc pr1 {s} {$/;" p 7877fa16a7fSMasatake YAMATO 78886bcb5c2SHiroo HAYASHILet's define '``^``' as default prefix: 7897fa16a7fSMasatake YAMATO 7907fa16a7fSMasatake YAMATO.. code-block:: console 791a5c14cdaSHiroo HAYASHI :emphasize-lines: 1 7927fa16a7fSMasatake YAMATO 7937fa16a7fSMasatake YAMATO $ ctags -o - --extras=+q --_scopesep-Tcl='*/*:->' --_scopesep-Tcl='/*:^' input.tcl 7947fa16a7fSMasatake YAMATO M input.tcl /^ namespace eval M {$/;" n namespace:^N 7957fa16a7fSMasatake YAMATO N input.tcl /^namespace eval N {$/;" n 7967fa16a7fSMasatake YAMATO ^N input.tcl /^namespace eval N {$/;" n 7977fa16a7fSMasatake YAMATO ^N->M input.tcl /^ namespace eval M {$/;" n namespace:^N 7987fa16a7fSMasatake YAMATO ^N->M->pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:^N->M 7997fa16a7fSMasatake YAMATO ^pr1 input.tcl /^proc pr1 {s} {$/;" p 8007fa16a7fSMasatake YAMATO pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:^N->M 8017fa16a7fSMasatake YAMATO pr1 input.tcl /^proc pr1 {s} {$/;" p 8027fa16a7fSMasatake YAMATO 8037fa16a7fSMasatake YAMATOLet's override the specification of separator for combining a 80486bcb5c2SHiroo HAYASHInamespace and a procedure with '``+``': (About the separator for 8057fa16a7fSMasatake YAMATOcombining a namespace and another namespace, ctags uses the default separator.) 8067fa16a7fSMasatake YAMATO 8077fa16a7fSMasatake YAMATO.. code-block:: console 808a5c14cdaSHiroo HAYASHI :emphasize-lines: 1 8097fa16a7fSMasatake YAMATO 810a5c14cdaSHiroo HAYASHI $ ctags -o - --extras=+q --_scopesep-Tcl='*/*:->' --_scopesep-Tcl='/*:^' --_scopesep-Tcl='n/p:+' input.tcl 8117fa16a7fSMasatake YAMATO M input.tcl /^ namespace eval M {$/;" n namespace:^N 8127fa16a7fSMasatake YAMATO N input.tcl /^namespace eval N {$/;" n 8137fa16a7fSMasatake YAMATO ^N input.tcl /^namespace eval N {$/;" n 8147fa16a7fSMasatake YAMATO ^N->M input.tcl /^ namespace eval M {$/;" n namespace:^N 8157fa16a7fSMasatake YAMATO ^N->M+pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:^N->M 8167fa16a7fSMasatake YAMATO ^pr1 input.tcl /^proc pr1 {s} {$/;" p 8177fa16a7fSMasatake YAMATO pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:^N->M 8187fa16a7fSMasatake YAMATO pr1 input.tcl /^proc pr1 {s} {$/;" p 8197fa16a7fSMasatake YAMATO 82086bcb5c2SHiroo HAYASHILet's override the definition of prefix for a namespace with '``@``': 8217fa16a7fSMasatake YAMATO(About the prefix for procedures, ctags uses the default prefix.) 8227fa16a7fSMasatake YAMATO 8237fa16a7fSMasatake YAMATO.. code-block:: console 824a5c14cdaSHiroo HAYASHI :emphasize-lines: 1 8257fa16a7fSMasatake YAMATO 826a5c14cdaSHiroo HAYASHI $ ctags -o - --extras=+q --_scopesep-Tcl='*/*:->' --_scopesep-Tcl='/*:^' --_scopesep-Tcl='n/p:+' --_scopesep-Tcl='/n:@' input.tcl 8277fa16a7fSMasatake YAMATO @N input.tcl /^namespace eval N {$/;" n 8287fa16a7fSMasatake YAMATO @N->M input.tcl /^ namespace eval M {$/;" n namespace:@N 8297fa16a7fSMasatake YAMATO @N->M+pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:@N->M 8307fa16a7fSMasatake YAMATO M input.tcl /^ namespace eval M {$/;" n namespace:@N 8317fa16a7fSMasatake YAMATO N input.tcl /^namespace eval N {$/;" n 8327fa16a7fSMasatake YAMATO ^pr1 input.tcl /^proc pr1 {s} {$/;" p 8337fa16a7fSMasatake YAMATO pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:@N->M 8347fa16a7fSMasatake YAMATO pr1 input.tcl /^proc pr1 {s} {$/;" p 8357fa16a7fSMasatake YAMATO 836ecc1c043SMasatake YAMATO 837b40096fdSHadriel KaplanMulti-line pattern match 8388370e4a6SMasatake YAMATO~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 8398370e4a6SMasatake YAMATO 840b40096fdSHadriel KaplanWe often need to scan multiple lines to generate a tag, whether due to 841b40096fdSHadriel Kaplanneeding contextual information to decide whether to tag or not, or to 842b40096fdSHadriel Kaplanconstrain generating tags to only certain cases, or to grab multiple 843b40096fdSHadriel Kaplansubstrings to generate the tag name. 8448370e4a6SMasatake YAMATO 84586bcb5c2SHiroo HAYASHIUniversal Ctags has two ways to accomplish this: *multi-line regex options*, 84686bcb5c2SHiroo HAYASHIand an experimental *multi-table regex options* described later. 8478370e4a6SMasatake YAMATO 848b40096fdSHadriel KaplanThe newly introduced ``--mline-regex-<LANG>`` is similar to ``--regex-<LANG>`` 849b40096fdSHadriel Kaplanexcept the pattern is applied to the whole file's contents, not line by line. 8508370e4a6SMasatake YAMATO 85104cce070SHiroo HAYASHIThis example is based on an issue `#219 85204cce070SHiroo HAYASHI<https://github.com/universal-ctags/ctags/issues/219>`_ posted by 85304cce070SHiroo HAYASHI@andreicristianpetcu: 854b40096fdSHadriel Kaplan 855b40096fdSHadriel Kaplan.. code-block:: java 856b40096fdSHadriel Kaplan 857b40096fdSHadriel Kaplan // in input.java: 858b40096fdSHadriel Kaplan 8598370e4a6SMasatake YAMATO @Subscribe 8608370e4a6SMasatake YAMATO public void catchEvent(SomeEvent e) 8618370e4a6SMasatake YAMATO { 8628370e4a6SMasatake YAMATO return; 8638370e4a6SMasatake YAMATO } 8648370e4a6SMasatake YAMATO 8658370e4a6SMasatake YAMATO @Subscribe 8668370e4a6SMasatake YAMATO public void 8678370e4a6SMasatake YAMATO recover(Exception e) 8688370e4a6SMasatake YAMATO { 8698370e4a6SMasatake YAMATO return; 8708370e4a6SMasatake YAMATO } 8718370e4a6SMasatake YAMATO 872b40096fdSHadriel KaplanThe above java code is similar to the Java `Spring <https://spring.io>`_ 873b40096fdSHadriel Kaplanframework. The ``@Subscribe`` annotation is a keyword for the framework, and the 874b40096fdSHadriel Kaplandeveloper would like to have a tag generated for each method annotated with 875b40096fdSHadriel Kaplan``@Subscribe``, using the name of the method followed by a dash followed by the 876b40096fdSHadriel Kaplantype of the argument. For example the developer wants the tag name 877b40096fdSHadriel Kaplan``Event-SomeEvent`` generated for the first method shown above. 878b40096fdSHadriel Kaplan 879b40096fdSHadriel KaplanTo accomplish this, the developer creates a :file:`spring.ctags` file with 880b40096fdSHadriel Kaplanthe following: 881b40096fdSHadriel Kaplan 882d14dd918SMasatake YAMATO.. code-block:: ctags 883a5c14cdaSHiroo HAYASHI :emphasize-lines: 4 884b40096fdSHadriel Kaplan 885b40096fdSHadriel Kaplan # in spring.ctags: 8868370e4a6SMasatake YAMATO --langdef=javaspring 887d14dd918SMasatake YAMATO --map-javaspring=+.java 88810860ef1SMasatake YAMATO --mline-regex-javaspring=/@Subscribe([[:space:]])*([a-z ]+)[[:space:]]*([a-zA-Z]*)\(([a-zA-Z]*)/\3-\4/s,subscription/{mgroup=3} 8898370e4a6SMasatake YAMATO --fields=+ln 8908370e4a6SMasatake YAMATO 891b40096fdSHadriel KaplanAnd now using :file:`spring.ctags` the tag file has this: 892b40096fdSHadriel Kaplan 893b40096fdSHadriel Kaplan.. code-block:: console 894b40096fdSHadriel Kaplan 89545e335abSHiroo HAYASHI $ ctags -o - --options=./spring.ctags input.java 8968370e4a6SMasatake YAMATO Event-SomeEvent input.java /^public void catchEvent(SomeEvent e)$/;" s line:2 language:javaspring 8978370e4a6SMasatake YAMATO recover-Exception input.java /^ recover(Exception e)$/;" s line:10 language:javaspring 8988370e4a6SMasatake YAMATO 899b40096fdSHadriel KaplanMultiline pattern flags 900b40096fdSHadriel Kaplan...................................................................... 901b40096fdSHadriel Kaplan 902b40096fdSHadriel Kaplan.. note:: These flags also apply to the experimental ``--_mtable-regex-<LANG>`` 903b40096fdSHadriel Kaplan option described later. 904b40096fdSHadriel Kaplan 905641e337aSMasatake YAMATO``{mgroup=N}`` 9068370e4a6SMasatake YAMATO 907b40096fdSHadriel Kaplan This flag indicates the pattern should be applied to the whole file 908b40096fdSHadriel Kaplan contents, not line by line. ``N`` is the number of a capture group in the 909b40096fdSHadriel Kaplan pattern, which is used to record the line number location of the tag. In the 910b40096fdSHadriel Kaplan above example ``3`` is specified. The start position of the regex capture 911b40096fdSHadriel Kaplan group 3, relative to the whole file is used. 912b40096fdSHadriel Kaplan 913b40096fdSHadriel Kaplan.. warning:: You **must** add an ``{mgroup=N}`` flag to the multi-line 914b40096fdSHadriel Kaplan ``--mline-regex-<LANG>`` option, even if the ``N`` is ``0`` (meaning the 915b40096fdSHadriel Kaplan start position of the whole regex pattern). You do not need to add it for 916b40096fdSHadriel Kaplan the multi-table ``--_mtable-regex-<LANG>``. 917b40096fdSHadriel Kaplan 9183cd8570eSHiroo HAYASHI.. TODO: Q: isn't the above restriction really a bug? I think it is. I should fix it. 919db3dd52bSHiroo HAYASHI Q to @masatake-san: Do you mean that {mgroup=0} can be omitted? -> #2918 is opened 920b40096fdSHadriel Kaplan 9218370e4a6SMasatake YAMATO 922c9bfc26fSMasatake YAMATO``{_advanceTo=N[start|end]}`` 923c9bfc26fSMasatake YAMATO 924b40096fdSHadriel Kaplan A regex pattern is applied to whole file's contents iteratively. This long 925b40096fdSHadriel Kaplan flag specifies from where the pattern should be applied in the next 926b40096fdSHadriel Kaplan iteration for regex matching. When a pattern matches, the next pattern 927b40096fdSHadriel Kaplan matching starts from the start or end of capture group ``N``. By default it 928e4668dd9SMasanari Iida advances to the end of the whole match (i.e., ``{_advanceTo=0end}`` is 929b40096fdSHadriel Kaplan the default). 930c9bfc26fSMasatake YAMATO 931c9bfc26fSMasatake YAMATO 932c9bfc26fSMasatake YAMATO Let's think about following input 933c9bfc26fSMasatake YAMATO :: 934c9bfc26fSMasatake YAMATO 935c9bfc26fSMasatake YAMATO def def abc 936c9bfc26fSMasatake YAMATO 9373cd8570eSHiroo HAYASHI Consider two sets of options, ``foo.ctags`` and ``bar.ctags``. 938c9bfc26fSMasatake YAMATO 939d14dd918SMasatake YAMATO .. code-block:: ctags 940a5c14cdaSHiroo HAYASHI :emphasize-lines: 5 941c9bfc26fSMasatake YAMATO 942b40096fdSHadriel Kaplan # foo.ctags: 943c9bfc26fSMasatake YAMATO --langdef=foo 944c9bfc26fSMasatake YAMATO --langmap=foo:.foo 945c9bfc26fSMasatake YAMATO --kinddef-foo=a,something,something 946c9bfc26fSMasatake YAMATO --mline-regex-foo=/def *([a-z]+)/\1/a/{mgroup=1} 947c9bfc26fSMasatake YAMATO 948c9bfc26fSMasatake YAMATO 949d14dd918SMasatake YAMATO .. code-block:: ctags 950a5c14cdaSHiroo HAYASHI :emphasize-lines: 5 951c9bfc26fSMasatake YAMATO 952b40096fdSHadriel Kaplan # bar.ctags: 953c9bfc26fSMasatake YAMATO --langdef=bar 954c9bfc26fSMasatake YAMATO --langmap=bar:.bar 955c9bfc26fSMasatake YAMATO --kinddef-bar=a,something,something 956c9bfc26fSMasatake YAMATO --mline-regex-bar=/def *([a-z]+)/\1/a/{mgroup=1}{_advanceTo=1start} 957c9bfc26fSMasatake YAMATO 95886bcb5c2SHiroo HAYASHI ``foo.ctags`` emits following tags output:: 959c9bfc26fSMasatake YAMATO 960c9bfc26fSMasatake YAMATO def input.foo /^def def abc$/;" a 961c9bfc26fSMasatake YAMATO 96286bcb5c2SHiroo HAYASHI ``bar.ctags`` emits following tags output:: 963c9bfc26fSMasatake YAMATO 964c9bfc26fSMasatake YAMATO def input-0.bar /^def def abc$/;" a 965c9bfc26fSMasatake YAMATO abc input-0.bar /^def def abc$/;" a 966c9bfc26fSMasatake YAMATO 96786bcb5c2SHiroo HAYASHI ``_advanceTo=1start`` is specified in ``bar.ctags``. 96886bcb5c2SHiroo HAYASHI This allows ctags to capture ``abc``. 969c9bfc26fSMasatake YAMATO 970c9bfc26fSMasatake YAMATO At the first iteration, the patterns of both 97186bcb5c2SHiroo HAYASHI ``foo.ctags`` and ``bar.ctags`` match as follows 972c9bfc26fSMasatake YAMATO :: 973f7c45d47SMasatake YAMATO 974c9bfc26fSMasatake YAMATO 0 1 (start) 975c9bfc26fSMasatake YAMATO v v 976c9bfc26fSMasatake YAMATO def def abc 977c9bfc26fSMasatake YAMATO ^ 978c9bfc26fSMasatake YAMATO 0,1 (end) 979c9bfc26fSMasatake YAMATO 98086bcb5c2SHiroo HAYASHI ``def`` at the group 1 is captured as a tag in 981c9bfc26fSMasatake YAMATO both languages. At the next iteration, the positions 982c9bfc26fSMasatake YAMATO where the pattern matching is applied to are not the 98325d761b4SMasatake YAMATO same in the languages. 984c9bfc26fSMasatake YAMATO 98586bcb5c2SHiroo HAYASHI ``foo.ctags`` 986c9bfc26fSMasatake YAMATO :: 987f7c45d47SMasatake YAMATO 988c9bfc26fSMasatake YAMATO 0end (default) 989c9bfc26fSMasatake YAMATO v 990c9bfc26fSMasatake YAMATO def def abc 991c9bfc26fSMasatake YAMATO 992c9bfc26fSMasatake YAMATO 99386bcb5c2SHiroo HAYASHI ``bar.ctags`` 994c9bfc26fSMasatake YAMATO :: 995f7c45d47SMasatake YAMATO 996c9bfc26fSMasatake YAMATO 1start (as specified in _advanceTo long flag) 997c9bfc26fSMasatake YAMATO v 998c9bfc26fSMasatake YAMATO def def abc 999c9bfc26fSMasatake YAMATO 1000c9bfc26fSMasatake YAMATO This difference of positions makes the difference of tags output. 1001c9bfc26fSMasatake YAMATO 1002b40096fdSHadriel Kaplan A more relevant use-case is when ``{_advanceTo=N[start|end]}`` is used in 1003b40096fdSHadriel Kaplan the experimental ``--_mtable-regex-<LANG>``, to "advance" back to the 1004b40096fdSHadriel Kaplan beginning of a match, so that one can generate multiple tags for the same 1005b40096fdSHadriel Kaplan input line(s). 1006c9bfc26fSMasatake YAMATO 1007b40096fdSHadriel Kaplan.. note:: This flag doesn't work well with scope related flags and ``exclusive`` flags. 100809be9c82SMasatake YAMATO 100901afa120SMasatake YAMATO 1010b40096fdSHadriel Kaplan.. Q: this was previously titled "Byte oriented pattern matching...", presumably 1011b40096fdSHadriel Kaplan because it "matched against the input at the current byte position, not line". 1012b40096fdSHadriel Kaplan But that's also true for --mline-regex-<LANG>, as far as I can tell. 1013b40096fdSHadriel Kaplan 1014b40096fdSHadriel KaplanAdvanced pattern matching with multiple regex tables 101501afa120SMasatake YAMATO~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 101601afa120SMasatake YAMATO 1017b40096fdSHadriel Kaplan.. note:: This is a highly experimental feature. This will not go into 1018b40096fdSHadriel Kaplan the man page of 6.0. But let's be honest, it's the most exciting feature! 101901afa120SMasatake YAMATO 1020b40096fdSHadriel KaplanIn some cases, the ``--regex-<LANG>`` and ``--mline-regex-<LANG>`` options are not 1021b40096fdSHadriel Kaplansufficient to generate the tags for a particular language. Some of the common 1022b40096fdSHadriel Kaplanreasons for this are: 102301afa120SMasatake YAMATO 1024b40096fdSHadriel Kaplan* To ignore commented lines or sections for the language file, so that 1025b40096fdSHadriel Kaplan tags aren't generated for symbols that are within the comments. 1026b40096fdSHadriel Kaplan* To enter and exit scope, and use it for tagging based on contextual 1027b40096fdSHadriel Kaplan state or with end-scope markers that are difficult to match to their 1028b40096fdSHadriel Kaplan associated scope entry point. 1029b40096fdSHadriel Kaplan* To support nested scopes. 1030b40096fdSHadriel Kaplan* To change the pattern searched for, or the resultant tag for the same 1031b40096fdSHadriel Kaplan pattern, based on scoping or contextual location. 1032b40096fdSHadriel Kaplan* To break up an overly complicated ``--mline-regex-<LANG>`` pattern into 1033b40096fdSHadriel Kaplan separate regex patterns, for performance or readability reasons. 103401afa120SMasatake YAMATO 1035dccba5efSHiroo HAYASHITo help handle such things, Universal Ctags has been enhanced with multi-table 1036b40096fdSHadriel Kaplanregex matching. The feature is inspired by `lex`, the fast lexical analyzer 1037b40096fdSHadriel Kaplangenerator, which is a popular tool on Unix environments for writing parsers, and 1038b40096fdSHadriel Kaplan`RegexLexer <http://pygments.org/docs/lexerdevelopment/>`_ of Pygments. 1039b40096fdSHadriel KaplanKnowledge about them will help you understand the new options. 104001afa120SMasatake YAMATO 1041b40096fdSHadriel KaplanThe new options are: 1042b40096fdSHadriel Kaplan 1043b40096fdSHadriel Kaplan``--_tabledef-<LANG>`` 1044b40096fdSHadriel Kaplan Declares a new regex matching table of a given name for the language, 104586bcb5c2SHiroo HAYASHI as described in ":ref:`tabledef`". 1046b40096fdSHadriel Kaplan 1047b40096fdSHadriel Kaplan``--_mtable-regex-<LANG>`` 1048b40096fdSHadriel Kaplan Adds a regex pattern and associated tag generation information and flags, to 104986bcb5c2SHiroo HAYASHI the given table, as described in ":ref:`mtable_regex`". 1050b40096fdSHadriel Kaplan 1051b40096fdSHadriel Kaplan``--_mtable-extend-<LANG>`` 1052b40096fdSHadriel Kaplan Includes a previously-defined regex table to the named one. 1053b40096fdSHadriel Kaplan 1054b40096fdSHadriel KaplanThe above will be discussed in more detail shortly. 1055b40096fdSHadriel Kaplan 10563f73955fSMasatake YAMATOFirst, let's explain the feature with an example. Consider an 105786bcb5c2SHiroo HAYASHIimaginary language `X` has a similar syntax as JavaScript: ``var`` is 10583cd8570eSHiroo HAYASHIused as defining variable(s), and "``/* ... */``" is used for block 1059b40096fdSHadriel Kaplancomments. 1060b40096fdSHadriel Kaplan 1061b40096fdSHadriel KaplanHere is our input, :file:`input.x`: 1062b40096fdSHadriel Kaplan 1063b40096fdSHadriel Kaplan.. code-block:: java 106401afa120SMasatake YAMATO 106501afa120SMasatake YAMATO /* BLOCK COMMENT 106601afa120SMasatake YAMATO var dont_capture_me; 106701afa120SMasatake YAMATO */ 106801afa120SMasatake YAMATO var a /* ANOTHER BLOCK COMMENT */, b; 106901afa120SMasatake YAMATO 1070b40096fdSHadriel KaplanWe want ctags to capture ``a`` and ``b`` - but it is difficult to write a parser 1071b40096fdSHadriel Kaplanthat will ignore ``dont_capture_me`` in the comment with a classical regex 1072b40096fdSHadriel Kaplanparser defined with ``--regex-<LANG>`` or ``--mline-regex-<LANG>``, because of 1073b40096fdSHadriel Kaplanthe block comments. 107401afa120SMasatake YAMATO 1075be11ec05SMasanari IidaThe ``--regex-<LANG>`` option only works on one line at a time, so can not know 1076b40096fdSHadriel Kaplan``dont_capture_me`` is within comments. The ``--mline-regex-<LANG>`` could 1077b40096fdSHadriel Kaplando it in theory, but due to the greedy nature of the regex engine it is 1078b40096fdSHadriel Kaplanimpractical and potentially inefficient to do so, given that there could be 107986bcb5c2SHiroo HAYASHImultiple block comments in the file, with '``*``' inside them, etc. 108001afa120SMasatake YAMATO 1081b40096fdSHadriel KaplanA parser written with multi-table regex, on the other hand, can capture only 1082b40096fdSHadriel Kaplan``a`` and ``b`` safely. But it is more complicated to understand. 108301afa120SMasatake YAMATO 10843cd8570eSHiroo HAYASHIHere is the 1st version of :file:`X.ctags`: 1085d14dd918SMasatake YAMATO 1086d14dd918SMasatake YAMATO.. code-block:: ctags 108701afa120SMasatake YAMATO 108801afa120SMasatake YAMATO --langdef=X 108901afa120SMasatake YAMATO --map-X=.x 109001afa120SMasatake YAMATO --kinddef-X=v,var,variables 109101afa120SMasatake YAMATO 1092b40096fdSHadriel KaplanNot so interesting. It doesn't really *do* anything yet. It just creates a new 1093b40096fdSHadriel Kaplanlanguage named ``X``, for files ending with a :file:`.x` suffix, and defines a 1094b40096fdSHadriel Kaplannew tag for variable kinds. 109501afa120SMasatake YAMATO 1096b40096fdSHadriel KaplanWhen writing a multi-table parser, you have to think about the necessary states 109786bcb5c2SHiroo HAYASHIof parsing. For the parser of language `X`, we need the following states: 109801afa120SMasatake YAMATO 109901afa120SMasatake YAMATO* `toplevel` (initial state) 110001afa120SMasatake YAMATO* `comment` (inside comment) 110101afa120SMasatake YAMATO* `vars` (var statements) 110201afa120SMasatake YAMATO 1103b40096fdSHadriel Kaplan.. _tabledef: 110401afa120SMasatake YAMATO 1105b40096fdSHadriel KaplanDeclaring a new regex table 1106b40096fdSHadriel Kaplan...................................................................... 1107b40096fdSHadriel Kaplan 1108b40096fdSHadriel KaplanBefore adding regular expressions, you have to declare tables for each state 1109b40096fdSHadriel Kaplanwith the ``--_tabledef-<LANG>=<TABLE>`` option. 1110b40096fdSHadriel Kaplan 1111b40096fdSHadriel KaplanHere is the 2nd version of :file:`X.ctags` doing so: 1112d14dd918SMasatake YAMATO 1113d14dd918SMasatake YAMATO.. code-block:: ctags 1114a5c14cdaSHiroo HAYASHI :emphasize-lines: 5-7 111501afa120SMasatake YAMATO 111601afa120SMasatake YAMATO --langdef=X 111701afa120SMasatake YAMATO --map-X=.x 111801afa120SMasatake YAMATO --kinddef-X=v,var,variables 111901afa120SMasatake YAMATO 112001afa120SMasatake YAMATO --_tabledef-X=toplevel 112101afa120SMasatake YAMATO --_tabledef-X=comment 112201afa120SMasatake YAMATO --_tabledef-X=vars 112301afa120SMasatake YAMATO 1124b40096fdSHadriel KaplanFor table names, only characters in the range ``[0-9a-zA-Z_]`` are acceptable. 112501afa120SMasatake YAMATO 1126b40096fdSHadriel KaplanFor a given language, for each file's input the ctags multi-table parser begins 112786bcb5c2SHiroo HAYASHIwith the first declared table. For :file:`X.ctags`, ``toplevel`` is the one. 1128b40096fdSHadriel KaplanThe other tables are only ever entered/checked if another table specified to do 1129b40096fdSHadriel Kaplanso, starting with the first table. In other words, if the first declared table 1130b40096fdSHadriel Kaplandoes not find a match for the current input, and does not specify to go to 1131b40096fdSHadriel Kaplananother table, the other tables for that language won't be used. The flags to go 1132b40096fdSHadriel Kaplanto another table are ``{tenter}``, ``{tleave}``, and ``{tjump}``, as described 1133b40096fdSHadriel Kaplanlater. 113401afa120SMasatake YAMATO 1135b40096fdSHadriel Kaplan.. _mtable_regex: 113601afa120SMasatake YAMATO 1137b40096fdSHadriel KaplanAdding a regex to a regex table 1138b40096fdSHadriel Kaplan...................................................................... 113901afa120SMasatake YAMATO 1140b40096fdSHadriel KaplanThe new option to add a regex to a declared table is ``--_mtable-regex-<LANG>``, 1141b40096fdSHadriel Kaplanand it follows this form: 114201afa120SMasatake YAMATO 11433cd8570eSHiroo HAYASHI.. code-block:: ctags 114401afa120SMasatake YAMATO 1145b40096fdSHadriel Kaplan --_mtable-regex-<LANG>=<TABLE>/<PATTERN>/<NAME>/[<KIND>]/LONGFLAGS 1146b40096fdSHadriel Kaplan 1147b40096fdSHadriel KaplanThe parameters for ``--_mtable-regex-<LANG>`` look complicated. However, 1148b40096fdSHadriel Kaplan``<PATTERN>``, ``<NAME>``, and ``<KIND>`` are the same as the parameters of the 1149b40096fdSHadriel Kaplan``--regex-<LANG>`` and ``--mline-regex-<LANG>`` options. ``<TABLE>`` is simply 1150b40096fdSHadriel Kaplanthe name of a table previously declared with the ``--_tabledef-<LANG>`` option. 1151b40096fdSHadriel Kaplan 1152b40096fdSHadriel KaplanA regex pattern added to a parser with ``--_mtable-regex-<LANG>`` is matched 1153b40096fdSHadriel Kaplanagainst the input at the current byte position, not line. Even if you do not 115486bcb5c2SHiroo HAYASHIspecify the '``^``' anchor at the start of the pattern, ctags adds '``^``' to 1155b40096fdSHadriel Kaplanthe pattern automatically. Unlike the ``--regex-<LANG>`` and 115686bcb5c2SHiroo HAYASHI``--mline-regex-<LANG>`` options, a '``^``' anchor does not mean "beginning of 1157b40096fdSHadriel Kaplanline" in ``--_mtable-regex-<LANG>``; instead it means the beginning of the 1158b40096fdSHadriel Kaplaninput string (i.e., the current byte position). 1159b40096fdSHadriel Kaplan 1160b40096fdSHadriel KaplanThe ``LONGFLAGS`` include the already discussed flags for ``--regex-<LANG>`` and 1161b40096fdSHadriel Kaplan``--mline-regex-<LANG>``: ``{scope=...}``, ``{mgroup=N}``, ``{_advanceTo=N}``, 1162b40096fdSHadriel Kaplan``{basic}``, ``{extend}``, and ``{icase}``. The ``{exclusive}`` flag does not 1163b40096fdSHadriel Kaplanmake sense for multi-table regex. 1164b40096fdSHadriel Kaplan 1165b40096fdSHadriel KaplanIn addition, several new flags are introduced exclusively for multi-table 1166b40096fdSHadriel Kaplanregex use: 1167b40096fdSHadriel Kaplan 1168b40096fdSHadriel Kaplan``{tenter}`` 1169b40096fdSHadriel Kaplan Push the current table on the stack, and enter another table. 1170b40096fdSHadriel Kaplan 1171b40096fdSHadriel Kaplan``{tleave}`` 1172b40096fdSHadriel Kaplan Leave the current table, pop the stack, and go to the table that was 1173b40096fdSHadriel Kaplan just popped from the stack. 1174b40096fdSHadriel Kaplan 1175b40096fdSHadriel Kaplan``{tjump}`` 1176b40096fdSHadriel Kaplan Jump to another table, without affecting the stack. 1177b40096fdSHadriel Kaplan 1178b40096fdSHadriel Kaplan``{treset}`` 1179b40096fdSHadriel Kaplan Clear the stack, and go to another table. 1180b40096fdSHadriel Kaplan 1181b40096fdSHadriel Kaplan``{tquit}`` 1182b40096fdSHadriel Kaplan Clear the stack, and stop processing the current input file for this 1183b40096fdSHadriel Kaplan language. 1184b40096fdSHadriel Kaplan 1185b40096fdSHadriel KaplanTo explain the above new flags, we'll continue using our example in the 1186b40096fdSHadriel Kaplannext section. 118701afa120SMasatake YAMATO 118801afa120SMasatake YAMATOSkipping block comments 118901afa120SMasatake YAMATO...................................................................... 119001afa120SMasatake YAMATO 1191b40096fdSHadriel KaplanLet's continue with our example. Here is the 3rd version of :file:`X.ctags`: 119201afa120SMasatake YAMATO 1193d14dd918SMasatake YAMATO.. code-block:: ctags 1194a5c14cdaSHiroo HAYASHI :emphasize-lines: 9-13 1195a5c14cdaSHiroo HAYASHI :linenos: 119601afa120SMasatake YAMATO 119701afa120SMasatake YAMATO --langdef=X 119801afa120SMasatake YAMATO --map-X=.x 119901afa120SMasatake YAMATO --kinddef-X=v,var,variables 120001afa120SMasatake YAMATO 120101afa120SMasatake YAMATO --_tabledef-X=toplevel 120201afa120SMasatake YAMATO --_tabledef-X=comment 120301afa120SMasatake YAMATO --_tabledef-X=vars 120401afa120SMasatake YAMATO 120501afa120SMasatake YAMATO --_mtable-regex-X=toplevel/\/\*//{tenter=comment} 120601afa120SMasatake YAMATO --_mtable-regex-X=toplevel/.// 120701afa120SMasatake YAMATO 120801afa120SMasatake YAMATO --_mtable-regex-X=comment/\*\///{tleave} 120901afa120SMasatake YAMATO --_mtable-regex-X=comment/.// 121001afa120SMasatake YAMATO 1211b40096fdSHadriel KaplanFour ``--_mtable-regex-X`` lines are added for skipping the block comments. Let's 1212b40096fdSHadriel Kaplandiscuss them one by one. 121301afa120SMasatake YAMATO 121486bcb5c2SHiroo HAYASHIFor each new file it scans, ctags always chooses the first pattern of the 121586bcb5c2SHiroo HAYASHIfirst table of the parser. Even if it's an empty table, ctags will only try 1216be11ec05SMasanari Iidathe first declared table. (in such a case it would immediately fail to match 1217da7b7cd3SIvan Gonzalez Polancoanything, and thus stop processing the input file and effectively do nothing) 121801afa120SMasatake YAMATO 1219b40096fdSHadriel KaplanThe first declared table (``toplevel``) has the following regex added to 1220b40096fdSHadriel Kaplanit first: 122101afa120SMasatake YAMATO 1222d14dd918SMasatake YAMATO.. code-block:: ctags 1223a5c14cdaSHiroo HAYASHI :linenos: 1224a5c14cdaSHiroo HAYASHI :lineno-start: 9 122501afa120SMasatake YAMATO 1226b40096fdSHadriel Kaplan --_mtable-regex-X=toplevel/\/\*//{tenter=comment} 1227b40096fdSHadriel Kaplan 1228b40096fdSHadriel KaplanA pattern of ``\/\*`` is added to the ``toplevel`` table, to match the 1229b40096fdSHadriel Kaplanbeginning of a block comment. A backslash character is used in front of the 123086bcb5c2SHiroo HAYASHIleading '``/``' to escape the separation character '``/``' that separates the fields 1231b40096fdSHadriel Kaplanof ``--_mtable-regex-<LANG>``. Another backslash inside the pattern is used 123286bcb5c2SHiroo HAYASHIbefore the asterisk '``*``', to make it a literal asterisk character in regex. 1233b40096fdSHadriel Kaplan 123486bcb5c2SHiroo HAYASHIThe last ``//`` means ctags should not tag something matching this pattern. 1235b40096fdSHadriel KaplanIn ``--regex-<LANG>`` you never use ``//`` because it would be pointless to 1236b40096fdSHadriel Kaplanmatch something and not tag it using and single-line ``--regex-<LANG>``; in 1237b40096fdSHadriel Kaplanmulti-line ``--mline-regex-<LANG>`` you rarely see it, because it would rarely 1238b40096fdSHadriel Kaplanbe useful. But in multi-table regex it's quite common, since you frequently 1239b40096fdSHadriel Kaplanwant to transition from one state to another (i.e., ``tenter`` or ``tjump`` 1240b40096fdSHadriel Kaplanfrom one table to another). 1241b40096fdSHadriel Kaplan 1242b40096fdSHadriel KaplanThe long flag added to our first regex of our first table is ``tenter``, which 1243b40096fdSHadriel Kaplanis a long flag for switching the table and pushing on the stack. ``{tenter=comment}`` 124401afa120SMasatake YAMATOmeans "switch the table from toplevel to comment". 124501afa120SMasatake YAMATO 124686bcb5c2SHiroo HAYASHISo given the input file :file:`input.x` shown earlier, ctags will begin at 1247b40096fdSHadriel Kaplanthe ``toplevel`` table and try to match the first regex. It will succeed, and 1248b40096fdSHadriel Kaplanthus push on the stack and go to the ``comment`` table. 124901afa120SMasatake YAMATO 1250b40096fdSHadriel KaplanIt will begin at the top of the ``comment`` table (it always begins at the top 1251b40096fdSHadriel Kaplanof a given table), and try each regex line in sequence until it finds a match. 1252b40096fdSHadriel KaplanIf it fails to find a match, it will pop the stack and go to the table that was 1253b40096fdSHadriel Kaplanjust popped from the stack, and begin trying to match at the top of *that* table. 1254b40096fdSHadriel KaplanIf it continues failing to find a match, and ultimately reaches the end of the 1255b40096fdSHadriel Kaplanstack, it will stop processing for this file. For the next input file, it will 1256b40096fdSHadriel Kaplanbegin again from the top of the first declared table. 125701afa120SMasatake YAMATO 1258b40096fdSHadriel KaplanGetting back to our example, the top of the ``comment`` table has this regex: 125901afa120SMasatake YAMATO 1260d14dd918SMasatake YAMATO.. code-block:: ctags 1261a5c14cdaSHiroo HAYASHI :linenos: 1262a5c14cdaSHiroo HAYASHI :lineno-start: 12 126301afa120SMasatake YAMATO 1264b40096fdSHadriel Kaplan --_mtable-regex-X=comment/\*\///{tleave} 1265b40096fdSHadriel Kaplan 1266b40096fdSHadriel KaplanSimilar to the previous ``toplevel`` table pattern, this one for ``\*\/`` uses 126786bcb5c2SHiroo HAYASHIa backslash to escape the separator '``/``', as well as one before the '``*``' to 1268b40096fdSHadriel Kaplanmake it a literal asterisk in regex. So what it's looking for, from a simple 1269b40096fdSHadriel Kaplanstring perspective, is the sequence ``*/``. Note that this means even though 1270b40096fdSHadriel Kaplanyou see three backslashes ``///`` at the end, the first one is escaped and used 1271b40096fdSHadriel Kaplanfor the pattern itself, and the ``--_mtable-regex-X`` only has ``//`` to 1272b40096fdSHadriel Kaplanseparate the regex pattern from the long flags, instead of the usual ``///``. 1273b40096fdSHadriel KaplanThus it's using the shorthand form of the ``--_mtable-regex-X`` option. 1274b40096fdSHadriel KaplanIt could instead have been: 1275b40096fdSHadriel Kaplan 1276d14dd918SMasatake YAMATO.. code-block:: ctags 1277b40096fdSHadriel Kaplan 1278b40096fdSHadriel Kaplan --_mtable-regex-X=comment/\*\////{tleave} 1279b40096fdSHadriel Kaplan 1280b40096fdSHadriel KaplanThe above would have worked exactly the same. 1281b40096fdSHadriel Kaplan 1282b40096fdSHadriel KaplanGetting back to our example, remember we're looking at the :file:`input.x` 1283b40096fdSHadriel Kaplanfile, currently using the ``comment`` table, and trying to match the first 1284b40096fdSHadriel Kaplanregex of that table, shown above, at the following location:: 1285b40096fdSHadriel Kaplan 1286b40096fdSHadriel Kaplan ,ctags is trying to match starting here 1287b40096fdSHadriel Kaplan v 128801afa120SMasatake YAMATO /* BLOCK COMMENT 128901afa120SMasatake YAMATO var dont_capture_me; 129001afa120SMasatake YAMATO */ 129101afa120SMasatake YAMATO var a /* ANOTHER BLOCK COMMENT */, b; 129201afa120SMasatake YAMATO 1293b40096fdSHadriel KaplanThe pattern doesn't match for the position just after ``/*``, because that 129486bcb5c2SHiroo HAYASHIposition is a space character. So ctags tries the next pattern in the same 1295b40096fdSHadriel Kaplantable: 129601afa120SMasatake YAMATO 1297d14dd918SMasatake YAMATO.. code-block:: ctags 1298a5c14cdaSHiroo HAYASHI :linenos: 1299a5c14cdaSHiroo HAYASHI :lineno-start: 13 130001afa120SMasatake YAMATO 1301b40096fdSHadriel Kaplan --_mtable-regex-X=comment/.// 130201afa120SMasatake YAMATO 1303b40096fdSHadriel KaplanThis pattern matches any any one character including newline; the current 1304b40096fdSHadriel Kaplanposition moves one character forward. Now the character at the current position is 130586bcb5c2SHiroo HAYASHI'``B``'. The first pattern of the table ``*/`` still does not match with the input. So 130686bcb5c2SHiroo HAYASHIctags uses next pattern again. When the current position moves to the ``*/`` 1307b40096fdSHadriel Kaplanof the 3rd line of :file:`input.x`, it will finally match this: 130801afa120SMasatake YAMATO 1309d14dd918SMasatake YAMATO.. code-block:: ctags 1310a5c14cdaSHiroo HAYASHI :linenos: 1311a5c14cdaSHiroo HAYASHI :lineno-start: 12 131201afa120SMasatake YAMATO 1313b40096fdSHadriel Kaplan --_mtable-regex-X=comment/\*\///{tleave} 131401afa120SMasatake YAMATO 1315b40096fdSHadriel KaplanIn this pattern, the long flag ``{tleave}`` is specified. This triggers table 131686bcb5c2SHiroo HAYASHIswitching again. ``{tleave}`` makes ctags switch the table back to the last 1317b40096fdSHadriel Kaplantable used before doing ``{tenter}``. In this case, ``toplevel`` is the table. 131886bcb5c2SHiroo HAYASHIctags manages a stack where references to tables are put. ``{tenter}`` pushes 1319b40096fdSHadriel Kaplanthe current table to the stack. ``{tleave}`` pops the table at the top of the 1320b40096fdSHadriel Kaplanstack and chooses it. 132101afa120SMasatake YAMATO 132286bcb5c2SHiroo HAYASHISo now ctags is back to the ``toplevel`` table, and tries the first regex 1323b40096fdSHadriel Kaplanof that table, which was this: 1324b40096fdSHadriel Kaplan 1325d14dd918SMasatake YAMATO.. code-block:: ctags 1326a5c14cdaSHiroo HAYASHI :linenos: 1327a5c14cdaSHiroo HAYASHI :lineno-start: 9 1328b40096fdSHadriel Kaplan 1329b40096fdSHadriel Kaplan --_mtable-regex-X=toplevel/\/\*//{tenter=comment} 1330b40096fdSHadriel Kaplan 1331b40096fdSHadriel KaplanIt tries to match that against its current position, which is now the 1332b40096fdSHadriel Kaplannewline on line 3, between the ``*/`` and the word ``var``:: 1333b40096fdSHadriel Kaplan 1334b40096fdSHadriel Kaplan /* BLOCK COMMENT 1335b40096fdSHadriel Kaplan var dont_capture_me; 1336b40096fdSHadriel Kaplan */ <--- ctags is now at this newline (/n) character 1337b40096fdSHadriel Kaplan var a /* ANOTHER BLOCK COMMENT */, b; 1338b40096fdSHadriel Kaplan 1339b40096fdSHadriel KaplanThe first regex of the ``toplevel`` table does not match a newline, so it tries 1340b40096fdSHadriel Kaplanthe second regex: 1341b40096fdSHadriel Kaplan 1342d14dd918SMasatake YAMATO.. code-block:: ctags 1343a5c14cdaSHiroo HAYASHI :linenos: 1344a5c14cdaSHiroo HAYASHI :lineno-start: 13 1345b40096fdSHadriel Kaplan 1346b40096fdSHadriel Kaplan --_mtable-regex-X=toplevel/.// 1347b40096fdSHadriel Kaplan 134886bcb5c2SHiroo HAYASHIThis matches a newline successfully, but has no actions to perform. So ctags 1349b40096fdSHadriel Kaplanmoves one character forward (the newline it just matched), and goes back to the 1350b40096fdSHadriel Kaplantop of the ``toplevel`` table, and tries the first regex again. Eventually we'll 1351b40096fdSHadriel Kaplanreach the beginning of the second block comment, and do the same things as before. 1352b40096fdSHadriel Kaplan 135386bcb5c2SHiroo HAYASHIWhen ctags finally reaches the end of the file (the position after ``b;``), 1354b40096fdSHadriel Kaplanit will not be able to match either the first or second regex of the 1355b40096fdSHadriel Kaplan``toplevel`` table, and quit processing the input file. 1356b40096fdSHadriel Kaplan 1357b40096fdSHadriel KaplanSo far, we've successfully skipped over block comments for our new ``X`` 135886bcb5c2SHiroo HAYASHIlanguage, but haven't generated any tags. The point of ctags is to generate 1359b40096fdSHadriel Kaplantags, not just keep your computer warm. So now let's move onto actually tagging 1360b40096fdSHadriel Kaplanvariables... 136101afa120SMasatake YAMATO 136201afa120SMasatake YAMATO 136301afa120SMasatake YAMATOCapturing variables in a sequence 136401afa120SMasatake YAMATO...................................................................... 136501afa120SMasatake YAMATO 1366b40096fdSHadriel KaplanHere is the 4th version of :file:`X.ctags`: 136701afa120SMasatake YAMATO 1368d14dd918SMasatake YAMATO.. code-block:: ctags 1369a5c14cdaSHiroo HAYASHI :emphasize-lines: 10,16-19 1370a5c14cdaSHiroo HAYASHI :linenos: 137101afa120SMasatake YAMATO 137201afa120SMasatake YAMATO --langdef=X 137301afa120SMasatake YAMATO --map-X=.x 137401afa120SMasatake YAMATO --kinddef-X=v,var,variables 137501afa120SMasatake YAMATO 137601afa120SMasatake YAMATO --_tabledef-X=toplevel 137701afa120SMasatake YAMATO --_tabledef-X=comment 137801afa120SMasatake YAMATO --_tabledef-X=vars 137901afa120SMasatake YAMATO 138001afa120SMasatake YAMATO --_mtable-regex-X=toplevel/\/\*//{tenter=comment} 138101afa120SMasatake YAMATO --_mtable-regex-X=toplevel/var[ \n\t]//{tenter=vars} 138201afa120SMasatake YAMATO --_mtable-regex-X=toplevel/.// 138301afa120SMasatake YAMATO 138401afa120SMasatake YAMATO --_mtable-regex-X=comment/\*\///{tleave} 138501afa120SMasatake YAMATO --_mtable-regex-X=comment/.// 138601afa120SMasatake YAMATO 138701afa120SMasatake YAMATO --_mtable-regex-X=vars/;//{tleave} 138801afa120SMasatake YAMATO --_mtable-regex-X=vars/\/\*//{tenter=comment} 138901afa120SMasatake YAMATO --_mtable-regex-X=vars/([a-zA-Z][a-zA-Z0-9]*)/\1/v/ 139001afa120SMasatake YAMATO --_mtable-regex-X=vars/.// 139101afa120SMasatake YAMATO 1392b40096fdSHadriel KaplanOne pattern in ``toplevel`` was added, and a new table ``vars`` with four 1393b40096fdSHadriel Kaplanpatterns was also added. 139401afa120SMasatake YAMATO 1395b40096fdSHadriel KaplanThe new regex in ``toplevel`` is this: 139601afa120SMasatake YAMATO 1397d14dd918SMasatake YAMATO.. code-block:: ctags 1398a5c14cdaSHiroo HAYASHI :linenos: 1399a5c14cdaSHiroo HAYASHI :lineno-start: 10 140001afa120SMasatake YAMATO 1401b40096fdSHadriel Kaplan --_mtable-regex-X=toplevel/var[ \n\t]//{tenter=vars} 140201afa120SMasatake YAMATO 1403b40096fdSHadriel KaplanThe purpose of this being in `toplevel` is to switch to the `vars` table when 1404b40096fdSHadriel Kaplanthe keyword ``var`` is found in the input stream. We need to switch states 1405b40096fdSHadriel Kaplan(i.e., tables) because we can't simply capture the variables ``a`` and ``b`` 1406b40096fdSHadriel Kaplanwith a single regex pattern in the ``toplevel`` table, because there might be 1407b40096fdSHadriel Kaplanblock comments inside the ``var`` statement (as there are in our 1408b40096fdSHadriel Kaplan:file:`input.x`), and we also need to create *two* tags: one for ``a`` and one 1409b40096fdSHadriel Kaplanfor ``b``, even though the word ``var`` only appears once. In other words, we 1410b40096fdSHadriel Kaplanneed to "remember" that we saw the keyword ``var``, when we later encounter the 1411b40096fdSHadriel Kaplannames ``a`` and ``b``, so that we know to tag each of them; and saving that 1412b40096fdSHadriel Kaplan"in-variable-statement" state is accomplished by switching tables to the 1413b40096fdSHadriel Kaplan``vars`` table. 141401afa120SMasatake YAMATO 1415b40096fdSHadriel KaplanThe first regex in our new ``vars`` table is: 141601afa120SMasatake YAMATO 1417d14dd918SMasatake YAMATO.. code-block:: ctags 1418a5c14cdaSHiroo HAYASHI :linenos: 1419a5c14cdaSHiroo HAYASHI :lineno-start: 16 142001afa120SMasatake YAMATO 1421b40096fdSHadriel Kaplan --_mtable-regex-X=vars/;//{tleave} 1422b40096fdSHadriel Kaplan 142386bcb5c2SHiroo HAYASHIThis pattern is used to match a single semi-colon '``;``', and if it matches 1424b40096fdSHadriel Kaplanpop back to the ``toplevel`` table using the ``{tleave}`` long flag. We 1425b40096fdSHadriel Kaplandidn't have to make this the first regex pattern, because it doesn't overlap 1426b40096fdSHadriel Kaplanwith any of the other ones other than the ``/.//`` last one (which must be 1427b40096fdSHadriel Kaplanlast for this example to work). 1428b40096fdSHadriel Kaplan 1429b40096fdSHadriel KaplanThe second regex in our ``vars`` table is: 1430b40096fdSHadriel Kaplan 1431d14dd918SMasatake YAMATO.. code-block:: ctags 1432a5c14cdaSHiroo HAYASHI :linenos: 1433a5c14cdaSHiroo HAYASHI :lineno-start: 17 1434b40096fdSHadriel Kaplan 1435b40096fdSHadriel Kaplan --_mtable-regex-X=vars/\/\*//{tenter=comment} 1436b40096fdSHadriel Kaplan 1437b40096fdSHadriel KaplanWe need this because block comments can be in variable definitions:: 143801afa120SMasatake YAMATO 143901afa120SMasatake YAMATO var a /* ANOTHER BLOCK COMMENT */, b; 144001afa120SMasatake YAMATO 1441b40096fdSHadriel KaplanSo to skip block comments in such a position, the pattern ``\/\*`` is used just 1442b40096fdSHadriel Kaplanlike it was used in the ``toplevel`` table: to find the literal ``/*`` beginning 1443b40096fdSHadriel Kaplanof the block comment and enter the ``comment`` table. Because we're using 1444b40096fdSHadriel Kaplan``{tenter}`` and ``{tleave}`` to push/pop from a stack of tables, we can 1445b40096fdSHadriel Kaplanuse the same ``comment`` table for both ``toplevel`` and ``vars`` to go to, 144686bcb5c2SHiroo HAYASHIbecause ctags will *remember* the previous table and ``{tleave}`` will 1447b40096fdSHadriel Kaplanpop back to the right one. 144801afa120SMasatake YAMATO 1449b40096fdSHadriel KaplanThe third regex in our ``vars`` table is: 145001afa120SMasatake YAMATO 1451d14dd918SMasatake YAMATO.. code-block:: ctags 1452a5c14cdaSHiroo HAYASHI :linenos: 1453a5c14cdaSHiroo HAYASHI :lineno-start: 18 145401afa120SMasatake YAMATO 1455b40096fdSHadriel Kaplan --_mtable-regex-X=vars/([a-zA-Z][a-zA-Z0-9]*)/\1/v/ 145601afa120SMasatake YAMATO 1457b40096fdSHadriel KaplanThis is nothing special, but is the one that actually tags something: it 1458b40096fdSHadriel Kaplancaptures the variable name and uses it for generating a ``variable`` (shorthand 1459b40096fdSHadriel Kaplan``v``) tag kind. 1460b40096fdSHadriel Kaplan 1461b40096fdSHadriel KaplanThe last regex in the ``vars`` table we've seen before: 1462b40096fdSHadriel Kaplan 1463d14dd918SMasatake YAMATO.. code-block:: ctags 1464a5c14cdaSHiroo HAYASHI :linenos: 1465a5c14cdaSHiroo HAYASHI :lineno-start: 19 1466b40096fdSHadriel Kaplan 1467b40096fdSHadriel Kaplan --_mtable-regex-X=vars/.// 1468b40096fdSHadriel Kaplan 146986bcb5c2SHiroo HAYASHIThis makes ctags ignore any other characters, such as whitespace or the 147086bcb5c2SHiroo HAYASHIcomma '``,``'. 147101afa120SMasatake YAMATO 147201afa120SMasatake YAMATO 1473b40096fdSHadriel KaplanRunning our example 147401afa120SMasatake YAMATO...................................................................... 147501afa120SMasatake YAMATO 147601afa120SMasatake YAMATO.. code-block:: console 147701afa120SMasatake YAMATO 147801afa120SMasatake YAMATO $ cat input.x 147901afa120SMasatake YAMATO /* BLOCK COMMENT 148001afa120SMasatake YAMATO var dont_capture_me; 148101afa120SMasatake YAMATO */ 148201afa120SMasatake YAMATO var a /* ANOTHER BLOCK COMMENT */, b; 148301afa120SMasatake YAMATO 148401afa120SMasatake YAMATO $ u-ctags -o - --fields=+n --options=X.ctags input.x 148501afa120SMasatake YAMATO u-ctags -o - --fields=+n --options=X.ctags input.x 148601afa120SMasatake YAMATO a input.x /^var a \/* ANOTHER BLOCK COMMENT *\/, b;$/;" v line:4 148701afa120SMasatake YAMATO b input.x /^var a \/* ANOTHER BLOCK COMMENT *\/, b;$/;" v line:4 148801afa120SMasatake YAMATO 1489b40096fdSHadriel KaplanIt works! 149001afa120SMasatake YAMATO 1491b40096fdSHadriel KaplanYou can find additional examples of multi-table regex in our github repo, under 1492b40096fdSHadriel Kaplanthe ``optlib`` directory. For example ``puppetManifest.ctags`` is a serious 1493b40096fdSHadriel Kaplanexample. It is the primary parser for testing multi-table regex parsers, and 149486bcb5c2SHiroo HAYASHIused in the actual ctags program for parsing puppet manifest files. 149501afa120SMasatake YAMATO 149601afa120SMasatake YAMATO 14973f73955fSMasatake YAMATO.. _guest-regex-flag: 14983f73955fSMasatake YAMATO 1499b45a42b3SHiroo HAYASHIScheduling a guest parser with ``_guest`` regex flag 15003f73955fSMasatake YAMATO~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 15013f73955fSMasatake YAMATO.. NOT REVIEWED YET 15023f73955fSMasatake YAMATO 150386bcb5c2SHiroo HAYASHIWith ``_guest`` regex flag, you can run a parser (a guest parser) on an 15043f73955fSMasatake YAMATOarea of the current input file. 1505b45a42b3SHiroo HAYASHISee ":ref:`host-guest-parsers`" about the concept of the guest parser. 15063f73955fSMasatake YAMATO 15073cd8570eSHiroo HAYASHIThe ``_guest`` regex flag specifies a *guest spec*, and attaches it to 15083f73955fSMasatake YAMATOthe associated regex pattern. 15093f73955fSMasatake YAMATO 151086bcb5c2SHiroo HAYASHIA guest spec has three fields: *<PARSER>*, *<START>* of area, and *<END>* of area. 151186bcb5c2SHiroo HAYASHIThe ``_guest`` regex flag has following forms:: 15123f73955fSMasatake YAMATO 151386bcb5c2SHiroo HAYASHI {_guest=<PARSER>,<START>,<END>} 15143f73955fSMasatake YAMATO 15153cd8570eSHiroo HAYASHIctags maintains a data called *guest request* during parsing. A 15163f73955fSMasatake YAMATOguest request also has three fields: `parser`, `start of area`, and 15173f73955fSMasatake YAMATO`end of area`. 15183f73955fSMasatake YAMATO 15193f73955fSMasatake YAMATOYou, a parser developer, have to fill the fields of guest specs. 152086bcb5c2SHiroo HAYASHIctags inquiries the guest spec when matching the regex pattern 15213f73955fSMasatake YAMATOassociated with it, tries to fill the fields of the guest request, 15223f73955fSMasatake YAMATOand runs a guest parser when all the fields of the guest request are 15233f73955fSMasatake YAMATOfilled. 15243f73955fSMasatake YAMATO 15253cd8570eSHiroo HAYASHIIf you use `Multi-line pattern match`_ to define a host parser, 15263cd8570eSHiroo HAYASHIyou must specify all the fields of `guest request`. 15273cd8570eSHiroo HAYASHI 15283cd8570eSHiroo HAYASHIOn the other hand if you don't use `Multi-line pattern match`_ to define a host parser, 152986bcb5c2SHiroo HAYASHIctags can fill fields of `guest request` incrementally; more than 15303f73955fSMasatake YAMATOone guest specs are used to fill the fields. In other words, you can 15313cd8570eSHiroo HAYASHImake some of the fields of a guest spec empty. 15323f73955fSMasatake YAMATO 153386bcb5c2SHiroo HAYASHIThe *<PARSER>* field of ``_guest`` regex flag 15343f73955fSMasatake YAMATO...................................................................... 153586bcb5c2SHiroo HAYASHIFor *<PARSER>*, you can specify one of the following items: 15363f73955fSMasatake YAMATO 15373f73955fSMasatake YAMATOa name of a parser 15383f73955fSMasatake YAMATO 15393f73955fSMasatake YAMATO If you know the guest parser you want to run before parsing 1540*6024deefSMasatake YAMATO the input file, specify the name of the parser. Aliases of parsers 1541*6024deefSMasatake YAMATO are also considered when finding a parser for the name. 15423f73955fSMasatake YAMATO 15433f73955fSMasatake YAMATO An example of running C parser as a guest parser:: 15443f73955fSMasatake YAMATO 15453f73955fSMasatake YAMATO {_guest=C,... 15463f73955fSMasatake YAMATO 154786bcb5c2SHiroo HAYASHIthe group number of a regex pattern started from '``\``' (backslash) 15483f73955fSMasatake YAMATO 15493f73955fSMasatake YAMATO If a parser name appears in an input file, write a regex pattern 15503f73955fSMasatake YAMATO to capture the name. Specify the group number where the name is 155186bcb5c2SHiroo HAYASHI stored to the parser. In such case, use '``\``' as the prefix for 1552*6024deefSMasatake YAMATO the number. Aliases of parsers are also considered when finding 1553*6024deefSMasatake YAMATO a parser for the name. 15543f73955fSMasatake YAMATO 15553f73955fSMasatake YAMATO Let's see an example. Git Flavor Markdown (GFM) is a language for 15563f73955fSMasatake YAMATO documentation. It provides a notation for quoting a snippet of 15573f73955fSMasatake YAMATO program code; the language treats the area started from ``~~~`` to 15583f73955fSMasatake YAMATO ``~~~`` as a snippet. You can specify a programming language of 15593f73955fSMasatake YAMATO the snippet with starting the area with 156086bcb5c2SHiroo HAYASHI ``~~~<THE_NAME_OF_LANGUAGE>``, like ``~~~C`` or ``~~~Java``. 15613f73955fSMasatake YAMATO 15623f73955fSMasatake YAMATO To run a guest parser on the area, you have to capture the 1563a5c14cdaSHiroo HAYASHI *<THE_NAME_OF_LANGUAGE>* with a regex pattern: 1564a5c14cdaSHiroo HAYASHI 1565a5c14cdaSHiroo HAYASHI .. code-block:: ctags 15663f73955fSMasatake YAMATO 15673f73955fSMasatake YAMATO --_mtable-regex-Markdown=main/~~~([a-zA-Z0-9][-#+a-zA-Z0-9]*)[\n]//{_guest=\1,0end,} 15683f73955fSMasatake YAMATO 15693f73955fSMasatake YAMATO The pattern captures the language name in the input file with the 157086bcb5c2SHiroo HAYASHI regex group 1, and specify it to *<PARSER>*:: 15713f73955fSMasatake YAMATO 15723f73955fSMasatake YAMATO {guest=\1,... 15733f73955fSMasatake YAMATO 157486bcb5c2SHiroo HAYASHIthe group number of a regex pattern started from '``*``' (asterisk) 15753f73955fSMasatake YAMATO 15763f73955fSMasatake YAMATO If a file name implying a programming language appears in an input 15773f73955fSMasatake YAMATO file, capture the file name with the regex pattern where the guest 157886bcb5c2SHiroo HAYASHI spec attaches to. ctags tries to find a proper parser for the 15793f73955fSMasatake YAMATO file name by inquiring the langmap. 15803f73955fSMasatake YAMATO 158186bcb5c2SHiroo HAYASHI Use '``*``' as the prefix to the number for specifying the group of 15823f73955fSMasatake YAMATO the regex pattern that captures the file name. 15833f73955fSMasatake YAMATO 15843f73955fSMasatake YAMATO Let's see an example. Consider you have a shell script that emits 15853cd8570eSHiroo HAYASHI a program code instantiated from one of the templates. Here documents 15863f73955fSMasatake YAMATO are used to represent the templates like: 15873f73955fSMasatake YAMATO 15883f73955fSMasatake YAMATO .. code-block:: sh 15893f73955fSMasatake YAMATO 15903f73955fSMasatake YAMATO i=... 15913f73955fSMasatake YAMATO cat > foo.c <<EOF 15923f73955fSMasatake YAMATO int main (void) { return $i; } 15933f73955fSMasatake YAMATO EOF 15943f73955fSMasatake YAMATO 15953f73955fSMasatake YAMATO cat > foo.el <<EOF 15963f73955fSMasatake YAMATO (defun foo () (1+ $i)) 15973f73955fSMasatake YAMATO EOF 15983f73955fSMasatake YAMATO 15993f73955fSMasatake YAMATO To run guest parsers for the here document areas, the shell 16003f73955fSMasatake YAMATO script parser of ctags must choose the parsers from the file 1601a5c14cdaSHiroo HAYASHI names (``foo.c`` and ``foo.el``): 1602a5c14cdaSHiroo HAYASHI 1603a5c14cdaSHiroo HAYASHI .. code-block:: ctags 16043f73955fSMasatake YAMATO 16053f73955fSMasatake YAMATO --regex-sh=/cat > ([a-z.]+) <<EOF//{_guest=*1,0end,} 16063f73955fSMasatake YAMATO 16073f73955fSMasatake YAMATO The pattern captures the file name in the input file with the 160886bcb5c2SHiroo HAYASHI regex group 1, and specify it to *<PARSER>*:: 16093f73955fSMasatake YAMATO 16103f73955fSMasatake YAMATO {_guest=*1,... 16113f73955fSMasatake YAMATO 161286bcb5c2SHiroo HAYASHIThe *<START>* and *<END>* fields of `_guest` regex flag 16133f73955fSMasatake YAMATO...................................................................... 16143f73955fSMasatake YAMATO 161586bcb5c2SHiroo HAYASHIThe *<START>* and *<END>* fields specify the area the *<PARSER>* parses. *<START>* 161686bcb5c2SHiroo HAYASHIspecifies the start of the area. *<END>* specifies the end of the area. 16173f73955fSMasatake YAMATO 16183f73955fSMasatake YAMATOThe forms of the two fields are the same: a regex group number 161986bcb5c2SHiroo HAYASHIfollowed by ``start`` or ``end``. e.g. ``3start``, ``0end``. The suffixes, 162086bcb5c2SHiroo HAYASHI``start`` and ``end``, represents one of two boundaries of the group. 16213f73955fSMasatake YAMATO 162286bcb5c2SHiroo HAYASHILet's see an example:: 16233f73955fSMasatake YAMATO 16243f73955fSMasatake YAMATO {_guest=C,2end,3start} 16253f73955fSMasatake YAMATO 16263f73955fSMasatake YAMATOThis guest regex flag means running C parser on the area between 162786bcb5c2SHiroo HAYASHI``2end`` and ``3start``. ``2end`` means the area starts from the end of 162886bcb5c2SHiroo HAYASHImatching of the 2nd regex group associated with the flag. ``3start`` 16293f73955fSMasatake YAMATOmeans the area ends at the beginning of matching of the 3rd regex 16303f73955fSMasatake YAMATOgroup associated with the flag. 16313f73955fSMasatake YAMATO 16323f73955fSMasatake YAMATOLet's more realistic example. 16333cd8570eSHiroo HAYASHIHere is an optlib file for an imaginary language `single`: 16343f73955fSMasatake YAMATO 1635d14dd918SMasatake YAMATO.. code-block:: ctags 1636a5c14cdaSHiroo HAYASHI :emphasize-lines: 3 1637d14dd918SMasatake YAMATO 16383f73955fSMasatake YAMATO --langdef=single 16393f73955fSMasatake YAMATO --map-single=.single 16403f73955fSMasatake YAMATO --regex-single=/^(BEGIN_C<).*(>END_C)$//{_guest=C,1end,2start} 16413f73955fSMasatake YAMATO 164286bcb5c2SHiroo HAYASHIThis parser can run C parser and extract ``main`` function from the 16433f73955fSMasatake YAMATOfollowing input file:: 16443f73955fSMasatake YAMATO 16453f73955fSMasatake YAMATO BEGIN_C<int main (int argc, char **argv) { return 0; }>END_C 16463f73955fSMasatake YAMATO ^ ^ 16473f73955fSMasatake YAMATO `- "1end" points here. | 16483f73955fSMasatake YAMATO "2start" points here. -+ 16493f73955fSMasatake YAMATO 1650b45a42b3SHiroo HAYASHI.. NOT REVIEWED YET 16510f3a04d2SHiroo HAYASHI 16520f3a04d2SHiroo HAYASHI.. _defining-subparsers: 16530f3a04d2SHiroo HAYASHI 16540f3a04d2SHiroo HAYASHIDefining a subparser 16550f3a04d2SHiroo HAYASHI~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 16560f3a04d2SHiroo HAYASHI 16570f3a04d2SHiroo HAYASHIBasic 16580f3a04d2SHiroo HAYASHI......................................................................... 16590f3a04d2SHiroo HAYASHI 166086bcb5c2SHiroo HAYASHIAbout the concept of subparser, see ":ref:`base-sub-parsers`". 16610f3a04d2SHiroo HAYASHI 16623cd8570eSHiroo HAYASHI``--langdef=<LANG>`` option is extended as 16633cd8570eSHiroo HAYASHI``--langdef=<LANG>[{base=<LANG>}[{shared|dedicated|bidirectional}]][{_autoFQTag}]`` to define 16640f3a04d2SHiroo HAYASHIa subparser for a specified base parser. Combining with ``--kinddef-<LANG>`` 16650f3a04d2SHiroo HAYASHIand ``--regex-<KIND>`` options, you can extend an existing parser 16660f3a04d2SHiroo HAYASHIwithout risk of kind confliction. 16670f3a04d2SHiroo HAYASHI 16680f3a04d2SHiroo HAYASHILet's see an example. 16690f3a04d2SHiroo HAYASHI 16700f3a04d2SHiroo HAYASHIinput.c 16710f3a04d2SHiroo HAYASHI 16720f3a04d2SHiroo HAYASHI.. code-block:: C 16730f3a04d2SHiroo HAYASHI 16740f3a04d2SHiroo HAYASHI static int set_one_prio(struct task_struct *p, int niceval, int error) 16750f3a04d2SHiroo HAYASHI { 16760f3a04d2SHiroo HAYASHI } 16770f3a04d2SHiroo HAYASHI 16780f3a04d2SHiroo HAYASHI SYSCALL_DEFINE3(setpriority, int, which, int, who, int, niceval) 16790f3a04d2SHiroo HAYASHI { 16800f3a04d2SHiroo HAYASHI ...; 16810f3a04d2SHiroo HAYASHI } 16820f3a04d2SHiroo HAYASHI 16830f3a04d2SHiroo HAYASHI.. code-block:: console 16840f3a04d2SHiroo HAYASHI 168545e335abSHiroo HAYASHI $ ctags -x --_xformat="%20N %10K %10l" -o - input.c 16860f3a04d2SHiroo HAYASHI set_one_prio function C 16870f3a04d2SHiroo HAYASHI SYSCALL_DEFINE3 function C 16880f3a04d2SHiroo HAYASHI 168986bcb5c2SHiroo HAYASHIC parser doesn't understand that ``SYSCALL_DEFINE3`` is a macro for defining an 16900f3a04d2SHiroo HAYASHIentry point for a system. 16910f3a04d2SHiroo HAYASHI 1692a5c14cdaSHiroo HAYASHILet's define `linux` subparser which using C parser as a base parser (``linux.ctags``): 16930f3a04d2SHiroo HAYASHI 1694a5c14cdaSHiroo HAYASHI.. code-block:: ctags 1695a5c14cdaSHiroo HAYASHI :emphasize-lines: 1,3 16960f3a04d2SHiroo HAYASHI 16970f3a04d2SHiroo HAYASHI --langdef=linux{base=C} 16980f3a04d2SHiroo HAYASHI --kinddef-linux=s,syscall,system calls 16990f3a04d2SHiroo HAYASHI --regex-linux=/SYSCALL_DEFINE[0-9]\(([^, )]+)[\),]*/\1/s/ 17000f3a04d2SHiroo HAYASHI 17010f3a04d2SHiroo HAYASHIThe output is change as follows with `linux` parser: 17020f3a04d2SHiroo HAYASHI 17030f3a04d2SHiroo HAYASHI.. code-block:: console 1704a5c14cdaSHiroo HAYASHI :emphasize-lines: 2 17050f3a04d2SHiroo HAYASHI 170645e335abSHiroo HAYASHI $ ctags --options=./linux.ctags -x --_xformat="%20N %10K %10l" -o - input.c 17070f3a04d2SHiroo HAYASHI setpriority syscall linux 17080f3a04d2SHiroo HAYASHI set_one_prio function C 17090f3a04d2SHiroo HAYASHI SYSCALL_DEFINE3 function C 17100f3a04d2SHiroo HAYASHI 171186bcb5c2SHiroo HAYASHI``setpriority`` is recognized as a ``syscall`` of `linux`. 17120f3a04d2SHiroo HAYASHI 171386bcb5c2SHiroo HAYASHIUsing only ``--regex-C=...`` you can capture ``setpriority``. 17140f3a04d2SHiroo HAYASHIHowever, there were concerns about kind confliction; when introducing 171586bcb5c2SHiroo HAYASHIa new kind with ``--regex-C=...``, you cannot use a letter and name already 171686bcb5c2SHiroo HAYASHIused in C parser and ``--regex-C=...`` options specified in the other places. 17170f3a04d2SHiroo HAYASHI 17180f3a04d2SHiroo HAYASHIYou can use a newly defined subparser as a new namespace of kinds. 17190f3a04d2SHiroo HAYASHIIn addition you can enable/disable with the subparser usable 172086bcb5c2SHiroo HAYASHI``--languages=[+|-]`` option: 17210f3a04d2SHiroo HAYASHI 17220f3a04d2SHiroo HAYASHI.. code-block::console 17230f3a04d2SHiroo HAYASHI 172445e335abSHiroo HAYASHI $ ctags --options=./linux.ctags --languages=-linux -x --_xformat="%20N %10K %10l" -o - input.c 17250f3a04d2SHiroo HAYASHI set_one_prio function C 17260f3a04d2SHiroo HAYASHI SYSCALL_DEFINE3 function C 17270f3a04d2SHiroo HAYASHI 1728eb56edb2SHiroo HAYASHI.. _optlib_directions: 1729eb56edb2SHiroo HAYASHI 1730eb56edb2SHiroo HAYASHIDirection flags 17310f3a04d2SHiroo HAYASHI......................................................................... 17320f3a04d2SHiroo HAYASHI 1733755aeae5SMasatake YAMATO.. TESTCASE: Units/flags-langdef-directions.r 1734755aeae5SMasatake YAMATO 1735eb56edb2SHiroo HAYASHIAs explained in ":ref:`multiple_parsers_directions`" in 1736eb56edb2SHiroo HAYASHI":ref:`multiple_parsers`", you can choose direction(s) how a base parser and a 1737eb56edb2SHiroo HAYASHIguest parser work together with direction flags. 17380f3a04d2SHiroo HAYASHI 1739eb56edb2SHiroo HAYASHIThe following examples are taken from `#1409 17400f3a04d2SHiroo HAYASHI<https://github.com/universal-ctags/ctags/issues/1409>`_ submitted by @sgraham on 17410f3a04d2SHiroo HAYASHIgithub Universal Ctags repository. 17420f3a04d2SHiroo HAYASHI 174386bcb5c2SHiroo HAYASHI``input.cc`` and ``input.mojom`` are input files, and have the same 17440f3a04d2SHiroo HAYASHIcontents:: 17450f3a04d2SHiroo HAYASHI 17460f3a04d2SHiroo HAYASHI ABC(); 17470f3a04d2SHiroo HAYASHI int main(void) 17480f3a04d2SHiroo HAYASHI { 17490f3a04d2SHiroo HAYASHI } 17500f3a04d2SHiroo HAYASHI 175186bcb5c2SHiroo HAYASHIC++ parser can capture ``main`` as a function. `Mojom` subparser defined in the 175286bcb5c2SHiroo HAYASHIlater runs on C++ parser and is for capturing ``ABC``. 17530f3a04d2SHiroo HAYASHI 17540f3a04d2SHiroo HAYASHIshared combination 1755a60d2470SHiroo HAYASHI^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 175686bcb5c2SHiroo HAYASHI``{shared}`` is specified, for ``input.cc``, both tags capture by C++ parser 175786bcb5c2SHiroo HAYASHIand mojom parser are recorded to tags file. For ``input.mojom``, only 17580f3a04d2SHiroo HAYASHItags captured by mojom parser are recorded to tags file. 17590f3a04d2SHiroo HAYASHI 17600f3a04d2SHiroo HAYASHImojom-shared.ctags: 17610f3a04d2SHiroo HAYASHI 17620f3a04d2SHiroo HAYASHI.. code-block:: ctags 1763a5c14cdaSHiroo HAYASHI :emphasize-lines: 1 17640f3a04d2SHiroo HAYASHI 17650f3a04d2SHiroo HAYASHI --langdef=mojom{base=C++}{shared} 17660f3a04d2SHiroo HAYASHI --map-mojom=+.mojom 17670f3a04d2SHiroo HAYASHI --kinddef-mojom=f,function,functions 17680f3a04d2SHiroo HAYASHI --regex-mojom=/^[ ]+([a-zA-Z]+)\(/\1/f/ 17690f3a04d2SHiroo HAYASHI 1770a5c14cdaSHiroo HAYASHI.. code-block:: ctags 1771a5c14cdaSHiroo HAYASHI :emphasize-lines: 2 17720f3a04d2SHiroo HAYASHI 1773a5c14cdaSHiroo HAYASHI $ ctags --options=mojom-shared.ctags --fields=+l -o - input.cc 17740f3a04d2SHiroo HAYASHI ABC input.cc /^ ABC();$/;" f language:mojom 17750f3a04d2SHiroo HAYASHI main input.cc /^int main(void)$/;" f language:C++ typeref:typename:int 17760f3a04d2SHiroo HAYASHI 1777a5c14cdaSHiroo HAYASHI.. code-block:: ctags 1778a5c14cdaSHiroo HAYASHI :emphasize-lines: 2 17790f3a04d2SHiroo HAYASHI 1780a5c14cdaSHiroo HAYASHI $ ctags --options=mojom-shared.ctags --fields=+l -o - input.mojom 17810f3a04d2SHiroo HAYASHI ABC input.mojom /^ ABC();$/;" f language:mojom 17820f3a04d2SHiroo HAYASHI 17830f3a04d2SHiroo HAYASHIMojom parser uses C++ parser internally but tags captured by C++ parser are 17840f3a04d2SHiroo HAYASHIdropped in the output. 17850f3a04d2SHiroo HAYASHI 17860f3a04d2SHiroo HAYASHIdedicated combination 1787a60d2470SHiroo HAYASHI^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 178886bcb5c2SHiroo HAYASHI``{dedicated}`` is specified, for ``input.cc``, only tags capture by C++ 178986bcb5c2SHiroo HAYASHIparser are recorded to tags file. For ``input.mojom``, both tags capture 17900f3a04d2SHiroo HAYASHIby C++ parser and mojom parser are recorded to tags file. 17910f3a04d2SHiroo HAYASHI 17920f3a04d2SHiroo HAYASHImojom-dedicated.ctags: 17930f3a04d2SHiroo HAYASHI 17940f3a04d2SHiroo HAYASHI.. code-block:: ctags 1795a5c14cdaSHiroo HAYASHI :emphasize-lines: 1 17960f3a04d2SHiroo HAYASHI 17970f3a04d2SHiroo HAYASHI --langdef=mojom{base=C++}{dedicated} 17980f3a04d2SHiroo HAYASHI --map-mojom=+.mojom 17990f3a04d2SHiroo HAYASHI --kinddef-mojom=f,function,functions 18000f3a04d2SHiroo HAYASHI --regex-mojom=/^[ ]+([a-zA-Z]+)\(/\1/f/ 18010f3a04d2SHiroo HAYASHI 1802a5c14cdaSHiroo HAYASHI.. code-block:: ctags 18030f3a04d2SHiroo HAYASHI 1804a5c14cdaSHiroo HAYASHI $ ctags --options=mojom-dedicated.ctags --fields=+l -o - input.cc 18050f3a04d2SHiroo HAYASHI main input.cc /^int main(void)$/;" f language:C++ typeref:typename:int 18060f3a04d2SHiroo HAYASHI 1807a5c14cdaSHiroo HAYASHI.. code-block:: ctags 1808a5c14cdaSHiroo HAYASHI :emphasize-lines: 2-3 18090f3a04d2SHiroo HAYASHI 1810a5c14cdaSHiroo HAYASHI $ ctags --options=mojom-dedicated.ctags --fields=+l -o - input.mojom 18110f3a04d2SHiroo HAYASHI ABC input.mojom /^ ABC();$/;" f language:mojom 18120f3a04d2SHiroo HAYASHI main input.mojom /^int main(void)$/;" f language:C++ typeref:typename:int 18130f3a04d2SHiroo HAYASHI 181486bcb5c2SHiroo HAYASHIMojom parser works only when ``.mojom`` file is given as input. 18150f3a04d2SHiroo HAYASHI 18160f3a04d2SHiroo HAYASHIbidirectional combination 1817a60d2470SHiroo HAYASHI^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 181886bcb5c2SHiroo HAYASHI``{bidirectional}`` is specified, both tags capture by C++ parser and 181986bcb5c2SHiroo HAYASHImojom parser are recorded to tags file for either input ``input.cc`` and 182086bcb5c2SHiroo HAYASHI``input.mojom``. 18210f3a04d2SHiroo HAYASHI 18220f3a04d2SHiroo HAYASHImojom-bidirectional.ctags: 18230f3a04d2SHiroo HAYASHI 18240f3a04d2SHiroo HAYASHI.. code-block:: ctags 1825a5c14cdaSHiroo HAYASHI :emphasize-lines: 1 18260f3a04d2SHiroo HAYASHI 18270f3a04d2SHiroo HAYASHI --langdef=mojom{base=C++}{bidirectional} 18280f3a04d2SHiroo HAYASHI --map-mojom=+.mojom 18290f3a04d2SHiroo HAYASHI --kinddef-mojom=f,function,functions 18300f3a04d2SHiroo HAYASHI --regex-mojom=/^[ ]+([a-zA-Z]+)\(/\1/f/ 18310f3a04d2SHiroo HAYASHI 1832a5c14cdaSHiroo HAYASHI.. code-block:: ctags 1833a5c14cdaSHiroo HAYASHI :emphasize-lines: 2 18340f3a04d2SHiroo HAYASHI 1835a5c14cdaSHiroo HAYASHI $ ctags --options=mojom-bidirectional.ctags --fields=+l -o - input.cc 18360f3a04d2SHiroo HAYASHI ABC input.cc /^ ABC();$/;" f language:mojom 18370f3a04d2SHiroo HAYASHI main input.cc /^int main(void)$/;" f language:C++ typeref:typename:int 18380f3a04d2SHiroo HAYASHI 1839a5c14cdaSHiroo HAYASHI.. code-block:: ctags 1840a5c14cdaSHiroo HAYASHI :emphasize-lines: 2-3 18410f3a04d2SHiroo HAYASHI 1842a5c14cdaSHiroo HAYASHI $ ctags --options=mojom-bidirectional.ctags --fields=+l -o - input.mojom 18430f3a04d2SHiroo HAYASHI ABC input.cc /^ ABC();$/;" f language:mojom 18440f3a04d2SHiroo HAYASHI main input.cc /^int main(void)$/;" f language:C++ typeref:typename:int 18450f3a04d2SHiroo HAYASHI 18460f3a04d2SHiroo HAYASHI 1847e30940dcSHiroo HAYASHI.. _optlib2c: 1848e30940dcSHiroo HAYASHI 1849e30940dcSHiroo HAYASHITranslating an option file into C source code (optlib2c) 1850e30940dcSHiroo HAYASHI~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1851e30940dcSHiroo HAYASHIUniversal Ctags has an ``optlib2c`` script that translates an option file into C 1852e30940dcSHiroo HAYASHIsource code. Your optlib parser can thus easily become a built-in parser. 1853e30940dcSHiroo HAYASHI 185486bcb5c2SHiroo HAYASHITo add your optlib file, ``foo.ctags``, into ctags do the following steps; 1855e30940dcSHiroo HAYASHI 1856e30940dcSHiroo HAYASHI* copy ``foo.ctags`` file on ``optlib/`` directory 185779059629SHiroo HAYASHI* add ``foo.ctags`` on ``OPTLIB2C_INPUT`` variable in ``source.mak`` 1858e30940dcSHiroo HAYASHI* add ``fooParser`` on ``PARSER_LIST`` macro variable in ``main/parser_p.h`` 1859e30940dcSHiroo HAYASHI 1860e30940dcSHiroo HAYASHIYou are encouraged to submit your :file:`.ctags` file to our repository on 186186bcb5c2SHiroo HAYASHIgithub through a pull request. See ":ref:`contributions`" for more details. 1862