1.. _optlib: 2 3Extending ctags with Regex parser (*optlib*) 4--------------------------------------------------------------------- 5 6:Maintainer: Masatake YAMATO <yamato@redhat.com> 7 8.. contents:: `Table of contents` 9 :depth: 3 10 :local: 11 12.. TODO: 13 add a section on debugging 14 15Exuberant Ctags allows a user to add a new parser to ctags with ``--langdef=<LANG>`` 16and ``--regex-<LANG>=...`` options. 17Universal Ctags follows and extends the design of Exuberant Ctags in more 18powerful ways and call the feature as *optlib parser*, which is described in in 19:ref:`ctags-optlib(7) <ctags-optlib(7)>` and the following sections. 20 21:ref:`ctags-optlib(7) <ctags-optlib(7)>` is the primary document of the optlib 22parser feature. The following sections provide additional information and more 23advanced features. Note that some of the features are experimental, and will be 24marked as such in the documentation. 25 26Lots of optlib parsers are included in Universal Ctags, 27`optlib/*.ctags <https://github.com/universal-ctags/ctags/tree/master/optlib>`_. 28They will be good examples when you develop your own parsers. 29 30A optlib parser can be translated into C source code. Your optlib parser can 31thus easily become a built-in parser. See ":ref:`optlib2c`" for details. 32 33Regular expression (regex) engine 34~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 35 36Universal Ctags uses `the POSIX Extended Regular Expressions (ERE) 37<https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html>`_ 38syntax as same as Exuberant Ctags by default. 39 40During building Universal Ctags the ``configure`` script runs compatibility 41tests of the regex engine in the system library. If tests pass the engine is 42used, otherwise the regex engine imported from `the GNU Gnulib library 43<https://www.gnu.org/software/gnulib/manual/gnulib.html#Regular-expressions>`_ 44is used. In the latter case, ``ctags --list-features`` will contain 45``gnulib_regex``. 46 47See ``regex(7)`` or `the GNU Gnulib Manual 48<https://www.gnu.org/software/gnulib/manual/gnulib.html#Regular-expressions>`_ 49for the details of the regular expression syntax. 50 51.. note:: 52 53 The GNU regex engine supports some GNU extensions described `here 54 <https://www.gnu.org/software/gnulib/manual/gnulib.html#posix_002dextended-regular-expression-syntax>`_. 55 Note that an optlib parser using the extensions may not work with Universal 56 Ctags on some other systems. 57 58The POSIX Extended Regular Expressions (ERE) does 59*not* support many of the "modern" extensions such as lazy captures, 60non-capturing grouping, atomic grouping, possessive quantifiers, look-ahead/behind, 61etc. It may be notoriously slow when backtracking. 62 63A common error is forgetting that a 64POSIX ERE engine is always *greedy*; the '``*``' and '``+``' quantifiers match 65as much as possible, before backtracking from the end of their match. 66 67For example this pattern:: 68 69 foo.*bar 70 71Will match this entire string, not just the first part:: 72 73 foobar, bar, and even more bar 74 75Another detail to keep in mind is how the regex engine treats newlines. 76Universal Ctags compiles the regular expressions in the ``--regex-<LANG>`` and 77``--mline-regex-<LANG>`` options with ``REG_NEWLINE`` set. What that means is documented 78in the 79`POSIX specification <https://pubs.opengroup.org/onlinepubs/9699919799/functions/regcomp.html>`_. 80One obvious effect is that the regex special dot any-character '``.``' does not match 81newline characters, the '``^``' anchor *does* match right after a newline, and 82the '``$``' anchor matches right before a newline. A more subtle issue is this text from the 83chapter "`Regular Expressions <https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html>`_"; 84"the use of literal <newline>s or any escape sequence equivalent produces undefined 85results". What that means is using a regex pattern with ``[^\n]+`` is invalid, 86and indeed in glibc produces very odd results. **Never use** '``\n``' in patterns 87for ``--regex-<LANG>``, and **never use them** in non-matching bracket expressions 88for ``--mline-regex-<LANG>`` patterns. For the experimental ``--_mtable-regex-<LANG>`` 89you can safely use '``\n``' because that regex is not compiled with ``REG_NEWLINE``. 90 91And it may also have some known "quirks" 92with respect to escaping special characters in bracket expressions. 93For example, a pattern of ``[^\]]+`` is invalid in POSIX ERE, because the '``]``' is 94*not* special inside a bracket expression, and thus should **not** be escaped. 95Most regex engines ignore this subtle detail in POSIX ERE, and instead allow 96escaping it with '``\]``' inside the bracket expression and treat it as the 97literal character '``]``'. GNU glibc, however, does not generate an error but 98instead considers it undefined behavior, and in fact it will match very odd 99things. Instead you **must** use the more unintuitive ``[^]]+`` syntax. The same 100is technically true of other special characters inside a bracket expression, 101such as ``[^\)]+``, which should instead be ``[^)]+``. The ``[^\)]+`` will 102appear to work usually, but only because what it is really doing is matching any 103character but '``\``' *or* '``)``'. The only exceptions for using '``\``' inside a 104bracket expression are for '``\t``' and '``\n``', which ctags converts to their 105single literal character control codes before passing the pattern to glibc. 106 107You should always test your regex patterns against test files with strings that 108do and do not match. Pay particular emphasis to when it should *not* match, and 109how *much* it matches when it should. 110 111Perl-compatible regular expressions (PCRE2) engine 112~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 113 114Universal Ctags optionally supports `Perl-Compatible Regular Expressions (PCRE2) 115<https://www.pcre.org/current/doc/html/pcre2syntax.html>`_ syntax 116only if the Universal Ctags is built with ``pcre2`` library. 117See the output of ``--list-features`` option to know whether your Universal 118Ctags is built-with ``pcre2`` or not. 119 120PCRE2 *does* support many "modern" extensions. 121For example this pattern:: 122 123 foo.*?bar 124 125Will match just the first part, ``foobar``, not this entire string,:: 126 127 foobar, bar, and even more bar 128 129Regex option argument flags 130~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 131 132Many regex-based options described in this document support additional arguments 133in the form of long flags. Long flags are specified with surrounding '``{``' and 134'``}``'. 135 136The general format and placement is as follows: 137 138.. code-block:: ctags 139 140 --regex-<LANG>=<PATTERN>/<NAME>/[<KIND>/]LONGFLAGS 141 142Some examples: 143 144.. code-block:: ctags 145 146 --regex-Pod=/^=head1[ \t]+(.+)/\1/c/ 147 --regex-Foo=/set=[^;]+/\1/v/{icase} 148 --regex-Man=/^\.TH[[:space:]]{1,}"([^"]{1,})".*/\1/t/{exclusive}{icase}{scope=push} 149 --regex-Gdbinit=/^#//{exclusive} 150 151Note that the last example only has two '``/``' forward-slashes following 152the regex pattern, as a shortened form when no kind-spec exists. 153 154The ``--mline-regex-<LANG>`` option also follows the above format. The 155experimental ``--_mtable-regex-<LANG>`` option follows a slightly 156modified version as well. 157 158Regex control flags 159...................................................................... 160 161.. Q: why even discuss the single-character version of the flags? Just 162 make everyone use the long form. 163 164The regex matching can be controlled by adding flags to the ``--regex-<LANG>``, 165``--mline-regex-<LANG>``, and experimental ``--_mtable-regex-<LANG>`` options. 166This is done by either using the single character short flags ``b``, ``e`` and 167``i`` flags as explained in the *ctags.1* man page, or by using long flags 168described earlier. The long flags require more typing but are much more 169readable. 170 171The mapping between the older short flag names and long flag names is: 172 173=========== =========== =========== 174short flag long flag description 175=========== =========== =========== 176b basic Posix basic regular expression syntax. 177e extend Posix extended regular expression syntax (default). 178i icase Case-insensitive matching. 179=========== =========== =========== 180 181 182So the following ``--regex-<LANG>`` expression: 183 184.. code-block:: ctags 185 186 --kinddef-m4=d,definition,definitions 187 --regex-m4=/^m4_define\(\[([^]$\(]+).+$/\1/d/x 188 189is the same as: 190 191.. code-block:: ctags 192 193 --kinddef-m4=d,definition,definitions 194 --regex-m4=/^m4_define\(\[([^]$\(]+).+$/\1/d/{extend} 195 196The characters '``{``' and '``}``' may not be suitable for command line 197use, but long flags are mostly intended for option files. 198 199Exclusive flag in regex 200...................................................................... 201 202By default, lines read from the input files will be matched against all the 203regular expressions defined with ``--regex-<LANG>``. Each successfully matched 204regular expression will emit a tag. 205 206In some cases another policy, exclusive-matching, is preferable to the 207all-matching policy. Exclusive-matching means the rest of regular 208expressions are not tried if one of regular expressions is matched 209successfully, for that input line. 210 211For specifying exclusive-matching the flags ``exclusive`` (long) and ``x`` 212(short) were introduced. For example, this is used in 213:file:`optlib/gdbinit.ctags` for ignoring comment lines in gdb files, 214as follows: 215 216.. code-block:: ctags 217 218 --regex-Gdbinit=/^#//{exclusive} 219 220Comments in gdb files start with '``#``' so the above line is the first regex 221match line in :file:`gdbinit.ctags`, so that subsequent regex matches are 222not tried for the input line. 223 224If an empty name pattern (``//``) is used for the ``--regex-<LANG>`` option, 225ctags warns it as a wrong usage of the option. However, if the flags 226``exclusive`` or ``x`` is specified, the warning is suppressed. 227This is useful to ignore matched patterns as above. 228 229NOTE: This flag does not make sense in the multi-line ``--mline-regex-<LANG>`` 230option nor the multi-table ``--_mtable-regex-<LANG>`` option. 231 232 233Experimental flags 234...................................................................... 235 236.. note:: These flags are experimental. They apply to all regex option 237 types: basic ``--regex-<LANG>``, multi-line ``--mline-regex-<LANG>``, 238 and the experimental multi-table ``--_mtable-regex-<LANG>`` option. 239 240``_extra`` 241 242 This flag indicates the tag should only be generated if the given 243 ``extra`` type is enabled, as explained in ":ref:`extras`". 244 245``_field`` 246 247 This flag allows a regex match to add additional custom fields to the 248 generated tag entry, as explained in ":ref:`fields`". 249 250``_role`` 251 252 This flag allows a regex match to generate a reference tag entry and 253 specify the role of the reference, as explained in ":ref:`roles`". 254 255.. NOT REVIEWED YET 256 257``_anonymous=PREFIX`` 258 259 This flag allows a regex match to generate an anonymous tag entry. 260 ctags gives a name starting with ``PREFIX`` and emits it. 261 This flag is useful to record the position for a language object 262 having no name. A lambda function in a functional programming 263 language is a typical example of a language object having no name. 264 265 Consider following input (``input.foo``): 266 267 .. code-block:: lisp 268 269 (let ((f (lambda (x) (+ 1 x)))) 270 ... 271 ) 272 273 Consider following optlib file (``foo.ctags``): 274 275 .. code-block:: ctags 276 :emphasize-lines: 4 277 278 --langdef=Foo 279 --map-Foo=+.foo 280 --kinddef-Foo=l,lambda,lambda functions 281 --regex-Foo=/.*\(lambda .*//l/{_anonymous=L} 282 283 You can get following tags file: 284 285 .. code-block:: console 286 287 $ u-ctags --options=foo.ctags -o - /tmp/input.foo 288 Le4679d360100 /tmp/input.foo /^(let ((f (lambda (x) (+ 1 x))))$/;" l 289 290 291.. _extras: 292 293Conditional tagging with extras 294^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 295 296.. NEEDS MORE REVIEWS 297 298If a matched pattern should only be tagged when an ``extra`` flag is enabled, 299mark the pattern with ``{_extra=XNAME}`` where ``XNAME`` is the name of the 300extra. You must define a ``XNAME`` with the 301``--_extradef-<LANG>=XNAME,DESCRIPTION`` option before defining a regex flag 302marked ``{_extra=XNAME}``. 303 304.. code-block:: python 305 306 if __name__ == '__main__': 307 do_something() 308 309To capture the lines above in a python program (``input.py``), an ``extra`` flag can 310be used. 311 312.. code-block:: ctags 313 :emphasize-lines: 1-2 314 315 --_extradef-Python=main,__main__ entry points 316 --regex-Python=/^if __name__ == '__main__':/__main__/f/{_extra=main} 317 318The above optlib (``python-main.ctags``) introduces ``main`` extra to the Python parser. 319The pattern matching is done only when the ``main`` is enabled. 320 321.. code-block:: console 322 323 $ ctags --options=python-main.ctags -o - --extras-Python='+{main}' input.py 324 __main__ input.py /^if __name__ == '__main__':$/;" f 325 326 327.. TODO: this "fields" section should probably be moved up this document, as a 328 subsection in the "Regex option argument flags" section 329 330.. _fields: 331 332Adding custom fields to the tag output 333^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 334 335.. NEEDS MORE REVIEWS 336 337Exuberant Ctags allows just one of the specified groups in a regex pattern to 338be used as a part of the name of a tag entry. 339 340Universal Ctags allows using the other groups in the regex pattern. 341An optlib parser can have its specific fields. The groups can be used as a 342value of the fields of a tag entry. 343 344Let's think about `Unknown`, an imaginary language. 345Here is a source file (``input.unknown``) written in `Unknown`: 346 347.. code-block:: java 348 349 public func foo(n, m); 350 protected func bar(n); 351 private func baz(n,...); 352 353With ``--regex-Unknown=...`` Exuberant Ctags can capture ``foo``, ``bar``, and ``baz`` 354as names. Universal Ctags can attach extra context information to the 355names as values for fields. Let's focus on ``bar``. ``protected`` is a 356keyword to control how widely the identifier ``bar`` can be accessed. 357``(n)`` is the parameter list of ``bar``. ``protected`` and ``(n)`` are 358extra context information of ``bar``. 359 360With the following optlib file (``unknown.ctags``), ctags can attach 361``protected`` to the field protection and ``(n)`` to the field signature. 362 363.. code-block:: ctags 364 :emphasize-lines: 5-9 365 366 --langdef=unknown 367 --kinddef-unknown=f,func,functions 368 --map-unknown=+.unknown 369 370 --_fielddef-unknown=protection,access scope 371 --_fielddef-unknown=signature,signatures 372 373 --regex-unknown=/^((public|protected|private) +)?func ([^\(]+)\((.*)\)/\3/f/{_field=protection:\1}{_field=signature:(\4)} 374 --fields-unknown=+'{protection}{signature}' 375 376For the line ``protected func bar(n);`` you will get following tags output:: 377 378 bar input.unknown /^protected func bar(n);$/;" f protection:protected signature:(n) 379 380Let's see the detail of ``unknown.ctags``. 381 382.. code-block:: ctags 383 384 --_fielddef-unknown=protection,access scope 385 386``--_fielddef-<LANG>=name,description`` defines a new field for a parser 387specified by *<LANG>*. Before defining a new field for the parser, 388the parser must be defined with ``--langdef=<LANG>``. ``protection`` is 389the field name used in tags output. ``access scope`` is the description 390used in the output of ``--list-fields`` and ``--list-fields=Unknown``. 391 392.. code-block:: ctags 393 394 --_fielddef-unknown=signature,signatures 395 396This defines a field named ``signature``. 397 398.. code-block:: ctags 399 400 --regex-unknown=/^((public|protected|private) +)?func ([^\(]+)\((.*)\)/\3/f/{_field=protection:\1}{_field=signature:(\4)} 401 402This option requests making a tag for the name that is specified with the group 3 of the 403pattern, attaching the group 1 as a value for ``protection`` field to the tag, and attaching 404the group 4 as a value for ``signature`` field to the tag. You can use the long regex flag 405``_field`` for attaching fields to a tag with the following notation rule:: 406 407 {_field=FIELDNAME:GROUP} 408 409 410``--fields-<LANG>=[+|-]{FIELDNAME}`` can be used to enable or disable specified field. 411 412When defining a new parser specific field, it is disabled by default. Enable the 413field explicitly to use the field. See ":ref:`Parser specific fields <parser-specific-fields>`" 414about ``--fields-<LANG>`` option. 415 416`passwd` parser is a simple example that uses ``--fields-<LANG>`` option. 417 418 419.. _roles: 420 421Capturing reference tags 422^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 423 424.. NOT REVIEWED YET 425 426To make a reference tag with an optlib parser, specify a role with 427``_role`` long regex flag. Let's see an example: 428 429.. code-block:: ctags 430 :emphasize-lines: 3-6 431 432 --langdef=FOO 433 --kinddef-FOO=m,module,modules 434 --_roledef-FOO.m=imported,imported module 435 --regex-FOO=/import[ \t]+([a-z]+)/\1/m/{_role=imported} 436 --extras=+r 437 --fields=+r 438 439A role must be defined before specifying it as value for ``_role`` flag. 440``--_roledef-<LANG>.<KIND>=<ROLE>,<ROLEDESC>`` option is for defining a role. 441See the line, ``--regex-FOO=...``. In this parser `FOO`, the name of an 442imported module is captured as a reference tag with role ``imported``. 443 444For specifying *<KIND>* where the role is defined, you can use either a 445kind letter or a kind name surrounded by '``{``' and '``}``'. 446 447The option has two parameters separated by a comma: 448 449*<ROLE>* 450 451 the role name, and 452 453*<ROLEDESC>* 454 455 the description of the role. 456 457The first parameter is the name of the role. The role is defined in 458the kind *<KIND>* of the language *<LANG>*. In the example, 459``imported`` role is defined in the ``module`` kind, which is specified 460with ``m``. You can use ``{module}``, the name of the kind instead. 461 462The kind specified in ``--_roledef-<LANG>.<KIND>`` option must be 463defined *before* using the option. See the description of 464``--kinddef-<LANG>`` for defining a kind. 465 466The roles are listed with ``--list-roles=<LANG>``. The name and description 467passed to ``--_roledef-<LANG>.<KIND>`` option are used in the output like:: 468 469 $ ctags --langdef=FOO --kinddef-FOO=m,module,modules \ 470 --_roledef-FOO.m='imported,imported module' --list-roles=FOO 471 #KIND(L/N) NAME ENABLED DESCRIPTION 472 m/module imported on imported module 473 474 475If specifying ``_role`` regex flag multiple times with different roles, you can 476assign multiple roles to a reference tag. See following input of C language 477 478.. code-block:: C 479 480 x = 0; 481 i += 1; 482 483An ultra fine grained C parser may capture the variable ``x`` with 484``lvalue`` role and the variable ``i`` with ``lvalue`` and ``incremented`` 485roles. 486 487You can implement such roles by extending the built-in C parser: 488 489.. code-block:: ctags 490 :emphasize-lines: 2-5 491 492 # c-extra.ctags 493 --_roledef-C.v=lvalue,locator values 494 --_roledef-C.v=incremented,incremented with ++ operator 495 --regex-C=/([a-zA-Z_][a-zA-Z_0-9]*) *=/\1/v/{_role=lvalue} 496 --regex-C=/([a-zA-Z_][a-zA-Z_0-9]*) *\+=/\1/v/{_role=lvalue}{_role=incremented} 497 498.. code-block:: console 499 500 $ ctags with --options=c-extra.ctags --extras=+r --fields=+r 501 i input.c /^i += 1;$/;" v roles:lvalue,incremented 502 x input.c /^x = 0;$/;" v roles:lvalue 503 504 505Scope tracking in a regex parser 506...................................................................... 507 508About the ``{scope=..}`` flag itself for scope tracking, see "FLAGS FOR 509--regex-<LANG> OPTION" section of :ref:`ctags-optlib(7) <ctags-optlib(7)>`. 510 511Example 1: 512 513.. code-block:: python 514 515 # in /tmp/input.foo 516 class foo: 517 def bar(baz): 518 print(baz) 519 class goo: 520 def gar(gaz): 521 print(gaz) 522 523.. code-block:: ctags 524 :emphasize-lines: 7,8 525 526 # in /tmp/foo.ctags: 527 --langdef=Foo 528 --map-Foo=+.foo 529 --kinddef-Foo=c,class,classes 530 --kinddef-Foo=d,definition,definitions 531 532 --regex-Foo=/^class[[:blank:]]+([[:alpha:]]+):/\1/c/{scope=set} 533 --regex-Foo=/^[[:blank:]]+def[[:blank:]]+([[:alpha:]]+).*:/\1/d/{scope=ref} 534 535.. code-block:: console 536 537 $ ctags --options=/tmp/foo.ctags -o - /tmp/input.foo 538 bar /tmp/input.foo /^ def bar(baz):$/;" d class:foo 539 foo /tmp/input.foo /^class foo:$/;" c 540 gar /tmp/input.foo /^ def gar(gaz):$/;" d class:goo 541 goo /tmp/input.foo /^class goo:$/;" c 542 543 544Example 2: 545 546.. code-block:: c 547 548 // in /tmp/input.pp 549 class foo { 550 int bar; 551 } 552 553.. code-block:: ctags 554 :emphasize-lines: 7-9 555 556 # in /tmp/pp.ctags: 557 --langdef=pp 558 --map-pp=+.pp 559 --kinddef-pp=c,class,classes 560 --kinddef-pp=v,variable,variables 561 562 --regex-pp=/^[[:blank:]]*\}//{scope=pop}{exclusive} 563 --regex-pp=/^class[[:blank:]]*([[:alnum:]]+)[[[:blank:]]]*\{/\1/c/{scope=push} 564 --regex-pp=/^[[:blank:]]*int[[:blank:]]*([[:alnum:]]+)/\1/v/{scope=ref} 565 566.. code-block:: console 567 568 $ ctags --options=/tmp/pp.ctags -o - /tmp/input.pp 569 bar /tmp/input.pp /^ int bar$/;" v class:foo 570 foo /tmp/input.pp /^class foo {$/;" c 571 572 573Example 3: 574 575.. code-block:: 576 577 # in /tmp/input.docdoc 578 title T 579 ... 580 section S0 581 ... 582 section S1 583 ... 584 585.. code-block:: ctags 586 :emphasize-lines: 15,21 587 588 # in /tmp/doc.ctags: 589 --langdef=doc 590 --map-doc=+.docdoc 591 --kinddef-doc=s,section,sections 592 --kinddef-doc=S,subsection,subsections 593 594 --_tabledef-doc=main 595 --_tabledef-doc=section 596 --_tabledef-doc=subsection 597 598 --_mtable-regex-doc=main/section +([^\n]+)\n/\1/s/{scope=push}{tenter=section} 599 --_mtable-regex-doc=main/[^\n]+\n|[^\n]+|\n// 600 --_mtable-regex-doc=main///{scope=clear}{tquit} 601 602 --_mtable-regex-doc=section/section +([^\n]+)\n/\1/s/{scope=replace} 603 --_mtable-regex-doc=section/subsection +([^\n]+)\n/\1/S/{scope=push}{tenter=subsection} 604 --_mtable-regex-doc=section/[^\n]+\n|[^\n]+|\n// 605 --_mtable-regex-doc=section///{scope=clear}{tquit} 606 607 --_mtable-regex-doc=subsection/(section )//{_advanceTo=0start}{tleave}{scope=pop} 608 --_mtable-regex-doc=subsection/subsection +([^\n]+)\n/\1/S/{scope=replace} 609 --_mtable-regex-doc=subsection/[^\n]+\n|[^\n]+|\n// 610 --_mtable-regex-doc=subsection///{scope=clear}{tquit} 611 612.. code-block:: console 613 614 % ctags --sort=no --fields=+nl --options=/tmp/doc.ctags -o - /tmp/input.docdoc 615 SEC0 /tmp/input.docdoc /^section SEC0$/;" s line:1 language:doc 616 SUB0-1 /tmp/input.docdoc /^subsection SUB0-1$/;" S line:3 language:doc section:SEC0 617 SUB0-2 /tmp/input.docdoc /^subsection SUB0-2$/;" S line:5 language:doc section:SEC0 618 SEC1 /tmp/input.docdoc /^section SEC1$/;" s line:7 language:doc 619 SUB1-1 /tmp/input.docdoc /^subsection SUB1-1$/;" S line:9 language:doc section:SEC1 620 SUB1-2 /tmp/input.docdoc /^subsection SUB1-2$/;" S line:11 language:doc section:SEC1 621 622 623NOTE: This flag doesn't work well with ``--mline-regex-<LANG>=``. 624 625Overriding the letter for file kind 626~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 627 628.. Q: this was fixed in https://github.com/universal-ctags/ctags/pull/331 629 so can we remove this section? 630 631One of the built-in tag kinds in Universal Ctags is the ``F`` file kind. 632Overriding the letter for file kind is not allowed in Universal Ctags. 633 634.. warning:: 635 636 Don't use ``F`` as a kind letter in your parser. (See issue `#317 637 <https://github.com/universal-ctags/ctags/issues/317>`_ on github) 638 639Generating fully qualified tags automatically from scope information 640~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 641 642If scope fields are filled properly with ``{scope=...}`` regex flags, 643you can use the field values for generating fully qualified tags. 644About the ``{scope=..}`` flag itself, see "FLAGS FOR --regex-<LANG> 645OPTION" section of :ref:`ctags-optlib(7) <ctags-optlib(7)>`. 646 647Specify ``{_autoFQTag}`` to the end of ``--langdef=<LANG>`` option like 648``--langdef=Foo{_autoFQTag}`` to make ctags generate fully qualified 649tags automatically. 650 651'``.``' is the (ctags global) default separator combining names into a 652fully qualified tag. You can customize separators with 653``--_scopesep-<LANG>=...`` option. 654 655input.foo:: 656 657 class X 658 var y 659 end 660 661foo.ctags: 662 663.. code-block:: ctags 664 :emphasize-lines: 1 665 666 --langdef=foo{_autoFQTag} 667 --map-foo=+.foo 668 --kinddef-foo=c,class,classes 669 --kinddef-foo=v,var,variables 670 --regex-foo=/class ([A-Z]*)/\1/c/{scope=push} 671 --regex-foo=/end///{placeholder}{scope=pop} 672 --regex-foo=/[ \t]*var ([a-z]*)/\1/v/{scope=ref} 673 674Output:: 675 676 $ u-ctags --quiet --options=./foo.ctags -o - input.foo 677 X input.foo /^class X$/;" c 678 y input.foo /^ var y$/;" v class:X 679 680 $ u-ctags --quiet --options=./foo.ctags --extras=+q -o - input.foo 681 X input.foo /^class X$/;" c 682 X.y input.foo /^ var y$/;" v class:X 683 y input.foo /^ var y$/;" v class:X 684 685 686``X.y`` is printed as a fully qualified tag when ``--extras=+q`` is given. 687 688.. NOT REVIEWED YET (--_scopesep) 689 690Customizing scope separators 691...................................................................... 692Use ``--_scopesep-<LANG>=[<parent-kindLetter>]/<child-kindLetter>:<sep>`` 693option for customizing if the language uses ``{_autoFQTag}``. 694 695``parent-kindLetter`` 696 697 The kind letter for a tag of outer-scope. 698 699 You can use '``*``' for specifying as wildcards that means 700 *any kinds* for a tag of outer-scope. 701 702 If you omit ``parent-kindLetter``, the separator is used as 703 a prefix for tags having the kind specified with ``child-kindLetter``. 704 This prefix can be used to refer to global namespace or similar concepts if the 705 language has one. 706 707``child-kindLetter`` 708 709 The kind letter for a tag of inner-scope. 710 711 You can use '``*``' for specifying as wildcards that means 712 *any kinds* for a tag of inner-scope. 713 714``sep`` 715 716 In a qualified tag, if the outer-scope has kind and ``parent-kindLetter`` 717 the inner-scope has ``child-kindLetter``, then ``sep`` is instead in 718 between the scope names in the generated tags file. 719 720specifying '``*``' as both ``parent-kindLetter`` and ``child-kindLetter`` 721sets ``sep`` as the language default separator. It is used as fallback. 722 723Specifying '``*``' as ``child-kindLetter`` and omitting ``parent-kindLetter`` 724sets ``sep`` as the language default prefix. It is used as fallback. 725 726 727NOTE: There is no ctags global default prefix. 728 729NOTE: ``_scopesep-<LANG>=...`` option affects only a parser that 730enables ``_autoFQTag``. A parser building full qualified tags 731manually ignores the option. 732 733Let's see an example. 734The input file is written in Tcl. Tcl parser is not an optlib 735parser. However, it uses the ``_autoFQTag`` feature internally. 736Therefore, ``_scopesep-Tcl=`` option works well. Tcl parser 737defines two kinds ``n`` (``namespace``) and ``p`` (``procedure``). 738 739By default, Tcl parser uses ``::`` as scope separator. The parser also 740uses ``::`` as root prefix. 741 742.. code-block:: tcl 743 744 namespace eval N { 745 namespace eval M { 746 proc pr0 {s} { 747 puts $s 748 } 749 } 750 } 751 752 proc pr1 {s} { 753 puts $s 754 } 755 756``M`` is defined under the scope of ``N``. ``pr0`` is defined under the scope 757of ``M``. ``N`` and ``pr1`` are at top level (so they are candidates to be added 758prefixes). ``M`` and ``N`` are language objects with ``n`` (``namespace``) kind. 759``pr0`` and ``pr1`` are language objects with ``p`` (``procedure``) kind. 760 761.. code-block:: console 762 763 $ ctags -o - --extras=+q input.tcl 764 ::N input.tcl /^namespace eval N {$/;" n 765 ::N::M input.tcl /^ namespace eval M {$/;" n namespace:::N 766 ::N::M::pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:::N::M 767 ::pr1 input.tcl /^proc pr1 {s} {$/;" p 768 M input.tcl /^ namespace eval M {$/;" n namespace:::N 769 N input.tcl /^namespace eval N {$/;" n 770 pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:::N::M 771 pr1 input.tcl /^proc pr1 {s} {$/;" p 772 773Let's change the default separator to ``->``: 774 775.. code-block:: console 776 :emphasize-lines: 1 777 778 $ ctags -o - --extras=+q --_scopesep-Tcl='*/*:->' input.tcl 779 ::N input.tcl /^namespace eval N {$/;" n 780 ::N->M input.tcl /^ namespace eval M {$/;" n namespace:::N 781 ::N->M->pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:::N->M 782 ::pr1 input.tcl /^proc pr1 {s} {$/;" p 783 M input.tcl /^ namespace eval M {$/;" n namespace:::N 784 N input.tcl /^namespace eval N {$/;" n 785 pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:::N->M 786 pr1 input.tcl /^proc pr1 {s} {$/;" p 787 788Let's define '``^``' as default prefix: 789 790.. code-block:: console 791 :emphasize-lines: 1 792 793 $ ctags -o - --extras=+q --_scopesep-Tcl='*/*:->' --_scopesep-Tcl='/*:^' input.tcl 794 M input.tcl /^ namespace eval M {$/;" n namespace:^N 795 N input.tcl /^namespace eval N {$/;" n 796 ^N input.tcl /^namespace eval N {$/;" n 797 ^N->M input.tcl /^ namespace eval M {$/;" n namespace:^N 798 ^N->M->pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:^N->M 799 ^pr1 input.tcl /^proc pr1 {s} {$/;" p 800 pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:^N->M 801 pr1 input.tcl /^proc pr1 {s} {$/;" p 802 803Let's override the specification of separator for combining a 804namespace and a procedure with '``+``': (About the separator for 805combining a namespace and another namespace, ctags uses the default separator.) 806 807.. code-block:: console 808 :emphasize-lines: 1 809 810 $ ctags -o - --extras=+q --_scopesep-Tcl='*/*:->' --_scopesep-Tcl='/*:^' --_scopesep-Tcl='n/p:+' input.tcl 811 M input.tcl /^ namespace eval M {$/;" n namespace:^N 812 N input.tcl /^namespace eval N {$/;" n 813 ^N input.tcl /^namespace eval N {$/;" n 814 ^N->M input.tcl /^ namespace eval M {$/;" n namespace:^N 815 ^N->M+pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:^N->M 816 ^pr1 input.tcl /^proc pr1 {s} {$/;" p 817 pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:^N->M 818 pr1 input.tcl /^proc pr1 {s} {$/;" p 819 820Let's override the definition of prefix for a namespace with '``@``': 821(About the prefix for procedures, ctags uses the default prefix.) 822 823.. code-block:: console 824 :emphasize-lines: 1 825 826 $ ctags -o - --extras=+q --_scopesep-Tcl='*/*:->' --_scopesep-Tcl='/*:^' --_scopesep-Tcl='n/p:+' --_scopesep-Tcl='/n:@' input.tcl 827 @N input.tcl /^namespace eval N {$/;" n 828 @N->M input.tcl /^ namespace eval M {$/;" n namespace:@N 829 @N->M+pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:@N->M 830 M input.tcl /^ namespace eval M {$/;" n namespace:@N 831 N input.tcl /^namespace eval N {$/;" n 832 ^pr1 input.tcl /^proc pr1 {s} {$/;" p 833 pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:@N->M 834 pr1 input.tcl /^proc pr1 {s} {$/;" p 835 836 837Multi-line pattern match 838~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 839 840We often need to scan multiple lines to generate a tag, whether due to 841needing contextual information to decide whether to tag or not, or to 842constrain generating tags to only certain cases, or to grab multiple 843substrings to generate the tag name. 844 845Universal Ctags has two ways to accomplish this: *multi-line regex options*, 846and an experimental *multi-table regex options* described later. 847 848The newly introduced ``--mline-regex-<LANG>`` is similar to ``--regex-<LANG>`` 849except the pattern is applied to the whole file's contents, not line by line. 850 851This example is based on an issue `#219 852<https://github.com/universal-ctags/ctags/issues/219>`_ posted by 853@andreicristianpetcu: 854 855.. code-block:: java 856 857 // in input.java: 858 859 @Subscribe 860 public void catchEvent(SomeEvent e) 861 { 862 return; 863 } 864 865 @Subscribe 866 public void 867 recover(Exception e) 868 { 869 return; 870 } 871 872The above java code is similar to the Java `Spring <https://spring.io>`_ 873framework. The ``@Subscribe`` annotation is a keyword for the framework, and the 874developer would like to have a tag generated for each method annotated with 875``@Subscribe``, using the name of the method followed by a dash followed by the 876type of the argument. For example the developer wants the tag name 877``Event-SomeEvent`` generated for the first method shown above. 878 879To accomplish this, the developer creates a :file:`spring.ctags` file with 880the following: 881 882.. code-block:: ctags 883 :emphasize-lines: 4 884 885 # in spring.ctags: 886 --langdef=javaspring 887 --map-javaspring=+.java 888 --mline-regex-javaspring=/@Subscribe([[:space:]])*([a-z ]+)[[:space:]]*([a-zA-Z]*)\(([a-zA-Z]*)/\3-\4/s,subscription/{mgroup=3} 889 --fields=+ln 890 891And now using :file:`spring.ctags` the tag file has this: 892 893.. code-block:: console 894 895 $ ctags -o - --options=./spring.ctags input.java 896 Event-SomeEvent input.java /^public void catchEvent(SomeEvent e)$/;" s line:2 language:javaspring 897 recover-Exception input.java /^ recover(Exception e)$/;" s line:10 language:javaspring 898 899Multiline pattern flags 900...................................................................... 901 902.. note:: These flags also apply to the experimental ``--_mtable-regex-<LANG>`` 903 option described later. 904 905``{mgroup=N}`` 906 907 This flag indicates the pattern should be applied to the whole file 908 contents, not line by line. ``N`` is the number of a capture group in the 909 pattern, which is used to record the line number location of the tag. In the 910 above example ``3`` is specified. The start position of the regex capture 911 group 3, relative to the whole file is used. 912 913.. warning:: You **must** add an ``{mgroup=N}`` flag to the multi-line 914 ``--mline-regex-<LANG>`` option, even if the ``N`` is ``0`` (meaning the 915 start position of the whole regex pattern). You do not need to add it for 916 the multi-table ``--_mtable-regex-<LANG>``. 917 918.. TODO: Q: isn't the above restriction really a bug? I think it is. I should fix it. 919 Q to @masatake-san: Do you mean that {mgroup=0} can be omitted? -> #2918 is opened 920 921 922``{_advanceTo=N[start|end]}`` 923 924 A regex pattern is applied to whole file's contents iteratively. This long 925 flag specifies from where the pattern should be applied in the next 926 iteration for regex matching. When a pattern matches, the next pattern 927 matching starts from the start or end of capture group ``N``. By default it 928 advances to the end of the whole match (i.e., ``{_advanceTo=0end}`` is 929 the default). 930 931 932 Let's think about following input 933 :: 934 935 def def abc 936 937 Consider two sets of options, ``foo.ctags`` and ``bar.ctags``. 938 939 .. code-block:: ctags 940 :emphasize-lines: 5 941 942 # foo.ctags: 943 --langdef=foo 944 --langmap=foo:.foo 945 --kinddef-foo=a,something,something 946 --mline-regex-foo=/def *([a-z]+)/\1/a/{mgroup=1} 947 948 949 .. code-block:: ctags 950 :emphasize-lines: 5 951 952 # bar.ctags: 953 --langdef=bar 954 --langmap=bar:.bar 955 --kinddef-bar=a,something,something 956 --mline-regex-bar=/def *([a-z]+)/\1/a/{mgroup=1}{_advanceTo=1start} 957 958 ``foo.ctags`` emits following tags output:: 959 960 def input.foo /^def def abc$/;" a 961 962 ``bar.ctags`` emits following tags output:: 963 964 def input-0.bar /^def def abc$/;" a 965 abc input-0.bar /^def def abc$/;" a 966 967 ``_advanceTo=1start`` is specified in ``bar.ctags``. 968 This allows ctags to capture ``abc``. 969 970 At the first iteration, the patterns of both 971 ``foo.ctags`` and ``bar.ctags`` match as follows 972 :: 973 974 0 1 (start) 975 v v 976 def def abc 977 ^ 978 0,1 (end) 979 980 ``def`` at the group 1 is captured as a tag in 981 both languages. At the next iteration, the positions 982 where the pattern matching is applied to are not the 983 same in the languages. 984 985 ``foo.ctags`` 986 :: 987 988 0end (default) 989 v 990 def def abc 991 992 993 ``bar.ctags`` 994 :: 995 996 1start (as specified in _advanceTo long flag) 997 v 998 def def abc 999 1000 This difference of positions makes the difference of tags output. 1001 1002 A more relevant use-case is when ``{_advanceTo=N[start|end]}`` is used in 1003 the experimental ``--_mtable-regex-<LANG>``, to "advance" back to the 1004 beginning of a match, so that one can generate multiple tags for the same 1005 input line(s). 1006 1007.. note:: This flag doesn't work well with scope related flags and ``exclusive`` flags. 1008 1009 1010.. Q: this was previously titled "Byte oriented pattern matching...", presumably 1011 because it "matched against the input at the current byte position, not line". 1012 But that's also true for --mline-regex-<LANG>, as far as I can tell. 1013 1014Advanced pattern matching with multiple regex tables 1015~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1016 1017.. note:: This is a highly experimental feature. This will not go into 1018 the man page of 6.0. But let's be honest, it's the most exciting feature! 1019 1020In some cases, the ``--regex-<LANG>`` and ``--mline-regex-<LANG>`` options are not 1021sufficient to generate the tags for a particular language. Some of the common 1022reasons for this are: 1023 1024* To ignore commented lines or sections for the language file, so that 1025 tags aren't generated for symbols that are within the comments. 1026* To enter and exit scope, and use it for tagging based on contextual 1027 state or with end-scope markers that are difficult to match to their 1028 associated scope entry point. 1029* To support nested scopes. 1030* To change the pattern searched for, or the resultant tag for the same 1031 pattern, based on scoping or contextual location. 1032* To break up an overly complicated ``--mline-regex-<LANG>`` pattern into 1033 separate regex patterns, for performance or readability reasons. 1034 1035To help handle such things, Universal Ctags has been enhanced with multi-table 1036regex matching. The feature is inspired by `lex`, the fast lexical analyzer 1037generator, which is a popular tool on Unix environments for writing parsers, and 1038`RegexLexer <http://pygments.org/docs/lexerdevelopment/>`_ of Pygments. 1039Knowledge about them will help you understand the new options. 1040 1041The new options are: 1042 1043``--_tabledef-<LANG>`` 1044 Declares a new regex matching table of a given name for the language, 1045 as described in ":ref:`tabledef`". 1046 1047``--_mtable-regex-<LANG>`` 1048 Adds a regex pattern and associated tag generation information and flags, to 1049 the given table, as described in ":ref:`mtable_regex`". 1050 1051``--_mtable-extend-<LANG>`` 1052 Includes a previously-defined regex table to the named one. 1053 1054The above will be discussed in more detail shortly. 1055 1056First, let's explain the feature with an example. Consider an 1057imaginary language `X` has a similar syntax as JavaScript: ``var`` is 1058used as defining variable(s), and "``/* ... */``" is used for block 1059comments. 1060 1061Here is our input, :file:`input.x`: 1062 1063.. code-block:: java 1064 1065 /* BLOCK COMMENT 1066 var dont_capture_me; 1067 */ 1068 var a /* ANOTHER BLOCK COMMENT */, b; 1069 1070We want ctags to capture ``a`` and ``b`` - but it is difficult to write a parser 1071that will ignore ``dont_capture_me`` in the comment with a classical regex 1072parser defined with ``--regex-<LANG>`` or ``--mline-regex-<LANG>``, because of 1073the block comments. 1074 1075The ``--regex-<LANG>`` option only works on one line at a time, so can not know 1076``dont_capture_me`` is within comments. The ``--mline-regex-<LANG>`` could 1077do it in theory, but due to the greedy nature of the regex engine it is 1078impractical and potentially inefficient to do so, given that there could be 1079multiple block comments in the file, with '``*``' inside them, etc. 1080 1081A parser written with multi-table regex, on the other hand, can capture only 1082``a`` and ``b`` safely. But it is more complicated to understand. 1083 1084Here is the 1st version of :file:`X.ctags`: 1085 1086.. code-block:: ctags 1087 1088 --langdef=X 1089 --map-X=.x 1090 --kinddef-X=v,var,variables 1091 1092Not so interesting. It doesn't really *do* anything yet. It just creates a new 1093language named ``X``, for files ending with a :file:`.x` suffix, and defines a 1094new tag for variable kinds. 1095 1096When writing a multi-table parser, you have to think about the necessary states 1097of parsing. For the parser of language `X`, we need the following states: 1098 1099* `toplevel` (initial state) 1100* `comment` (inside comment) 1101* `vars` (var statements) 1102 1103.. _tabledef: 1104 1105Declaring a new regex table 1106...................................................................... 1107 1108Before adding regular expressions, you have to declare tables for each state 1109with the ``--_tabledef-<LANG>=<TABLE>`` option. 1110 1111Here is the 2nd version of :file:`X.ctags` doing so: 1112 1113.. code-block:: ctags 1114 :emphasize-lines: 5-7 1115 1116 --langdef=X 1117 --map-X=.x 1118 --kinddef-X=v,var,variables 1119 1120 --_tabledef-X=toplevel 1121 --_tabledef-X=comment 1122 --_tabledef-X=vars 1123 1124For table names, only characters in the range ``[0-9a-zA-Z_]`` are acceptable. 1125 1126For a given language, for each file's input the ctags multi-table parser begins 1127with the first declared table. For :file:`X.ctags`, ``toplevel`` is the one. 1128The other tables are only ever entered/checked if another table specified to do 1129so, starting with the first table. In other words, if the first declared table 1130does not find a match for the current input, and does not specify to go to 1131another table, the other tables for that language won't be used. The flags to go 1132to another table are ``{tenter}``, ``{tleave}``, and ``{tjump}``, as described 1133later. 1134 1135.. _mtable_regex: 1136 1137Adding a regex to a regex table 1138...................................................................... 1139 1140The new option to add a regex to a declared table is ``--_mtable-regex-<LANG>``, 1141and it follows this form: 1142 1143.. code-block:: ctags 1144 1145 --_mtable-regex-<LANG>=<TABLE>/<PATTERN>/<NAME>/[<KIND>]/LONGFLAGS 1146 1147The parameters for ``--_mtable-regex-<LANG>`` look complicated. However, 1148``<PATTERN>``, ``<NAME>``, and ``<KIND>`` are the same as the parameters of the 1149``--regex-<LANG>`` and ``--mline-regex-<LANG>`` options. ``<TABLE>`` is simply 1150the name of a table previously declared with the ``--_tabledef-<LANG>`` option. 1151 1152A regex pattern added to a parser with ``--_mtable-regex-<LANG>`` is matched 1153against the input at the current byte position, not line. Even if you do not 1154specify the '``^``' anchor at the start of the pattern, ctags adds '``^``' to 1155the pattern automatically. Unlike the ``--regex-<LANG>`` and 1156``--mline-regex-<LANG>`` options, a '``^``' anchor does not mean "beginning of 1157line" in ``--_mtable-regex-<LANG>``; instead it means the beginning of the 1158input string (i.e., the current byte position). 1159 1160The ``LONGFLAGS`` include the already discussed flags for ``--regex-<LANG>`` and 1161``--mline-regex-<LANG>``: ``{scope=...}``, ``{mgroup=N}``, ``{_advanceTo=N}``, 1162``{basic}``, ``{extend}``, and ``{icase}``. The ``{exclusive}`` flag does not 1163make sense for multi-table regex. 1164 1165In addition, several new flags are introduced exclusively for multi-table 1166regex use: 1167 1168``{tenter}`` 1169 Push the current table on the stack, and enter another table. 1170 1171``{tleave}`` 1172 Leave the current table, pop the stack, and go to the table that was 1173 just popped from the stack. 1174 1175``{tjump}`` 1176 Jump to another table, without affecting the stack. 1177 1178``{treset}`` 1179 Clear the stack, and go to another table. 1180 1181``{tquit}`` 1182 Clear the stack, and stop processing the current input file for this 1183 language. 1184 1185To explain the above new flags, we'll continue using our example in the 1186next section. 1187 1188Skipping block comments 1189...................................................................... 1190 1191Let's continue with our example. Here is the 3rd version of :file:`X.ctags`: 1192 1193.. code-block:: ctags 1194 :emphasize-lines: 9-13 1195 :linenos: 1196 1197 --langdef=X 1198 --map-X=.x 1199 --kinddef-X=v,var,variables 1200 1201 --_tabledef-X=toplevel 1202 --_tabledef-X=comment 1203 --_tabledef-X=vars 1204 1205 --_mtable-regex-X=toplevel/\/\*//{tenter=comment} 1206 --_mtable-regex-X=toplevel/.// 1207 1208 --_mtable-regex-X=comment/\*\///{tleave} 1209 --_mtable-regex-X=comment/.// 1210 1211Four ``--_mtable-regex-X`` lines are added for skipping the block comments. Let's 1212discuss them one by one. 1213 1214For each new file it scans, ctags always chooses the first pattern of the 1215first table of the parser. Even if it's an empty table, ctags will only try 1216the first declared table. (in such a case it would immediately fail to match 1217anything, and thus stop processing the input file and effectively do nothing) 1218 1219The first declared table (``toplevel``) has the following regex added to 1220it first: 1221 1222.. code-block:: ctags 1223 :linenos: 1224 :lineno-start: 9 1225 1226 --_mtable-regex-X=toplevel/\/\*//{tenter=comment} 1227 1228A pattern of ``\/\*`` is added to the ``toplevel`` table, to match the 1229beginning of a block comment. A backslash character is used in front of the 1230leading '``/``' to escape the separation character '``/``' that separates the fields 1231of ``--_mtable-regex-<LANG>``. Another backslash inside the pattern is used 1232before the asterisk '``*``', to make it a literal asterisk character in regex. 1233 1234The last ``//`` means ctags should not tag something matching this pattern. 1235In ``--regex-<LANG>`` you never use ``//`` because it would be pointless to 1236match something and not tag it using and single-line ``--regex-<LANG>``; in 1237multi-line ``--mline-regex-<LANG>`` you rarely see it, because it would rarely 1238be useful. But in multi-table regex it's quite common, since you frequently 1239want to transition from one state to another (i.e., ``tenter`` or ``tjump`` 1240from one table to another). 1241 1242The long flag added to our first regex of our first table is ``tenter``, which 1243is a long flag for switching the table and pushing on the stack. ``{tenter=comment}`` 1244means "switch the table from toplevel to comment". 1245 1246So given the input file :file:`input.x` shown earlier, ctags will begin at 1247the ``toplevel`` table and try to match the first regex. It will succeed, and 1248thus push on the stack and go to the ``comment`` table. 1249 1250It will begin at the top of the ``comment`` table (it always begins at the top 1251of a given table), and try each regex line in sequence until it finds a match. 1252If it fails to find a match, it will pop the stack and go to the table that was 1253just popped from the stack, and begin trying to match at the top of *that* table. 1254If it continues failing to find a match, and ultimately reaches the end of the 1255stack, it will stop processing for this file. For the next input file, it will 1256begin again from the top of the first declared table. 1257 1258Getting back to our example, the top of the ``comment`` table has this regex: 1259 1260.. code-block:: ctags 1261 :linenos: 1262 :lineno-start: 12 1263 1264 --_mtable-regex-X=comment/\*\///{tleave} 1265 1266Similar to the previous ``toplevel`` table pattern, this one for ``\*\/`` uses 1267a backslash to escape the separator '``/``', as well as one before the '``*``' to 1268make it a literal asterisk in regex. So what it's looking for, from a simple 1269string perspective, is the sequence ``*/``. Note that this means even though 1270you see three backslashes ``///`` at the end, the first one is escaped and used 1271for the pattern itself, and the ``--_mtable-regex-X`` only has ``//`` to 1272separate the regex pattern from the long flags, instead of the usual ``///``. 1273Thus it's using the shorthand form of the ``--_mtable-regex-X`` option. 1274It could instead have been: 1275 1276.. code-block:: ctags 1277 1278 --_mtable-regex-X=comment/\*\////{tleave} 1279 1280The above would have worked exactly the same. 1281 1282Getting back to our example, remember we're looking at the :file:`input.x` 1283file, currently using the ``comment`` table, and trying to match the first 1284regex of that table, shown above, at the following location:: 1285 1286 ,ctags is trying to match starting here 1287 v 1288 /* BLOCK COMMENT 1289 var dont_capture_me; 1290 */ 1291 var a /* ANOTHER BLOCK COMMENT */, b; 1292 1293The pattern doesn't match for the position just after ``/*``, because that 1294position is a space character. So ctags tries the next pattern in the same 1295table: 1296 1297.. code-block:: ctags 1298 :linenos: 1299 :lineno-start: 13 1300 1301 --_mtable-regex-X=comment/.// 1302 1303This pattern matches any any one character including newline; the current 1304position moves one character forward. Now the character at the current position is 1305'``B``'. The first pattern of the table ``*/`` still does not match with the input. So 1306ctags uses next pattern again. When the current position moves to the ``*/`` 1307of the 3rd line of :file:`input.x`, it will finally match this: 1308 1309.. code-block:: ctags 1310 :linenos: 1311 :lineno-start: 12 1312 1313 --_mtable-regex-X=comment/\*\///{tleave} 1314 1315In this pattern, the long flag ``{tleave}`` is specified. This triggers table 1316switching again. ``{tleave}`` makes ctags switch the table back to the last 1317table used before doing ``{tenter}``. In this case, ``toplevel`` is the table. 1318ctags manages a stack where references to tables are put. ``{tenter}`` pushes 1319the current table to the stack. ``{tleave}`` pops the table at the top of the 1320stack and chooses it. 1321 1322So now ctags is back to the ``toplevel`` table, and tries the first regex 1323of that table, which was this: 1324 1325.. code-block:: ctags 1326 :linenos: 1327 :lineno-start: 9 1328 1329 --_mtable-regex-X=toplevel/\/\*//{tenter=comment} 1330 1331It tries to match that against its current position, which is now the 1332newline on line 3, between the ``*/`` and the word ``var``:: 1333 1334 /* BLOCK COMMENT 1335 var dont_capture_me; 1336 */ <--- ctags is now at this newline (/n) character 1337 var a /* ANOTHER BLOCK COMMENT */, b; 1338 1339The first regex of the ``toplevel`` table does not match a newline, so it tries 1340the second regex: 1341 1342.. code-block:: ctags 1343 :linenos: 1344 :lineno-start: 13 1345 1346 --_mtable-regex-X=toplevel/.// 1347 1348This matches a newline successfully, but has no actions to perform. So ctags 1349moves one character forward (the newline it just matched), and goes back to the 1350top of the ``toplevel`` table, and tries the first regex again. Eventually we'll 1351reach the beginning of the second block comment, and do the same things as before. 1352 1353When ctags finally reaches the end of the file (the position after ``b;``), 1354it will not be able to match either the first or second regex of the 1355``toplevel`` table, and quit processing the input file. 1356 1357So far, we've successfully skipped over block comments for our new ``X`` 1358language, but haven't generated any tags. The point of ctags is to generate 1359tags, not just keep your computer warm. So now let's move onto actually tagging 1360variables... 1361 1362 1363Capturing variables in a sequence 1364...................................................................... 1365 1366Here is the 4th version of :file:`X.ctags`: 1367 1368.. code-block:: ctags 1369 :emphasize-lines: 10,16-19 1370 :linenos: 1371 1372 --langdef=X 1373 --map-X=.x 1374 --kinddef-X=v,var,variables 1375 1376 --_tabledef-X=toplevel 1377 --_tabledef-X=comment 1378 --_tabledef-X=vars 1379 1380 --_mtable-regex-X=toplevel/\/\*//{tenter=comment} 1381 --_mtable-regex-X=toplevel/var[ \n\t]//{tenter=vars} 1382 --_mtable-regex-X=toplevel/.// 1383 1384 --_mtable-regex-X=comment/\*\///{tleave} 1385 --_mtable-regex-X=comment/.// 1386 1387 --_mtable-regex-X=vars/;//{tleave} 1388 --_mtable-regex-X=vars/\/\*//{tenter=comment} 1389 --_mtable-regex-X=vars/([a-zA-Z][a-zA-Z0-9]*)/\1/v/ 1390 --_mtable-regex-X=vars/.// 1391 1392One pattern in ``toplevel`` was added, and a new table ``vars`` with four 1393patterns was also added. 1394 1395The new regex in ``toplevel`` is this: 1396 1397.. code-block:: ctags 1398 :linenos: 1399 :lineno-start: 10 1400 1401 --_mtable-regex-X=toplevel/var[ \n\t]//{tenter=vars} 1402 1403The purpose of this being in `toplevel` is to switch to the `vars` table when 1404the keyword ``var`` is found in the input stream. We need to switch states 1405(i.e., tables) because we can't simply capture the variables ``a`` and ``b`` 1406with a single regex pattern in the ``toplevel`` table, because there might be 1407block comments inside the ``var`` statement (as there are in our 1408:file:`input.x`), and we also need to create *two* tags: one for ``a`` and one 1409for ``b``, even though the word ``var`` only appears once. In other words, we 1410need to "remember" that we saw the keyword ``var``, when we later encounter the 1411names ``a`` and ``b``, so that we know to tag each of them; and saving that 1412"in-variable-statement" state is accomplished by switching tables to the 1413``vars`` table. 1414 1415The first regex in our new ``vars`` table is: 1416 1417.. code-block:: ctags 1418 :linenos: 1419 :lineno-start: 16 1420 1421 --_mtable-regex-X=vars/;//{tleave} 1422 1423This pattern is used to match a single semi-colon '``;``', and if it matches 1424pop back to the ``toplevel`` table using the ``{tleave}`` long flag. We 1425didn't have to make this the first regex pattern, because it doesn't overlap 1426with any of the other ones other than the ``/.//`` last one (which must be 1427last for this example to work). 1428 1429The second regex in our ``vars`` table is: 1430 1431.. code-block:: ctags 1432 :linenos: 1433 :lineno-start: 17 1434 1435 --_mtable-regex-X=vars/\/\*//{tenter=comment} 1436 1437We need this because block comments can be in variable definitions:: 1438 1439 var a /* ANOTHER BLOCK COMMENT */, b; 1440 1441So to skip block comments in such a position, the pattern ``\/\*`` is used just 1442like it was used in the ``toplevel`` table: to find the literal ``/*`` beginning 1443of the block comment and enter the ``comment`` table. Because we're using 1444``{tenter}`` and ``{tleave}`` to push/pop from a stack of tables, we can 1445use the same ``comment`` table for both ``toplevel`` and ``vars`` to go to, 1446because ctags will *remember* the previous table and ``{tleave}`` will 1447pop back to the right one. 1448 1449The third regex in our ``vars`` table is: 1450 1451.. code-block:: ctags 1452 :linenos: 1453 :lineno-start: 18 1454 1455 --_mtable-regex-X=vars/([a-zA-Z][a-zA-Z0-9]*)/\1/v/ 1456 1457This is nothing special, but is the one that actually tags something: it 1458captures the variable name and uses it for generating a ``variable`` (shorthand 1459``v``) tag kind. 1460 1461The last regex in the ``vars`` table we've seen before: 1462 1463.. code-block:: ctags 1464 :linenos: 1465 :lineno-start: 19 1466 1467 --_mtable-regex-X=vars/.// 1468 1469This makes ctags ignore any other characters, such as whitespace or the 1470comma '``,``'. 1471 1472 1473Running our example 1474...................................................................... 1475 1476.. code-block:: console 1477 1478 $ cat input.x 1479 /* BLOCK COMMENT 1480 var dont_capture_me; 1481 */ 1482 var a /* ANOTHER BLOCK COMMENT */, b; 1483 1484 $ u-ctags -o - --fields=+n --options=X.ctags input.x 1485 u-ctags -o - --fields=+n --options=X.ctags input.x 1486 a input.x /^var a \/* ANOTHER BLOCK COMMENT *\/, b;$/;" v line:4 1487 b input.x /^var a \/* ANOTHER BLOCK COMMENT *\/, b;$/;" v line:4 1488 1489It works! 1490 1491You can find additional examples of multi-table regex in our github repo, under 1492the ``optlib`` directory. For example ``puppetManifest.ctags`` is a serious 1493example. It is the primary parser for testing multi-table regex parsers, and 1494used in the actual ctags program for parsing puppet manifest files. 1495 1496 1497.. _guest-regex-flag: 1498 1499Scheduling a guest parser with ``_guest`` regex flag 1500~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1501.. NOT REVIEWED YET 1502 1503With ``_guest`` regex flag, you can run a parser (a guest parser) on an 1504area of the current input file. 1505See ":ref:`host-guest-parsers`" about the concept of the guest parser. 1506 1507The ``_guest`` regex flag specifies a *guest spec*, and attaches it to 1508the associated regex pattern. 1509 1510A guest spec has three fields: *<PARSER>*, *<START>* of area, and *<END>* of area. 1511The ``_guest`` regex flag has following forms:: 1512 1513 {_guest=<PARSER>,<START>,<END>} 1514 1515ctags maintains a data called *guest request* during parsing. A 1516guest request also has three fields: `parser`, `start of area`, and 1517`end of area`. 1518 1519You, a parser developer, have to fill the fields of guest specs. 1520ctags inquiries the guest spec when matching the regex pattern 1521associated with it, tries to fill the fields of the guest request, 1522and runs a guest parser when all the fields of the guest request are 1523filled. 1524 1525If you use `Multi-line pattern match`_ to define a host parser, 1526you must specify all the fields of `guest request`. 1527 1528On the other hand if you don't use `Multi-line pattern match`_ to define a host parser, 1529ctags can fill fields of `guest request` incrementally; more than 1530one guest specs are used to fill the fields. In other words, you can 1531make some of the fields of a guest spec empty. 1532 1533The *<PARSER>* field of ``_guest`` regex flag 1534...................................................................... 1535For *<PARSER>*, you can specify one of the following items: 1536 1537a name of a parser 1538 1539 If you know the guest parser you want to run before parsing 1540 the input file, specify the name of the parser. Aliases of parsers 1541 are also considered when finding a parser for the name. 1542 1543 An example of running C parser as a guest parser:: 1544 1545 {_guest=C,... 1546 1547the group number of a regex pattern started from '``\``' (backslash) 1548 1549 If a parser name appears in an input file, write a regex pattern 1550 to capture the name. Specify the group number where the name is 1551 stored to the parser. In such case, use '``\``' as the prefix for 1552 the number. Aliases of parsers are also considered when finding 1553 a parser for the name. 1554 1555 Let's see an example. Git Flavor Markdown (GFM) is a language for 1556 documentation. It provides a notation for quoting a snippet of 1557 program code; the language treats the area started from ``~~~`` to 1558 ``~~~`` as a snippet. You can specify a programming language of 1559 the snippet with starting the area with 1560 ``~~~<THE_NAME_OF_LANGUAGE>``, like ``~~~C`` or ``~~~Java``. 1561 1562 To run a guest parser on the area, you have to capture the 1563 *<THE_NAME_OF_LANGUAGE>* with a regex pattern: 1564 1565 .. code-block:: ctags 1566 1567 --_mtable-regex-Markdown=main/~~~([a-zA-Z0-9][-#+a-zA-Z0-9]*)[\n]//{_guest=\1,0end,} 1568 1569 The pattern captures the language name in the input file with the 1570 regex group 1, and specify it to *<PARSER>*:: 1571 1572 {guest=\1,... 1573 1574the group number of a regex pattern started from '``*``' (asterisk) 1575 1576 If a file name implying a programming language appears in an input 1577 file, capture the file name with the regex pattern where the guest 1578 spec attaches to. ctags tries to find a proper parser for the 1579 file name by inquiring the langmap. 1580 1581 Use '``*``' as the prefix to the number for specifying the group of 1582 the regex pattern that captures the file name. 1583 1584 Let's see an example. Consider you have a shell script that emits 1585 a program code instantiated from one of the templates. Here documents 1586 are used to represent the templates like: 1587 1588 .. code-block:: sh 1589 1590 i=... 1591 cat > foo.c <<EOF 1592 int main (void) { return $i; } 1593 EOF 1594 1595 cat > foo.el <<EOF 1596 (defun foo () (1+ $i)) 1597 EOF 1598 1599 To run guest parsers for the here document areas, the shell 1600 script parser of ctags must choose the parsers from the file 1601 names (``foo.c`` and ``foo.el``): 1602 1603 .. code-block:: ctags 1604 1605 --regex-sh=/cat > ([a-z.]+) <<EOF//{_guest=*1,0end,} 1606 1607 The pattern captures the file name in the input file with the 1608 regex group 1, and specify it to *<PARSER>*:: 1609 1610 {_guest=*1,... 1611 1612The *<START>* and *<END>* fields of `_guest` regex flag 1613...................................................................... 1614 1615The *<START>* and *<END>* fields specify the area the *<PARSER>* parses. *<START>* 1616specifies the start of the area. *<END>* specifies the end of the area. 1617 1618The forms of the two fields are the same: a regex group number 1619followed by ``start`` or ``end``. e.g. ``3start``, ``0end``. The suffixes, 1620``start`` and ``end``, represents one of two boundaries of the group. 1621 1622Let's see an example:: 1623 1624 {_guest=C,2end,3start} 1625 1626This guest regex flag means running C parser on the area between 1627``2end`` and ``3start``. ``2end`` means the area starts from the end of 1628matching of the 2nd regex group associated with the flag. ``3start`` 1629means the area ends at the beginning of matching of the 3rd regex 1630group associated with the flag. 1631 1632Let's more realistic example. 1633Here is an optlib file for an imaginary language `single`: 1634 1635.. code-block:: ctags 1636 :emphasize-lines: 3 1637 1638 --langdef=single 1639 --map-single=.single 1640 --regex-single=/^(BEGIN_C<).*(>END_C)$//{_guest=C,1end,2start} 1641 1642This parser can run C parser and extract ``main`` function from the 1643following input file:: 1644 1645 BEGIN_C<int main (int argc, char **argv) { return 0; }>END_C 1646 ^ ^ 1647 `- "1end" points here. | 1648 "2start" points here. -+ 1649 1650.. NOT REVIEWED YET 1651 1652.. _defining-subparsers: 1653 1654Defining a subparser 1655~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1656 1657Basic 1658......................................................................... 1659 1660About the concept of subparser, see ":ref:`base-sub-parsers`". 1661 1662``--langdef=<LANG>`` option is extended as 1663``--langdef=<LANG>[{base=<LANG>}[{shared|dedicated|bidirectional}]][{_autoFQTag}]`` to define 1664a subparser for a specified base parser. Combining with ``--kinddef-<LANG>`` 1665and ``--regex-<KIND>`` options, you can extend an existing parser 1666without risk of kind confliction. 1667 1668Let's see an example. 1669 1670input.c 1671 1672.. code-block:: C 1673 1674 static int set_one_prio(struct task_struct *p, int niceval, int error) 1675 { 1676 } 1677 1678 SYSCALL_DEFINE3(setpriority, int, which, int, who, int, niceval) 1679 { 1680 ...; 1681 } 1682 1683.. code-block:: console 1684 1685 $ ctags -x --_xformat="%20N %10K %10l" -o - input.c 1686 set_one_prio function C 1687 SYSCALL_DEFINE3 function C 1688 1689C parser doesn't understand that ``SYSCALL_DEFINE3`` is a macro for defining an 1690entry point for a system. 1691 1692Let's define `linux` subparser which using C parser as a base parser (``linux.ctags``): 1693 1694.. code-block:: ctags 1695 :emphasize-lines: 1,3 1696 1697 --langdef=linux{base=C} 1698 --kinddef-linux=s,syscall,system calls 1699 --regex-linux=/SYSCALL_DEFINE[0-9]\(([^, )]+)[\),]*/\1/s/ 1700 1701The output is change as follows with `linux` parser: 1702 1703.. code-block:: console 1704 :emphasize-lines: 2 1705 1706 $ ctags --options=./linux.ctags -x --_xformat="%20N %10K %10l" -o - input.c 1707 setpriority syscall linux 1708 set_one_prio function C 1709 SYSCALL_DEFINE3 function C 1710 1711``setpriority`` is recognized as a ``syscall`` of `linux`. 1712 1713Using only ``--regex-C=...`` you can capture ``setpriority``. 1714However, there were concerns about kind confliction; when introducing 1715a new kind with ``--regex-C=...``, you cannot use a letter and name already 1716used in C parser and ``--regex-C=...`` options specified in the other places. 1717 1718You can use a newly defined subparser as a new namespace of kinds. 1719In addition you can enable/disable with the subparser usable 1720``--languages=[+|-]`` option: 1721 1722.. code-block::console 1723 1724 $ ctags --options=./linux.ctags --languages=-linux -x --_xformat="%20N %10K %10l" -o - input.c 1725 set_one_prio function C 1726 SYSCALL_DEFINE3 function C 1727 1728.. _optlib_directions: 1729 1730Direction flags 1731......................................................................... 1732 1733.. TESTCASE: Units/flags-langdef-directions.r 1734 1735As explained in ":ref:`multiple_parsers_directions`" in 1736":ref:`multiple_parsers`", you can choose direction(s) how a base parser and a 1737guest parser work together with direction flags. 1738 1739The following examples are taken from `#1409 1740<https://github.com/universal-ctags/ctags/issues/1409>`_ submitted by @sgraham on 1741github Universal Ctags repository. 1742 1743``input.cc`` and ``input.mojom`` are input files, and have the same 1744contents:: 1745 1746 ABC(); 1747 int main(void) 1748 { 1749 } 1750 1751C++ parser can capture ``main`` as a function. `Mojom` subparser defined in the 1752later runs on C++ parser and is for capturing ``ABC``. 1753 1754shared combination 1755^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1756``{shared}`` is specified, for ``input.cc``, both tags capture by C++ parser 1757and mojom parser are recorded to tags file. For ``input.mojom``, only 1758tags captured by mojom parser are recorded to tags file. 1759 1760mojom-shared.ctags: 1761 1762.. code-block:: ctags 1763 :emphasize-lines: 1 1764 1765 --langdef=mojom{base=C++}{shared} 1766 --map-mojom=+.mojom 1767 --kinddef-mojom=f,function,functions 1768 --regex-mojom=/^[ ]+([a-zA-Z]+)\(/\1/f/ 1769 1770.. code-block:: ctags 1771 :emphasize-lines: 2 1772 1773 $ ctags --options=mojom-shared.ctags --fields=+l -o - input.cc 1774 ABC input.cc /^ ABC();$/;" f language:mojom 1775 main input.cc /^int main(void)$/;" f language:C++ typeref:typename:int 1776 1777.. code-block:: ctags 1778 :emphasize-lines: 2 1779 1780 $ ctags --options=mojom-shared.ctags --fields=+l -o - input.mojom 1781 ABC input.mojom /^ ABC();$/;" f language:mojom 1782 1783Mojom parser uses C++ parser internally but tags captured by C++ parser are 1784dropped in the output. 1785 1786dedicated combination 1787^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1788``{dedicated}`` is specified, for ``input.cc``, only tags capture by C++ 1789parser are recorded to tags file. For ``input.mojom``, both tags capture 1790by C++ parser and mojom parser are recorded to tags file. 1791 1792mojom-dedicated.ctags: 1793 1794.. code-block:: ctags 1795 :emphasize-lines: 1 1796 1797 --langdef=mojom{base=C++}{dedicated} 1798 --map-mojom=+.mojom 1799 --kinddef-mojom=f,function,functions 1800 --regex-mojom=/^[ ]+([a-zA-Z]+)\(/\1/f/ 1801 1802.. code-block:: ctags 1803 1804 $ ctags --options=mojom-dedicated.ctags --fields=+l -o - input.cc 1805 main input.cc /^int main(void)$/;" f language:C++ typeref:typename:int 1806 1807.. code-block:: ctags 1808 :emphasize-lines: 2-3 1809 1810 $ ctags --options=mojom-dedicated.ctags --fields=+l -o - input.mojom 1811 ABC input.mojom /^ ABC();$/;" f language:mojom 1812 main input.mojom /^int main(void)$/;" f language:C++ typeref:typename:int 1813 1814Mojom parser works only when ``.mojom`` file is given as input. 1815 1816bidirectional combination 1817^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1818``{bidirectional}`` is specified, both tags capture by C++ parser and 1819mojom parser are recorded to tags file for either input ``input.cc`` and 1820``input.mojom``. 1821 1822mojom-bidirectional.ctags: 1823 1824.. code-block:: ctags 1825 :emphasize-lines: 1 1826 1827 --langdef=mojom{base=C++}{bidirectional} 1828 --map-mojom=+.mojom 1829 --kinddef-mojom=f,function,functions 1830 --regex-mojom=/^[ ]+([a-zA-Z]+)\(/\1/f/ 1831 1832.. code-block:: ctags 1833 :emphasize-lines: 2 1834 1835 $ ctags --options=mojom-bidirectional.ctags --fields=+l -o - input.cc 1836 ABC input.cc /^ ABC();$/;" f language:mojom 1837 main input.cc /^int main(void)$/;" f language:C++ typeref:typename:int 1838 1839.. code-block:: ctags 1840 :emphasize-lines: 2-3 1841 1842 $ ctags --options=mojom-bidirectional.ctags --fields=+l -o - input.mojom 1843 ABC input.cc /^ ABC();$/;" f language:mojom 1844 main input.cc /^int main(void)$/;" f language:C++ typeref:typename:int 1845 1846 1847.. _optlib2c: 1848 1849Translating an option file into C source code (optlib2c) 1850~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1851Universal Ctags has an ``optlib2c`` script that translates an option file into C 1852source code. Your optlib parser can thus easily become a built-in parser. 1853 1854To add your optlib file, ``foo.ctags``, into ctags do the following steps; 1855 1856* copy ``foo.ctags`` file on ``optlib/`` directory 1857* add ``foo.ctags`` on ``OPTLIB2C_INPUT`` variable in ``source.mak`` 1858* add ``fooParser`` on ``PARSER_LIST`` macro variable in ``main/parser_p.h`` 1859 1860You are encouraged to submit your :file:`.ctags` file to our repository on 1861github through a pull request. See ":ref:`contributions`" for more details. 1862