1.. _changes_tags_file: 2 3Changes to the tags file format 4--------------------------------------------------------------------- 5 6``F`` kind usage 7~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 8 9You cannot use ``F`` (``file``) kind in your .ctags because Universal Ctags 10reserves it. See :ref:`ctags-incompatibilities(7) <ctags-incompatibilities(7)>`. 11 12Reference tags 13~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 14 15Traditionally ctags collects the information for locating where a 16language object is DEFINED. 17 18In addition Universal Ctags supports reference tags. If the extra-tag 19``r`` is enabled, Universal Ctags also collects the information for 20locating where a language object is REFERENCED. This feature was 21proposed by @shigio in `#569 22<https://github.com/universal-ctags/ctags/issues/569>`_ for GNU GLOBAL. 23 24Here are some examples. Here is the target input file named reftag.c. 25 26.. code-block:: c 27 28 #include <stdio.h> 29 #include "foo.h" 30 #define TYPE point 31 struct TYPE { int x, y; }; 32 TYPE p; 33 #undef TYPE 34 35 36Traditional output: 37 38.. code-block:: console 39 40 $ ctags -o - reftag.c 41 TYPE reftag.c /^#define TYPE /;" d file: 42 TYPE reftag.c /^struct TYPE { int x, y; };$/;" s file: 43 p reftag.c /^TYPE p;$/;" v typeref:typename:TYPE 44 x reftag.c /^struct TYPE { int x, y; };$/;" m struct:TYPE typeref:typename:int file: 45 y reftag.c /^struct TYPE { int x, y; };$/;" m struct:TYPE typeref:typename:int file: 46 47Output with the extra-tag ``r`` enabled: 48 49.. code-block:: console 50 51 $ ctags --list-extras | grep ^r 52 r Include reference tags off 53 $ ctags -o - --extras=+r reftag.c 54 TYPE reftag.c /^#define TYPE /;" d file: 55 TYPE reftag.c /^#undef TYPE$/;" d file: 56 TYPE reftag.c /^struct TYPE { int x, y; };$/;" s file: 57 foo.h reftag.c /^#include "foo.h"/;" h 58 p reftag.c /^TYPE p;$/;" v typeref:typename:TYPE 59 stdio.h reftag.c /^#include <stdio.h>/;" h 60 x reftag.c /^struct TYPE { int x, y; };$/;" m struct:TYPE typeref:typename:int file: 61 y reftag.c /^struct TYPE { int x, y; };$/;" m struct:TYPE typeref:typename:int file: 62 63`#undef X` and two `#include` are newly collected. 64 65"roles" is a newly introduced field in Universal Ctags. The field 66named is for recording how a tag is referenced. If a tag is definition 67tag, the roles field has "def" as its value. 68 69Universal Ctags prints the role information when the `r` 70field is enabled with ``--fields=+r``. 71 72.. code-block:: console 73 74 $ ctags -o - --extras=+r --fields=+r reftag.c 75 TYPE reftag.c /^#define TYPE /;" d file: 76 TYPE reftag.c /^#undef TYPE$/;" d file: roles:undef 77 TYPE reftag.c /^struct TYPE { int x, y; };$/;" s file: roles:def 78 foo.h reftag.c /^#include "foo.h"/;" h roles:local 79 p reftag.c /^TYPE p;$/;" v typeref:typename:TYPE roles:def 80 stdio.h reftag.c /^#include <stdio.h>/;" h roles:system 81 x reftag.c /^struct TYPE { int x, y; };$/;" m struct:TYPE typeref:typename:int file: roles:def 82 y reftag.c /^struct TYPE { int x, y; };$/;" m struct:TYPE typeref:typename:int file: roles:def 83 84The `Reference tag marker` field, ``R``, is a specialized GNU global 85requirement; D is used for the traditional definition tags, and R is 86used for the new reference tags. The field can be used only with 87``--_xformat``. 88 89.. code-block:: console 90 91 $ ctags -x --_xformat="%R %-16N %4n %-16F %C" --extras=+r reftag.c 92 D TYPE 3 reftag.c #define TYPE point 93 D TYPE 4 reftag.c struct TYPE { int x, y; }; 94 D p 5 reftag.c TYPE p; 95 D x 4 reftag.c struct TYPE { int x, y; }; 96 D y 4 reftag.c struct TYPE { int x, y; }; 97 R TYPE 6 reftag.c #undef TYPE 98 R foo.h 2 reftag.c #include "foo.h" 99 R stdio.h 1 reftag.c #include <stdio.h> 100 101See :ref:`Customizing xref output <xformat>` for more details about 102``--_xformat``. 103 104Although the facility for collecting reference tags is implemented, 105only a few parsers currently utilize it. All available roles can be 106listed with ``--list-roles``: 107 108.. code-block:: console 109 110 $ ctags --list-roles 111 #LANGUAGE KIND(L/N) NAME ENABLED DESCRIPTION 112 SystemdUnit u/unit Requires on referred in Requires key 113 SystemdUnit u/unit Wants on referred in Wants key 114 SystemdUnit u/unit After on referred in After key 115 SystemdUnit u/unit Before on referred in Before key 116 SystemdUnit u/unit RequiredBy on referred in RequiredBy key 117 SystemdUnit u/unit WantedBy on referred in WantedBy key 118 Yaml a/anchor alias on alias 119 DTD e/element attOwner on attributes owner 120 Automake c/condition branched on used for branching 121 Cobol S/sourcefile copied on copied in source file 122 Maven2 g/groupId dependency on dependency 123 DTD p/parameterEntity elementName on element names 124 DTD p/parameterEntity condition on conditions 125 LdScript s/symbol entrypoint on entry points 126 LdScript i/inputSection discarded on discarded when linking 127 ... 128 129.. NOTE: --xformat is the only way to extract referenced tag 130 131The first column shows the name of the parser. 132The second column shows the letter/name of the kind. 133The third column shows the name of the role. 134The fourth column shows whether the role is enabled or not. 135The fifth column shows the description of the role. 136 137You can define a role in an optlib parser for capturing reference 138tags. See :ref:`Capturing reference tags <roles>` for more 139details. 140 141``--roles-<LANG>.<KIND>`` is the option for enabling/disabling 142specified roles. 143 144Pseudo-tags 145~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 146 147.. IN MAN PAGE 148 149See :ref:`ctags-client-tools(7) <ctags-client-tools(7)>` about the 150concept of the pseudo-tags. 151 152.. TODO move the following contents to ctags-client-tools(7). 153 154``TAG_KIND_DESCRIPTION`` 155......................................................................... 156 157This is a newly introduced pseudo-tag. It is not emitted by default. 158It is emitted only when ``--pseudo-tags=+TAG_KIND_DESCRIPTION`` is 159given. 160 161This is for describing kinds; their letter, name, and description are 162enumerated in the tag. 163 164ctags emits ``TAG_KIND_DESCRIPTION`` with following format:: 165 166 !_TAG_KIND_SEPARATOR!{parser} {letter},{name} /{description}/ 167 168A backslash and a slash in {description} is escaped with a backslash. 169 170 171``TAG_KIND_SEPARATOR`` 172......................................................................... 173 174This is a newly introduced pseudo-tag. It is not emitted by default. 175It is emitted only when ``--pseudo-tags=+TAG_KIND_SEPARATOR`` is 176given. 177 178This is for describing separators placed between two kinds in a 179language. 180 181Tag entries including the separators are emitted when ``--extras=+q`` 182is given; fully qualified tags contain the separators. The separators 183are used in scope information, too. 184 185ctags emits ``TAG_KIND_SEPARATOR`` with following format:: 186 187 !_TAG_KIND_SEPARATOR!{parser} {sep} /{upper}{lower}/ 188 189or :: 190 191 !_TAG_KIND_SEPARATOR!{parser} {sep} /{lower}/ 192 193Here {parser} is the name of language. e.g. PHP. 194{lower} is the letter representing the kind of the lower item. 195{upper} is the letter representing the kind of the upper item. 196{sep} is the separator placed between the upper item and the lower 197item. 198 199The format without {upper} is for representing a root separator. The 200root separator is used as prefix for an item which has no upper scope. 201 202`*` given as {upper} is a fallback wild card; if it is given, the 203{sep} is used in combination with any upper item and the item 204specified with {lower}. 205 206Each backslash character used in {sep} is escaped with an extra 207backslash character. 208 209Example output: 210 211.. code-block:: console 212 213 $ ctags -o - --extras=+p --pseudo-tags= --pseudo-tags=+TAG_KIND_SEPARATOR input.php 214 !_TAG_KIND_SEPARATOR!PHP :: /*c/ 215 ... 216 !_TAG_KIND_SEPARATOR!PHP \\ /c/ 217 ... 218 !_TAG_KIND_SEPARATOR!PHP \\ /nc/ 219 ... 220 221The first line means ``::`` is used when combining something with an 222item of the class kind. 223 224The second line means ``\\`` is used when a class item is at the top 225level; no upper item is specified. 226 227The third line means ``\\`` is used when for combining a namespace item 228(upper) and a class item (lower). 229 230Of course, ctags uses the more specific line when choosing a 231separator; the third line has higher priority than the first. 232 233``TAG_OUTPUT_FILESEP`` 234......................................................................... 235 236This pseudo-tag represents the separator used in file name: slash or 237backslash. This is always 'slash' on Unix-like environments. 238This is also 'slash' by default on Windows, however when 239``--output-format=e-tags`` or ``--use-slash-as-filename-separator=no`` 240is specified, it becomes 'backslash'. 241 242 243``TAG_OUTPUT_MODE`` 244......................................................................... 245 246.. NOT REVIEWED YET 247 248This pseudo-tag represents output mode: u-ctags or e-ctags. 249This is controlled by ``--output-format`` option. 250 251See also :ref:`Compatible output and weakness <compat-output>`. 252 253Truncating the pattern for long input lines 254~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 255 256See ``--pattern-length-limit=N`` option in :ref:`ctags(1) <ctags(1)>`. 257 258.. _parser-specific-fields: 259 260Parser specific fields 261~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 262 263A tag has a `name`, an `input` file name, and a `pattern` as basic 264information. Some fields like `language:`, `signature:`, etc are 265attached to the tag as optional information. 266 267In Exuberant Ctags, fields are common to all languages. 268Universal Ctags extends the concept of fields; a parser can define 269its specific field. This extension was proposed by @pragmaware in 270`#857 <https://github.com/universal-ctags/ctags/issues/857>`_. 271 272For implementing the parser specific fields, the options for listing and 273enabling/disabling fields are also extended. 274 275In the output of ``--list-fields``, the owner of the field is printed 276in the `LANGUAGE` column: 277 278.. code-block:: console 279 280 $ ctags --list-fields 281 #LETTER NAME ENABLED LANGUAGE XFMT DESCRIPTION 282 ... 283 - end off C TRUE end lines of various constructs 284 - properties off C TRUE properties (static, inline, mutable,...) 285 - end off C++ TRUE end lines of various constructs 286 - template off C++ TRUE template parameters 287 - captures off C++ TRUE lambda capture list 288 - properties off C++ TRUE properties (static, virtual, inline, mutable,...) 289 - sectionMarker off reStructuredText TRUE character used for declaring section 290 - version off Maven2 TRUE version of artifact 291 292e.g. reStructuredText is the owner of the sectionMarker field and 293both C and C++ own the end field. 294 295``--list-fields`` takes one optional argument, `LANGUAGE`. If it is 296given, ``--list-fields`` prints only the fields for that parser: 297 298.. code-block:: console 299 300 $ ctags --list-fields=Maven2 301 #LETTER NAME ENABLED LANGUAGE XFMT DESCRIPTION 302 - version off Maven2 TRUE version of artifact 303 304A parser specific field only has a long name, no letter. For 305enabling/disabling such fields, the name must be passed to 306``--fields-<LANG>``. 307 308e.g. for enabling the `sectionMarker` field owned by the 309`reStructuredText` parser, use the following command line: 310 311.. code-block:: console 312 313 $ ctags --fields-reStructuredText=+{sectionMarker} ... 314 315The wild card notation can be used for enabling/disabling parser specific 316fields, too. The following example enables all fields owned by the 317`C++` parser. 318 319.. code-block:: console 320 321 $ ctags --fields-C++='*' ... 322 323`*` can also be used for specifying languages. 324 325The next example is for enabling `end` fields for all languages which 326have such a field. 327 328.. code-block:: console 329 330 $ ctags --fields-'*'=+'{end}' ... 331 ... 332 333In this case, using wild card notation to specify the language, not 334only fields owned by parsers but also common fields having the name 335specified (`end` in this example) are enabled/disabled. 336 337Using the wild card notation to specify the language is helpful to 338avoid incompatibilities between versions of Universal Ctags itself 339(SELF INCOMPATIBLY). 340 341In Universal Ctags development, a parser developer may add a new 342parser specific field for a certain language. Sometimes other developers 343then recognize it is meaningful not only for the original language 344but also other languages. In this case the field may be promoted to a 345common field. Such a promotion will break the command line 346compatibility for ``--fields-<LANG>`` usage. The wild card for 347`<LANG>` will help in avoiding this unwanted effect of the promotion. 348 349With respect to the tags file format, nothing is changed when 350introducing parser specific fields; `<fieldname>`:`<value>` is used as 351before and the name of field owner is never prefixed. The `language:` 352field of the tag identifies the owner. 353 354 355Parser specific extras 356~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 357 358.. NOT REVIEWED YET 359 360As man page of Exuberant Ctags says, ``--extras`` option specifies 361whether to include extra tag entries for certain kinds of information. 362This option is available in Universal Ctags, too. 363 364In Universal Ctags it is extended; a parser can define its specific 365extra flags. They can be controlled with ``--extras-<LANG>=[+|-]{...}``. 366 367See some examples: 368 369.. code-block:: console 370 371 $ ctags --list-extras 372 #LETTER NAME ENABLED LANGUAGE DESCRIPTION 373 F fileScope TRUE NONE Include tags ... 374 f inputFile FALSE NONE Include an entry ... 375 p pseudo FALSE NONE Include pseudo tags 376 q qualified FALSE NONE Include an extra ... 377 r reference FALSE NONE Include reference tags 378 g guest FALSE NONE Include tags ... 379 - whitespaceSwapped TRUE Robot Include tags swapping ... 380 381See the `LANGUAGE` column. NONE means the extra flags are language 382independent (common). They can be enabled or disabled with `--extras=` as before. 383 384Look at `whitespaceSwapped`. Its language is `Robot`. This flag is enabled 385by default but can be disabled with `--extras-Robot=-{whitespaceSwapped}`. 386 387.. code-block:: console 388 389 $ cat input.robot 390 *** Keywords *** 391 it's ok to be correct 392 Python_keyword_2 393 394 $ ctags -o - input.robot 395 it's ok to be correct input.robot /^it's ok to be correct$/;" k 396 it's_ok_to_be_correct input.robot /^it's ok to be correct$/;" k 397 398 $ ctags -o - --extras-Robot=-'{whitespaceSwapped}' input.robot 399 it's ok to be correct input.robot /^it's ok to be correct$/;" k 400 401When disabled the name `it's_ok_to_be_correct` is not included in the 402tags output. In other words, the name `it's_ok_to_be_correct` is 403derived from the name `it's ok to be correct` when the extra flag is 404enabled. 405 406Discussion 407......................................................................... 408 409.. NOT REVIEWED YET 410 411(This subsection should move to somewhere for developers.) 412 413The question is what are extra tag entries. As far as I know none has 414answered explicitly. I have two ideas in Universal Ctags. I 415write "ideas", not "definitions" here because existing parsers don't 416follow the ideas. They are kept as is in variety reasons but the 417ideas may be good guide for people who wants to write a new parser 418or extend an exiting parser. 419 420The first idea is that a tag entry whose name is appeared in the input 421file as is, the entry is NOT an extra. (If you want to control the 422inclusion of such entries, the classical ``--kind-<LANG>=[+|-]...`` is 423what you want.) 424 425Qualified tags, whose inclusion is controlled by ``--extras=+q``, is 426explained well with this idea. 427Let's see an example: 428 429.. code-block:: console 430 431 $ cat input.py 432 class Foo: 433 def func (self): 434 pass 435 436 $ ctags -o - --extras=+q --fields=+E input.py 437 Foo input.py /^class Foo:$/;" c 438 Foo.func input.py /^ def func (self):$/;" m class:Foo extra:qualified 439 func input.py /^ def func (self):$/;" m class:Foo 440 441`Foo` and `func` are in `input.py`. So they are no extra tags. In 442other hand, `Foo.func` is not in `input.py` as is. The name is 443generated by ctags as a qualified extra tag entry. 444`whitespaceSwapped` extra flag of `Robot` parser is also aligned well 445on the idea. 446 447I don't say all parsers follows this idea. 448 449.. code-block:: console 450 451 $ cat input.cc 452 class A 453 { 454 A operator+ (int); 455 }; 456 457 $ ctags --kinds-all='*' --fields= -o - input.cc 458 A input.cc /^class A$/ 459 operator + input.cc /^ A operator+ (int);$/ 460 461In this example `operator+` is in `input.cc`. 462In other hand, `operator +` is in the ctags output as non extra tag entry. 463See a whitespace between the keyword `operator` and `+` operator. 464This is an exception of the first idea. 465 466The second idea is that if the *inclusion* of a tag cannot be 467controlled well with ``--kind-<LANG>=[+|-]...``, the tag may be an 468extra. 469 470.. code-block:: console 471 472 $ cat input.c 473 static int foo (void) 474 { 475 return 0; 476 } 477 int bar (void) 478 { 479 return 1; 480 } 481 482 $ ctags --sort=no -o - --extras=+F input.c 483 foo input.c /^static int foo (void)$/;" f typeref:typename:int file: 484 bar input.c /^int bar (void)$/;" f typeref:typename:int 485 486 $ ctags -o - --extras=-F input.c 487 foo input.c /^static int foo (void)$/;" f typeref:typename:int file: 488 489 $ 490 491Function `foo` of C language is included only when `F` extra flag 492is enabled. Both `foo` and `bar` are functions. Their inclusions 493can be controlled with `f` kind of C language: ``--kind-C=[+|-]f``. 494 495The difference between static modifier or implicit extern modifier in 496a function definition is handled by `F` extra flag. 497 498Basically the concept kind is for handling the kinds of language 499objects: functions, variables, macros, types, etc. The concept extra 500can handle the other aspects like scope (static or extern). 501 502However, a parser developer can take another approach instead of 503introducing parser specific extra; one can prepare `staticFunction` and 504`exportedFunction` as kinds of one's parser. The second idea is a 505just guide; the parser developer must decide suitable approach for the 506target language. 507 508Anyway, in the second idea, ``--extras`` is for controlling inclusion 509of tags. If what you want is not about inclusion, ``--param-<LANG>`` 510can be used as the last resort. 511 512 513Parser specific parameter 514~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 515 516.. NOT REVIEWED YET 517 518To control the detail of a parser, ``--param-<LANG>`` option is introduced. 519``--kinds-<LANG>``, ``--fields-<LANG>``, ``--extras-<LANG>`` 520can be used for customizing the behavior of a parser specified with ``<LANG>``. 521 522``--param-<LANG>`` should be used for aspects of the parser that 523the options(kinds, fields, extras) cannot handle well. 524 525A parser defines a set of parameters. Each parameter has name and 526takes an argument. A user can set a parameter with following notation 527:: 528 529 --param-<LANG>.name=arg 530 531An example of specifying a parameter 532:: 533 534 --param-CPreProcessor.if0=true 535 536Here `if0` is a name of parameter of CPreProcessor parser and 537`true` is the value of it. 538 539All available parameters can be listed with ``--list-params`` option. 540 541.. code-block:: console 542 543 $ ctags --list-params 544 #PARSER NAME DESCRIPTION 545 CPreProcessor if0 examine code within "#if 0" branch (true or [false]) 546 CPreProcessor ignore a token to be specially handled 547 548(At this time only CPreProcessor parser has parameters.) 549