1.. _ctags-client-tools(7): 2 3============================================================== 4ctags-client-tools 5============================================================== 6--------------------------------------------------------------------------------- 7Hints for developing a tool using @CTAGS_NAME_EXECUTABLE@ command and tags output 8--------------------------------------------------------------------------------- 9:Version: @VERSION@ 10:Manual group: Universal Ctags 11:Manual section: 7 12 13SYNOPSIS 14-------- 15| **@CTAGS_NAME_EXECUTABLE@** [options] [file(s)] 16| **@ETAGS_NAME_EXECUTABLE@** [options] [file(s)] 17 18 19DESCRIPTION 20----------- 21**Client tool** means a tool running the @CTAGS_NAME_EXECUTABLE@ command 22and/or reading a tags file generated by @CTAGS_NAME_EXECUTABLE@ command. 23This man page gathers hints for people who develop client tools. 24 25 26PSEUDO-TAGS 27----------- 28**Pseudo-tags**, stored in a tag file, indicate how 29@CTAGS_NAME_EXECUTABLE@ generated the tags file: whether the 30tags file is sorted or not, which version of tags file format is used, 31the name of tags generator, and so on. The opposite term for 32pseudo-tags is **regular-tags**. A regular-tag is for a language 33object in an input file. A pseudo-tag is for the tags file 34itself. Client tools may use pseudo-tags as reference for processing 35regular-tags. 36 37A pseudo-tag is stored in a tags file in the same format as 38regular-tags as described in tags(5), except that pseudo-tag names 39are prefixed with "!_". For the general information about 40pseudo-tags, see "TAG FILE INFORMATION" in tags(5). 41 42An example of a pseudo tag:: 43 44 !_TAG_PROGRAM_NAME Universal Ctags /Derived from Exuberant Ctags/ 45 46The value, "2", associated with the pseudo tag "TAG_PROGRAM_NAME", is 47used in the field for input file. The description, "Derived from 48Exuberant Ctags", is used in the field for pattern. 49 50Universal Ctags extends the naming scheme of the classical pseudo-tags 51available in Exuberant Ctags for emitting language specific 52information as pseudo tags:: 53 54 !_{pseudo-tag-name}!{language-name} {associated-value} /{description}/ 55 56The language-name is appended to the pseudo-tag name with a separator, "!". 57 58An example of pseudo tag with a language suffix:: 59 60 !_TAG_KIND_DESCRIPTION!C f,function /function definitions/ 61 62This pseudo-tag says "the function kind of C language is enabled 63when generating this tags file." ``--pseudo-tags`` is the option for 64enabling/disabling individual pseudo-tags. When enabling/disabling a 65pseudo tag with the option, specify the tag name only 66"TAG_KIND_DESCRIPTION", without the prefix ("!_") or the suffix ("!C"). 67 68 69Options for Pseudo-tags 70~~~~~~~~~~~~~~~~~~~~~~~ 71``--extras=+p`` (or ``--extras=+{pseudo}``) 72 Forces writing pseudo-tags. 73 74 @CTAGS_NAME_EXECUTABLE@ emits pseudo-tags by default when writing tags 75 to a regular file (e.g. "tags'.) However, when specifying ``-o -`` 76 or ``-f -`` for writing tags to standard output, 77 @CTAGS_NAME_EXECUTABLE@ doesn't emit pseudo-tags. ``--extras=+p`` or 78 ``--extras=+{pseudo}`` will force pseudo-tags to be written. 79 80``--list-pseudo-tags`` 81 Lists available types of pseudo-tags and shows whether they are enabled or disabled. 82 83 Running @CTAGS_NAME_EXECUTABLE@ with ``--list-pseudo-tags`` option 84 lists available pseudo-tags. Some of pseudo-tags newly introduced 85 in Universal Ctags project are disabled by default. Use 86 ``--pseudo-tags=...`` to enable them. 87 88``--pseudo-tags=[+|-]names|*`` 89 Specifies a list of pseudo-tag types to include in the output. 90 91 The parameters are a set of pseudo tag names. Valid pseudo tag names 92 can be listed with ``--list-pseudo-tags``. Surround each name in the set 93 with braces, like "{TAG_PROGRAM_AUTHOR}". You don't have to include the "!_" 94 pseudo tag prefix when specifying a name in the option argument for ``--pseudo-tags=`` 95 option. 96 97 pseudo-tags don't have a notation using one-letter flags. 98 99 If a name is preceded by either the '+' or '-' characters, that 100 tags's effect has been added or removed. Otherwise the names replace 101 any current settings. All entries are included if '*' is given. 102 103``--fields=+E`` (or ``--fields=+{extras}``) 104 Attach "extras:pseudo" field to pseudo-tags. 105 106 An example of pseudo tags with the field:: 107 108 !_TAG_PROGRAM_NAME Universal Ctags /Derived from Exuberant Ctags/ extras:pseudo 109 110 If the name of a normal tag in a tag file starts with "!_", a 111 client tool cannot distinguish whether the tag is a regular-tag or 112 pseudo-tag. The fields attached with this option help the tool 113 distinguish them. 114 115 116List of notable pseudo-tags 117~~~~~~~~~~~~~~~~~~~~~~~~~~~ 118Running ctags with ``--list-pseudo-tags`` option lists available types 119of pseudo-tags with short descriptions. This subsection shows hints 120for using notable ones. 121 122``TAG_EXTRA_DESCRIPTION`` (new in Universal Ctags) 123 Indicates the names and descriptions of enabled extras:: 124 125 !_TAG_EXTRA_DESCRIPTION {extra-name} /description/ 126 !_TAG_EXTRA_DESCRIPTION!{language-name} {extra-name} /description/ 127 128 If your tool relies on some extra tags (extras), refer to 129 the pseudo-tags of this type. A tool can reject the tags file that 130 doesn't include expected extras, and raise an error in an early 131 stage of processing. 132 133 An example of the pseudo-tags:: 134 135 $ @CTAGS_NAME_EXECUTABLE@ --extras=+p --pseudo-tags='{TAG_EXTRA_DESCRIPTION}' -o - input.c 136 !_TAG_EXTRA_DESCRIPTION anonymous /Include tags for non-named objects like lambda/ 137 !_TAG_EXTRA_DESCRIPTION fileScope /Include tags of file scope/ 138 !_TAG_EXTRA_DESCRIPTION pseudo /Include pseudo tags/ 139 !_TAG_EXTRA_DESCRIPTION subparser /Include tags generated by subparsers/ 140 ... 141 142 A client tool can know "{anonymous}", "{fileScope}", "{pseudo}", 143 and "{subparser}" extras are enabled from the output. 144 145``TAG_FIELD_DESCRIPTION`` (new in Universal Ctags) 146 Indicates the names and descriptions of enabled fields:: 147 148 !_TAG_FIELD_DESCRIPTION {field-name} /description/ 149 !_TAG_FIELD_DESCRIPTION!{language-name} {field-name} /description/ 150 151 If your tool relies on some fields, refer to the pseudo-tags of 152 this type. A tool can reject a tags file that doesn't include 153 expected fields, and raise an error in an early stage of 154 processing. 155 156 An example of the pseudo-tags:: 157 158 $ @CTAGS_NAME_EXECUTABLE@ --fields-C=+'{macrodef}' --extras=+p --pseudo-tags='{TAG_FIELD_DESCRIPTION}' -o - input.c 159 !_TAG_FIELD_DESCRIPTION file /File-restricted scoping/ 160 !_TAG_FIELD_DESCRIPTION input /input file/ 161 !_TAG_FIELD_DESCRIPTION name /tag name/ 162 !_TAG_FIELD_DESCRIPTION pattern /pattern/ 163 !_TAG_FIELD_DESCRIPTION typeref /Type and name of a variable or typedef/ 164 !_TAG_FIELD_DESCRIPTION!C macrodef /macro definition/ 165 ... 166 167 A client tool can know "{file}", "{input}", "{name}", "{pattern}", 168 and "{typeref}" fields are enabled from the output. 169 The fields are common in languages. In addition to the common fields, 170 the tool can known "{macrodef}" field of C language is also enabled. 171 172``TAG_FILE_ENCODING`` (new in Universal Ctags) 173 TBW 174 175``TAG_FILE_FORMAT`` 176 See also tags(5). 177 178``TAG_FILE_SORTED`` 179 See also tags(5). 180 181``TAG_KIND_DESCRIPTION`` (new in Universal Ctags) 182 Indicates the names and descriptions of enabled kinds:: 183 184 !_TAG_KIND_DESCRIPTION!{language-name} {kind-letter},{kind-name} /description/ 185 186 If your tool relies on some kinds, refer to the pseudo-tags of 187 this type. A tool can reject the tags file that doesn't include 188 expected kinds, and raise an error in an early stage of 189 processing. 190 191 Kinds are language specific, so a language name is always 192 appended to the tag name as suffix. 193 194 An example of the pseudo-tags:: 195 196 $ @CTAGS_NAME_EXECUTABLE@ --extras=+p --kinds-C=vfm --pseudo-tags='{TAG_KIND_DESCRIPTION}' -o - input.c 197 !_TAG_KIND_DESCRIPTION!C f,function /function definitions/ 198 !_TAG_KIND_DESCRIPTION!C m,member /struct, and union members/ 199 !_TAG_KIND_DESCRIPTION!C v,variable /variable definitions/ 200 ... 201 202 A client tool can know "{function}", "{member}", and "{variable}" 203 kinds of C language are enabled from the output. 204 205``TAG_KIND_SEPARATOR`` (new in Universal Ctags) 206 TBW 207 208``TAG_OUTPUT_EXCMD`` (new in Universal Ctags) 209 Indicates the specified type of EX command with ``--excmd`` option. 210 211``TAG_OUTPUT_FILESEP`` (new in Universal Ctags) 212 TBW 213 214``TAG_OUTPUT_MODE`` (new in Universal Ctags) 215 TBW 216 217``TAG_PATTERN_LENGTH_LIMIT`` (new in Universal Ctags) 218 TBW 219 220``TAG_PROC_CWD`` (new in Universal Ctags) 221 Indicates the working directory of @CTAGS_NAME_EXECUTABLE@ during processing. 222 223 This pseudo-tag helps a client tool solve the absolute paths for 224 the input files for tag entries even when they are tagged with 225 relative paths. 226 227 An example of the pseudo-tags:: 228 229 $ cat tags 230 !_TAG_PROC_CWD /tmp/ // 231 main input.c /^int main (void) { return 0; }$/;" f typeref:typename:int 232 ... 233 234 From the regular tag for "main", the client tool can know the 235 "main" is at "input.c". However, it is a relative path. So if the 236 directory where @CTAGS_NAME_EXECUTABLE@ run and the directory 237 where the client tool runs are different, the client tool cannot 238 find "input.c" from the file system. In that case, 239 ``TAG_PROC_CWD`` gives the tool a hint; "input.c" may be at "/tmp". 240 241``TAG_PROGRAM_NAME`` 242 TBW 243 244``TAG_ROLE_DESCRIPTION`` (new in Universal Ctags) 245 Indicates the names and descriptions of enabled roles:: 246 247 !_TAG_ROLE_DESCRIPTION!{language-name}!{kind-name} {role-name} /description/ 248 249 If your tool relies on some roles, refer to the pseudo-tags of 250 this type. Note that a role owned by a disabled kind is not listed 251 even if the role itself is enabled. 252 253REDUNDANT-KINDS 254--------------- 255TBW 256 257MULTIPLE-LANGUAGES FOR AN INPUT FILE 258------------------------------------ 259Universal ctags can run multiple parsers. 260That means a parser, which supports multiple parsers, may output tags for 261different languages. ``language``/``l`` field can be used to show the language 262for each tag. 263 264.. code-block:: console 265 266 $ cat /tmp/foo.html 267 <html> 268 <script>var x = 1</script> 269 <h1>title</h1> 270 </html> 271 $ ./ctags -o - --extras=+g /tmp/foo.html 272 title /tmp/foo.html /^ <h1>title<\/h1>$/;" h 273 x /tmp/foo.html /var x = 1/;" v 274 $ ./ctags -o - --extras=+g --fields=+l /tmp/foo.html 275 title /tmp/foo.html /^ <h1>title<\/h1>$/;" h language:HTML 276 x /tmp/foo.html /var x = 1/;" v language:JavaScript 277 278UTILIZING READTAGS 279----------------------------------- 280See readtags(1) to know how to use readtags. This section is for discussing 281some notable topics for client tools. 282 283Build Filter/Sorter Expressions 284~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 285Certain escape sequences in expressions are recognized by readtags. For 286example, when searching for a tag that matches ``a\?b``, if using a filter 287expression like ``'(eq? $name "a\?b")'``, since ``\?`` is translated into a 288single ``?`` by readtags, it actually searches for ``a?b``. 289 290Another problem is if a single quote appear in filter expressions (which is 291also wrapped by single quotes), it terminates the expression, producing broken 292expressions, and may even cause unintended shell injection. Single quotes can 293be escaped using ``'"'"'``. 294 295So, client tools need to: 296 297* Replace ``\`` by ``\\`` 298* Replace ``'`` by ``'"'"'`` 299 300inside the expressions. If the expression also contains strings, ``"`` in the 301strings needs to be replaced by ``\"``. 302 303Client tools written in Lisp could build the expression using lists. ``prin1`` 304(in Common Lisp style Lisps) and ``write`` (in Scheme style Lisps) can 305translate the list into a string that can be directly used. For example, in 306EmacsLisp: 307 308.. code-block:: EmacsLisp 309 310 (let ((name "hi")) 311 (prin1 `(eq? $name ,name))) 312 => "(eq\\? $name "hi")" 313 314The "?" is escaped, and readtags can handle it. Scheme style Lisps should do 315proper escaping so the expression readtags gets is just the expression passed 316into ``write``. Common Lisp style Lisps may produce unrecognized escape 317sequences by readtags, like ``\#``. Readtags provides some aliases for these 318Lisps: 319 320* Use ``true`` for ``#t``. 321* Use ``false`` for ``#f``. 322* Use ``nil`` or ``()`` for ``()``. 323* Use ``(string->regexp "PATTERN")`` for ``#/PATTERN/``. Use 324 ``(string->regexp "PATTERN" :case-fold true)`` for ``#/PATTERN/i``. Notice 325 that ``string->regexp`` doesn't require escaping "/" in the pattern. 326 327Notice that even when the client tool uses this method, ``'`` still needs to be 328replaced by ``'"'"'`` to prevent broken expressions and shell injection. 329 330Another thing to notice is that missing fields are represented by ``#f``, and 331applying string operators to them will produce an error. You should always 332check if a field is missing before applying string operators. See the 333"Filtering" section in readtags(1) to know how to do this. Run "readtags -H 334filter" to see which operators take string arguments. 335 336Parse Readtags Output 337~~~~~~~~~~~~~~~~~~~~~ 338In the output of readtags, tabs can appear in all field values (e.g., the tag 339name itself could contain tabs), which makes it hard to split the line into 340fields. Client tools should use the ``-E`` option, which keeps the escape 341sequences in the tags file, so the only field that could contain tabs is the 342pattern field. 343 344The pattern field could: 345 346- Use a line number. It will look like ``number;"`` (e.g. ``10;"``). 347- Use a search pattern. It will look like ``/pattern/;"`` or ``?pattern?;"``. 348 Notice that the search pattern could contain tabs. 349- Combine these two, like ``number;/pattern/;"`` or ``number;?pattern?;"``. 350 351These are true for tags files using extended format, which is the default one. 352The legacy format (i.e. ``--format=1``) doesn't include the semicolons. It's 353old and barely used, so we won't discuss it here. 354 355Client tools could split the line using the following steps: 356 357* Find the first 2 tabs in the line, so we get the name and input field. 358* From the 2nd tab: 359 360 * If a ``/`` follows, then the pattern delimiter is ``/``. 361 * If a ``?`` follows, then the pattern delimiter is ``?``. 362 * If a number follows, then: 363 364 * If a ``;/`` follows the number, then the delimiter is ``/``. 365 * If a ``;?`` follows the number, then the delimiter is ``?``. 366 * If a ``;"`` follows the number, then the field uses only line number, and 367 there's no pattern delimiter (since there's no regex pattern). In this 368 case the pattern field ends at the 3rd tab. 369 370* After the opening delimiter, find the next unescaped pattern delimiter, and 371 that's the closing delimiter. It will be followed by ``;"`` and then a tab. 372 That's the end of the pattern field. By "unescaped pattern delimiter", we 373 mean there's an even number (including 0) of backslashes before it. 374* From here, split the rest of the line into fields by tabs. 375 376Then, the escape sequences in fields other than the pattern field should be 377translated. See "Proposal" in tags(5) to know about all the escape sequences. 378 379Make Use of the Pattern Field 380~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 381 382The pattern field specifies how to find a tag in its source file. The code 383generating this field seems to have a long history, so there are some pitfalls 384and it's a bit hard to handle. A client tool could simply require the ``line:`` 385field and jump to the line it specifies, to avoid using the pattern field. But 386anyway, we'll discuss how to make the best use of it here. 387 388You should take the words here merely as suggestions, and not standards. A 389client tool could definitely develop better (or simpler) ways to use the 390pattern field. 391 392From the last section, we know the pattern field could contain a line number 393and a search pattern. When it only contains the line number, handling it is 394easy: you simply go to that line. 395 396The search pattern resembles an EX command, but as we'll see later, it's 397actually not a valid one, so some manual work are required to process it. 398 399The search pattern could look like ``/pat/``, called "forward search pattern", 400or ``?pat?``, called "backward search pattern". Using a search pattern means 401even if the source file is updated, as long as the part containing the tag 402doesn't change, we could still locate the tag correctly by searching. 403 404When the pattern field only contains the search pattern, you just search for 405it. The search direction (forward/backward) doesn't matter, as it's decided 406solely by whether the ``-B`` option is enabled, and not the actual context. You 407could always start the search from say the beginning of the file. 408 409When both the search pattern and the line number are presented, you could make 410good use of the line number, by going to the line first, then searching for the 411nearest occurrence of the pattern. A way to do this is to search both forward 412and backward for the pattern, and when there is a occurrence on both sides, go 413to the nearer one. 414 415What's good about this is when there are multiple identical lines in the source 416file (e.g. the COMMON block in Fortran), this could help us find the correct 417one, even after the source file is updated and the tag position is shifted by a 418few lines. 419 420Now let's discuss how to search for the pattern. After you trim the ``/`` or 421``?`` around it, the pattern resembles a regex pattern. It should be a regex 422pattern, as required by being a valid EX command, but it's actually not, as 423you'll see below. 424 425It could begin with a ``^``, which means the pattern starts from the beginning 426of a line. It could also end with an *unescaped* ``$`` which means the pattern 427ends at the end of a line. Let's keep this information, and trim them too. 428 429Now the remaining part is the actual string containing the tag. Some characters 430are escaped: 431 432* ``\``. 433* ``$``, but only at the end of the string. 434* ``/``, but only in forward search patterns. 435* ``?``, but only in backward search patterns. 436 437You need to unescape these to get the literal string. Now you could convert 438this literal string to a regexp that matches it (by escaping, like 439``re.escape`` in Python or ``regexp-quote`` in Elisp), and assemble it with 440``^`` or ``$`` if the pattern originally has it, and finally search for the tag 441using this regexp. 442 443Remark: About a Previous Format of the Pattern Field 444~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 445 446In some earlier versions of Universal Ctags, the line number in the pattern 447field is the actual line number minus one, for forward search patterns; or plus 448one, for backward search patterns. The idea is to resemble an EX command: you 449go to the line, then search forward/backward for the pattern, and you can 450always find the correct one. But this denies the purpose of using a search 451pattern: to tolerate file updates. For example, the tag is at line 50, 452according to this scheme, the pattern field should be:: 453 454 49;/pat/;" 455 456Then let's assume that some code above are removed, and the tag is now at 457line 45. Now you can't find it if you search forward from line 49. 458 459Due to this reason, Universal Ctags turns to use the actual line number. A 460client tool could distinguish them by the ``TAG_OUTPUT_EXCMD`` pseudo tag, it's 461"combine" for the old scheme, and "combineV2" for the present scheme. But 462probably there's no need to treat them differently, since "search for the 463nearest occurrence from the line" gives good result on both schemes. 464 465JSON OUTPUT 466----------- 467Universal Ctags supports `JSON <https://www.json.org/>`_ (strictly 468speaking `JSON Lines <https://jsonlines.org/>`_) output format if the 469ctags executable is built with ``libjansson``. JSON output goes to 470standard output by default. 471 472Format 473~~~~~~ 474Each JSON line represents a tag. 475 476.. code-block:: console 477 478 $ ctags --extras=+p --output-format=json --fields=-s input.py 479 {"_type": "ptag", "name": "JSON_OUTPUT_VERSION", "path": "0.0", "pattern": "in development"} 480 {"_type": "ptag", "name": "TAG_FILE_SORTED", "path": "1", "pattern": "0=unsorted, 1=sorted, 2=foldcase"} 481 ... 482 {"_type": "tag", "name": "Klass", "path": "/tmp/input.py", "pattern": "/^class Klass:$/", "language": "Python", "kind": "class"} 483 {"_type": "tag", "name": "method", "path": "/tmp/input.py", "pattern": "/^ def method(self):$/", "language": "Python", "kind": "member", "scope": "Klass", "scopeKind": "class"} 484 ... 485 486A key not starting with ``_`` is mapped to a field of ctags. 487"``--output-format=json --list-fields``" options list the fields. 488 489A key starting with ``_`` represents meta information of the JSON 490line. Currently only ``_type`` key is used. If the value for the key 491is ``tag``, the JSON line represents a normal tag. If the value is 492``ptag``, the line represents a pseudo-tag. 493 494The output format can be changed in the 495future. ``JSON_OUTPUT_VERSION`` pseudo-tag provides a change 496client-tools to handle the changes. Current version is "0.0". A 497client-tool can extract the version with ``path`` key from the 498pseudo-tag. 499 500The JSON output format is newly designed and has no limitation found 501in the default tags file format. 502 503* The values for ``kind`` key are represented in long-name flags. 504 No one-letter is here. 505 506* Scope names and scope kinds have distinguished keys: ``scope`` and ``scopeKind``. 507 They are combined in the default tags file format. 508 509Data type used in a field 510~~~~~~~~~~~~~~~~~~~~~~~~~ 511Values for the most of all keys are represented in JSON string type. 512However, some of them are represented in string, integer, and/or boolean type. 513 514"``--output-format=json --list-fields``" options show What kind of data type 515used in a field of JSON. 516 517.. code-block:: console 518 519 $ ctags --output-format=json --list-fields 520 #LETTER NAME ENABLED LANGUAGE JSTYPE FIXED DESCRIPTION 521 F input yes NONE s-- no input file 522 ... 523 P pattern yes NONE s-b no pattern 524 ... 525 f file yes NONE --b no File-restricted scoping 526 ... 527 e end no NONE -i- no end lines of various items 528 ... 529 530``JSTYPE`` column shows the data types. 531 532'``s``' 533 string 534 535'``i``' 536 integer 537 538'``b``' 539 boolean (true or false) 540 541For an example, the value for ``pattern`` field of ctags takes a string or a boolean value. 542 543SEE ALSO 544-------- 545ctags(1), ctags-lang-python(7), ctags-incompatibilities(7), tags(5), readtags(1) 546