1.. _testing_parser: 2 3============================================================================= 4Testing a parser 5============================================================================= 6 7 8.. contents:: `Table of contents` 9 :depth: 3 10 :local: 11 12It is difficult for us to know syntax of all languages supported in ctags. Test 13facility and test cases are quite important for maintaining ctags with limited 14resources. 15 16.. _units: 17 18*Units* test facility 19--------------------------------------------------------------------- 20 21:Maintainer: Masatake YAMATO <yamato@redhat.com> 22 23---- 24 25**Test facility** 26 27Exuberant Ctags has a test facility. The test case were *Test* 28directory. So Here I call it *Test*. 29 30Main aim of the facility is detecting regression. All files under Test 31directory are given as input for old and new version of ctags 32commands. The output tags files of both versions are compared. If any 33difference is found the check fails. *Test* expects the older ctags 34binary to be correct. 35 36This expectation is not always met. Consider that a parser for a new 37language is added. You may want to add a sample source code for that 38language to *Test*. An older ctags version is unable to generate a 39tags file for that sample code, but the newer ctags version does. At 40this point a difference is found and *Test* reports failure. 41 42**Units facility** 43 44The units test facility (*Units*) I describe here takes a different 45approach. An input file and an expected output file are given by a 46contributor of a language parser. The units test facility runs ctags 47command with the input file and compares its output and the expected 48output file. The expected output doesn't depend on ctags. 49 50If a contributor sends a patch which may improve a language parser, 51and if a reviewer is not familiar with that language, s/he cannot 52evaluate it. 53 54*Unit* test files, the pair of input file and expected output file may 55be able to explain the intent of patch well; and may help the 56reviewer. 57 58How to write a test case 59~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 60 61The test facility recognizes an input file and an expected 62output file by patterns of file name. Each test case should 63have its own directory under Units directory. 64 65*Units/TEST/input.\** **requisite** 66 67 Input file name must have a *input* as basename. *TEST* 68 part should explain the test case well. 69 70*Units/TEST/input[-_][0-9].\** *Units/TEST/input[-_][0-9][-_]\*.\** **optional** 71 72 Optional input file names. They are put next to *input.\** in 73 testing command line. 74 75*Units/TEST/expected.tags* **optional** 76 77 Expected output file must have a name *expected.tags*. It 78 should be the same directory of the input file. 79 80 If this file is not given, the exit status of ctags process 81 is just checked; the output is ignored. 82 83 If you want to test etags output (specified with ``-e`` ), 84 Use **.tags-e** as suffix instead of **.tags**. 85 In such a case you don't have to write ``-e`` to ``args.ctags``. 86 The test facility sets ``-e`` automatically. 87 88 If you want to test cross reference output (specified with ``-x`` ), 89 Use **.tags-x** as suffix instead of **.tags**. 90 In such a case you don't have to write ``-x`` to ``args.ctags``. 91 The test facility sets ``-x`` automatically. 92 93 If you want to test json output (specified with ``--output-format=json`` ), 94 Use **.tags-json** as suffix instead of **.tags**. 95 In such a case you don't have to write ``--output-format=json`` to ``args.ctags``, 96 and add ``json`` to ``features`` as described below. 97 The test facility sets the option and the feature automatically. 98 99*Units/TEST/args.ctags* **optional** 100 101 ``-o -`` is used as default optional argument when running a 102 unit test ctags. If you want to add more options, enumerate 103 options in **args.ctags** file. 104 105 Remember you have to put one option in one line; don't 106 put multiple options to one line. Multiple options in 107 one line doesn't work. 108 109*Units/TEST/filter* **optional** 110 111 You can rearrange the output of ctags with this command 112 before comparing with *executed.tags*. 113 This command is invoked with no argument. The output 114 ctags is given via stdin. Rearrange data should be 115 written to stdout. 116 117*Units/TEST/features* **optional** 118 119 If a unit test case requires special features of ctags, 120 enumerate them in this file line by line. If a target ctags 121 doesn't have one of the features, the test is skipped. 122 123 If a file line is started with ``!``, the effect is inverted; 124 if a target ctags has the feature specified with ``!``, the 125 test is skipped. 126 127 All features built-in can be listed with passing 128 ``--list-features`` to ctags. 129 130*Units/TEST/languages* **optional** 131 132 If a unit test case requires that language parsers are enabled/available, 133 enumerate them in this file line by line. If one of them is 134 disabled/unavailable, the test is skipped. 135 136 language parsers enabled/available can be checked with passing 137 ``--list-languages`` to ctags. 138 139Note for importing a test case from Test directory 140~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 141 142I think all test cases under Test directory should be converted to 143Units. 144 145If you convert use following TEST name convention. 146 147* use *.t* instead of *.d* as suffix for the name 148 149Here is an example:: 150 151 Test/simple.sh 152 153This should be:: 154 155 Units/simple.sh.t 156 157With this name convention we can track which test case is converted or 158not. 159 160Example of files 161~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 162 163See `Units/parser-c.r/c-sample.d 164<https://github.com/universal-ctags/ctags/tree/master/Units/parser-c.r/c-sample.d>`_. 165 166How to run unit tests 167~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 168 169*test* make target:: 170 171 $ make units 172 173The result of unit tests is reported by lines. You can specify 174test cases with ``UNITS=``. 175 176An example to run *vim-command.d* only:: 177 178 $ make units UNITS=vim-command 179 180Another example to run *vim-command.d* and *parser-python.r/bug1856363.py.d*:: 181 182 $ make units UNITS=vim-command,bug1856363.py 183 184During testing *OUTPUT.tmp*, *EXPECTED.tmp* and *DIFF.tmp* files are 185generated for each test case directory. These are removed when the 186unit test is **passed**. If the result is **FAILED**, it is kept for 187debugging. Following command line can clean up these generated files 188at once:: 189 190 $ make clean-units 191 192Other than **FAILED** and **passed** two types of result are 193defined. 194 195 196**skipped** 197 198 means running the test case is skipped in some reason. 199 200**failed (KNOWN bug)** 201 202 means the result was failed but the failure is expected. 203 See ":ref:`gathering_test`". 204 205Example of running 206~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 207:: 208 209 $ make units 210 Category: ROOT 211 ------------------------------------------------------------------------- 212 Testing 1795612.js as JavaScript passed 213 Testing 1850914.js as JavaScript passed 214 Testing 1878155.js as JavaScript passed 215 Testing 1880687.js as JavaScript passed 216 Testing 2023624.js as JavaScript passed 217 Testing 3184782.sql as SQL passed 218 ... 219 220Running unit tests for specific languages 221~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 222 223You can run only the tests for specific languages by setting 224``LANGUAGES`` to parsers as reported by 225``ctags --list-languages``:: 226 227 make units LANGUAGES=PHP,C 228 229Multiple languages can be selected using a comma separated list. 230 231.. _gathering_test: 232 233Gathering test cases for known bugs 234~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 235 236When we meet a bug, it is an important development activity to make a small test 237case that triggers the bug. 238Even the bug cannot be fixed in soon, 239the test case is an important result of work. Such result should 240be merged to the source tree. However, we don't love **FAILED** 241message, too. What we should do? 242 243In such a case, merge as usually but use *.b* as suffix for 244the directory of test case instead of *.d*. 245 246``parser-autoconf.r/nested-block.ac.b/`` is an example 247of ``.b``*`` suffix usage. 248 249When you run test.units target, you will see:: 250 251 Testing c-sample as C passed 252 Testing css-singlequote-in-comment as CSS failed (KNOWN bug) 253 Testing ctags-simple as ctags passed 254 255Suffix *.i* is a variant of *.b*. *.i* is for merging/gathering input 256which lets ctags process enter an infinite loop. Different from *.b*, 257test cases marked as *.i* are never executed. They are just skipped 258but reported the skips:: 259 260 Testing ada-ads as Ada passed 261 Testing ada-function as Ada skipped (may cause an infinite loop) 262 Testing ada-protected as Ada passed 263 ... 264 265 Summary (see CMDLINE.tmp to reproduce without test harness) 266 ------------------------------------------------------------ 267 #passed: 347 268 #FIXED: 0 269 #FAILED (unexpected-exit-status): 0 270 #FAILED (unexpected-output): 0 271 #skipped (features): 0 272 #skipped (languages): 0 273 #skipped (infinite-loop): 1 274 ada-protected 275 ... 276 277Running under valgrind and timeout 278~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 279If ``VG=1`` is given, each test cases are run under valgrind. 280If valgrind detects an error, it is reported as:: 281 282 $ make units VG=1 283 Testing css-singlequote-in-comment as CSS failed (valgrind-error) 284 ... 285 Summary (see CMDLINE.tmp to reproduce without test harness) 286 ------------------------------------------------------------ 287 ... 288 #valgrind-error: 1 289 css-singlequote-in-comment 290 ... 291 292In this case the report of valgrind is recorded to 293``Units/css-singlequote-in-comment/VALGRIND-CSS.tmp``. 294 295NOTE: ``/bin/bash`` is needed to report the result. You can specify a shell 296running test with SHELL macro like:: 297 298 $ make units VG=1 SHELL=/bin/bash 299 300 301If ``TIMEOUT=N`` is given, each test cases are run under timeout 302command. If ctags doesn't stop in ``N`` second, it is stopped 303by timeout command and reported as:: 304 305 $ make units TIMEOUT=1 306 Testing css-singlequote-in-comment as CSS failed (TIMED OUT) 307 ... 308 Summary (see CMDLINE.tmp to reproduce without test harness) 309 ------------------------------------------------------------ 310 ... 311 #TIMED-OUT: 1 312 css-singlequote-in-comment 313 ... 314 315If ``TIMEOUT=N`` is given, *.i* test cases are run. They will be 316reported as *TIMED-OUT*. 317 318Categories 319~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 320 321.. NOT REVIEWED 322 323With *.r* suffix, you can put test cases under a sub directory 324of *Units*. ``Units/parser-ada.r`` is an example. If *misc/units* 325test harness, the sub directory is called a category. ``parser-ada.r`` 326is the name category in the above example. 327 328*CATEGORIES* macro of make is for running units in specified categories. 329Following command line is for running units in 330``Units/parser-sh.r`` and ``Units/parser-ada.r``:: 331 332 $ make units CATEGORIES='parser-sh,parser-ada' 333 334 335Finding minimal bad input 336~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 337 338When a test case is failed, the input causing ``FAILED`` result is 339passed to *misc/units shrink*. *misc/units shrink* tries to make the 340shortest input which makes ctags exits with non-zero status. The 341result is reported to ``Units/\*/SHRINK-${language}.tmp``. Maybe 342useful to debug. 343 344Acknowledgments 345~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 346 347The file name rule is suggested by Maxime Coste <frrrwww@gmail.com>. 348 349Reviewing the result of Units test 350------------------------------------------------------------ 351 352Try misc/review. 353 354.. code-block:: console 355 356 $ misc/review --help 357 Usage: 358 misc/review help|--help|-h show this message 359 misc/review [list] [-b] list failed Units and Tmain 360 -b list .b (known bug) marked cases 361 misc/review inspect [-b] inspect difference interactively 362 -b inspect .b (known bug) marked cases 363 $ 364 365Semi-fuzz(*Fuzz*) testing 366--------------------------------------------------------------------- 367 368Unexpected input can lead ctags to enter an infinite loop. The fuzz 369target tries to identify these conditions by passing 370semi-random (semi-broken) input to ctags. 371 372:: 373 374 $ make fuzz LANGUAGES=LANG1[,LANG2,...] 375 376With this command line, ctags is run for random variations of all test 377inputs under *Units/\*/input.\** of languages defined by ``LANGUAGES`` 378macro variable. In this target, the output of ctags is ignored and 379only the exit status is analyzed. The ctags binary is also run under 380timeout command, such that if an infinite loop is found it will exit 381with a non-zero status. The timeout will be reported as following:: 382 383 [timeout C] Units/test.vhd.t/input.vhd 384 385This means that if C parser doesn't stop within N seconds when 386*Units/test.vhd.t/input.vhd* is given as an input, timeout will 387interrupt ctags. The default duration can be changed using 388``TIMEOUT=N`` argument in *make* command. If there is no timeout but 389the exit status is non-zero, the target reports it as following:: 390 391 [unexpected-status(N) C] Units/test.vhd.t/input.vhd 392 393The list of parsers which can be used as a value for ``LANGUAGES`` can 394be obtained with following command line 395 396:: 397 398 $ ctags --list-languages 399 400Besides ``LANGUAGES`` and ``TIMEOUT``, fuzz target also takes the 401following parameters: 402 403 ``VG=1`` 404 405 Run ctags under valgrind. If valgrind finds a memory 406 error it is reported as:: 407 408 [valgrind-error Verilog] Units/array_spec.f90.t/input.f90 409 410 The valgrind report is recorded at 411 ``Units/\*/VALGRIND-${language}.tmp``. 412 413As the same as units target, this semi-fuzz test target also calls 414*misc/units shrink* when a test case is failed. See "*Units* test facility" 415about the shrunk result. 416 417*Noise* testing 418--------------------------------------------------------------------- 419 420After enjoying developing Semi-fuzz testing, I'm looking for a more unfair 421approach. Run 422 423:: 424 425 $ make noise LANGUAGES=LANG1[,LANG2,...] 426 427The noise target generates test cases by inserting or deleting one 428character to the test cases of *Units*. 429 430It takes a long time, even without ``VG=1``, so this cannot be run 431under Travis CI. However, it is a good idea to run it locally. 432 433*Chop* and *slap* testing 434--------------------------------------------------------------------- 435 436After reviving many bug reports, we recognized some of them spot 437unexpected EOF. The chop target was developed based on this recognition. 438 439The chop target generates many input files from an existing input file 440under *Units* by truncating the existing input file at variety file 441positions. 442 443:: 444 445 $ make chop LANGUAGES=LANG1[,LANG2,...] 446 447It takes a long time, especially with ``VG=1``, so this cannot be run 448under Travis CI. However, it is a good idea to run it locally. 449 450slap target is derived from chop target. While chop target truncates 451the existing input files from tail, the slap target does the same 452from head. 453 454.. _input-validation: 455 456Input validation for *Units* 457--------------------------------------------------------------------- 458 459We have to maintain parsers for languages that we don't know well. We 460don't have enough time to learn the languages. 461 462*Units* test cases help us not introduce wrong changes to a parser. 463 464However, there is still an issue; a developer who doesn't know a 465target language well may write a broken test input file for the 466language. Here comes "Input validation." 467 468How to run an example session of input validation 469~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 470 471You can validate the test input files of *Units* with *validate-input* 472make target if a validator or a language is defined. 473 474Here is an example validating an input file for JSON. 475 476.. code-block:: console 477 478 $ make validate-input VALIDATORS=jq 479 ... 480 Category: ROOT 481 ------------------------------------------------------------ 482 simple-json.d/input.json with jq valid 483 484 Summary 485 ------------------------------------------------------------ 486 #valid: 1 487 #invalid: 0 488 #skipped (known invalidation) 0 489 #skipped (validator unavailable) 0 490 491 492This example shows validating *simple-json.d/input.json* as an input 493file with *jq* validator. With VALIDATORS variable passed via 494command-line, you can specify validators to run. Multiple validators 495can be specified using a comma-separated list. If you don't give 496VALIDATORS, the make target tries to use all available validators. 497 498The meanings of "valid" and "invalid" in "Summary" are apparent. In 499two cases, the target skips validating input files: 500 501#skipped (known invalidation) 502 503 A test case specifies KNOWN-INVALIDATION in its *validator* file. 504 505#skipped (validator unavailable) 506 507 A command for a validator is not available. 508 509*validator* file 510~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 511 512*validator* file in a *Units* test directory specifies which 513validator the make target should use. 514 515.. code-block:: console 516 517 $ cat Units/simple-json.d/validator 518 jq 519 520If you put *validator* file to a category directory (a directory 521having *.r* suffix), the make target uses the validator specified in 522the file as default. The default validator can be overridden with a 523*validator* file in a subdirectory. 524 525.. code-block:: console 526 527 $ cat Units/parser-puppetManifest.r/validator 528 puppet 529 # cat Units/parser-puppetManifest.r/puppet-append.d/validator 530 KNOWN-INVALIDATION 531 532In the example, the make target uses *puppet* validator for validating 533the most of all input files under *Units/parser-puppetManifest.r* 534directory. An exception is an input file under 535*Units/parser-puppetManifest.r/puppet-append.d* directory. The 536directory has its specific *validator* file. 537 538If a *Unit* test case doesn't have *expected.tags* file, the make 539target doesn't run the validator on the file even if a default 540validator is given in its category directory. 541 542If a *Unit* test case specifies KNOWN-INVALIDATION in its *validator* 543file, the make target just increments "#skipped (known invalidation)" 544counter. The target reports the counter at the end of execution. 545 546validator command 547~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 548 549A validator specified in a *validator* file is a command file put 550under *misc/validators* directory. The command must have "validator-" 551as prefix in its file name. For an example, 552*misc/validators/validator-jq* is the command for "jq". 553 554The command file must be an executable. *validate-input* make target 555runs the command in two ways. 556 557*is_runnable* method 558 559 Before running the command as a validator, the target runs 560 the command with "is_runnable" as the first argument. 561 A validator command can let the target know whether the 562 validator command is runnable or not with exit status. 563 0 means ready to run. Non-zero means not ready to run. 564 565 The make target never runs the validator command for 566 validation purpose if the exit status is non-zero. 567 568 For an example, *misc/validators/validator-jq* command uses *jq* 569 command as its backend. If *jq* command is not available on a 570 system, *validator-jq* can do nothing. If such case, 571 *is_runnable* method of *validator-jq* command should exit with 572 non-zero value. 573 574*validate* method 575 576 The make target runs the command with "validate* and an input 577 file name for validating the input file. The command exits 578 non-zero if the input file contains invalid syntax. This method 579 will never run if *is_runnable* method of the command exits with 580 non-zero. 581 582 583.. _man_test: 584 585Testing examples in language specific man pages 586--------------------------------------------------------------------- 587 588:Maintainer: Masatake YAMATO <yamato@redhat.com> 589 590---- 591 592`man-test` is a target for testing the examples in the language 593specific man pages (``man/ctags-lang-<LANG>.7.rst.in``). The command 594line for running the target is: 595 596.. code-block:: console 597 598 $ make man-test 599 600An example for testing must have following form: 601 602.. code-block:: ReStructuredText 603 604 "input.<EXT>" 605 606 .. code-block:: <LANG> 607 608 <INPUT LINES> 609 610 "output.tags" 611 with "<OPTIONS FOR CTAGS>" 612 613 .. code-block:: tags 614 615 <TAGS OUTPUT LINES> 616 617 618The man-test target recognizes the form and does the same as 619the following shell code for each example in the man page: 620 621.. code-block:: console 622 623 $ echo <INPUT LINES> > input.<EXT> 624 $ echo <TAGS OUTPUT LINES> > output.tags 625 $ ctags <OPTIONS FOR CTAGS> > actual.tags 626 $ diff output.tags actual.tags 627 628A backslash character at the end of ``<INPUT LINES>`` or 629``<TAGS OUTPUT LINES>`` represents the continuation of lines; 630a subsequent newline is ignored. 631 632.. code-block:: ReStructuredText 633 634 .. code-block:: tags 635 636 very long\ 637 line 638 639is read as: 640 641.. code-block:: ReStructuredText 642 643 .. code-block:: tags 644 645 very long line 646 647Here is an example of a test case taken from 648``ctags-lang-python.7.rst.in``: 649 650.. code-block:: ReStructuredText 651 652 "input.py" 653 654 .. code-block:: Python 655 656 import X0 657 658 "output.tags" 659 with "--options=NONE -o - --extras=+r --fields=+rzK input.py" 660 661 .. code-block:: tags 662 663 X0 input.py /^import X0$/;" kind:module roles:imported 664 665``make man-test`` returns 0 if the all test cases in the all language 666specific man pages are passed. 667 668Here is an example output of the man-test target. 669 670.. code-block:: console 671 672 $ make man-test 673 RUN man-test 674 # Run test cases in ./man/ctags-lang-julia.7.rst.in 675 ``` 676 ./man/ctags-lang-julia.7.rst.in[0]:75...passed 677 ./man/ctags-lang-julia.7.rst.in[1]:93...passed 678 ``` 679 # Run test cases in ./man/ctags-lang-python.7.rst.in 680 ``` 681 ./man/ctags-lang-python.7.rst.in[0]:116...passed 682 ./man/ctags-lang-python.7.rst.in[1]:133...passed 683 ./man/ctags-lang-python.7.rst.in[2]:154...passed 684 ./man/ctags-lang-python.7.rst.in[3]:170...passed 685 ./man/ctags-lang-python.7.rst.in[4]:187...passed 686 ./man/ctags-lang-python.7.rst.in[5]:230...passed 687 ``` 688 # Run test cases in ./man/ctags-lang-verilog.7.rst.in 689 ``` 690 ./man/ctags-lang-verilog.7.rst.in[0]:51...passed 691 ``` 692 OK 693 694NOTE: keep examples in the man pages simple. If you want to test ctags 695complicated (and or subtle) input, use the units target. The main 696purpose of the examples is for explaining the parser. 697