/* * CDDL HEADER START * * The contents of this file are subject to the terms of the * Common Development and Distribution License (the "License"). * You may not use this file except in compliance with the License. * * See LICENSE.txt included in this distribution for the specific * language governing permissions and limitations under the License. * * When distributing Covered Code, include this CDDL HEADER in each * file and include the License file at LICENSE.txt. * If applicable, add the following below this CDDL HEADER, with the * fields enclosed by brackets "[]" replaced with your own identifying * information: Portions Copyright [yyyy] [name of copyright owner] * * CDDL HEADER END */ /* * Copyright (c) 2008, 2016, Oracle and/or its affiliates. All rights reserved. * Portions Copyright (c) 2017, Chris Fraire . * * Copyright © 1993 The Regents of the University of California. * Copyright © 1994-1996 Sun Microsystems, Inc. * Copyright © 1995-1997 Roger E. Critchlow Jr. */ Number = ([0-9]+\.[0-9]+|[0-9][0-9]*|"#" [boxBOX] [0-9a-fA-F]+) /* * [1] Commands. ... Semi-colons and newlines are command separators unless * quoted as described below. * * [3] Words. Words of a command are separated by white space (except for * newlines, which are command separators). * [4] Double quotes. If the first character of a word is double-quote (``"'') * then the word is terminated by the next double-quote character. * [5] Braces. If the first character of a word is an open brace (``{'') then * the word is terminated by the matching close brace (``}''). * N.b. OpenGrok handles [4] and [5] as special matches distinct from {Word}. * * [9] Comments. If a hash character (``#'') appears at a point where Tcl is * expecting the first character of the first word of a command, then the hash * character and the characters that follow it, up through the next newline, * are treated as a comment and ignored. The comment character only has * significance when it appears at the beginning of a command. * * N.b. this "OrdinaryWord" is for OpenGrok's purpose of symbol tokenization * and deviates from the above definitions by treating backslash escapes as * word breaking and precluding some characters from starting words and mostly * the same from continuing words. E.g., hyphen is not allowed by OpenGrok to * start OrdinaryWord but can be present afterward. */ OrdinaryWord = [\S--\-,=#\"\}\{\]\[\)\(\\] [\S--#\"\}\{\]\[\)\(\\]* /* * [7] Variable substitution. * * $name * Name is the name of a scalar variable; the name is a sequence of one or * more characters that are a letter, digit, underscore, or namespace * separators (two or more colons). */ Varsub1 = \$ {name_unit}+ name_unit = ([\p{Letter}\p{Digit}_] | [:][:]+) /* * $name(index) * Name gives the name of an array variable and index gives the name of an * element within that array. Name must contain only letters, digits, * underscores, and namespace separators, and may be an empty string. */ Varsub2 = \$ {name_unit}* \( {name_unit}+ \) /* * ${name} * Name is the name of a scalar variable. It may contain any characters * whatsoever except for close braces. */ Varsub3 = \$\{ [^\}]+ \} /* * [8] Backslash substitution. * Backslash plus a character, where ... in all cases but [for the characters] * described below, the backslash is dropped and the following character is * treated as an ordinary character and included in the word. * * Special cases: * a,f,b,n,r,t,v,backslash; * \whiteSpace; * \ooo The digits ooo (one, two, or three of them); * \xhh The hexadecimal digits hh .... Any number of hexadecimal digits may be * present; * \uhhhh The hexadecimal digits hhhh (one, two, three, or four of them) * * "Backslash substitution is not performed on words enclosed in braces, except * for backslash-newline as described above." */ Backslash_sub = [\\] ([afbnrtv\\] | \p{Number}{1,3} | [x][0-9a-fA-F]+ | [u][0-9a-fA-F]{1,4} | [[^]--[afbnrtv\n\p{Number}xu\\]]) Backslash_nl = [\\] \n\s+ WordOperators = ("*" | "&&" | "||")