1*0f0b7adcSColomban Wendling.. _python: 2*0f0b7adcSColomban Wendling 3*0f0b7adcSColomban Wendling====================================================================== 4*0f0b7adcSColomban WendlingThe new Python parser 5*0f0b7adcSColomban Wendling====================================================================== 6*0f0b7adcSColomban Wendling 7*0f0b7adcSColomban Wendling:Maintainer: Colomban Wendling <ban@herbesfolles.org> 8*0f0b7adcSColomban Wendling 9*0f0b7adcSColomban WendlingIntroduction 10*0f0b7adcSColomban Wendling--------------------------------------------------------------------- 11*0f0b7adcSColomban Wendling 12*0f0b7adcSColomban WendlingThe old Python parser was a line-oriented parser that grew way beyond 13*0f0b7adcSColomban Wendlingits capabilities, and ended up riddled with hacks and easily fooled by 14*0f0b7adcSColomban Wendlingperfectly valid input. By design, it especially had problems dealing 15*0f0b7adcSColomban Wendlingwith constructs spanning multiple lines, like triple-quoted strings 16*0f0b7adcSColomban Wendlingor implicitly continued lines; but several less tricky constructs were 17*0f0b7adcSColomban Wendlingalso mishandled, and handling of lexical constructs was duplicated and 18*0f0b7adcSColomban Wendlingeach clone evolved in its own direction, supporting different features 19*0f0b7adcSColomban Wendlingand having different bugs depending on the location. 20*0f0b7adcSColomban Wendling 21*0f0b7adcSColomban WendlingAll this made it very hard to fix some existing bugs, or add new 22*0f0b7adcSColomban Wendlingfeatures. To fix this regrettable state of things, the parser has been 23*0f0b7adcSColomban Wendlingrewritten from scratch separating lexical analysis (generating tokens) 24*0f0b7adcSColomban Wendlingfrom syntactical analysis (understanding what the lexemes mean). 25*0f0b7adcSColomban WendlingThis moves understanding lexemes to a single location, making it 26*0f0b7adcSColomban Wendlingconsistent and easier to extend with new lexemes, and lightens the 27*0f0b7adcSColomban Wendlingburden on the parsing code making it more concise, robust and clear. 28*0f0b7adcSColomban Wendling 29*0f0b7adcSColomban WendlingThis rewrite allowed to quite easily fix all known bugs of the old 30*0f0b7adcSColomban Wendlingparser, and add many new features, including: 31*0f0b7adcSColomban Wendling 32*0f0b7adcSColomban Wendling- Tagging function parameters 33*0f0b7adcSColomban Wendling- Extraction of decorators 34*0f0b7adcSColomban Wendling- Proper handling of semicolons 35*0f0b7adcSColomban Wendling- Extracting multiple variables in a combined declaration 36*0f0b7adcSColomban Wendling- More accurate support of mixed indentation 37*0f0b7adcSColomban Wendling- Tagging local variables 38*0f0b7adcSColomban Wendling 39*0f0b7adcSColomban Wendling 40*0f0b7adcSColomban WendlingThe parser should be compatible with the old one. 41