Project

General

Profile

ComparisonWithPygments » History » Version 2

Kornelius Kalnbach, 12/05/2010 03:39 AM

1 1 Kornelius Kalnbach
h1. Comparison with Pygments
2 1 Kornelius Kalnbach
3 1 Kornelius Kalnbach
h2. General sifferences
4 1 Kornelius Kalnbach
5 1 Kornelius Kalnbach
* CodeRay is a Ruby library, Pygments is written in Python.
6 1 Kornelius Kalnbach
* CodeRay supports 19 languages, while Pygments supports over 90.
7 1 Kornelius Kalnbach
* CodeRay has handwritten scanners. In Pygments, scanners are defined with a scanner DSL.
8 1 Kornelius Kalnbach
9 1 Kornelius Kalnbach
h2. Handwritten vs. DSL, Pro & Contra
10 1 Kornelius Kalnbach
11 1 Kornelius Kalnbach
The last two differences in the list above are very much related.
12 1 Kornelius Kalnbach
13 1 Kornelius Kalnbach
h3. Pro: handwritten scanners (CodeRay)
14 1 Kornelius Kalnbach
15 1 Kornelius Kalnbach
* faster
16 1 Kornelius Kalnbach
** lots of fine tuning is possible
17 1 Kornelius Kalnbach
** no overhead for DSL transformation and interpretation
18 1 Kornelius Kalnbach
* more flexible
19 1 Kornelius Kalnbach
20 1 Kornelius Kalnbach
h3. Pro: scanner definition (Pygments)
21 1 Kornelius Kalnbach
22 1 Kornelius Kalnbach
* easier to write, read, and maintain
23 1 Kornelius Kalnbach
** less code
24 1 Kornelius Kalnbach
* DSL interpreter can be optimized/changed independently
25 1 Kornelius Kalnbach
* porting scanners is easier
26 2 Kornelius Kalnbach
* use of higher-level features (like token groups or stacks) is simple
27 1 Kornelius Kalnbach
28 1 Kornelius Kalnbach
h2. Other differences
29 2 Kornelius Kalnbach
30 2 Kornelius Kalnbach
h3. Token kinds vs. token types
31 2 Kornelius Kalnbach
32 2 Kornelius Kalnbach
CodeRay represents tokens with a Token Kind (see #122), which is just a Ruby symbol ("source":http://redmine.rubychan.de/projects/coderay/repository/entry/trunk/lib/coderay/token_classes.rb?rev=452).
33 2 Kornelius Kalnbach
Pygments uses a hierarchical token type/subtype system ("source":http://bitbucket.org/birkenfeld/pygments-main/src/f90ec0252e78/pygments/token.py#cl-47), which is more complex to implement (and slower), but more flexible and easier to understand for authors of new language definitions.
34 2 Kornelius Kalnbach
35 2 Kornelius Kalnbach
h3. Token groups
36 2 Kornelius Kalnbach
37 2 Kornelius Kalnbach
CodeRay supports token groups, which map nicely to SPANs in the HTML output. A token group has a token kind and can contain tokens and other token groups. The final color of a token depends on the group nesting it is in (for example, @string/delimiter@ has a different color than @regexp/delimiter@.) Groups are represented with special @:open@ and @:close@ tokens.
38 2 Kornelius Kalnbach
39 2 Kornelius Kalnbach
Token groups allow CSS-style color definitions, which are most useful for HTML output. Pygments doesn't have a comparable feature; you can see that strings are usually a single token in Pygments, while the delimiting quotes are usually separate tokens in CodeRay.
40 2 Kornelius Kalnbach
41 2 Kornelius Kalnbach
CodeRay is optimized for HTML/CSS output. The concept of token groups may be ported to LaTeX or console output, but it's not trivial.
42 2 Kornelius Kalnbach
43 2 Kornelius Kalnbach
h3. Filters
44 2 Kornelius Kalnbach
45 2 Kornelius Kalnbach
Pygments has "filters":http://pygments.org/docs/filters/#builtin-filters, which manipulate the token stream in some way. You can do some cool tricks with these. CodeRay currently lacks such a feature.
46 2 Kornelius Kalnbach
47 2 Kornelius Kalnbach
h3. Plugins
48 2 Kornelius Kalnbach
49 2 Kornelius Kalnbach
Pygments and CodeRay allow extension via plugins. The specific details are different, but it's simple.