Project

General

Profile

ComparisonWithPygments » History » Version 2

« Previous - Version 2/6 (diff) - Next » - Current version
Kornelius Kalnbach, 12/05/2010 03:39 AM


Comparison with Pygments

General sifferences

  • CodeRay is a Ruby library, Pygments is written in Python.
  • CodeRay supports 19 languages, while Pygments supports over 90.
  • CodeRay has handwritten scanners. In Pygments, scanners are defined with a scanner DSL.

Handwritten vs. DSL, Pro & Contra

The last two differences in the list above are very much related.

Pro: handwritten scanners (CodeRay)

  • faster
    • lots of fine tuning is possible
    • no overhead for DSL transformation and interpretation
  • more flexible

Pro: scanner definition (Pygments)

  • easier to write, read, and maintain
    • less code
  • DSL interpreter can be optimized/changed independently
  • porting scanners is easier
  • use of higher-level features (like token groups or stacks) is simple

Other differences

Token kinds vs. token types

CodeRay represents tokens with a Token Kind (see #122), which is just a Ruby symbol (source).
Pygments uses a hierarchical token type/subtype system (source), which is more complex to implement (and slower), but more flexible and easier to understand for authors of new language definitions.

Token groups

CodeRay supports token groups, which map nicely to SPANs in the HTML output. A token group has a token kind and can contain tokens and other token groups. The final color of a token depends on the group nesting it is in (for example, string/delimiter has a different color than regexp/delimiter.) Groups are represented with special :open and :close tokens.

Token groups allow CSS-style color definitions, which are most useful for HTML output. Pygments doesn't have a comparable feature; you can see that strings are usually a single token in Pygments, while the delimiting quotes are usually separate tokens in CodeRay.

CodeRay is optimized for HTML/CSS output. The concept of token groups may be ported to LaTeX or console output, but it's not trivial.

Filters

Pygments has filters, which manipulate the token stream in some way. You can do some cool tricks with these. CodeRay currently lacks such a feature.

Plugins

Pygments and CodeRay allow extension via plugins. The specific details are different, but it's simple.