Ghost Wheel Example

Posted 5 months ago in Ghost Wheel and Parsing.

There has been a fair bit of buzz around the Treetop parser in the Ruby community lately. Part of that is fueled by the nice screencast that shows off how to use the parser generator.

It doesn't get talked about as much, but I wrote a parser generator too, called Ghost Wheel. Probably the main reason Ghost Wheel doesn't receive much attention yet is that I have been slow in getting the documentation written. Given that, I thought I would show how the code built in the Treetop screencast translates to Ghost Wheel:

#!/usr/bin/env ruby -wKU

require "rubygems"
require "ghost_wheel"

# define a parser using Ghost Wheel's Ruby DSL
RubyParser    = GhostWheel.build_parser do
  rule( :additive,
        alt( seq( :multiplicative,
                  :space,
                  :additive_op,
                  :space,
                  :additive ) { |add| add[0].send(add[2], add[-1])},
             :multiplicative ) )
  rule(:additive_op, alt("+", "-"))

  rule( :multiplicative,
        alt( seq( :primary,
                  :space,
                  :multiplicative_op,
                  :space,
                  :multiplicative ) { |mul| mul[0].send(mul[2], mul[-1])},
             :primary ) )
  rule(:multiplicative_op, alt("*", "/"))

  rule(:primary, alt(:parenthized_additive, :number))
  rule( :parenthized_additive,
        seq("(", :space, :additive, :space, ")") { |par| par[2] } )
  rule(:number, /[1-9][0-9]*|0/) { |n| Integer(n) }

  rule(:space, /\s*/)
  parser(:exp, seq(:additive, eof) { |e| e[0] })
end

# define a parser using Ghost Wheel's grammar syntax
GrammarParser = GhostWheel.build_parser %q{
  additive             =  multiplicative space additive_op space additive
                          { ast[0].send(ast[2], ast[-1]) }
                       |  multiplicative
  additive_op          =  "+" | "-"

  multiplicative       =  primary space multiplicative_op space multiplicative
                          { ast[0].send(ast[2], ast[-1])}
                       |  primary
  multiplicative_op    =  "*" | "/"

  primary              = parenthized_additive | number
  parenthized_additive =  "(" space additive space ")" { ast[2] }
  number               =  /[1-9][0-9]*|0/ { Integer(ast) }

  space                =  /\s*/
  exp                  := additive EOF { ast[0] }
}

if __FILE__ == $PROGRAM_NAME
  require "test/unit"

  class TestArithmetic < Test::Unit::TestCase
    def test_paring_numbers
      assert_parses         "0"
      assert_parses         "1"
      assert_parses         "123"
      assert_does_not_parse "01"
    end

    def test_parsing_multiplicative
      assert_parses "1*2"
      assert_parses "1 * 2"
      assert_parses "1/2"
      assert_parses "1 / 2"
    end

    def test_parsing_additive
      assert_parses "1+2"
      assert_parses "1 + 2"
      assert_parses "1-2"
      assert_parses "1 - 2"

      assert_parses "1*2 + 3 * 4"
    end

    def test_parsing_parenthized_expressions
      assert_parses "1 * (2 + 3) * 4"
    end

    def test_parse_results
      assert_correct_result "0"
      assert_correct_result "1"
      assert_correct_result "123"

      assert_correct_result "1*2"
      assert_correct_result "1 * 2"
      assert_correct_result "1/2"
      assert_correct_result "1 / 2"

      assert_correct_result "1+2"
      assert_correct_result "1 + 2"
      assert_correct_result "1-2"
      assert_correct_result "1 - 2"

      assert_correct_result "1*2 + 3 * 4"
      assert_correct_result "1 * (2 + 3) * 4"
    end

    private

    PARSERS = [RubyParser, GrammarParser]

    def assert_parses(input)
      PARSERS.each do |parser|
        assert_nothing_raised(GhostWheel::FailedParseError) do
          parser.parse(input)
        end
      end
    end

    def assert_does_not_parse(input)
      PARSERS.each do |parser|
        assert_raises(GhostWheel::FailedParseError) { parser.parse(input) }
      end
    end

    def assert_correct_result(input)
      PARSERS.each { |parser| assert_equal(eval(input), parser.parse(input)) }
    end
  end
end

The primary differences you should note from the above code and Treetop are:

  • I show two different ways to build the parser using Ghost Wheel: using a Ruby DSL and using a grammar syntax. I prefer the grammar syntax in this and, in fact, most cases. The Ruby DSL can be handy when you want the AST transformations to be true closures though.
  • Ghost Wheel builds on your regular expression knowledge. Note that the grammar syntax is regex-like and you can even match Regexp literals.
  • Ghost Wheel's AST transformations are more Lispish compared to Treetop's very object oriented syntax. I think they both have strengths in certain scenarios.

There's still plenty I want to do with Ghost Wheel, but maybe this will begin the process of getting the word out about it. Feel free to post questions here.

Vish added 11 days later:

James,

Could this parser be used to check a very limited English vocabulary given a full superset of nouns, verbs and prepositions?

For illustration, I'd greatly appreciate if you could show how this might be used to generate a parser which verifies that

"Bob and John go to work" is accepted and so are "John goes to work" and "Bob goes to home" but the following are rejected: "Bob and John goes to work", "John go to work", "John goes to", "Bob John work".

The superset being-

[Prepositions: to, and], [Nouns: Bob, John, work, home], [Verbs: go, come, goes, comes]

I'm not expecting semantic/colloquial correctness so "Bob and work go to John" is acceptable and "Bob goes to home" is acceptable in place of "Bob goes home". (Of course it would be a plus to have that too.)

My question is motivated by the fact that many DSLs have a very limited vocabulary.

Thanks in advance

James Edward Gray II added 11 days later:

I'm sure you can put something together as long as you are willing to spend the effort, yes. Here's a trivial example that passes your tests, just to get you started:

#!/usr/bin/env ruby -wKU

require "rubygems"
require "ghost_wheel"

English = GhostWheel.build_parser %q{
  space       =  /\s+/

  noun        =  'Bob' | 'John' | 'work' | 'home'
  nouns       =  noun space 'and' space noun | noun

  verb        =  'go'   | 'come'
  verb_plural =  'goes' | 'comes'

  preposition =  'to' space

  statement   =  noun  space verb_plural space preposition? noun
              |  nouns space verb        space preposition? noun
  sentence    := statement EOF
}

if __FILE__ == $PROGRAM_NAME
  require "test/unit"

  class TestEnglishGrammar < Test::Unit::TestCase
    def test_good_grammar
      assert_parses "Bob and John go to work"
      assert_parses "John goes to work"
      assert_parses "Bob goes home"
    end

    def test_bad_grammar
      assert_does_not_parse "Bob and John goes to work"
      assert_does_not_parse "John goes to"
      assert_does_not_parse "Bob John work"
    end

    private

    def assert_parses(input)
      assert_nothing_raised(GhostWheel::FailedParseError) do
        English.parse(input)
      end
    end

    def assert_does_not_parse(input)
      assert_raises(GhostWheel::FailedParseError) { English.parse(input) }
    end
  end
end
Phil added 2 months later:

I wonder why you chose the name "Ghost Wheel" as Ruby 1.9's Regexp engine has the same name (in Japanese). Seems like it could cause potential confusion. Was it intentional?

James Edward Gray II added 2 months later:

It was intentional, yes. Ghost Wheel is a parser generator designed to build on your existing regular expression knowledge, so the related names seemed like a good fit to me.

Add Your Thoughts

You can use Markdown in the body of your comment to format text and make links.

Note that I reserve the right to edit any content you post here. I typically exercise this right to fix formatting issues. All posts must be approved so spam will never been seen on these pages.

Author:
URL or Email (optional):
Body: