Earley parsing

Have been digging back into the past recently, trying to find one of my old implementations of an Earley parser.

It's a very powerful parser that I have implemented in various forms over the years. And yet, I'm still not convinced I've yet created one that works properly without bugs.

Recognising a valid parse is relatively straightforward - the challenge comes when trying to build up the parse tree, particularly when the input is ambiguous.

Of course, English as a language readily leads to all manner of ambiguous parses, so it's kind of important to get this right when intending to use the parser to parse English.

Looks like my next little programming task might just be to re-implement another Earley parser and start parsing things.

[Image credit: Tantek Çelik]