Spelling Suggestion on the JVM #5: Implementing Norvig's Algo in Xtend
Wed, Feb 7, 2018 - Read in 0 Min
This tutorial project implements a basic spelling suggestion service. The motivation is didactic:
becoming familiar with “better java” languages around a simple but instructional example. Ease of
implementation outranks optimal performance so readers can focus on the JVM languages as such.
Examples and new concepts are introduced in Java 9 first so they’re immediately understandable to
the experienced Java programmer. Crucially, the Java implementation is accompanied by equivalent
(but idiomatic) implementations in today’s most relevant alternative JVM languages:
Code is available at Spellbound’s Github repo.
This fifth and last post presents an idiomatic Xtend implementation of Norvig’s spelling corrector.
Everything having to do with spelling correction revolves around a curated list of valid words
we refer to as the the dictionary.
In our dictionary implementation we associate every word with an integer rank indicating how
frequently the word is used in relation to all other words. The higher the rank, the more
frequently used the word is. Thus, for instance, the has rank 106295 while triose has rank