File Format

Since multiple modules will have to communicate, we need to have some common format to represent words, definitions and crosswords. This page describes what choices were available and which one we have selected.

Formats available

We have compared XML with JSON. Both are able to represent a mapping between keys and attribute. We have kept JSON mainly because it is less verbose.

As a JSON parser, we are currently using the one included with the Play Framework. Some good documentation can be found here and here.

Another common (proprietary) file format for crosswords is PUZ (reversed-engineered documentation can be found here, a viewer here). It is composed of:

We have made a parser from the PUZ to our internal format, but we do not plan to use this format internally because it's quite cumbersome to manipulate. Instead, we will use JSON.

JSON Format specification

In what follows, arguments in bold are mandatory, those in italic are optional. Any additional attribute is accepted, but we should avoid useless informations and try to have some homogeneity.

Crossword

Clue

Here is an example of a crossword grid:

{
  "source" : "The Guardian",
  "language" : "eng",
  "title" : "Cryptic crossword No 26,515",
  "categories" : [ "cryptic" ],
  "url" : "http://www.theguardian.com/crosswords/cryptic/26515",
  "author" : "Picaroon",
  "date" : "2015-03-10",
  "words" : [ {
    "word" : "OGRE",
    "clue" : "Revolting, I must grab right brute! (4)",
    "x" : 3,
    "y" : 0,
    "dir" : "South"
  }, {
    "word" : "BOXERS",
    "clue" : "Men in gloves and underpants? (6)",
    "x" : 0,
    "y" : 5,
    "dir" : "East"
  },
  ...
}

Definition

  • "word": the word which is defined
  • "equivalents": an array of equivalent terms (alternative spelling, acronyms...)
  • "associated": an array of close terms (synonyms, antonyms...)
  • "definitions": an array of definitions (sentences that explain the word)
  • "examples": an array of related sentences (where the word appear, but does not necessary explain it).

Here is an example of a definition :

{
  "word" : "head",
  "equivalents" : [ "hed", "heed", ... ],
  "associated" : [ "Anatomy", "boss", "tail", "turk's head", ... ],
  "definitions" : [ "part of the body", "topmost or leading part", "leader or chief", ... ],
  "examples" : [ "Ere foul sin, gathering head, shall break into corruption.", "to head an army, an expedition, or a riot", ... ]
}