Worzzler
  • Home
  • About
    • What is Worzzler?
    • Features
    • Customizable Challenge
    • Video
    • Screenshots
    • in Game Music
    • Topics and Dictionaries >
      • Extra Topics (Free)
      • Demo Topics (Included)
      • Language Dictionaries
    • Elenesski Games
  • Social
    • @Worzzler
    • #worzzler
    • Facebook
  • Support
    • Community (Forum)
    • Suggestions
    • User Defined Topics
  • Download
  • Puzzle Books

how language topic dictionaries were built

what are language topic dictionaries?

Worzzler currently offers 8 other language topic dictionaries (covering about 2.5 billion world-wide speakers).  Each language contains 4 to 8 letter words with easy, medium, hard and master dictionaries:
  • Danish
  • English
  • French
  • German
  • Italian
  • Spanish
  • Portuguese
  • Portuguese (Brazil)
  • Russian

Notice about language encryption

It is possible for people to create their own topics, and the dictionaries will be where the majority of their effort and value goes.  We encrypt these dictionaries to help protect that effort.  DIC files are encrypted, UNDIC files, if they are provided, are the unencrypted versions of the dictionaries.
Picture
Example of Unencrpted Dictionary
Picture
Example of Encrypted Dictionary

English words Topic dictionaries

The English words dictionaries were sourced from Google's top 1000, 10,000 and 50,000 word list, and then parsed against a dictionary similar to scrabble containing 80,000 words.  For example, the Hard dictionary has fewer than 50,000 words appear because this dictionary only contains 4 to 8 letter words.  All words are then vetted against a Scrabble™-like dictionary, to remove words like common names, cities, countries, states, contractions, phrases and inappropriate words.

The result are dictionaries containing approximately 800, 5,500 and 30,500 words each.  The master dictionary contains 80,000 words.

The English Demo topic only contains the Easy and Medium dictionaries.

foreign language topic dictionaries

We used the this library https://github.com/hermitdave/FrequencyWords/ which, in turn, used the open subtitles database as their source for foreign language words.  Additional information can be found here: https://invokeit.wordpress.com/frequency-word-lists/

These lists do a word-by-word count containing thousands of words and they are ordered with counts, with the most frequent words appearing first.  The lists look like this for "Catalan":
  • que 84784
  • no 78724
  • de 75624
  • la 71818
  • a 60868
  • el 59982
  • i 48501
  • és 42272
  • un 36975
  • per 36498

From the languages we did select, we perform the following tasks:
  • Words must use the lower-case letters of the language, this avoids pseudo words, proper names, hyphenated words, etc.
  • Words must be 4 to 8 letters in length.
  • Words cannot exist in our "English Words" Master Dictionary

The result should be a word that appears in that language's lexicon.  We went back and forth over the existence of foreign words (including English that weren't in our master dictionary), and decided that while they may not be formal words of that language, they are part of the lexicon, where words of other languages are mixed in with the core words of the language in question, and this was an acceptable compromise when considering just how time consuming it is to create a set of dictionaries.

We then produced 4 dictionaries consisting of these word counts and dictionary difficulties.
  • Top 1000       [Difficulty: 1]
  • Top 8,000      [Difficulty: 2]
  • Top 25,000   [Difficulty: 4]
  • Top 60,000   [Difficulty: 6]

The names of the dictionaries are converted from these English words using Google Translator.
  • Easy
  • Medium
  • Hard
  • Master

​As for the games, we use the exact same games and patterns found in the English Words topic.  These are 8 puzzle collections.  This produces a grid of 32 game types for each of the 6 match types.  With a billion game numbers, each topic has 192 billion game possibilities.

To learn about the word distributions by topic, see this page.

What is Worzzler?

In this game, your goal is to find a set of words that use all the letters presented to you in a grid.  Words are formed when the letters are adjacent to each other.  Each letter is used only once.​

Contact Us

  • Home
  • About
    • What is Worzzler?
    • Features
    • Customizable Challenge
    • Video
    • Screenshots
    • in Game Music
    • Topics and Dictionaries >
      • Extra Topics (Free)
      • Demo Topics (Included)
      • Language Dictionaries
    • Elenesski Games
  • Social
    • @Worzzler
    • #worzzler
    • Facebook
  • Support
    • Community (Forum)
    • Suggestions
    • User Defined Topics
  • Download
  • Puzzle Books