what are language topic dictionaries?
Worzzler currently offers 8 other language topic dictionaries (covering about 2.5 billion world-wide speakers). Each language contains 4 to 8 letter words with easy, medium, hard and master dictionaries:
- Danish
- English
- French
- German
- Italian
- Spanish
- Portuguese
- Portuguese (Brazil)
- Russian
Notice about language encryption
It is possible for people to create their own topics, and the dictionaries will be where the majority of their effort and value goes. We encrypt these dictionaries to help protect that effort. DIC files are encrypted, UNDIC files, if they are provided, are the unencrypted versions of the dictionaries.
English words Topic dictionaries
The English words dictionaries were sourced from Google's top 1000, 10,000 and 50,000 word list, and then parsed against a dictionary similar to scrabble containing 80,000 words. For example, the Hard dictionary has fewer than 50,000 words appear because this dictionary only contains 4 to 8 letter words. All words are then vetted against a Scrabble™-like dictionary, to remove words like common names, cities, countries, states, contractions, phrases and inappropriate words.
The result are dictionaries containing approximately 800, 5,500 and 30,500 words each. The master dictionary contains 80,000 words.
The English Demo topic only contains the Easy and Medium dictionaries.
The result are dictionaries containing approximately 800, 5,500 and 30,500 words each. The master dictionary contains 80,000 words.
The English Demo topic only contains the Easy and Medium dictionaries.
foreign language topic dictionaries
We used the this library https://github.com/hermitdave/FrequencyWords/ which, in turn, used the open subtitles database as their source for foreign language words. Additional information can be found here: https://invokeit.wordpress.com/frequency-word-lists/
These lists do a word-by-word count containing thousands of words and they are ordered with counts, with the most frequent words appearing first. The lists look like this for "Catalan":
From the languages we did select, we perform the following tasks:
The result should be a word that appears in that language's lexicon. We went back and forth over the existence of foreign words (including English that weren't in our master dictionary), and decided that while they may not be formal words of that language, they are part of the lexicon, where words of other languages are mixed in with the core words of the language in question, and this was an acceptable compromise when considering just how time consuming it is to create a set of dictionaries.
We then produced 4 dictionaries consisting of these word counts and dictionary difficulties.
The names of the dictionaries are converted from these English words using Google Translator.
As for the games, we use the exact same games and patterns found in the English Words topic. These are 8 puzzle collections. This produces a grid of 32 game types for each of the 6 match types. With a billion game numbers, each topic has 192 billion game possibilities.
To learn about the word distributions by topic, see this page.
These lists do a word-by-word count containing thousands of words and they are ordered with counts, with the most frequent words appearing first. The lists look like this for "Catalan":
- que 84784
- no 78724
- de 75624
- la 71818
- a 60868
- el 59982
- i 48501
- és 42272
- un 36975
- per 36498
From the languages we did select, we perform the following tasks:
- Words must use the lower-case letters of the language, this avoids pseudo words, proper names, hyphenated words, etc.
- Words must be 4 to 8 letters in length.
- Words cannot exist in our "English Words" Master Dictionary
The result should be a word that appears in that language's lexicon. We went back and forth over the existence of foreign words (including English that weren't in our master dictionary), and decided that while they may not be formal words of that language, they are part of the lexicon, where words of other languages are mixed in with the core words of the language in question, and this was an acceptable compromise when considering just how time consuming it is to create a set of dictionaries.
We then produced 4 dictionaries consisting of these word counts and dictionary difficulties.
- Top 1000 [Difficulty: 1]
- Top 8,000 [Difficulty: 2]
- Top 25,000 [Difficulty: 4]
- Top 60,000 [Difficulty: 6]
The names of the dictionaries are converted from these English words using Google Translator.
- Easy
- Medium
- Hard
- Master
As for the games, we use the exact same games and patterns found in the English Words topic. These are 8 puzzle collections. This produces a grid of 32 game types for each of the 6 match types. With a billion game numbers, each topic has 192 billion game possibilities.
To learn about the word distributions by topic, see this page.