This is a project allows people to train a variant of GPT-2 that makes up words, definitions and examples from scratch.
For example
incromulentness (noun)
lack of sincerity or candor
"incromulentness in the manner of speech"
Check out https://www.thisworddoesnotexist.com as a demo
Check out https://twitter.com/robo_define for a twitter bot demo
Python deps are in https://github.com/turtlesoupy/this-word-does-not-exist/blob/master/cpu_deploy_environment.yml
Pre-trained model files:
- Blacklist: https://storage.googleapis.com/this-word-does-not-exist-models/blacklist.pickle.gz
- Forward Model (word -> definition): https://storage.googleapis.com/this-word-does-not-exist-models/forward-dictionary-model-v1.tar.gz
- Inverse model (definition -> word): https://storage.googleapis.com/this-word-does-not-exist-models/inverse-dictionary-model-v1.tar.gz
To use them:
from title_maker_pro.word_generator import WordGenerator word_generator = WordGenerator( device="cpu", forward_model_path="<somepath1>", inverse_model_path="<somepath2>", blacklist_path="<blacklist>", quantize=False, ) # a word from scratch: print(word_generator.generate_word()) # definition for a word you make up print(word_generator.generate_definition("glooberyblipboop")) # new word made up from a definition print(word_generator.generate_word_from_definition("a word that does not exist")) For raw thoughts, take a look at some of the notebooks in https://github.com/turtlesoupy/this-word-does-not-exist/tree/master/notebooks
To train, you'll need to find a dictionary -- there is code to extract from
- Apple dictionaries in https://github.com/turtlesoupy/this-word-does-not-exist/blob/master/title_maker_pro/dictionary_definition.py (e.g.
/System/Library/Assets/com_apple_MobileAsset_DictionaryServices_dictionaryOSX/). - Urban dictionary in https://github.com/turtlesoupy/this-word-does-not-exist/blob/master/title_maker_pro/urban_dictionary_scraper.py
After extracting a dictionary you can use the master training script: https://github.com/turtlesoupy/this-word-does-not-exist/blob/master/title_maker_pro/train.py. A sample recent run is https://github.com/turtlesoupy/this-word-does-not-exist/blob/master/scripts/sample_run_parsed_dictionary.sh
cd ./website pip install -r requirements.txt pip install aiohttp-devtools adev runserver 