- add character vocab in preprocessor - add dataset loader for language model dataset - other minor adjustments - preserve only a little example data for language model