Change Log

All notable changes to this project will be documented in this file. This project adheres to Semantic Versioning starting with version 0.7.0.

[Unreleased] - master

Note

This version is not yet released and is under active development.

Added

Changed

Removed

Fixed

[0.8.6] - 2017-05-15

Fixed

  • Fixed duckling dimension persistence. fixes #358

[0.8.5] - 2017-05-10

Fixed

  • Fixed pypi installation dependencies (e.g. flask). fixes #354

[0.8.4] - 2017-05-10

Fixed

  • Fixed CRF model training without entities. fixes #345

[0.8.3] - 2017-05-10

Fixed

  • Fixed Luis emulation and added test to catch regression. Fixes #353

[0.8.2] - 2017-05-08

Fixed

  • deepcopy of context #343

[0.8.1] - 2017-05-08

Fixed

  • NER training reuses context inbetween requests

[0.8.0] - 2017-05-08

Added

  • ngram character featurizer (allows better handling of out-of-vocab words)
  • replaced pre-wired backends with more flexible pipeline definitions
  • return top 10 intents with sklearn classifier #199
  • python type annotations for nearly all public functions
  • support for arbitrary spacy language model names
  • duckling components to provide normalized output for structured entities
  • Conditional random field entity extraction (Markov model for entity tagging, better named entity recognition with low and medium data and similarly well at big data level)
  • allow naming of trained models instead of generated model names
  • dynamic check of requirements for the different components & error messages on missing dependencies
  • support for using multiple entity extractors and combining results downstream

Changed

  • unified tokenizers, classifiers and feature extractors to implement common component interface

  • src directory renamed to rasa_nlu

  • when loading data in a foreign format (api.ai, luis, wit) the data gets properly split into intent & entity examples

  • Configuration:
    • added max_number_of_ngrams
    • removed backend and added pipeline as a replacement
    • added luis_data_tokenizer
    • added duckling_dimensions
  • parser output format changed

    from {"intent": "greeting", "confidence": 0.9, "entities": []}

    to {"intent": {"name": "greeting", "confidence": 0.9}, "entities": []}

  • entities output format changed

    from {"start": 15, "end": 28, "value": "New York City", "entity": "GPE"}

    to {"extractor": "ner_mitie", "processors": ["ner_synonyms"], "start": 15, "end": 28, "value": "New York City", "entity": "GPE"}

    where extractor denotes the entity extractor that originally found an entity, and processor denotes components that alter entities, such as the synonym component.

  • camel cased MITIE classes (e.g. MITIETokenizerMitieTokenizer)

  • model metadata changed, see migration guide

  • updated to spacy 1.7 and dropped training and loading capabilities for the spacy component (breaks existing spacy models!)

  • introduced compatibility with both Python 2 and 3

Removed

Fixed

  • properly parse str additionally to unicode #210
  • support entity only training #181
  • resolved conflicts between metadata and configuration values #219
  • removed tokenization when reading Luis.ai data (they changed their format) #241

[0.7.4] - 2017-03-27

Fixed

  • fixed failed loading of example data after renaming attributes, i.e. “KeyError: ‘entities’”

[0.7.3] - 2017-03-15

Fixed

  • fixed regression in mitie entity extraction on special characters
  • fixed spacy fine tuning and entity recognition on passed language instance

[0.7.2] - 2017-03-13

Fixed

  • python documentation about calling rasa NLU from python

[0.7.1] - 2017-03-10

Fixed

  • mitie tokenization value generation #207, thanks @cristinacaputo
  • changed log file extension from .json to .log, since the contained text is not proper json

[0.7.0] - 2017-03-10

This is a major version update. Please also have a look at the Migration Guide.

Added

  • Changelog ;)
  • option to use multi-threading during classifier training
  • entity synonym support
  • proper temporary file creation during tests
  • mitie_sklearn backend using mitie tokenization and sklearn classification
  • option to fine-tune spacy NER models
  • multithreading support of build in REST server (e.g. using gunicorn)
  • multitenancy implementation to allow loading multiple models which share the same backend

Fixed

  • error propagation on failed vector model loading (spacy)
  • escaping of special characters during mitie tokenization

[0.6-beta] - 2017-01-31