项目作者: usriva2405

项目描述 :
ML based chatbot for banking applications
高级语言: Python
项目地址: git://github.com/usriva2405/tulo-chatbot.git
创建时间: 2019-09-15T14:23:01Z
项目社区:https://github.com/usriva2405/tulo-chatbot

开源协议:MIT License

下载


Tulo-Chatbot

This bot is a ML (machine learning) based intent (category) specific conversational bot, with following features -

  1. This is domain agnostic. Provide it with right training data and it should work just fine out of the box (However to demonstrate a use-case, training data is for banking domain).
  2. It has been designed to classify incoming query into categories
  3. In case a query is unclassifiable, it is stored for later training
  4. It is REST API driven (via flask). However, it can also be extended to any currently available social messaging app such as slack bot, skype bot, whatsapp, wechat, telegram (telegram prototype implemented. See below for reference)
  5. APIs require authentication and authorization.
  6. Models can be trained and retrained on the fly
  7. Everything (model creation, training, querying) is database driven (MongoDB and Redis server for caching).
  8. Supports multilingual training
  9. Supports trainable expletive query management
  10. Out of box deployment ready on Heroku (More on this later)
  11. Extend model selection and vector selection to custom implementation

In Pipeline (major upgrades) (Star it, watch it, contribute actively to it!)

  1. Each user account can support bots in multiple projects, each in multiple languages.
  2. Will Support (in the pipeline) follow up queries, custom variables in input and output.
  3. Small talk support (was part of version 1, but upon re-designing, this feature broke). Move to Spacy for small talk
  4. Improve classification accuracy by using normalizer (for spelling mistakes), NLTK for preprocessing, Lemmatization and stemming
  5. Add GUI for improved user experience (will mostly be a separate project)
  6. Other recommendations about project structure, deployment best practices etc…

Tech/ Infra Stack

Python 3.6+

MongoDB

Heroku (for deployment)

Redis (for decentralized caching)

Actors and Systems

Users -> Brokers -> Language -> TrainedClassifier

  1. User - person who creates the chatbot. This bot can be deployed as a “Bot as a Service”.
  2. Brokers - projects under which chat bots are created. User can create multiple brokers (Bank Bot, HR Bot, Restaurant Bot)
  3. Language - Under each broker, each chatbot can deal in multiple languages, with a classifier trained per language. Language is passed as a input parameter.
  4. TrainedClassifier - trained model for a given language. Refer to REQUEST objects below on how to make multi-lingual queries

Project Structure

  1. modules
  2. |
  3. |__ controllers
  4. |__ data
  5. | |__ dao
  6. | |__ db_model
  7. | |__ dto
  8. |__ nlp_engine
  9. | |__ classifier_instance
  10. | |__ model_builder
  11. | |__ model_selection
  12. | |__ vector_selection
  13. |__ saved_models *not used anymore*
  14. |__ services
  15. |__ utils
  1. Controllers - These contain endpoints exposed for flask and telegram (going forward for any other end point as well)
  2. data -> dao - contains daos for mongodb
    1. -> db_model - all the models which are required by the project reside here
    2. -> dto - response objects mostly
  3. nlp_engine -> classifier_instance - trained instance of a model, which is pickled and stored in database after training
    1. -> model_builder - contains training classes
    2. -> model_selection - contains models used for classification. Extend your models here
    3. -> vector_selection - contains vector implementations for bag of words models.
  4. services - intermediate layer between controllers and dao, plus any other addendum requirements
  5. utils - misc methods

Setup and installation -

  1. Setup mongodb. the sample data is available in docs -> db folder. create a database called tulo_bot and dump everything there
  2. ensure modules defined in requirements.txt exist (TODO : make installation of all requirements script based)
  3. setup redis.
  4. provide appropriate url and credentials for mongo and redis in config.yaml

Run Flask API -

  1. run flask_controller.py

APIs

AUTHENTICATION

REQUEST

URL : /authenticate

body :

  1. {
  2. "email" : "utkarshsrivastava.aiml@gmail.com",
  3. "password" : "password1"
  4. }

RESPONSE

returns list of brokers (id + default language) available and a auth token

RETRAIN

REQUEST

URL : /retrain

body :

  1. {
  2. "token": "<<use token generated from login here>>",
  3. "broker_id" : "5d9e1f9d6ecaa9720db58964",
  4. "lang" : "en-US"
  5. }

QUERY

REQUEST

URL : /query

body :

  1. {
  2. "token": "<<use token generated from login here>>",
  3. "broker_id" : "5d9e1f9d6ecaa9720db58964",
  4. "lang" : "en-US",
  5. "query" : "Can you show my balance?"
  6. }

LOGOUT

REQUEST

URL : /logout

body :

  1. {
  2. "token": "<<use token generated from login here>>"
  3. }

For more details refer to (https://github.com/usriva2405/tulo-chatbot/wiki)