Monitor Watson Machine Learning Model with Watson OpenScale on Cloud Pak for Data
Note: This code pattern is part of a series that explores telecom call-drop predictions using IBM Cloud Pak for Data, Data Virtualization, Watson OpenScale, and Cognos Analyics. Other assets included in the series include:
Businesses today are increasingly certain that AI will be a driving force in the evolution of their industries over the next few years. To successful infuse AI into your product or solution, there are many factors that challenges its widespread adoption in the business and to achieving their expected outcomes. A few are listed below:
What if there is one console that makes it easier for business users to track and measure AI outcomes?
In this code pattern we demonstrate a way to monitor your AI models in an application using Watson OpenScale. This will be demonstrated with an example of a Telecomm Call Drop Prediction Model. After the user has completed the code pattern, they will learn:
Note: This code pattern includes steps to use WML on IBM Cloud as well as WML within your Cloud Pak for Data instance.
Note: This step is only for For WML on IBM CLOUD users
Create a new Watson Machine Learning instance on IBM Cloud. Log in to IBM Cloud or sign up for IBM Cloud if you don’t have an account by following the on-screen instructions.
Select the location to Dallas
region and hit create.
Once the instance is created. Click on Service Credentials
. Click on New Credentials
and then click on View Credentials
. Copy using the icon.
NOTE: Save the credentials. It will be required when running the notebook.
Once you login to your Cloud Pak for Data instance. Click on the (☰) menu
icon in the top left corner of your screen and click Projects
.
When you reach the Project list, click on Create and Empty Project
. You will be navigated to a new page where you can enter the desired name(or Telco_CallDrop
). Once you click on Ok
you will go to a new screen. Click on Create
to complete your project creation.
Settings
tab. Click on Associate a Deployment Space
In the created project page, click on the Add to Project
button. Then click on Notebook
.
You will be navigated to a new page. Click on the From URL
tab.
Enter the name TowerA_Call_Drop
and click on the Create
button.
Note: Choose the Python 3.6 environment.
You will run cells individually by highlighting each cell, then either click the Run
button at the top of the notebook. While the cell is running, an asterisk ([*]
) will show up to the left of the cell. When that cell has finished executing a sequential number will show up (i.e. [17]
).
NOTE: For reference, we have included a completed notebook in the
/examples
directory of this repo. This version of the notebook includes all the executed steps and outputs. See https://github.com/IBM/icp4d-telco-monitor-models-with-wml-openscale/blob/master/examples/Setup_your_AIOS_Dashboard-Example.jupyter-py36.ipynb
Run
icon and install the necessary packages described in the first 3 cells.Kernel > Restart
.Download
and click on Save link as
. Save the file with .csv extension. Note: Ensure it is not saved as a text file.
In section 2.2 Add Dataset
, highlight the blank cell by clicking on it. Click on the 10/01 button and the browse
to select the downloaded data set.
Select the call_drop_data_train.csv
file that you added to the project. Under the Insert to code
option, click Insert Pandas DataFrame
.
IMPORTANT: Ensure the variable name is set to
df_data_1
.
2.1 Global Variables
enter the following:empty
schema in your database2.3 Add your WML Credentials
, add the credentials you had copied earlier when creating the Watson Machine Learning instance.2.4 Update your AIOS Credentials
, add the necessary Cloud Pak for Data instance details as instructed in the cell. Replace the <> with the information within the brackets.2.5 Add your Db Credentials
, add your database credentials and make sure the keys given in the template have values filled in.4.3 Setup the Datamart on AI OpenScale
, if the Datamart already exists, it will be used. If not, the notebook will create the Datamart, using these credentials to configure it.Deployment Space
Associated for the project as created in Step 2.2.5 Add your Db Credentials
, add your database credentials and make sure the keys given in the template have values filled in.4.3 Setup the Datamart on AI OpenScale
, if the Datamart already exists, it will be used. If not, the notebook will create the Datamart, using these credentials to configure it.4.5 Create a new Subscription
.Before runnning the remaining cells (again, please keep in mind it is better run cell by cell), update the variable subscription_id
in section 4.6 Perform Inital Payload Logging
, with the ID you created in the previous cell.
You will find this id in the table above the cell, as instructed in the notebook.
Open the Watson Openscale add-on on Cloud Pak for Data. Click the add-ons
icon located in the top right corner of the panel, which will show the list of add-ons. Use the option menu in the Watson OpenScale add-on tile to open it.
You will see the initial setup looking like this:
Included in the data directory of this repo is a file named Telco_training_final.csv). This contains call drop data for multiple cell towers. If you filter the data on the field outgoing_site_id
, you can create a file for each cell tower you would like to monitor. Using the same notebook, you can create a new deployment for each model.
In the following description, we show a dashboard that consists of call drop prediction models for 3 different towers - A, B and C.
OpenScale is configured so that it can monitor how your models are performing over time. The following screen shot gives one such snapshot:
As we can see, the model for Tower C demonstrates a fairness bias warning of 92%.
Data in this day and age comes from a wide variety of sources. Being able to give the right input to a machine learning model is crucial to generating a succesful machine learning implementation in an application. However, if the underlying data is biased in any form, there is a risk that structural inequalities and unfair biases are not just replicated, but also amplified.
Hence, Watson OpenScale is useful when you need to identify the bias, in real-time (i.e., when payload flows into your machine learning system).
Now, that we have identified a bias, let us see the consequence of the bias and what exactly is causing the bias in this example.
Select the desired deployment using options > View Details
. You will see a full dashboard with various charts telling you how your model has been performing over various time intervals.
You may not receive the same values as the below screenshots. Since we have only done an initial setup, you will have only see one line in your charts.
Now select _conds
in the Fairness
menu on the left side. If you hover over the graph, you can select Click to view details
. Select a timeline according to your current date.
To interpret the details, have a look at the screenshot below:
Our dataset contains the fields: Clear
,Fog
,Haze
, Partly Cloudy
, and Rain
. The graph above suggests the field Fog
contains a bias.
As we can see with the legend, we have the Favorable Outcome
and Unfavorable Outcome
. With this bias value we see that with the field value Fog
compared to the values has a lesser % of favorable outcomes. Why does this affect my model? It creates a disparity and thereby allows your model to be biased towards the unfavorable outcome (in the case of fog
).
The best way to explain the consequence of the disparity is by taking this example - suppose we allow the disparity to creep in. Our model will slowly start believing that the moment the weather condition is foggy, it needs to predict the unfavorable outcome(1), and in this case the call will drop. Now, this may not be the case in reality. There maybe other conditions along with the fog that is causing call drops. Hence, it is important to keep monitoring your machine learning model on a real-time basis.
Now click on the View Transactions
button shown in the previous screen shot. You will be presented with a list of each transaction. A transaction is basically the records
of all the data that flows into your model. It gives you the logs and if there is a bias identified, it will also be specified.
Come back to the dashboard and explore the Area under ROC
section. Again, hover over the graph and select Click to view details
.
In this example, we have setup only the Area under ROC
as a quality measure. As shown in the above image.
You will receive estimates and the configured quality montor values, as shown below. The image shows the overall True Positive
and False Positive
ratio for the monitored model.
This code pattern is licensed under the Apache License, Version 2. Separate third-party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 and the Apache License, Version 2.