Explain ability – Explainable AI

Step 5: Explain ability

Explain ability needs to be set in two places while working on AutoML images. The first one is during the training phase of the model and while deploying the model. Follow the below mentioned steps to configure explain ability of the model during training phase.

The steps shown below is for Integrated gradients method of Explainability as shown in the following screenshot:

Figure 10.7: Explain ability of image classification model

  1. Enable to Generate explainable bitmaps.
  2. Visualization type set to Outlines (pixels is another option to understand which pixels are playing important role for the prediction)..
  3.  Color map select Pink/Green (Pink/Green color are used to highlight the areas on the image).
  4.  Clip below and Clip above parameters are used to reduce the noise. Enter 70 and 99.9 for click below and above respectively.
  5. Select Original under Overlay type (pixels will be highlighted on top of the original image).
  6. Enter 50 for the Number of integral steps (increasing this parameter will reduce the approximation error).

Scroll down and follow the steps mentioned in the following step to set the parameters for XRAI method:

Figure 10.8: Explain ability of image classification model (XRAI)

  1. Choose the Color map.
  2. Clip below and Clip above parameters are used to reduce the noise. Enter 70 and 99.9 for click below and above respectively.
  3. Select Original under Overlay type (pixels will be highlighted on top of the original image).
  4. Enter 50 for the Number of Integral steps (increasing this parameter will reduce the approximation error).
  5. Click CONTINUE.

Step 6: Compute and pricing

Follow the below mentioned steps to configure the budget for the model training:

Figure 10.9: Compute and training for image classification model

  1. Set the minimum node hours for 8 (it is the minimum value for image data).
  2. Click START TRAINING.

It will take a few hours to train the image classification model. Prediction and the explanation for the prediction can be obtained.

Data for Explainable AI exercise – Explainable AI

In this exercise, we will try to understand how explainable AI can be used to understand model prediction using image data and the tabular data with the help of AutoML of Vertex AI. The data used for AutoML tables and images will be used for this exercise as well. (Refer chapters Introduction to Vertex AI and AutoML Tabular and AutoML Image, text and pre-built models).

AutoML_image_data_exai bucket is created under us-centra1 (single region).

Three folders containing image data (Cise_ships, Ferry_boat and Kayak) is uploaded and CSV file (class_labels.csv) is created as shown in Chapter 3, AutoML Image, text and pre-built models (refer Figure 3.1 and 3.2 for csv creation). heart_2020_train_data.csv is uploaded to the same folder which will be used for AutoML tables. Figure 10.2 shows the data uploaded to the cloud storage:

Figure 10.2: Data uploaded to the cloud storage

Model training for image data

The initial steps for dataset creation for the image data will have no changes. Refer Image dataset creation of Chapter 3, AutoML Image, text and pre-built models and follow the below mentioned steps for AutoML model training.

Step 1: Train new model

Newly created dataset will be listed under the dataset of the Vertex AI. Follow the below steps to initiate the model training as shown in the following screenshot:

Figure 10.3: Image dataset created on Vertex AI

  1. Click on Datasets section of Vertex AI (open the newly created image dataset).
  2. Click TRAIN NEW MODEL.

Step 2: Training method selection

Training method step does not have difference as mentioned in Chapter 3, AutoML Image, text and pre-built models. Follow these steps to set the training method:

Figure 10.4: Training method selection

  1. Select AutoML.
  2. Select Cloud (we will deploy the model to get predictions and explanations).
  3. Click on CONTINUE.

Step 3: Model details

Follow the steps mentioned below to set the model details:

Figure 10.5: Image classification model details

  1. Select Train new model.
  2. Provide a Name for the model.
  3. Provide Description for the model.
  4. Under Data split select Randomly assigned.
  5. Click CONTINUE.

Step 4: Training options

Follow the below mentioned steps for training options:

Figure 10.6: Image classification training options

  1. Select Default training method.
  2. Click CONTINUE (we do not have to enable incremental training for the explainable AI).

Feature attribution methods – Explainable AI

Each approach for attributing features is based on Shapley values, which is an algorithm derived from cooperative game theory that gives credit for a given result to each participant in a game. When this concept is applied to models for machine learning, it indicates that each model feature is dealt with as if it were a “player” in the game. The Vertex Explainable AI provides a certain amount of credit to each individual characteristic based on its weight in the overall forecast:

  • Sampled Shapley method: The sampled Shapley technique offers an estimate of the actual Shapley values via the use of sampling. Tabular models created using AutoML make use of the sampled Shapley approach to determine the relevance of features. For these models, which are meta-ensembles of tree and neural network structures, the Sampled Shapley method performs quite well.
  • Integrated gradients method: Along an integral route, the integrated gradients approach calculates the gradient of the prediction output with respect to the characteristics of the input. This is done to get an accurate result. Calculations of the gradients are performed at various time intervals along a scaling parameter. Utilizing the Gaussian quadrature rule allows for the calculation of the size of each interval. (When dealing with picture data, think of this scaling option as a “slider” that sets all the image’s pixels to a black value.) The integration of the gradients is done as follows:
    • An approximation of the integral may be found by using a weighted average.
    • Calculations are performed to get the element-wise product of the original input and the averaged gradients.
  • XRAI method: To discover which parts of a picture that contribute the most to a certain prediction of class, the XRAI approach uses a combination of the integrated gradients method and some extra phases.
  • Pixel-level attribution: XRAI can do pixel-level attribution for the picture that is sent into it. In this stage of the process, XRAI makes use of the integrated gradients approach, applying it to both, a black and a white baseline.
  • Over segmentation: XRAI generates a tiny patch over the picture by over segmentation , which is done independently of pixel-level attribution. In order to construct the picture segments, XRAI takes advantage of Felzenswalb’s graph-based technique.
  • Region selection: XRAI compiles the pixel-level attribution included inside each segment to calculate the attribution density of that segment. XRAI assigns a ranking to each segment based on these values, and then it arranges the segments from most positive to least positive. This identifies which parts of the picture contribute the most strongly to a certain class prediction, as well as which parts of the image are most prominent.

All types of models are supported for feature-based explanations. Classification models are supported for AutoML images and classification, and regression models are supported for AutoML tabular models.

Example-based explanations – Explainable AI

In the case of explanations based on examples, Vertex AI makes use of the closest neighbor search to produce a list of instances (usually taken from the training set) that are most comparable to the input. These examples allow users to investigate and clarify the behavior of the model since users can reasonably anticipate that comparable inputs would result in similar predictions.

Consider the following scenario: users have a model that analyzes photos to determine whether they depict a bird or an aircraft; however, the model incorrectly identifies certain birds as planes. To figure out what is going on, we may extract other photos from the training set that is comparable to the one we are looking at and utilize example-based explanations to explain what is occurring. When we look at those instances, we see that many of the incorrectly identified birds and the training examples that are comparable to them are dark silhouettes and that most of the dark silhouettes that were aircraft were found in the training set. This suggests that users might potentially increase the quality of the model by including more silhouetted birds in the training set.

Explanations that are based on examples may also help identify confusing inputs that might be improved with human labelling. Models that provide embedding or latent representation for input variables are supported. Tree based models which do not provide embeddings for the inputs are not supported in examples-based explanations.

Feature-based explanations

Feature-based explanations is another way of explaining model output based on the features. The amount of contribution that each feature in the model made to the predictions that were made for a particular instance is shown by the feature attributions. When users make a request for predictions, they will get anticipated values that are suitable for the model you are using. Feature attribution info will be provided when users request for the explanations.

Feature attributions work on image and tabular data. They are supported for AutoML and custom trained models. (Classification models only for image data and classification/regression models for tabular data).

Need of Explainable AI – Explainable AI

Artificial intelligence has the ability to automate judgments, and the outcomes of such decisions may have both beneficial and bad effects on businesses. It is essential to have an understanding of how AI comes to its conclusions, just as it is essential to have this understanding when recruiting decision is made for the business. A great number of companies are interested in using AI, but are hesitant to hand over decision-making authority to the model or AI simply because they do not yet trust the model. Explainability is beneficial in this regard since it offers insights into the decision-making process that models use. Explainable AI is a crucial component in the process of applying ethics to the usage of AI in business. Explainable AI is predicated on the notion that AI-based applications and technology should not be opaque “black box” models that are incomprehensible to regular people. Figure 10.1 shows the difference between AI and Explainable AI:

Figure 10.1: Explainable AI

In the majority of the scenarios developing complex models is far easier than convincing stakeholders that the model is capable of producing decisions that are superior to those produced by humans. It is not the same thing as having greater accuracy scores or a lower RMSE to make a better judgment. A correct conclusion may be reached by providing accurate data as input. In many cases, the person making the choice is the one who needs to comprehend it. For them to feel at ease handing over the decision-making to the model, they need to understand how the model came to its conclusions.

Explainable AI is essential to the development of responsible AI because it offers an adequate amount of transparency and responsibility for the choices made by complicated AI systems. This is of utmost importance when it comes to artificial intelligence systems that have a substantial influence on the lives of people.

XAI on Vertex AI

Explainable AI of Vertex AI provides explanations that are either feature-based or example-based in order to give a better understanding of how models make decisions. Anyone who builds or uses machine learning will gain new abilities if they learn how a model behaves and how it is influenced by its training dataset. These new abilities will allow users to improve their models, increase their confidence in their predictions, and understand when and why things work.

What is Explainable AI – Explainable AI

Introduction

This last chapter of the book covers explainable AI. We will start with understanding what is explainable AI, its need, how explainable AI works on Vertex AI (for image and tabular data) and how to get the explanations from the deployed model.

Structure

In this chapter, we will discuss the following topics:

  • What is Explainable AI
  • Need of Explainable AI
  • XAI on Vertex AI
  • Data for Explainable AI exercise
  • Model training for image data
  • Image classification model deployment
  • Explanations for image classification
  • Tabular classification model deployment
  • Explanations for tabular data
  • Deletion of resources
  • Limitations of Explainable AI

Objectives

By the end of this chapter, you will have a good idea about explainable AI and will know how to get the explanations from the deployed model in Vertex AI.

What is Explainable AI

Explainable AI (XAI) is a subfield of Artificial Intelligence (AI) that focuses on developing methods and strategies for using AI in a way that makes the outcomes of the solution understandable to human specialists. The mission of XAI is to ensure that AI systems be open and honest about not just the function they perform but also the purpose they serve. Interpretability is the broader umbrella under AI which includes explainable AI as one of its subcategories. Users can grasp what a model is learning, the additional information it must provide, and the reasoning behind its judgments concerning the problem that exists in the real world that we are seeking to solve, thanks to the model’s interpretability.

Explainable AI is one of the core ideas that define trust in AI systems (along with accountability, reproducibility, lack of machine bias, and resiliency). The aim and ambition shared by data scientists and machine learning technologists is the development of AI that is explainable.

Fetching feature values – Vertex AI Feature Store

Step 12: Fetching feature values
Feature values can be extracted from the feature store with the help of an online service client which has been created in Step 7 (Creation of feature store). Run the following lines of code to fetch data from the feature for a specific employee ID:
resp_data = client_data.streaming_read_feature_values(
featurestore_online_service.StreamingReadFeatureValuesRequest(
entity_type=client_admin.entity_type_path(
Project_id, location, featurestore_name, Entity_name
),
entity_ids=[“65438”],
feature_selector=FeatureSelector(id_matcher=IdMatcher(ids=[“employee_id”,”education”,”gender”,”no_of_trainings”,”age”])),
)
)
print(resp_data)

The output will be stored in the resp_data variable and it is an iterator. Run the following lines of code to extract and parse the data from the iterator:
names_col=[]
for resp in resp_data:
if resp.header.feature_descriptors != “”:
for head in resp.header.feature_descriptors:
names_col.append(head.id)
try:
values=[]
for items in resp.entity_view.data:
if items.value.string_value !=””:values.append(items.value.string_value)
elif items.value.int64_value !=””:values.append(items.value.int64_value)
except:pass
print(“Feature Names”,names_col)
print(“Feature Values”,values)

The output of the cell is shown in Figure 9.21:

Figure 9.21: Data extracted from the feature store
Deleting resources
We have utilized cloud storage to store the data and delete the CSV file from the cloud storage manually. Feature store is a cost-incurring resource on GCP, ensure to delete them. Also, the feature store cannot be deleted from the web console or GUI, we need to delete it through programming. Run the below-mentioned lines of code to delete the feature store (both the feature stores were created using GUI and Python). Check if the landing page of the feature store after running the code to ensure it is deleted (the landing page should look like the Figure 9.4):
client_admin.delete_featurestore(
request=fs_s.DeleteFeaturestoreRequest(
name=client_admin.featurestore_path(Project_id, location, featurestore_name),
force=True,
)
).result()
featurestore_name=employee_fs_gui
client_admin.delete_featurestore(
request=fs_s.DeleteFeaturestoreRequest(
name=client_admin.featurestore_path(Project_id, location, featurestore_name),
force=True,
)
).result()

Best practices for feature store

Below listed are few of the best practices for using Feature store of Vertex AI:

  1. Model features to multiple entities: Some features might be used in multiple entities (like clicks per product at user level). In this kind of scenarios, it is best to create a separate entity to group shared features.
  2. Access control for multiple teams: Multiple teams like data scientists, ML researchers, Devops, and so on, may require access to the same feature store but with different level of permissions. Resource level IAM policies can be used to restrict the access to feature store or particular entity type.
  3. Ingesting historical data (backfilling): It is recommended to stop online serving while ingesting the historical data to prevent any changes to the online store.
  4. Cost optimization:
    1. Autoscaling: Instead of maintaining a high node count, autoscaling allows Vertex AI Feature Store to analyze traffic patterns and automatically modify the number of nodes up or down based on CPU consumption and also works better for cost optimization.
    1. Recommended to provide a startTime in the batchReadFeatureValues or exportFeatureValues request to optimize offline storage costs during batch serving and batch export.

Conclusion

In this chapter, we learned about the feature store of Vertex AI, and worked on the creation of the feature store, entity type, adding features, and ingesting feature values using web console and Python.

In the next chapter, we will start understanding explainable AI, and how explainable AI works on Vertex AI.

Questions

  1. What are the different input sources from which data can be ingested into a feature store?
  2. Can feature stores have multiple entity types?
  3. What are the scenarios in which using a feature store brings value?

Creation of entity type – Vertex AI Feature Store

Step 8: Creation of entity type
Entity type will be created under the newly created feature store using the create_entity_type method. Run the below-mentioned code in a new cell to create the entity type. The output of the last line in the code provides the path of the entity type created. Check the feature store landing page, the newly created feature store, and the entity type will be displayed:
entity_creation = client_admin.create_entity_type(
fs_s.CreateEntityTypeRequest(
parent=client_admin.featurestore_path(Project_id, location, featurestore_name),
entity_type_id=Entity_name,
entity_type=entity_type.EntityType(
description=”employee entity”,
),
)
)
print(entity_creation.result())

Step 9: Creation of feature
Once the feature store and entity type are created, the feature needs to be created before ingesting the feature values. For each of the features, information on feature ID, type, and description is provided. Run the following-mentioned code in a new cell to add features:
client_admin.batch_create_features(
parent=client_admin.entity_type_path(Project_id, location, featurestore_name, Entity_name),
requests=[
fs_s.CreateFeatureRequest(
feature=feature.Feature(
value_type=feature.Feature.ValueType.INT64,
description=”employee id”,
),
feature_id=”employee_id”,
),
fs_s.CreateFeatureRequest(
feature=feature.Feature(
value_type=feature.Feature.ValueType.STRING,
description=”education”,
),
feature_id=”education”,
),
fs_s.CreateFeatureRequest(
feature=feature.Feature(
value_type=feature.Feature.ValueType.STRING,
description=”gender”,
),
feature_id=”gender”,
),
fs_s.CreateFeatureRequest(
feature=feature.Feature(
value_type=feature.Feature.ValueType.INT64,
description=”no_of_trainings”,
),
feature_id=”no_of_trainings”,
),
fs_s.CreateFeatureRequest(
feature=feature.Feature(
value_type=feature.Feature.ValueType.INT64,
description=”age”,
),
feature_id=”age”,
),
],
).result()

Once the features are created, they are displayed in the output of the cell as shown in Figure 9.19:

Figure 9.19: Addition of features to the feature store using Python

Step 10: Define the ingestion job
As seen in the web console, feature values can be ingested from cloud storage or BigQuery. We shall use the same CSV file which has been uploaded to the cloud storage. Importantly, we should also supply the timestamp information while ingesting the values. Timestamps can be provided in the code or there can be a separate column in the data that contains timestamp information. Timestamp information must be in google.protobuf.Timestamp format. Run the following code in a new cell to define the ingestion job:
seconds = int(datetime.datetime.now().timestamp())
timestamp_input = Timestamp(seconds=seconds)
ingest_data_csv = fs_s.ImportFeatureValuesRequest(
entity_type=client_admin.entity_type_path(
Project_id, location, featurestore_name, Entity_name
),
csv_source=io.CsvSource(
gcs_source=io.GcsSource(
uris=[
“gs://feature_store_input/employee_promotion_data_fs.csv”
]
)
),
entity_id_field=”employee_id”,
feature_specs=[
ImportFeatureValuesRequest.FeatureSpec(id=”employee_id”),
ImportFeatureValuesRequest.FeatureSpec(id=”education”),
ImportFeatureValuesRequest.FeatureSpec(id=”gender”),
ImportFeatureValuesRequest.FeatureSpec(id=”no_of_trainings”),
ImportFeatureValuesRequest.FeatureSpec(id=”age”),
],
feature_time=timestamp_input,
worker_count=1,
)

Note: If all feature values were generated at the same time, there is no need to have a timestamp column. Users can specify the timestamp as part of the ingestion request.
Step 11: Initiation of ingestion job
The ingestion job needs to be initiated after it is defined, run the following line of codes to begin the ingestion process:
ingest_data = client_admin.import_feature_values(ingest_data_csv)
ingest_data.result()

Once the ingestion is complete, it will provide information on the number of feature values ingested as shown in Figure 9.20:

Figure 9.20: Ingestion of feature values using Python

Working on feature store using Python – Vertex AI Feature Store

In the previous section, we worked on a feature store for the creation and uploading of feature values using the GUI approach. In this section, we shall create another feature store, ingest values, and also fetch the values from the feature store.

We will be using the Python 3 notebook file to type commands for working on the feature store. Follow the following-mentioned steps to create a Python file and type the Python codes given in this section.

Step 1: Create a Python notebook file
Once the workbench is created, open Jupyterlab and follow the steps mentioned in Figure 9.17 to create a Python notebook file:

Figure 9.17: New launcher window of notebook

  1. Click the new launcher.
  2. Double-click the Python 3 notebook file to create one.

Step 2: Package installation
Run the following commands to install the google cloud AI platform package. (It will take a few minutes to install the packages):
USER=”–user”
!pip install {USER} google-cloud-aiplatform

Step 3: Kernel restart
Type the following commands in the next cell, to restart the kernel. (users can restart the kernel from the GUI as well):
import os
import IPython
if not os.getenv(“”):
    IPython.Application.instance().kernel.do_shutdown(True)

Step 4: Importing the installed packages
Run the following-mentioned codes in a new cell to import the required packages:
import google.cloud.aiplatform_v1
from google.cloud.aiplatform_v1.types import featurestore_service as fs_s
from google.cloud.aiplatform_v1.types import featurestore as fs
from google.cloud.aiplatform_v1.types import feature
from google.cloud.aiplatform_v1.types import entity_type
from google.cloud.aiplatform_v1.types import io
from google.protobuf.timestamp_pb2 import Timestamp
from google.cloud.aiplatform_v1.types.featurestore_service import ImportFeatureValuesRequest
from google.cloud.aiplatform_v1.types import FeatureSelector, IdMatcher
from google.cloud.aiplatform_v1.types import featurestore_online_service
import datetime

Step 5: Setting up the project and other variables
Run the following-mentioned line of codes in a new cell to set the project to the current one and also define variables to store the path for multiple purposes:
Project_id=”vertex-ai-gcp-1”
featurestore_name=”employee_fs_pysdk”
Entity_name=”emp_entity_pysdk”
location = “us-central1”
endpoint = “us-central1-aiplatform.googleapis.com”

Step 6: Connecting to the feature store
Connection to the feature store is the first step to work on the feature store. We create a connection to the feature store through the service client to create and ingest values to it using FeaturestoreServiceClient.

FeaturestoreOnlineServingServiceClient is used to fetch the feature values from the feature store. Run the below-mentioned line of codes to complete the connection:
client_admin = google.cloud.aiplatform_v1.FeaturestoreServiceClient(client_options={“api_endpoint”: endpoint})
client_data = google.cloud.aiplatform_v1.FeaturestoreOnlineServingServiceClient(client_options={“api_endpoint”: endpoint})
fs_resource_path = client_admin.common_location_path(Project_id, location)

Step 7: Creation of feature store
Instead of using the feature store that has been created from the GUI approach, we will create a new one. Feature store name, location, and project information are already in Step 5 (Setting up the project and other variables). Run the following code to create the feature store. The status of the feature store will be displayed in the results, as shown in Figure 9.18. The feature store creation procedure is a long-running operation, they are asynchronous jobs. except PI calls like updating or removing feature stores follow the same procedure.
create_fs = client_admin.create_featurestore(
    fs_s.CreateFeaturestoreRequest(
        parent=fs_resource_path,
        featurestore_id=featurestore_name,
        featurestore=fs.Featurestore(
            online_serving_config=fs.Featurestore.OnlineServingConfig(
                fixed_node_count=1
print(create_fs.result())
client_admin.get_featurestore(name=client_admin.featurestore_path(Project_id, location, featurestore_name))

Figure 9.18: Feature store creation using Python

Entity type created successfully – Vertex AI Feature Store

Step 6: Entity type created successfully

The entity type is created successfully as shown in Figure 9.8 under the selected feature store:

Figure 9.8: Entity type created and listed on the landing page

  1. Click the newly created Entity type.

Step 7: Creation of features

Once the entity type is created, features need to be created before ingesting the values. Follow the steps mentioned in Figure 9.9 to create features:

Figure 9.9: Creation of features

  1. Click ADD FEATURES.

A new side tab will pop out to enter the features as shown in Figure 9.10, follow the below steps to create the features:

Figure 9.10: Adding user input for feature creation

  1. Enter the Feature name.
  2. Enter the Value type stored in that feature.
  3. Enter the Description for the feature.
  4. Click Add Another Feature to add new features.
  5. Click SAVE, once all the features are added.

Step 8: Features created successfully

Once the features are created successfully, they are displayed on the entity type page as shown in Figure 9.11:

Figure 9.11: Features listed under the entity type

  1. Newly created features are displayed in tabular format.
  2. Click on Ingest Values to add the feature values.

Step 9: Ingesting feature values

Follow the steps mentioned in Figure 9.12 to initiate the ingestion of feature values:

Figure 9.12: Importing data to features

  1. Data can be ingested from cloud storage or BigQuery. Select Cloud Storage CSV file.
  2. Select the CSV file from the cloud storage by clicking BROWSE.
  3. Click CONTINUE and follow the steps mentioned in Figure 9.13:

After selecting the data source, we need to map the columns of the data source to the features. Follow the steps mentioned in the Figure 9.13 to map the features:

Figure 9.13: Mapping of columns to features

  1. Add employee ID, since that is the column which is containing unique values.
  2. Select to enter the Timestamp manually. If data contains the timestamp values, the same column can be used here.
  3. Select the date and time.
  4. Map the column names in the CSV file to the features.
  5. Click INGEST to initiate the ingestion job.

Step 10: Ingestion job successful

Once the feature values are ingested successfully, the ingestion job status will be updated as shown in Figure 9.14:

Figure 9.14: Ingestion jobs of feature store

  1. The ingestion job is completed successfully.

Step 11: Landing page of feature store after the creation of feature store, entity type, and features

The landing page of the feature store is shown in Figure 9.15, all the features under entity type and feature store are listed and displayed in the tabular format:

Figure 9.15: Landing page of feature store after the creation of features

  1. Click the age feature. The window will navigate to the properties of the feature as shown in Figure 9.16:

Figure 9.16: Properties of feature

  1. For all the features, Feature Properties consisting of basic information and statistics are displayed.
  2. Metrics are populated if the monitoring feature is enabled for the feature store and for that particular feature.