Keras is an open source neural network library written in Python. It is capable of running on top of TensorFlow, Microsoft Cognitive Toolkit or Theano. In this particular project, I’ll use it with R since Keras is also available in R for quite some time. Besides Keras package, I’ll incorporate LIME package that allows the user to pry open black box machine learning models and explain their outcomes on a per-observation basis. LIME can be as well used with mlr, caret h20 xgboost, which are the most popular packages for supervised machine learning out there. 

The goal behind this project is to see if the model vgg16 is capable of recognizing my friend’s dog Oscar. Vgg16 is already pretrained model that is located within Keras as a default model.

First thing you should do is import these three libraries.

library(keras)
library(lime)
library(magick)

Afterwards load the application vgg16. The vgg16 model is an image classification model that has been build as part of the ImageNet competition where the goal is to classify pictures into 1000 categories with the highest accuracy. As we can see it is fairly complicated.

model <- application_vgg16(
weights = "imagenet",
include_top = TRUE)

In order to create anexplainerwe will need to pass in the training data as well. For image data the training data is really only used to tell lime that we are dealing with an image model, so any image will suffice. In this particular case, we will see if the vgg16 will recognize Oscar the dog. 

img <- image_read('cucek.jpg')
img_path <- file.path(tempdir(), 'cucek.jpg')
image_write(img, img_path)
plot(as.raster(img))

As with text models the explainerwill need to know how to prepare the input data for the model. For keras models this means formatting the image data as tensors. Thankfully keras comes with a lot of tools for reshaping image data:

image_prep <- function(x) {
arrays <- lapply(x, function(path) {
img <- image_load(path, target_size = c(224,224))
x <- image_to_array(img)
x <- array_reshape(x, c(1, dim(x)))
x <- imagenet_preprocess_input(x)
})
do.call(abind::abind, c(arrays, list(along = 1)))
}

We now have an explainermodel for understanding how the vgg16 neural network makes its predictions. Before we go along, lets see what the model think of our Oskar:

res <- predict(model, image_prep(img_path)) 

imagenet_decode_predictions(res)} 

class_name  class_description  score
1 n02094114 Norfolk_terrier  0.21228586
2 n02098413 Lhasa  0.18369357
3 n02094433 Yorkshire_terrier  0.13259803
4 n02102318 cocker_spaniel  0.12664761
5 n02097474 Tibetan_terrier  0.08206796}

It seems that the model thinks that there is a around 21% probability that within the image there is a Norkfolk_terrier. After a bit of research on the internet, I wanted to see what Lhasa is and it appears that Lhasa is rather known by its full name: Lhasa Apso, a non-sporting dog breed that originates from Tibet.

Now the next step would be to implement lime and overwrite the class labels of a model, using a function as_classifier.

We are used to classifiers knowing the class labels, but this is not the case for keras. Motivated by this, LIME now have a way to define/overwrite the class labels of a model, using the as_classifier()function. Let’s redo our explainer:

model_labels <- readRDS(system.file('extdata', 'imagenet_labels.rds', package = 'lime'))

explainer <- lime(img_path, as_classifier(model, model_labels), image_prep)

LIME comes with a function to assess the superpixel segmentation before beginning the explanation and it is recommended to play with it a bit — with time you’ll likely get a feel for the right values:

plot_superpixels(img_path)

The output of an image explanation is a data frame of the same format as that from tabular and text data. Each feature will be a superpixel and the pixel range of the superpixel will be used as its description. Usually the explanation will only make sense in the context of the image itself, so the new version of LIME also comes with a plot_image_explanation()function to do just that. Let’s see what our explanation have to tell us:

plot_image_explanation(explanation)

We can see that the model, for both the major predicted classes, focuses on the dog and that is in the first batch it recognized the face of the dog, while in the second one the floor was recognized as well. The fit is fairly large and I believe more stable result could be achieved if the number of permutations is set higher, by changing the default permutation of 10 per batch. Overall, image explanations do take time and they should be primarily used for better understanding of the model and potentially improve image classification. 

 

Facebook Comments