Working Material

A notebook about the results of Deep Learning with real world data

Jürgen Kanz

Author

Jürgen Kanz

Title

A notebook about the results of Deep Learning with real world data

Description

A notebook about the results of Deep Learning with real world data

Author: Jürgen Kanz

In[]:=

Hyperlink["www.juergen-kanz.de","http://www.juergen-kanz.de"]

Version 1.0 - January 14th, 2019

Introduction

I am working with Wolfram Mathematica 11.3.0 since few months, so I am still on a beginner level as programmer. The objective of this notebook is to show the application of different neural networks on real world data. That sounds very ambitious, but it’s not. The real world data here is an image of me and my dog “Kira”. We want to see what results come out of the algorithms, with the question behind: Can we trust the results without critical thinking?

I give feedback to every single result, whether the result meet my expectation or not.

Several Neural Networks are used to fullfil the needed tasks. Wolfram has done a great job to develop / collect the networks and to make them available for Mathematica users. Thank you, Stephen Wolfram.
Parts of the code are taken from the notebook “Wolfram Neural Net Repository” by Meghan Rieu-Werden and Matteo Salvarezza, Wolfram Research, Inc.

The following hyperlink will forward my readers to the “Wolfram Neural Net Repository” in case of further interest.

In[]:=

Hyperlink["WOLFRAM NEURAL NET REPOSITORY","http://resources.wolframcloud.com/NeuralNetRepository/"]

Initialization


Input: Real world data

Input is an image of me and my dog “Kira”

In[]:=

img=

;

Object Detection

Feature extraction

In[]:=

yolo=NetModel["YOLO V2 Trained on MS-COCO Data"];NetExtract[yolo,"Conv"]

Classification

In[]:=

NetExtract[yolo,"Region"]

Bounding box prediction

In[]:=

NetInformation[NetExtract[yolo,"BoxTransformation"],"SummaryGraphic"]

Net prediction

In[]:=

res=yolo[img];Dimensions/@res

Final result

In[]:=

detection=netEvaluator[yolo][img]

Combine the detection result and the input image:

In[]:=

highlightDetection[img,detection]

Result:
Good, the algorithm detects two elements - person and dog - in the image.

General-purpose object classification:

In[]:=

net=NetModel["Wolfram ImageIdentify Net for WL 11.1"]

Identify the main object in the image:

In[]:=

net[img]

Obtain the top 5 guesses:

In[]:=

net[img,{"TopProbabilities",5}]

Result:
The task for the algorithm is here to identify the main object in the image. No surprise it’s my dog. The breed is classified as an “Alaskan malamute”. Well, that is not correct. “Kira” is a mongrel. Her father is a German shepherd and her mother a Collie.

In[]:=

;

I have to admit that Kira looks very similar to an Alaskan malamute, as you can see in the picture above.

Object subitizing:

In[]:=

net=NetModel["Inception V1 Trained on Extended Salient Object Subitizing Data"]

Count the number of prominent items in an image:

In[]:=

net[img]

Obtain all the probabilities:

In[]:=

net[img,"Probabilities"]

Result:
Interestingly, the process only counts one prominent item and it’s not clear whether the dog or the person is meant.

Scene recognition:

In[]:=

net=NetModel["Inception V1 Trained on Places365 Data"]

In[]:=

net[img]

Obtain the top 10 guesses:

In[]:=

net[img,{"TopProbabilities",10}]

Result:
The neural network has to identify the background / surrounding. It tells us it’s a Field Road. In reality it’s more a Forest Path or a Forest Road. But honestly I do not know the difference among Forest Path and Road.
So, the calculation result is not as expected, but acceptable. Perhaps additional training images could improve the outcome.

Geolocation:

In[]:=

net=NetModel["ResNet-101 Trained on YFCC100m Geotagged Data"]

Determine the geolocation of a photograph:

In[]:=

net[img]

Show the position on the map:

In[]:=

GeoGraphics[%]

Obtain the top 50 guesses:

In[]:=

topGuesses=net[img,{"TopProbabilities",50}]

Plot these locations on the world map, with the size of the location marker proportional to the probability:

In[]:=

GeoBubbleChart[topGuesses,ChartStyleOpacity[0.4],ImageSizeLarge]

Result:
This is a great feature. The algorithm tries to identify the geographical position where the image was taken. Unfortunately it doesn’t work sufficiently in this case. The picture was taken here:

In[]:=

GeoGraphics[GeoPosition[{52.263417,9.452044}]]

Facial Analysis

In[]:=

faces=ImageTrim[img,#]&/@FindFaces[img]

Age estimation:

In[]:=

net=NetModel["Age Estimation VGG-16 Trained on IMDB-WIKI Data"]

Estimate the age:

In[]:=

net[faces]

Get the top 10 guesses:

In[]:=

net[faces,{"TopProbabilities",10}]

Plot the probability distribution over possible ages:

In[]:=

ListPlot[net[faces,"Probabilities"],PlotStyle{Red},FillingAxis]

Result:
Good, my face has been detected correctly, but the estimated age is not good enough. I was 53 years old when the image was taken. I assume more training examples could improve the network results.

Gender prediction:

In[]:=

net=NetModel["Gender Prediction VGG-16 Trained on IMDB-WIKI Data"]

Guess the gender:

In[]:=

net[faces]

Obtain the probabilities:

In[]:=

net[faces,"Probabilities"]

Result:
Yes, I am a man.

Facialkeypoints:

In[]:=

net=NetModel["Vanilla CNN for Facial Landmark Regression"]

Get the facial keypoints (eyes, nose, mouth):

In[]:=

keypoints=net[faces]

Visualize the predictions:

In[]:=

colorCodes=<|"LeftEye"

,"RightEye"

,"Nose"

,"LeftMouth"

,"RightMouth"

|>;

In[]:=

MapThread[HighlightImage[#1,{PointSize[0.04],Riffle[Values@colorCodes,#2]},DataRange{{0,1},{0,1}}]&,{faces,keypoints}]

Result:
Good, the facial keypoints are correctly placed in the image.

3D facial model:

In[]:=

net=NetModel["Unguided Volumetric Regression Net for 3D Face Reconstruction"]

Get the 3D volume:

In[]:=

volume=net

;

In[]:=

Dimensions[volume]

Visualize the model:

In[]:=

image3D=Image3D[volume,ViewPointBelow,BoxRatios{1,1,0.5},ImageSizeMedium]

In[]:=

ImageMesh[image3D,Method"DualMarchingCubes",ViewPointBelow,BoxRatios{1,1,0.5}]

Result:Cool, next step will be a 3D Print of my face

NotablePerson

In[]:=

photos=

;Classify["NotablePerson",photos]

In[]:=

who=Classify["NotablePerson",photos,{"TopProbabilities",3}];Row[{photos[[#]],Column[who[[#]]]}]&/@{1,5}

Result:
Okay, I am not a notable person. It’s interesting to know that this network believes that I am Winston Churchill. That is of course wrong, but obviously Winston has some similarity with me or vice versa.

Summary:
I am surprised about the good level of accuracy that Mathematica and belonging Neural Networks provide us already today. Of course the results are not perfect. Hence, critical thinking about the results is still necessary, because the input belongs to my domain; I have knowledge about the image. A other person that looks to the results does not have my knowledge and misinterpretations or simply wrong decisions can not be avoided. This can be a danger in other application areas.

Cite this as: Jürgen Kanz, "A notebook about the results of Deep Learning with real world data" from the Notebook Archive (2018), https://notebookarchive.org/2019-01-6genez1

A notebook about the results of Deep Learning with real world data

Initialization

Input: Real world data

Object Detection

Feature extraction

Classification

Bounding box prediction

Net prediction

Final result

General-purpose object classification:

Object subitizing:

Scene recognition:

Geolocation:

Facial Analysis

Age estimation:

Gender prediction:

Facialkeypoints:

3D facial model:

NotablePerson

Initialization
