miércoles, 19 de septiembre de 2018

SAP Leonardo Machine Learning API’s on the Go


Working for the d-shop, first in the Silicon Valley and now in Toronto, allows me to use my creativity and grab any new gadget that hits the market.

This time, it was Oculus Go’s turn 😉 and what’s the Oculus Go? Well, it is an Standalone VR headset, which basically means…no tangled cables 😉

For this project I had the chance to work with either Unity or Unreal Engine…I had used Unity many times to develop Oculus Rift and Microsoft HoloLens applications…so I thought Unreal Engine would be a better choice this time…although I have never used it in a big project before…specially because nothing beats Unreal when it comes to graphics…

With Unreal chosen…I needed to make another decision…C++ or Blueprints…well…while I have used C++ in the past for a couple of Cinder applications…Blueprints looked better as I wanted to develop faster and without too many complications…and well…that’s half of the truth…sometimes Blueprints can become really messy 😊

Just so you know, I used Unreal Engine 4.20.2 and created a Blueprints application.



Since the beginning I knew that I wanted to use SAP Leonardo Machine Learning API’s…as I used them before for my blog “Cozmo, read to me”  where I used a Cozmo Robot, OpenCV and SAP Leonardo’s OCR API to read a whiteboard with a handwritten message and have Cozmo read it out loud.

The idea

This time, I wanted to showcase more than just one API…so I needed to choose which ones…gladly that wasn’t really hard…most API are more “Enterprise” oriented…so that left me with “Image Classification, OCR and Language Translation” …

With all decided…I still needed to figure out how to use those API’s…I mean…Oculus Go is Virtual Reality…so no chance of looking at something, taking a picture and send it to the API…

So, I thought…why don’t I use Blender (which is an Open-Source 3D computer graphics software toolset) and make some models…then I can render those models…take a picture and send it to the API…and having models means…I could turn them into “.fbx” files and load them into Unreal for a nicer experience…

With the OCR and Language Translation API’s…it was different…as I needed images with text…so I decided to use InkScape (which is an Open-Source Vector Graphics Editor).

The implementation

When I first started working on the project…I knew I needed to start step by step…so I first did a Windows version of the App…then ported it to Android (Which was pretty easy BTW) and finally ported it to Oculus Go (Which was kind of painful…)

So, sadly I’m not going to be able to put any source code here…simply because I used Blueprints…and I’m not sure if you would like to reproduce them by hand ☹ You will see what I mean later on this blog…

Anyway…let’s keep going 😊

When I thought about this project, the first thing that came into my mind was…I want to have a d-shop room…with some desks…a sign for each API…some lights would be nice as well…



So, doesn’t look that bad, huh?

Next, I wanted to work on the “Image Classification” API…so I wanted to be fairly similar…but with only one desk in the middle…which later turned into a pedestal…with the 3D objects rotating on top of it…the it should be a space ready to show the results back from the API…also…arrows to let the user change the 3D model…and a house icon to allow the user to go back to the “Showfloor”…




You will notice two things right away…first…what does that ball supposed to be? Well…that’s just a placeholder that will be replaced by the 3D Models 😊 Also…you can see a black poster that says “SAP Leonardo Output”…that’s hidden and only become available when we launch the application…

For the “Optical Character Recognition” and “Language Translation” scenes…it’s pretty much the same although the last one doesn't have arrows 😊





The problems

So that’s pretty much how the scenes are related…but of course…I hit the first issue fast…how to call the API’s using Blueprints? I looked online and most of the plugins are paid ones…but gladly I found a free one that really surprised me…UnrealJSONQuery works like a charm is not that hard to use…but of course…I needed to change a couple of things in the source code (like adding the header for the key and changing the parameter to upload files). Then I simply recompiled it and voila! I got JSON on my application 😉

But you want to know what I changed, right? Sure thing 😊 I simply unzip the file and went to JSONQuery --> Source --> JSONQuery --> Private and opened JsonFieldData.cpp

Here I added a new header with (“APIKey”, “MySAPLeonardoAPIKey”) and then I looked for PostRequestWithFile and change the “file” parameter to “files”…

To compile the source code, I simply created a create a new C++ project, then a “plugins” folder in the root folder of my project and put everything from the downloaded folder…open the project…let it compiled and then I re-created everything from my previous project…once that was done…everything started to work perfectly…

So, let’s see part of the Blueprint used to call the API…




Basically, we need to create the JSON, call the API and then read the result and extract the information.

Everything was going fine and dandy…until I realized that I needed to package the 3D images generated by Blender…I had no idea how to do it…so gladly…the Victory Plugin came to the rescue 😉 Victory has some nodes that allows you to read many directories from inside the generated application…so I was all set 😊

This is how the Victory plugin looks like when using it in a Blueprint…




The Models

For the 3D Models as I said…I used Blender…I modeled them using “Cycles Render”, baked the materials and then render the image using “Blender Render” to be able to generate the .fbx files…





If the apples look kind of metallic or wax like…blame my poor lighting skills ☹

When loaded into Unreal…the models look really nice…


Now…I know you want to see how a full Blueprint screen looks like…this one is for the 3D Models on the Image Classification scene…


Complicated? Well...kind of…usually Blueprints are like that…but they are pretty powerful…

Here’s another one…this time for the “Right Arrow” which allows us to change models…


Looks weird…but works just fine 😉



You may realize that both “Image Classification” and “OCR” both have Right and Left arrows…so I needed to do some reuse of variables and they needed to be shared between Blueprints…so…for that I created a “Game Instance” where I simply create a bunch of public variables that could be then shared and updated.

If you wonder what I used Inkscape for? Well…I wanted to have a kind of Neon Sign image and a handwritten image…



From Android to Oculus Go

You may wonder…why does it changed from Android to the Oculus Go? Aren’t they both Android based? Well…yes…but still…thanks to personal experience…I know that things change a lot…

First…on Android…I created the scenes…and everything was fine…on the Oculus Go…no new scenes were loaded…when I clicked on a sign…the first level loaded itself… ☹ Why? Because I needed to include them in the arrays of scenes to be packaged…

And the funny thing is that the default projects folder for Unreal is “Documents”…so when I tried to add the scene it complained because the path was too long…so I need to clone the project and move it a folder on C:\

Also…when switching from Windows to Android…it was a simple as changing the “Click” to “Touch”…but for Oculus Go…well…I needed to create a “Pawn”…where I put a camera, a motion controller, and a pointer (acting like a laser pointer)…here I switch the “Touch” for a “Motion Controller Thumbstick”…and then from here I needed to control all the navigation details…very tricky…

Another thing that changed completely was the “SAP Leonardo Output”…let’s see how that looked on Android…



Here you can see that I used a “HUD”…so wherever you look…the HUD will go with you…

On the Oculus Go…this didn’t happen at all…first I needed to put a black image as a background…

Then I needed to create an actor and then put the HUD inside…turning it on a 3D HUD…




The final product

When everything was done…I simply packaged my app and load it into the Oculus Go…and by using Vysor I was able to record a simple session so you can see how this looks in real life 😉 Of course…the downside (Because first…I’m lazy to keep figuring out things and second because it’s too much hassle) is that you need to run this from the “Unknown Sources” section on the Oculus Go…but…it’s there and working and that’s all that matters 😉

Here’s the video so you can fully grasp what this application is all about 😊





I hope you like it 😉

Greetings,

Blag.
SAP Labs Network.




lunes, 30 de julio de 2018

Cozmo, read to me


Do you know Cozmo? The friendly robot from Anki? Well...here he is...

Cozmo is a programmable robot that has many features...and one of those includes a camera...so you can Cozmo take a picture of something...and then do something with that picture...

To code for Cozmo you need to use Python...actually...Python 3 -;)

For this blog, we're going to need a couple of things...so let's install them...

pip3 install ‘cozmo[camera]’

This will install the Cozmo SDK...and you will need to install the Cozmo app in your phone as well...

If you have the SDK installed already, you may want to upgrade it because if you don't have the latest version it might not work...

pip3 install --upgrade cozmo

Now, we need a couple of extra things...

sudo apt-get install python-pygame
pip3 install pillow
pip3 install numpy

pygame is a games framework
pillow is a wrapper around the PIL library and it's used to manage images.
numpy allows us to manage complex numbers in Python.

That was the easy part...as now we need to install OpenCV...which allows to manipulate images and video...

This one is a little bit tricky, so if you get stuck...search on Google or just drop me a message...

First, make sure that OpenCV is not installed by removing it...unless you are sure it's working properly for you...

sudo apt-get uninstall opencv

Then, install the following prerequisites...

sudo apt-get install build-essential cmake pkg-config yasm python-numpy

sudo apt-get install libjpeg-dev libjpeg8-dev libtiff5-dev libjasper-dev 
libpng12-dev

sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev 
libv4l-dev libdc1394-22-dev

sudo apt-get install libxvidcore-dev libx264-dev libxine-dev libfaac-dev

sudo apt-get install libgtk-3-dev libtbb-dev libqt4-dev libmp3lame-dev

sudo apt-get install libatlas-base-dev gfortran

sudo apt-get install libopencore-amrnb-dev libopencore-amrwb-dev 
libtheora-dev libxvidcore-dev x264 v4l-utils

If by any chance, something is not available on your system, simply remove it from the list and try again...unless you're like me and want to spend hours trying to get everything...

Now, we need to download the OpenCV source code so we can build it...from the source...

wget https://github.com/opencv/opencv/archive/3.4.0.zip
unzip opencv-3.4.0.zip //This should produce the folder opencv-3.4.0

Then, we need to download the contributions because there are some things not bundled in OpenCV by default...and you might need them for any other project...

wget https://github.com/opencv/opencv_contrib/archive/3.4.0.zip
unzip opencv-contrib-3.4.0.zip 
//This should produce the folder opencv_contrib-3.4.0

As we have both folders, we can start compiling...

cd opencv-3.4.0
mkdir build
cd build
cmake -D CMAKE_BUILD_TYPE=RELEASE 
-D CMAKE_INSTALL_PREFIX=/usr/local 
-D INSTALL_PYTHON_EXAMPLES=OFF 
-D CMAKE_CXX_COMPILER=/usr/bin/g++ 
-D INSTALL_C_EXAMPLES=OFF 
-D OPENCV_EXTRA_MODULES_PATH=/YourPath/opencv_contrib-3.4.0/modules 
-D PYTHON_EXECUTABLE=/usr/bin/python3.6 
-D WITH_FFMPEG=OFF 
-D BUILD_OPENCV_APPS=OFF 
-D BUILD_OPENCD_TS=OFF 
-D WITH_LIBV4L=OFF 
-D WITH_CUDA=OFF 
-D WITH_V4L=ON 
-D WITH_QT=ON 
-D WITH_LAPACK=OFF 
-D WITH_OPENCV_BIOINSPIRED=OFF 
-D WITH_XFEATURES2D=ON 
-D WITH_OPENCL=OFF 
-D WITH_FACE=ON 
-D ENABLE_PRECOMPILED_HEADERS=ON 
-D WITH_OPENCL=OFF 
-D WITH_OPENCL_SVM=OFF 
-D WITH_OPENCLAMDFFT=OFF 
-D WITH_OPENCLAMDBLAS=OFF 
-D WITH_OPENCV_DNN=OFF 
-D BUILD_OPENCV_APPS=ON 
-D BUILD_EXAMPLES=OFF ..

Keep extra attention that you need to pass the correct path to your opencv_contrib folder...so it's better to pass the full path to avoid making errors...

And yes...that's a pretty long command for a build...and it took me a long time to make it work...as you need to figure out all the parameters...

Once we're done, we need to make it...as cmake will prepare the recipe...

make -j2

If there's any mistake, simply do this...

make clean
make

Then, we can finally install OpenCV by doing this...

sudo make install
sudo ldconfig

To test that it's working properly...simply do this...

python3
>>>import cv2


If you don't have any errors...then we're good to go -;)

That was quite a lot of work...anyway...we need an extra tool to make sure our image get nicely processed...

Download textcleaner and put in the same folder as your Python script...

And...just in case you're wondering...yes...we're going to have Cozmo take a picture...we're going to process it...use SAP Leonardo's OCR API and then have Cozmo read it back to us...cool, huh?
SAP Leonardo's OCR API is still on version 2Alpha1...but regardless of that...it works amazing well -;)

Although keep in mind that if the result is not always pretty accurate that because of the lighting, the position of the image, your handwritting and the fact that the OCR API is still in Alpha...

Ok...so first things first...we need a white board...


And yes...my hand writing is far from being good... -:(

Now, let's jump into the source code...


CozmoOCR.py
import cozmo
from cozmo.util import degrees
import PIL
import cv2
import numpy as np
import os
import requests
import json
import re
import time
import pygame
import _thread

def input_thread(L):
    input()
    L.append(None)

def process_image(image_name):
 image = cv2.imread(image_name)
 
 img = cv2.resize(image, (600, 600))
 img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
 
 blur = cv2.GaussianBlur(img, (5, 5), 0)
 denoise = cv2.fastNlMeansDenoising(blur)
 thresh = cv2.adaptiveThreshold(denoise, 255, 
                 cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)
 blur1 = cv2.GaussianBlur(thresh, (5, 5), 0)
 dst = cv2.GaussianBlur(blur1, (5, 5), 0)
 
 cv2.imwrite('imggray.png', dst)
 
 cmd = './textcleaner -g -e normalize -o 12 -t 5 -u imggray.png out.png'
 
 os.system(cmd) 

def ocr():
 url = "https://sandbox.api.sap.com/ml/ocr/ocr"
 
 img_path = "out.png"
 
 files = {'files': open (img_path, 'rb')}
 
 headers = {
     'APIKey': "APIKey",
     'Accept': "application/json",
 }
 
 response = requests.post(url, files=files, headers=headers)
 
 json_response = json.loads(response.text)
 json_text = json_response['predictions'][0]
 json_text = re.sub('\n',' ',json_text)
 json_text = re.sub('3','z',json_text)
 json_text = re.sub('0|O','o',json_text) 
 return json_text

def cozmo_program(robot: cozmo.robot.Robot):
 robot.camera.color_image_enabled = False
 L = []
 _thread.start_new_thread(input_thread, (L,))
 robot.set_head_angle(degrees(20.0)).wait_for_completed()
 while True:
  if L:
   filename = "Message" + ".png"
   pic_filename = filename
   latest_image = robot.world.latest_image.raw_image
   latest_image.convert('L').save(pic_filename)
   robot.say_text("Picture taken!").wait_for_completed()
   process_image(filename)
   message = ocr()
   print(message)
   robot.say_text(message, use_cozmo_voice=True, 
                                       duration_scalar=0.5).wait_for_completed()
   break

pygame.init()
cozmo.run_program(cozmo_program, use_viewer=True, force_viewer_on_top=True)


Let's analyze the code a little bit...

We're going to use threads, as we need to have a window where we can see what Cozmo is looking at and another with Pygame where we can press "Enter" as command to have Cozmo taking a picture.

Basically, when we run the application, Cozmo will move his head and get into picture mode...then, if we press "Enter" (On the terminal screen) it will take a picture and then send it to our OpenCV processing function.

This function will simply grab the image, scale it, make it grayscale, do a GaussianBlur to blur the image and remove the noise and reduce detail. Then we're going to apply a denoising to get rid of dust and fireflies...apply a threshold to separate the white and black pixels, and apply a couple more blurs...

Finally we're to call textcleaner to further remove noise and make the image cleaner...

So, here is the original picture taken by Cozmo...


This is the picture after our OpenCV post-processing...


And finally, this is our image after using textcleaner...

Finally, once we have the image the way we wanted, we can call the OCR API which is pretty straightforward...

To get the API Key, simply go to https://api.sap.com/api/ocr_api/overview and log in...

Once we have the response back from the API, we can do some Regular Expressions cleanup just to make sure some characters doesn't get wrongly recognized...

Finally, we can have Cozmo to read the message out loud -;) And just for demonstration purposes...


Here, I was lucky enough that the lighting and everything was perfectly setup...so it was a pretty clean response...further tests were pretty bad -:( But again...it's important to have good lighting...

Of course...you wan to see a video of the process in action, right? Well...funny enough...my first try was perfect! Even better than this one...but I didn't shoot the video -:( Further tries were pretty crappy until I could get something acceptable...and this is what you're going to watch now...the sun coming through the window didn't helped me...but it's pretty good anyway...


Hope you liked this blog -:)

Greetings,

Blag.
SAP Labs Network.

lunes, 21 de mayo de 2018

The Blagchain



Lately, I have been learning about Blockchain and Ethereum. Two really nice and interesting topics...but as they say...the best way to learn is by doing...so I put myself on working on the Blagchain.

So, what's the Blagchain? Basically, it's a small Blockchain application that picks some things from Blockchain and some things from Ethereum and it was build as an educational thing...in the Blagchain you can get a user, post a product or buy it and everything will be stored in a chain like structure...

Before we jump into the screenshots...let me tell you about the technology I chose for this little project...

There are many technologies out there...so choosing the right one is always a hard thing...half the way you can realize that nope...that was not the smartest decision...some other language can do a better job in less time...or maybe that particular feature is not available and you didn't knew it because you never need it before...

When I started learning about Blockchain and Ethereum...I knew I wanted to build the Blagchain using a web interface...so the first languages that came into my mind were out of the question...basically because they don't provide web interfaces or simply because it would be too painful to build the app using them...also I wanted a language with few dependencies and with easy installation and extension...I wanted an easy but fast language...and then...almost instantly I knew which one I had to use...

Crystal is similar to Ruby but faster...and nicer -;) Also...it has Kemal a Sinatra-like web framework...

When I discovered Crystal I was really impressed by how well it is designed...specially because...it's still on Alpha! How can such a young language can be so good? Beats me...but Crystal is really impressive...

Anyway...let's see how the Blagchain works...

For sure...it's not a dapp...but that's fine because you only use it locally...it uses two web applications that run on different ports...one working as the server and the other working as the client...


You can add a new product...


You can see here that we have our Genesis Block, a new block for the posting of a product (And they are connected via the Previous Hash) and also you can see that any transaction will cost us 0.1 Blagcoin...


Now, we can use another browser to create a new user...


As this user didn't create the product...he/her can buy it...and add a new transaction to the chain...


Money (Blagcoin) goes from one account to the other. The chain grows and everything is recorded...


What if you don't have enough Blagcoin to buy something?


Now...if you like this kind of things...this is how many lines of codes it took me...

Blagchain.cr (Server part) --> 129 lines
BlagchainClient.cr (Client part) --> 125 lines
index.ecr (HTML, Bootstrap and JQuery) --> 219 lines

So not even 500 lines of codes for the whole application...that's pretty cool, huh? -;)

And yes...I know you want to see a little bit of the source code, right? Well...why not -:)

BlagchainClient.cr
post "/sellArticle" do |env|
  user = env.params.body["user"]
  article = env.params.body["article"]
  description = env.params.body["description"]
  price = env.params.body["price"]
  amount = (env.session.float("amount") - 0.1).round(2)
  env.session.float("amount", amount)
  HTTP::Client.post("http://localhost:3000/addTransaction", form: "user=" + user + 
                    "&article=" + article + "&description=" + description + "&price=" + price)
  env.session.bool("flag", true)
  env.redirect "/Blagchain"
end

Greetings,

Blag.
SAP Labs Network.

miércoles, 17 de enero de 2018

Wooden Puzzle - My first Amazon Sumerian Game

If you read my previous blog Amazon Sumerian - First impressions you will know that I wouldn't stop there -;)

I have been able to play a lot with Sumerian and most important...to learn a lot...the tutorials are pretty good so you should read them even if you don't have access to Sumerian yet...

Once thing that I always wanted to do...was to animate my Cozmo model...that I did on Blender...


I tried doing it on Blender (rigging it and doing the animation but it was getting weird as it worked fine on Blender but not on Sumerian...now I know why...but at the time I got frustrated) but failed...so instead I thought on doing it on Sumerian using its tools...

I gotta admit...at first it didn't worked...but then I kept exploring and realized that the Timeline was my friend...and after many testings...I got it working -;)

Here is how it looks like...


So just go to Cozmo and click on the robot to start the animation and then click on him again to restart the animation...

Simple but really cool -:)

After that...I start thinking about doing something else...something more interesting and this time involving some programming...which is actually JavaScript and not NodeJS like I though initially -:(

Anyway...I tried to do that once in Unity and also in Flare3D, but didn't had enough luck...although fair enough...by that time I didn't knew Blender...so I put myself into working on it...


I designed a Wooden Puzzle board using Blender and then imported into Sumerian and applied a Mesh Collision to it...that way...the ball can run around the board and fall down if it gets over a hole...

Here is how it looks like...




To play...simple use the cursor keys to move the board and guide the ball from "Start" to "Finish". Pressing "r" restarts the game.

Here's the link to play it "Wooden Puzzle"...

Was it hard to build? Not really -:) Sumerian is very awesome and pretty powerful...on top of that...the Sumerian team is really nice and they are always more than willing to help...

So far...my Sumerian experience had been nothing but joy...so I can see myself doing more and more projects...

Of course...I'm already working on a couple more -;) Specially one involving using Oculus Rift...but that will take more time for sure....as I need to do a lot of Blender work..

Have you tried Sumerian? Not yet? Why don't you go ahead and request access?

Greetings,

Blag.
Development Culture.

viernes, 22 de diciembre de 2017

Amazon Sumerian - First impressions

For those who know me and for those who doesn't...as I work as a Developer Evangelist...my main job is to learn, explore and evangelize new technologies and programming languages...so of course...AR/VR had been on my plate for quite some time...

I have played with Unity3D and Unreal Engine...and of course I have developed for the Google Glass, Microsoft HoloLens and Oculus Rift...

When the good folks at Amazon announced Amazon Sumerian you can figure out that I completely thrilled -:D

So yesterday, I finally got accepted into the Beta program, so of course I started to follow a couple of tutorials and get to know the tool -;)

Please be advised that I'm starting...so I haven't tried or used everything...I want to go step by step following the tutorials and trying to understand everything in the most positive way...

Have I mentioned that Sumerian runs on your browser? How crazy is that? No installation...just launch up your browser and start building AR/VR experiences...

When you first launch it, you will be presented with the following screen...



Where you can create a new screen or simply use a template.

Sumerian provides many tutorials, and so far I have only made my way through the first 3...


So here's how my TV room looks like...


As you can see...Sumerian is a full blown editor that provides all the tools that you can find on any other editor...plus many things that I believe are brand new and exciting...

Of course, you can preview your work...


As for the TV Room tutorial...the idea is that below the TV Screen, there's an Amazon Echo, so you can press it to change the videos presented on the screen. For this you need to use a State Machine and also create a script that will manage the different videos. For the scripting you need to use NodeJS...which is really nice as is the language that I mainly use when developing application for Alexa...



This is how my TV Room looks like when playing a video on render mode -:)


Before moving on to learn more about Sumerian...I need to say that the navigation system doesn't seem to be too good by now...you can use the mouse buttons, Tab and Shift...but control keys or AWSD doesn't seem to work like you would expect on Unity3D or Unreal Engine...I have forwarded my question to the Sumerian Team on Slack...so I will update this post as soon as I get an answer :)

*UPDATE* By following the "Lights and Camera" tutorial I found out that while the default camera doesn't allow fine grain navigation...the FlyCam does it! -:D All good in the hood -;)

Till next time,

Blag.
Development Culture.

martes, 2 de mayo de 2017

Blender Lego Art for HoloLens

This blog was originally posted on Blender Lego Art for HoloLens.


Who doesn’t love Lego? And if you have used Blender before…who doesn’t love Blender? -:)

Combining both seemed like a great idea, so that’s what I did…Using Blender I create some pretty simple Lego pieces that can be used to build both simple and complicated models.


A single piece is 0.25 by 0.25 and it’s made out of a single vertex. In the image, the colors are just used to give an idea of the different pieces.

The main point is simply to create a new Blender file, append the different pieces and start building.

At first it’s kind of complicate because you need to deal with the X, Y and Z positions…but once you get used to it…it’s becomes a little bit addictive -:)


Now, the name of the blog is Lego Art, right? So…if you look online for Pixel Art instead, you will find a lot of nice images…like this one…


The perfect candidate for Lego construction! By putting the pieces together and simply assign them the right material…we can get this…


And with some more time and dedication…we can get this…



Anyway…with a collection of models…we can think up and bundle them together into a Microsoft HoloLens application -;)

The application itself is easy…you start by looking at all the models on a shelve…you can select one, make it smaller, bigger, turn it left or right or simply go back to the shelve.

Here’s a video showing how it’s look like.



Greetings,

Blag.
Development Culture.