Serverless Machine Learning at Google

Google can tell dogs from mops. Can you?

Bret McGowen presented on Serverless machine learning at Google. You can watch his full talk here but here are my notes.

Serverless

Four principles:

no need to manage/think about servers
no upfront provisioning, scale as you go (can't be wrong about having enough capacity)
pay per use
stateless/ephemeral

Serverless at Google:

Background functions: Cloud Storage, Cloud Pub/Sub
HTTP functions: API, Webhooks, Browser

Machine Learning

Machine learning is using many examples to answer questions.

Machine Learning at Google:

Use your own data: TensorFlow and Cloud Machine Learning Engine
Pretrained ML models: Cloud (Vision, Speech, Natural Language, Translation) API, Cloud Video Intelligence

Specifics on capabilities of Cloud Vision API:

Label detection (dog or mop?)
Face detection (within the photo, here is the location of the face)
OCR (read text from photos)
Explicit content detection (violence/adult)
Landmark detection (that's the Eiffel tower!)
Local Detection (not sure)

Other Cloud Vision features:

crop hints - suggested crop dimensions
web annotations - suggested other metadata to search about your page - eg from a photo of an iconic car, it can tell you the model of car, what film it was from, where it probably is. And can give you other matching images to back it up.

Cloud event trigger walkthrough

Cloud storage -> Cloud Functions -> Cloud vision API

NLP: extract entities from a sentence, sentiment analysis, syntax analysis (parse sentence to a lemma so you can see the parts of speech dependency graph)

Speech API

Speech to text transcription in 110 languages.

Azar - uses cloud speech api and cloud translation api to talk
Also gives timestamp of each word on top of transcript.

Video Intelligence API

Look through the whole video to label things.