Cloud Vision API and Its Use Cases

July 8, 2019

Tudip

What is a Vision API?

The API which is used for performing tasks related to computer vision like we can derive any useful information from an image but how a computer detects it from an image with absolute ease is the vision API.

What is Google’s Cloud Vision API?

Google has launched their Machine Learning models in an API, it allows developers to use their Vision technology. The Cloud Vision API can easily classify the images into thousands of categories and assign them sensible labels. The most powerful feature of this API is it detect entity(objects) within an image. It can even detect faces, and pieces of text within an image.

Google’s Vision API lets you do two things:

Use the API directly from your code for performing powerful image analysis, that too as a scale.
Build custom models using the API to accommodate more flexibility for your particular use case.

Cloud Vision API features

Label detection: Labels can identify and pull out information about entities within an image, across a broad group of categories such as objects, locations, activities, animal species, products, and more.
Landmark detection: Landmark detection is an important step for many tasks in computer vision. It detects distinct features in human faces automatically such as expression recognition and face alignment.
Web detection: It is used to detect the Web references to an image. The Vision API can perform Web Detection directly on an image file located in Google Cloud Storage or on the Web without the need to send the contents of the image file.
Face detection: Applications that identify human faces in digital images. The main aim of face detection algorithms is to find out whether there is any face in an image or not. It can be used in many areas such as security, bio-metrics, law enforcement, entertainment, personal safety, etc.
Optical character recognition: It is a software to convert an image of printed or handwritten text into a form that can be recognized and manipulated by a word-processing program as alphabets, words, and numerals. OCR is used in storing the contents of books and documents without the need for retyping.
Content moderation: It detects explicit content such as adult content or violent content within an image. This feature uses five categories (“adult”, “spoof”, “medical”, “violence”, and “racy”) and returns the likelihood that each is present in a given image.
Logo detection: It is used to detect brand logos within images

Use cases of Google’s Cloud Vision API

Google Lens is an image recognition mobile app developed by Google using Cloud Vision API. It is specially designed to bring up relevant information using visual analysis. Google Lens attempts to identify the object or read labels and text and show relevant search results and information when directing the phone’s camera at an object. The Lens is also integrated with Google Photos and Google Assistant apps.
Cloud Vision API is used to simplify the job of driver license validation that was earlier been done manually. We can easily reject the inappropriate images from users that were not real driving licenses.
Recognize the content of the image: We can restrict the user to upload only a certain type of image. For example, you have a website regarding animals and you want to allow users to upload images of the animals and nothing else then API will analyze image content and display only the required images.
Inappropriate content detection: It is similar to the previous point, it automatically disallows users to upload images with inappropriate content.