Machine Learning: Going Deep on Image Analysis

North Highland was recently contacted by a major hotel chain to help them ensure the thousands of online photos of their properties meet the high quality standards of the company.  Standards such as ensuring each property has pictures showing the entire hotel from the outside, the reception area with a receptionist, two pictures of the pool, and a picture of a room that shows both the bed and the TV.

So how do you quickly and cost-effectively ensure thousands of online photos meet the requirements?  You can hire a team to manually inspect every photo, or you can leverage one of the many machine learning image recognition services available today to do the work for you.

What’s an image recognition service?  It’s a cognitive machine learning algorithm that has been trained on billions of photos to automatically detect objects within them, and is provided as a programmable interface at a very low cost.

How does it work?  Simply put, you submit a photo and it returns words or phrases about the content (i.e. this picture contains a pool, a person, and a lounge chair).  Many of the big tech companies offer them - pre-trained - to recognize thousands of objects out of the box.

What can I use it for?  Aside from just using it to identify an object in a photo, you can use it for many other purposes, such as to build a richer metadata inventory of your entire image library (what are all the identifiable objects in my images?), to detect inappropriate content, to understand more about your customers by what photos they post on social media.  You can also pair it with sentiment analysis APIs to tease out topics or emotion within photos, perform similarity searches to populate “you might also like” lists, detect landmarks (i.e. power a search engine on a vacation rental site to find properties with a view of the Eiffel Tower) and read street signs and house numbers, detect logos, etc.

How do you leverage them?  They’re typically available as a RESTful API call via your favorite programming language, be it Java, Python, Ruby, R, NodeJS or others.  And with its low cost, it’s extremely affordable.  A single API call typically costs around a tenth of a penny.

Which one is right for me?  One of the greatest benefits is that they are always improving accuracy, at no cost or disruption to you.  However, it doesn’t mean they will always solve your problem, or that they don’t have strengths and weaknesses.  As always, it’s best to try a few of them first to see which gives you the best result.  If you’re really adventurous, there are even aggregator services being developed independently that can call multiple services at once and combine the results to get the best of all worlds.

What if my use case is too specific for identifying objects in the pre-trained lists and it doesn’t work?  For example, what if the service can’t recognize the TV in the hotel room because the TV is at an angle in the photo?  Fortunately, these services often make it easy to train the algorithm for new purposes from your own photos.  Leverage that off-shore team that has already identified many objects in photos and use them to train the model to fit your exact scenario.

These emerging services add an exciting new capability to your data and analytics offerings, and are continually pushing the envelope of what’s possible.  So roll up your sleeves and learn how to use them, or hire someone who does, because your competition probably is.

 To learn more about our Data & Analytics capabilities, click here