Use face recognition on Whatsapp group pictures — Part 1

Pratik Barasia
5 min readJan 3, 2021

--

Did you ever attend an event where all pictures were being shared on a Whatsapp group and you had to download 100s of images to only find some of your pictures? Atleast I had this problem in most of the marriages I attended. I just wanted to look at my own pictures(narcissistic alert) but I had to download each one of them to find my pictures. It took a lot of space on my phone and also it was hard to delete all those pictures I just downloaded in vain.

I had this idea of a service which can read all these photos and only give me mine. How cool would be that? So let’s figure out what do we need to achieve that?

  1. Facial recognition — This was always a blocker for me because I thought I had to use some facial recognition algorithm in my code and even the thought of integrating this kind of complex code made me procrastinate for a long time.
    Luckily, I found that Google and Amazon both offer facial recognition as a service on their platform. After long considerations, I chose aws-rekognition for my project and I will let you know why in the coming sections.
  2. Platform to share and retrieve photos — My initial idea was to build an app that would be used by everyone to share these photos instead of whatsapp and then you can just click on your detected face to retrieve all the other photos of you. Same can be used to get photos of any other person as well, maybe your friend, wife or parents.
    This started a search for tutorials on building apps and I came across Flutter. I was pretty impressed by what can be done on Flutter and it also gave me options to integrate with facial recognition capability of Google so I started setting up my dev environment for mobile development and started learning Flutter.
    But while discussing this idea with a friend, he made me aware about the low probability of everyone adapting this app and it was a point of resistance in this whole idea. So I though if exploring ideas around reading the photos directly from Whatsapp. That brings us to our next requirement.
  3. Whatsapp API — Whatsapp has its own business api but the whole setup gave me nightmares and also their were issues with sending messages in groups and in general because every message you send should be first created as a message template. This meant a lot of inflexibility and unknowns in sending and receiving via whatsapp.
    There were also a lot of services like Twilio that I tried but none of them were flexible enough to be a perfect fit to play around and be used in my project. Luckily, I found a website https://chat-api.com/en/?lang=EN . I tested this using their sample messages and it could do everything I wanted. It could read messages, send messages to group and also send images without issues. Only catch was that its a paid service but it allows 3 days of free trial and that was enough for me to test it extensively.

So as now we had all the requirements fulfilled, it was time to start exploring aws-rekognition.

AWS Rekognition

To use any of the aws services, you need to first create an account and Amazon is great to give first time users 1 year of free but limited usage of most of their services. Let’s go and create an account first on https://aws.amazon.com/free .

The second step is to create an IAM user. This will give you access to all the services. The important thing is to save the access key and secret key because we will be using that in the project. Also keep in mind the region which you select during account setup as the speed and cost of your usage depends on that. Also, we would need region name in the project to access aws services.

Once you have these, you need to give permission to your IAM user to access AWS rekognition api. To do that you need to click on “Add Permission” and search for AmazonRekognitionFullAccess . It should look something like this:

So now you are ready to access the power of Amazon but how can you do that? For this you can download aws-cli and setup your credentials. Goto this url to get more information on how to do that- https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html . Also looks for the section to setup credentials.

Once the cli is setup, lets get to know how AWS Rekognition works. Here is documentation for it — https://docs.aws.amazon.com/rekognition/latest/dg/what-is.html
and here is a list of the apis which it provides-
https://docs.aws.amazon.com/cli/latest/reference/rekognition/

Now I know its a little overwhelming, so I will share the APIs we will use and add some details. But before that you can play around and grasp the power of this service using their demo- https://eu-central-1.console.aws.amazon.com/rekognition/home?region=eu-central-1#/label-detection

The list of APIs that we will be using and how will we be using them are —

  1. detectFaces — This api is used to detect faces from a given image. It returns metadata information including coordinated of the faces detected.
  2. indexFaces Aws rekognition can be used to create an index of faces which can be later used to detect the indexed faces in images. We will be storing the detected faces from the first step.
  3. searchFacesByImage — This api is used to search faces in an image which are already indexed. We will use this api to see which faces are already indexed and which faces we need to index. Also this will be used to create a mapping of images to faces which we can store in a DB.

Also as the indexes are stored in a collection, we can create a collection using aws-cli. Let’s call our collection “MyCollection” and create using this command —
aws rekognition create-collection — collection-id=”MyCollection”

Complete code can be found here:
https://github.com/pbarasia/facerec

But I am planning to post 2 more parts of this article to go through the whole project. Follow me to get more updates when the post is up.

--

--