Google Lens, my private translator!

Khalida Douibi
2 min readJun 19, 2022

--

Copyrigh image: https://blog.google/products/translate/google-translates-instant-camera-translation-gets-upgrade/

Recently, I moved to a Flemish region where everything is written in Dutch! A language that I didn’t manage! All the documents related to my stay and my work are in the same language! Can you imagine the challenge that I had every day before discovering the power of Google Lens! I have used the application many times before by taking some photos of my plants to search for the name, and conditions to better keep them healthy and beautiful (I’m passionate about plants!). Then, I said what if I try it for the translation, does it will work?

I was surprised because it works very well even in real-time, I only need to take a picture, and I choose to translate from Dutch to English/French and that’s it! Technology is a lifesaver! That’s why I’m passionate about AI, ML, and the data science field!

I was curious about the algorithm used behind the Google lens, so I read more about it here but let’s summarize it briefly below.

  • Google lens is based on computer vision, machine learning, and Google knowledge graph to detect objects/text from the image (1). For the translation in context, Lens uses Google Translate’s neural machine translation (NMT) algorithms, to translate entire sentences at a time, instead of a word-by-word strategy, in order to preserve proper grammar and diction (2).
  • Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, with the potential to overcome many of the weaknesses of conventional phrase-based translation systems (3).
  • Google Lens is using the context for a better translation, for that Lens redistributes the translation into lines of similar lengths and selects an appropriate font size to match; it also matches the color of translation and the background with the original text using a heuristic that assumes background and text are different in terms of heuristics and the former takes up the majority of the space. This helps Lens to classify the pixel into whether it represents the background or the text. Then it samples the average color from the two regions and ensures that the translated text matches the original (4).

The topic is very interesting and needs further reading, check the following links for some resources explaining it in depth.

For further reading:

1: https://analyticsindiamag.com/these-machine-learning-techniques-make-google-lens-a-success/

2: https://ai.googleblog.com/2019/09/giving-lens-new-reading-capabilities-in.html#:~:text=Lens%20uses%20Google%20Translate's%20neural,context%20of%20the%20original%20text.

3: WU, Yonghui, SCHUSTER, Mike, CHEN, Zhifeng, et al. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144, 2016.

4: https://analyticsindiamag.com/google-lens-ai-factor/

5:

--

--

Khalida Douibi
Khalida Douibi

Written by Khalida Douibi

Sn. Data Scientist. PhD. Biomedical Informatics, Machine learning

No responses yet