One of the features I’m most impressed with in ChatGPT-4 is its OCR capabilities ⬇️
I inputted a picture of a Pokemon card, and it was able to:
⚫ read blurry text description
⚫ assess the quality of the card
⚫ recognize the Pokemon depicted
⚫ correctly count and interpret symbols
⚫ extract text and numbers regardless of its position
I’m surprised this isn’t talked about more because it makes many OCR API’s obsolete.
For example using Amazon Textract to achieve this same objective would require extra logic to scan for text above, left, right, and below a key.
It also doesn't handle symbols, synonyms, and abbreviations well.
What was many lines of code and error prone before is now replaced with just a few lines of code using OpenAI’s API.
If you're interested in building on these API's, I've linked to the GPT-4 Vision docs here: