MedImageInsight: Open-Source Medical Image Embedding Model - now on HuggingFace

Aleksander Obuchowski - Nov 4 - - Dev Community

TLDR: check out the model at https://huggingface.co/lion-ai/MedImageInsights

Making Medical Image AI Actually Accessible

If you've ever tried implementing a medical imaging model, you know the drill: promising papers, complicated setup processes, and documentation that assumes you are a certified Zzure developer. When I came across the MedImageInsight model, it was the same story - great potential buried under layers of enterprise infrastructure.

It took me 6 hours to access the model - first I had to register on azure, set up payment organization etc. Then although the code repository was there there was no option to clone or download the repo so I had to manually copy the content of each file! Not to mention that this model was shared as MLFlow artifact so there was a ton of unnecessary code. But, since the model is shared on MIT licence I decided to share my custom implementation on Huggingface so you don't have to through the same hell as I did

What's MedImageInsight Anyway?

At its core, MedImageInsight is a dual-encoder model (think CLIP, but for medical images) that can:

  • Convert medical images into meaningful embeddings
  • Match images with text descriptions
  • Perform zero-shot classification on medical conditions
  • Handle multiple medical imaging modalities (X-rays, CT scans, etc.)

The model was trained on a massive dataset of medical images and their descriptions, learning to create a shared embedding space for both images and text. This means you can throw new medical conditions at it without retraining, and it'll do a decent job at identifying them.

Why Another Implementation?

The original implementation required:

  • An Azure account
  • MLflow setup
  • Multiple enterprise-level configurations
  • Dealing with undocumented dependencies
  • Coffee. Lots of coffee.

After spending way too much time setting it up, I decided to strip it down to its essentials. No shade to the original authors - they created an amazing model. But not everyone needs enterprise-grade MLflow pipelines to run a few predictions.

How It Actually Works

At its heart, MedImageInsight uses a technique called contrastive learning to create a shared understanding between medical images and their descriptions. Think of it as teaching the model to speak two languages fluently: the language of images and the language of medical terminology.

The Power of Zero-Shot Learning

Traditional machine learning models are like students who can only answer questions they've seen before. Zero-shot learning models, on the other hand, are like students who can apply their knowledge to entirely new situations.

MedImageInsight achieves this through a clever architectural design:

  1. One part of the model learns to understand medical images
  2. Another part learns to understand medical terminology
  3. Both parts are trained to translate their understanding into the same "language" (a shared vector space)

This means if you show the model a chest X-ray and ask "Is there pneumonia?", it doesn't need to have seen pneumonia examples during training. Instead, it understands both what pneumonia means textually and what to look for in the image.

Getting Started

  1. Clone the repo:
git clone https://huggingface.co/lion-ai/MedImageInsights
Enter fullscreen mode Exit fullscreen mode
  1. Install dependencies (we use uv because it's fast and deterministic):
uv sync
Enter fullscreen mode Exit fullscreen mode
  1. Run the example:
uv run example.py
Enter fullscreen mode Exit fullscreen mode

That's it. No Azure setup, no MLflow, no enterprise infrastructure required.

What's Next?

We're working on:

  • Better explainability (what is the model actually looking at?)
  • HuggingFace's transformers library compatibility
  • More example notebooks for specific use cases
  • Performance optimizations

Contributing

Found a bug? Have an improvement in mind? The repository is actually open source (imagine that!), and we welcome contributions.

Resources

. .