Fuzzify

Name Matching Engine

Pronunciation-aware search that matches how names sound, not how they're spelled.

Built for public-sector lookup workflows where phonetic accuracy across languages matters more than exact string matches.

Llama 3.2

Chroma DB

Flutter

IPA Phonetics

Phonetic Search

IPA Variant Generation

Vector Similarity Ranking

Real-world Record Matching

Fine-tuned LLM Pipeline

April 9, 2024

What Fuzzify Solves

Public databases often break down on names because spelling is rarely stable. A single person might be entered as Laxmi, Lakshmi, Laxhmi, or Lackshmy, which means exact-match search quietly misses valid records. Fuzzify was built to make those searches tolerant, pronunciation-aware, and useful in real field conditions where speed matters.

Core Idea

Instead of comparing raw spellings, Fuzzify converts a name into likely pronunciation variants and then searches for the closest phonetic match. That lets the system treat different spellings as related signals instead of unrelated strings.

How The Pipeline Works

A user enters the name they want to search.
A fine-tuned Llama 3.2 1B model predicts likely pronunciations in IPA form.
A custom embedder turns those pronunciation forms into vectors.
The vectors are stored and queried in Chroma DB.
Cosine similarity surfaces the closest matches across spelling variants.

Why This Approach Matters

It improves recall for transliterated and inconsistently spelled names.
It keeps the search lightweight enough for practical deployment.
It moves the matching logic closer to how names sound, not just how they are typed.
It reduces the friction of searching messy public datasets under time pressure.

Product Shape

Fuzzify was designed as a complete product flow, not just a model experiment.

Flutter handled the mobile-first search experience.
FastAPI served the retrieval pipeline.
Chroma DB stored the phonetic vectors.
Unsloth was used while fine-tuning the Llama 3.2 1B model.

Outcome

Built for the SIH Grand Finale 2024, Fuzzify demonstrates how a compact LLM pipeline can improve name lookup in public-sector workflows where data quality is uneven but search accuracy is critical.