India/kerala

Fuzzify

April 9, 2024
Public databases often break down on names because spelling is rarely stable. A single person might be entered as Laxmi, Lakshmi, Laxhmi, or Lackshmy, which means exact-match search quietly misses valid records. Fuzzify was built to make those searches tolerant, pronunciation-aware, and useful in real field conditions where speed matters. Instead of comparing raw spellings, Fuzzify converts a name into likely pronunciation variants and then searches for the closest phonetic match. That lets the system treat different spellings as related signals instead of unrelated strings.
  1. A user enters the name they want to search.
  2. A fine-tuned Llama 3.2 1B model predicts likely pronunciations in IPA form.
  3. A custom embedder turns those pronunciation forms into vectors.
  4. The vectors are stored and queried in Chroma DB.
  5. Cosine similarity surfaces the closest matches across spelling variants.
  • It improves recall for transliterated and inconsistently spelled names.
  • It keeps the search lightweight enough for practical deployment.
  • It moves the matching logic closer to how names sound, not just how they are typed.
  • It reduces the friction of searching messy public datasets under time pressure.
Fuzzify was designed as a complete product flow, not just a model experiment.
  • Flutter handled the mobile-first search experience.
  • FastAPI served the retrieval pipeline.
  • Chroma DB stored the phonetic vectors.
  • Unsloth was used while fine-tuning the Llama 3.2 1B model.
Built for the SIH Grand Finale 2024, Fuzzify demonstrates how a compact LLM pipeline can improve name lookup in public-sector workflows where data quality is uneven but search accuracy is critical.