Popularity
5.7
Stable
Activity
6.2
Declining
1,680
42
147

Description

Jellyfish is a python library for doing approximate and phonetic matching of strings.

Written by James Turk and Michael Stephens.

See https://github.com/jamesturk/jellyfish/graphs/contributors for contributors.

Programming language: Python
License: BSD 2-clause "Simplified" License
Tags: Text Processing     Data Analysis     Linguistic     Diff    

jellyfish alternatives and similar packages

Based on the "Text Processing" category.
Alternatively, view jellyfish alternatives based on common mentions on social networks and blogs.

Do you think we are missing an alternative of jellyfish or a related project?

Add another 'Text Processing' Package

README

Overview

jellyfish is a library for approximate & phonetic matching of strings.

Source: https://github.com/jamesturk/jellyfish

Documentation: https://jamesturk.github.io/jellyfish/

Issues: https://github.com/jamesturk/jellyfish/issues

PyPI badge Test badge Coveralls

Included Algorithms

String comparison:

  • Levenshtein Distance
  • Damerau-Levenshtein Distance
  • Jaro Distance
  • Jaro-Winkler Distance
  • Match Rating Approach Comparison
  • Hamming Distance

Phonetic encoding:

  • American Soundex
  • Metaphone
  • NYSIIS (New York State Identification and Intelligence System)
  • Match Rating Codex

Example Usage

>>> import jellyfish
>>> jellyfish.levenshtein_distance(u'jellyfish', u'smellyfish')
2
>>> jellyfish.jaro_distance(u'jellyfish', u'smellyfish')
0.89629629629629637
>>> jellyfish.damerau_levenshtein_distance(u'jellyfish', u'jellyfihs')
1

>>> jellyfish.metaphone(u'Jellyfish')
'JLFX'
>>> jellyfish.soundex(u'Jellyfish')
'J412'
>>> jellyfish.nysiis(u'Jellyfish')
'JALYF'
>>> jellyfish.match_rating_codex(u'Jellyfish')
'JLLFSH'