Description
Jellyfish is a python library for doing approximate and phonetic matching of strings.
Written by James Turk and Michael Stephens.
See https://github.com/jamesturk/jellyfish/graphs/contributors for contributors.
jellyfish alternatives and similar packages
Based on the "Text Processing" category.
Alternatively, view jellyfish alternatives based on common mentions on social networks and blogs.
-
Lark
Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity. -
TextDistance
📐 Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage. -
msgspec
A fast serialization and validation library, with builtin support for JSON, MessagePack, YAML, and TOML -
python-user-agents
A Python library that provides an easy way to identify devices like mobile phones, tablets and their capabilities by parsing (browser) user agent strings. -
Levenshtein
The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity -
pyparsing
DISCONTINUED. Python library for creating PEG parsers [Moved to: https://github.com/pyparsing/pyparsing] -
Construct
Construct: Declarative data structures for python that allow symmetric parsing and building -
AnyAscii
Unicode to ASCII transliteration - C Elixir Go Java JS Julia PHP Python Ruby Rust Shell .NET -
Efficient keyword mining with regular expressions
Efficient string matching with regular expressions
SaaSHub - Software Alternatives and Reviews
* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.
Do you think we are missing an alternative of jellyfish or a related project?
README
Overview
jellyfish is a library for approximate & phonetic matching of strings.
Source: https://github.com/jamesturk/jellyfish
Documentation: https://jamesturk.github.io/jellyfish/
Issues: https://github.com/jamesturk/jellyfish/issues
Included Algorithms
String comparison:
- Levenshtein Distance
- Damerau-Levenshtein Distance
- Jaro Distance
- Jaro-Winkler Distance
- Match Rating Approach Comparison
- Hamming Distance
Phonetic encoding:
- American Soundex
- Metaphone
- NYSIIS (New York State Identification and Intelligence System)
- Match Rating Codex
Example Usage
>>> import jellyfish
>>> jellyfish.levenshtein_distance(u'jellyfish', u'smellyfish')
2
>>> jellyfish.jaro_distance(u'jellyfish', u'smellyfish')
0.89629629629629637
>>> jellyfish.damerau_levenshtein_distance(u'jellyfish', u'jellyfihs')
1
>>> jellyfish.metaphone(u'Jellyfish')
'JLFX'
>>> jellyfish.soundex(u'Jellyfish')
'J412'
>>> jellyfish.nysiis(u'Jellyfish')
'JALYF'
>>> jellyfish.match_rating_codex(u'Jellyfish')
'JLLFSH'