Engineering packages

Showing projects tagged as Text Processing and Engineering

  • gensim

    9.5 8.6 L3 Python
    Topic Modelling for Humans
  • Pattern

    9.1 0.0 L2 Python
    Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
  • Stanza

    8.5 9.5 Python
    Official Stanford NLP Python Library for Many Human Languages
  • coala

    8.1 0.0 L4 Python
    coala provides a unified command-line interface for linting and fixing all your code, regardless of the programming languages you use.
  • 汉字拼音转换工具(Python 版)

    7.8 7.4 Python
    汉字转拼音(pypinyin)
  • sumy

    7.3 6.9 L5 Python
    Module for automatic summarization of text documents and HTML pages.
  • TextDistance

    6.9 5.8 Python
    Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
  • pdftabextract

    6.5 1.6 L3 Python
    A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
  • polyglot

    6.4 0.0 Python
    Multilingual text (NLP) processing toolkit
  • langid.py

    6.4 0.0 L3 Python
    Stand-alone language identification system
  • aeneas

    6.2 0.0 L3 Python
    aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
  • quepy

    5.7 0.0 L5 Python
    A python framework to transform natural language questions to queries in a database query language.
  • pymorphy2

    4.7 0.0 Python
    Morphological analyzer / inflection engine for Russian and Ukrainian languages.
  • trafilatura

    3.7 9.1 Python
    Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
  • Kotori

    1.7 7.2 Python
    A flexible data historian based on InfluxDB, Grafana, MQTT and more. Free, open, simple.
  • PatZilla

    1.7 6.7 Python
    PatZilla is a modular patent information research platform and data integration toolkit with a modern user interface and access to multiple data sources.
  • htmldate

    1.5 8.8 Python
    Fast and robust date extraction from web pages, with Python or on the command-line
  • pntl

    0.9 2.0 Python
    Practical Natural Language Processing Tools for Humans is build on the top of Senna Natural Language Processing (NLP) predictions: part-of-speech (POS) tags, chunking (CHK), name entity recognition (NER), semantic role labeling (SRL) and syntactic parsing (PSG) with skip-gram all in Python and still more features will be added. The website give is for downlarding Senna tool