Popularity

9.9

Stable

Activity

10.0

Growing

Stars 41,863

Watchers 1,116

Forks 17,292

Last Commit 2 days ago

Code Quality Rank: L2

Programming language: Python

License: BSD 3-clause "New" or "Revised" License

Tags: Science And Data Analysis Scientific Engineering

Latest version: v1.3.0.dev0

Pandas alternatives and similar packages

Based on the "Science and Data Analysis" category.
Alternatively, view Pandas alternatives based on common mentions on social networks and blogs.

NumPy

9.8 10.0 L1 Pandas VS NumPy

The fundamental package for scientific computing with Python.
SymPy

9.4 10.0 L2 Pandas VS SymPy

A computer algebra system written in pure Python

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

Promo www.influxdata.com

SciPy

9.4 9.9 L2 Pandas VS SciPy

SciPy library main repository
NetworkX

9.4 9.6 L3 Pandas VS NetworkX

Network Analysis in Python
Dask

9.2 9.7 L2 Pandas VS Dask

Parallel computing with task scheduling
statsmodels

9.2 9.4 L3 Pandas VS statsmodels

Statsmodels: statistical modeling and econometrics in Python
Numba

8.9 9.9 L3 Pandas VS Numba

NumPy aware dynamic Python compiler using LLVM
PyMC

8.9 9.4 L4 Pandas VS PyMC

Bayesian Modeling and Probabilistic Programming in Python
Getting Started

8.7 9.6 Pandas VS Getting Started

PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis
Biopython

8.3 9.6 L2 Pandas VS Biopython

Official git repository for Biopython (originally converted from CVS)
astropy

8.2 9.9 L2 Pandas VS astropy

Astronomy and astrophysics core library
orange

8.1 9.6 L2 Pandas VS orange

🍊 :bar_chart: :bulb: Orange: Interactive data analysis
Interactive Parallel Computing with IPython

7.4 8.3 L3 Pandas VS Interactive Parallel Computing with IPython

IPython Parallel: Interactive Parallel Computing in Python
blaze

7.3 0.0 L4 Pandas VS blaze

NumPy and Pandas interface to Big Data
RDKit

7.1 9.5 L1 Pandas VS RDKit

The official sources for the RDKit library
Cubes

5.9 0.0 L3 Pandas VS Cubes

[NOT MAINTAINED] Light-weight Python OLAP framework for multi-dimensional data analysis
Open Mining

5.7 0.0 L3 Pandas VS Open Mining

Business Intelligence (BI) in Python, OLAP
#<Sawyer::Resource:0x00007f547e829e00>

5.6 6.7 Pandas VS #<Sawyer::Resource:0x00007f547e829e00>

A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.
bcbio-nextgen

5.4 6.7 L3 Pandas VS bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
NIPY

5.4 8.4 L3 Pandas VS NIPY

Workflows and interfaces for neuroimaging packages
bcolz

4.7 0.0 Pandas VS bcolz

A columnar data container that can be compressed.
bccb

4.5 4.4 L4 Pandas VS bccb

Incubator for useful bioinformatics code, primarily in Python and R
Neupy

4.4 0.0 L5 Pandas VS Neupy

NeuPy is a Tensorflow based python library for prototyping and building neural networks
Bubbles

3.6 0.0 L5 Pandas VS Bubbles

[NOT MAINTAINED] Bubbles – Python ETL framework
PyDy

3.5 3.9 L3 Pandas VS PyDy

Multibody dynamics tool kit.
harold

2.4 1.8 L2 Pandas VS harold

An open-source systems and controls toolbox for Python3
signac

2.4 8.4 Pandas VS signac

Manage large and heterogeneous data spaces on the file system.
LynxKite

2.1 6.4 Pandas VS LynxKite

The complete graph data science platform
PatZilla

2.0 5.4 Pandas VS PatZilla

PatZilla is a modular patent information research platform and data integration toolkit with a modern user interface and access to multiple data sources.
Kotori

2.0 6.8 Pandas VS Kotori

A flexible data historian based on InfluxDB, Grafana, MQTT, and more. Free, open, simple.
Terkin

1.8 0.0 Pandas VS Terkin

Datalogger for MicroPython and CPython.
dask-memusage

0.9 0.0 Pandas VS dask-memusage

A low-impact profiler to figure out how much memory each task in Dask is using
cclib

0.9 Pandas VS cclib

A library for parsing and interpreting the results of computational chemistry packages.
ElasticBatch

0.8 0.0 Pandas VS ElasticBatch

Elasticsearch tool for easily collecting and batch inserting Python data and pandas DataFrames
Open Babel

- Pandas VS Open Babel

A chemical toolbox designed to speak the many languages of chemical data.

* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.

Do you think we are missing an alternative of Pandas or a related project?

Add another 'Science and Data Analysis' Package

Popular Comparisons

README

pandas: powerful Python data analysis toolkit

What is it?

pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language. It is already well on its way towards this goal.

Main Features

Here are just a few of the things that pandas does well:

Easy handling of missing data (represented as NaN, NA, or NaT) in floating point as well as non-floating point data
Size mutability: columns can be inserted and deleted from DataFrame and higher dimensional objects
Automatic and explicit data alignment: objects can be explicitly aligned to a set of labels, or the user can simply ignore the labels and let Series, DataFrame, etc. automatically align the data for you in computations
Powerful, flexible group by functionality to perform split-apply-combine operations on data sets, for both aggregating and transforming data
Make it easy to convert ragged, differently-indexed data in other Python and NumPy data structures into DataFrame objects
Intelligent label-based slicing, fancy indexing, and subsetting of large data sets
Intuitive merging and joining data sets
Flexible reshaping and pivoting of data sets
Hierarchical labeling of axes (possible to have multiple labels per tick)
Robust IO tools for loading data from flat files (CSV and delimited), Excel files, databases, and saving/loading data from the ultrafast HDF5 format
Time series-specific functionality: date range generation and frequency conversion, moving window statistics, date shifting and lagging

Where to get it

The source code is currently hosted on GitHub at: https://github.com/pandas-dev/pandas

Binary installers for the latest released version are available at the Python Package Index (PyPI) and on Conda.

# conda
conda install pandas

# or PyPI
pip install pandas

Dependencies

See the full installation instructions for minimum supported versions of required, recommended and optional dependencies.

Installation from sources

To install pandas from source you need Cython in addition to the normal dependencies above. Cython can be installed from PyPI:

pip install cython

In the pandas directory (same one where you found this file after cloning the git repo), execute:

python setup.py install

or for installing in development mode:

python -m pip install -e . --no-build-isolation --no-use-pep517

or alternatively

python setup.py develop

See the full instructions for installing from source.

License

[BSD 3](LICENSE)

Documentation

The official documentation is hosted on PyData.org: https://pandas.pydata.org/pandas-docs/stable

Background

Work on pandas started at AQR (a quantitative hedge fund) in 2008 and has been under active development since then.

Getting Help

For usage questions, the best place to go to is StackOverflow. Further, general questions and discussions can also take place on the pydata mailing list.

Discussion and Development

Most development discussions take place on GitHub in this repo. Further, the pandas-dev mailing list can also be used for specialized discussions or design issues, and a Slack channel is available for quick development related questions.

Contributing to pandas

All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome.

A detailed overview on how to contribute can be found in the contributing guide.

If you are simply looking to start working with the pandas codebase, navigate to the GitHub "issues" tab and start looking through interesting issues. There are a number of issues listed under Docs and good first issue where you could start out.

You can also triage issues which may include reproducing bug reports, or asking for vital information such as version numbers or reproduction instructions. If you would like to start triaging issues, one easy way to get started is to subscribe to pandas on CodeTriage.

Or maybe through using pandas you have an idea of your own or are looking for something in the documentation and thinking ‘this can be improved’...you can do something about it!

Feel free to ask questions on the mailing list or on Slack.

As contributors and maintainers to this project, you are expected to abide by pandas' code of conduct. More information can be found at: Contributor Code of Conduct

*Note that all licence references and agreements mentioned in the Pandas README section above are relevant to that project's source code only.

Pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more