Popularity
4.5
Growing
Activity
9.8
-
1,144
13
51

Description

Bytewax is a Python framework that simplifies event and stream processing. Because Bytewax couples the stream and event processing capabilities of Flink, Spark, and Kafka Streams with the friendly and familiar interface of Python, you can re-use the Python libraries you already know and love. Connect data sources, run stateful transformations, and write to various downstream systems with built-in connectors or existing Python libraries.

Programming language: Python
License: Apache License 2.0
Latest version: v0.9.0

Bytewax alternatives and similar packages

Based on the "Web Frameworks" category.
Alternatively, view bytewax alternatives based on common mentions on social networks and blogs.

Do you think we are missing an alternative of Bytewax or a related project?

Add another 'Web Frameworks' Package

README

Actions Status PyPI Bytewax User Guide

Bytewax is an open source Python framework for building highly scalable dataflows in a streaming or batch context.

Get started

Check out our getting started guide.

About

Bytewax lets you build Python based dataflows to process your data for augmentation, advanced analysis, machine learning and more. It is based on Timely Dataflow, which is a cyclic dataflow computational model. At a high-level, dataflow programming is a programming paradigm where program execution is conceptualized as data flowing through a series of operator based steps. Operators are the processing primitives of bytewax. Each of them gives you a “shape” of data transformation, and you give them functions to customize them to a specific task you need. The combination of each operator and their custom logic functions we call a dataflow step. You chain together steps in a dataflow to solve your high-level data processing problem.

At a high level, Bytewax provides a few major benefits:

  • The operators in Bytewax are largely “data-parallel”, meaning they can operate on independent parts of the data concurrently.
  • The ability to express higher-level control constructs, like iteration.
  • Bytewax allows you to develop and run your code locally, and then easily scale that code to multiple workers or processes without changes.
  • Bytewax can be used in both a streaming and batch context
  • Ability to leverage the Python ecosystem directly

Community

Slack Is the main forum for communication and discussion.

GitHub Issues is reserved only for actual issues. Please use the slack community for discussions.

Code of Conduct

Usage

Install the latest release with pip:

pip install bytewax

Example

Here is an example of a simple dataflow program using Bytewax:

from bytewax import Dataflow, run


flow = Dataflow()
flow.map(lambda x: x * x)
flow.capture()


if __name__ == "__main__":
    for epoch, x in sorted(run(flow, enumerate(range(10)))):
        print(x)

Running the program prints the following output:

0
1
4
9
16
25
36
49
64
81

For a more complete example, and documentation on the available operators, check out the User Guide.

For an exhaustive list of examples, checkout the /examples folder

License

Bytewax is licensed under the Apache-2.0 license.

Contributing

Contributions are welcome! This community and project would not be what it is without the contributors. All contributions, from bug reports to new features, are welcome and encouraged.

With ❤️ Bytewax


*Note that all licence references and agreements mentioned in the Bytewax README section above are relevant to that project's source code only.