Data Flow Facilitator for Machine Learning (dffml) v0.4.0 Release NotesRelease Date: 2021-02-18 // over 1 year ago
- 🆕 New model for Anomaly Detection
- Ablity to specify maximum number of contexts running at a time
- CLI and Python example usage of Custom Neural Network
- 💅 PyTorch loss function entrypoint style loading
- 👍 Custom Neural Network, last layer support for pre-trained models
- Example usage of sklearn operations
- Example Flower17 species image classification
- Configloading ablity from CLI using "@" before filename
- 📄 Docstrings and doctestable example for DataFlowPreprocessSource
- XGBoost Regression Model
- Pre-Trained PyTorch torchvision Models
- Spacy model for NER
- Ability to rename outputs using GetSingle
- Tutorial for using NLP operations with models
- 🔌 Operations plugin for NLP wrapping spacy and scikit functions
- 👌 Support for default value in a Definition
- Source for reading images in directories
- 🔌 Operations plugin for image preprocessing
- daal4py based linear regression model
- DataFlowPreprocessSource can take a config file as dataflow via the CLI.
- 👌 Support for link on conditions in dataflow diagrams
edit allcommand to edit records in bulk
- 👌 Support for Tensorflow 2.2
- Vowpal Wabbit Models
- 👍 Python 3.8 support
- binsec branch to
- ✅ Doctestable example for
- ✅ Doctestable examples to
- shouldi got an operation to run Dependency-check on java code.
runfunctions in high level API
- ✅ Doctestable examples to
- 📜 Source for parsing
- ✅ Tests for noasync high level API.
- ✅ Tests for load and save functions in high level API.
Operationinputs and outputs default to empty
dictif not given.
- Ability to export any object with
dffml service dev export
- Complete example for dataflow run cli command
- ✅ Tests for default configs instantiation.
- Example ffmpeg operation.
- 🚀 Operations to deploy docker container on receiving github webhook.
- 🆕 New use case
Redeploying dataflow on webhookin docs.
- 📚 Documentation for creating Source for new File types taking
.inias an example.
- 🆕 New input modes, output modes for HTTP API dataflow registration.
- Usage example for tfhub text classifier.
AssociateDefinitionoutput operation to map definition names to values produced as a result of passing Inputs with those definitions to operations.
- DataFlows now have a syntax for providing a set of definitions that will override the operations default definition for a given input.
- Source which modifies record features as they are read from another source. Useful for modifying datasets as they are used with ML commands or editing in bulk.
- Auto create Definition for the
opwhen they might have a spec, subspec.
shouldi usecommand which detects the language of the codebase given via path to directory or Git repo URL and runs the appropriate static analyzers.
- 👌 Support for entrypoint style loading of operations and seed inputs in
- Definition for output of the function that
- 🔦 Expose high level load, run and save functions to noasync.
- Operation to verify secret for GitHub webhook.
- Option to modify flow and add config in
- Ability to use a function as a data source via the
- 👉 Make every model's directory property required
- 🆕 New model AutoClassifierModel based on
- 🆕 New model AutoSklearnRegressorModel based on
- Example showing usage of locks in dataflow.
service dev installcommand to let users not install certain core plugins
- HTTP service got a
-redirectflag which allows for URL redirection via a HTTP 307 response
- 👌 Support for immediate response in HTTP service
- Daal4py example usage.
- Gitter chatbot tutorial.
- Option to run dataflow without sources from cli.
- ✅ Sphinx extension for automated testing of tutorials (consoletest)
- Example of software portal using DataFlows and HTTP service
- Retry parameter to
Operation. Allows for setting number of times operation should be retried before it's exception should be raised. ### 🔄 Changed
- 👀 Renamed
- 📇 Renamed configloader/png to configloader/image and added support for loading JPEG and TIFF file formats
- Update record
__str__method to output in tabular format
- ⚡️ Update MNIST use case to normalize image arrays.
arg_notation replaced with
CONFIG = ExampleConfigstyle syntax for parsing all command line arguments.
- 🚚 Moved usage/io.rst to docs/tutorials/dataflows/io.rst
editcommand substituted with
Edit on Githubbutton now hidden for plugins.
- ✅ Doctests now run via unittests
- Every class and function can now be imported from the top level module
opattempts to create
Definitions for each argument if an
inputsare not given.
- 0️⃣ Classes now use
CONFIGif it has a default for every field and
- Models now dynamically import third party modules.
- Memory dataflow classes now use auto args and config infrastructure
dffml list recordscommand prints Records as JSON using
- 🔋 Feature class in
dffml/feature/feature.pyinitialize a feature object
- All DefFeatures() functions are substituted with Features()
- All feature.type() and feature.lenght() are substituted with feature.type and feature.length
- FileSource takes pathlib.Path as filename
- ✅ Tensorflow tests re-run themselves up to 6 times to stop them from failing the CI due to their randomly initialized weights making them fail ~2% of the time
- 💅 Any plugin can now be loaded via it's entrypoint style path
with_featuresnow raises a helpful error message if no records with matching features were found
- Split out model tutorial into writing the model, and another tutorial for packaging the model.
- ✅ IntegrationCLITestCase creates a new directory and chdir into it for each test
- ✅ Automated testing of Automating Classification tutorial
dffml versioncommand now prints git repo hash and if the repo is dirty ### 🛠 Fixed
export_valuenow converts numpy array to JSON serializable datatype
- CSV source overwriting configloaded data to every row
- Race condition in
MemoryRedundancyCheckerwhen more than 4 possible parameter sets for an operation.
- 📜 Typing of config values for numpy parsed docstrings where type should be tuple or list
- Model predict methods now use
SourcesContext.with_features### ✂ Removed
- ✅ Monitor class and associated tests (unused)
- DefinedFeature class in
- DefFeature function in
- load_def function in Feature class in
Previous changes from v0.3.7
[0.3.7] - 2020-04-14
- IO operations demo and
- Python prompts
>>>can now be enabled or disabled for easy copying of code into interactive sessions.
- Whitespace check now checks .rst and .md files too.
GetMultioperation which gets all Inputs of a given definition
- ✅ Python usage example for LogisticRegression and its related tests.
- 👌 Support for async generator operations
- Example CLI commands and Python code for
savefunction in high level API to quickly save all given records to a
- 🔧 Ability to configure sources and models for HTTP API from command line when
- 📚 Documentation page for command line usage of HTTP API
- Usage of HTTP API to the quickstart to use trained model
- 🔌 Renamed
- CSV source sorts feature names within headers when saving
- 🚚 Moved HTTP service testing code to HTTP service
- 🔌 Exporting plugins
- 📜 Issue parsing string values when using the
dataflow runcommand and
specifying extra inputs.
- Unused imports
- IO operations demo and