Data Flow Facilitator for Machine Learning (dffml) v0.3.0 Release Notes

Release Date: 2019-10-26 // over 4 years ago
  • [0.3.0] - 2019-10-26

    โž• Added

    • ๐Ÿ‘€ Real DataFlows, see operations tutorial and usage examples
    • Async helper concurrently nocancel optional keyword argument which, if set is
      a set of tasks not to cancel when the concurrently execution loop completes.
    • โœ… FileSourceTest has a test_label method which checks that a FileSource knows
      how to properly load and save repos under a given label.
    • ๐Ÿ”€ Test case for Merge CLI command
    • Repo.feature method to select a single piece of feature data within a repo.
    • Dev service to help with hacking on DFFML and to create models from templates
      in the skel/ directory.
    • Classification type parameter to DNNClassifierModelConfig to specifiy data
      type of given classification options.
    • ๐Ÿ“œ util.cli CMD classes have their argparse description set to their docstring.
    • util.cli CMD classes can specify the formatter class used in
      argparse.ArgumentParser via the CLI_FORMATTER_CLASS property.
    • Skeleton for service creation was added
    • Simple Linear Regression model from scratch
    • Scikit Linear Regression model
    • Community link in CONTRIBUTING.md.
    • ๐Ÿ“„ Explained three main parts of DFFML on docs homepage
    • ๐Ÿ“š Documentation on how to use ML models on docs Models plugin page.
    • Mailing list info
    • Issue template for questions
    • Multiple Scikit Models with dynamic config
    • Entrypoint listing command to development service to aid in debugging issues
      with entrypoints.
    • HTTP API service to enable interacting with DFFML over HTTP. Currently
      ๐Ÿ”ง includes APIs for configuring and using Sources and Models.
    • MySQL protocol source to work with data from a MySQL protocol compatible db
    • shouldi example got a bandit operation which tells users not to install if
      there are more than 5 issues of high severity and confidence.
    • dev service got the ability to run a single operation in a standalone fashion.
    • ๐Ÿ“„ About page to docs.
    • Tensorflow DNNEstimator based regression model.

    ๐Ÿ”„ Changed

    • ๐Ÿ”‹ feature/codesec became it's own branch, binsec
    • 0๏ธโƒฃ BaseOrchestratorContext run_operations strict is default to true. With
      strict as true errors will be raised and not just logged.
    • MemoryInputNetworkContext got an sadd method which is shorthand for creating
      a MemoryInputSet with a StringInputSetContext.
    • MemoryOrchestrator basic_config method takes list of operations and optional
      config for them.
    • โšก๏ธ shouldi example uses updated MemoryOrchestrator.basic_config method and
      includes more explanation in comments.
    • CSVSource allows for setting the Repo's src_url from a csv column
    • util Entrypoint defines a new class for each loaded class and sets the
      ENTRY_POINT_LABEL parameter within the newly defined class.
    • ๐Ÿšš Tensorflow model removed usages of repo.classifications methods.
    • ๐Ÿ–จ Entrypoint prints traceback of loaded classes to standard error if they fail
      to load.
    • โšก๏ธ Updated Tensorflow model README.md to match functionality of
      DNNClassifierModel.
    • DNNClassifierModel no longer splits data for the user.
    • โšก๏ธ Update pip in Dockerfile.
    • ๐Ÿ“š Restructured documentation
    • Ran black on whole codebase, including all submodules
    • ๐Ÿ’… CI style check now checks whole codebase
    • ๐Ÿ”€ Merged HACKING.md into CONTRIBUTING.md
    • shouldi example runs bandit now in addition to safety
    • The way safety gets called
    • ๐Ÿ“š Switched documentation to Read The Docs theme
    • Models yield only a repo object instead of the value and confidence of the
      prediction as well. Models are not responsible for calling the predicted
      method on the repo. This will ease the process of making predict feature
      specific.
    • โšก๏ธ Updated Tensorflow model README.md to include usage of regression model

    ๐Ÿ›  Fixed

    • ๐Ÿ“„ Docs get version from dffml.version.VERSION.
    • FileSource zipfiles are wrapped with TextIOWrapper because CSVSource expects
      the underlying file object to return str instances rather than bytes.
    • โœ… FileSourceTest inherits from SourceTest and is used to test json and csv
      sources.
    • A temporary directory is used to replicate mktemp -u functionality so as to
      โœ… provide tests using a FileSource with a valid tempfile name.
    • Labels for JSON sources
    • Labels for CSV sources
    • ๐Ÿ“œ util.cli CMD's correcly set the description of subparsers instead of their
      help, they also accept the CLI_FORMATTER_CLASS property.
    • CSV source now has entry_point decoration
    • JSON source now has entry_point decoration
    • 0๏ธโƒฃ Strict flag in df.memory is now on by default
    • Dynamically created scikit models get config args correctly
    • ๐Ÿ“‡ Renamed DNNClassifierModelContext first init arg from config to features
    • BaseSource now has base_entry_point decoration

    โœ‚ Removed

    • Repo objects are no longer classification specific. Their classify,
      ๐Ÿšš classified, and classification methods were removed.