Changelog History
Page 5
-
v1.2 Changes
August 09, 2019MLflow 1.2 includes the following major features and improvements:
- Experiments now have editable tags and descriptions (#1630, #1632, #1678, @ankitmathur-db)
- π€ Search latency has been significantly reduced in the SQLAlchemyStore (#1660, @t-henri)
More features and improvements
- π Backend stores now support run tag values up to 5000 characters in length. Some store implementations may support longer tag values (#1687, @ankitmathur-db)
- Gunicorn options can now be configured for the
mlflow models serve
CLI with theGUNICORN_CMD_ARGS
environment variable (#1557, @LarsDu) - π» Jsonnet artifacts can now be previewed in the UI (#1683, @ankitmathur-db)
- π Adds an optional
python_version
argument tomlflow_install
for specifying the Python version (e.g. "3.5") to use within the conda environment created for installing the MLflow CLI. Ifpython_version
is unspecified,mlflow_install
defaults to using Python 3.6. (#1722, @smurching)
π Bug fixes and documentation updates
- [Tracking] The Autologging feature is now more resilient to tracking errors (#1690, @apurva-koti)
- π [Tracking] The
runs
field in in theGetExperiment.Response
proto has been deprecated & will be removed in MLflow 2.0. Please use theSearch Runs
API for fetching runs instead (#1647, @dbczumar) - π³ [Projects] Fixed a bug that prevented docker-based MLflow Projects from logging artifacts to the
LocalArtifactRepository
(#1450, @nlaille) - [Projects] Running MLflow projects with the
--no-conda
flag in R no longer requires Anaconda to be installed (#1650, @spadarian) - π [Models/Scoring] Fixed a bug that prevented Spark UDFs from being loaded on Databricks (#1658, @smurching)
- π» [UI] AJAX requests made by the MLflow Server Frontend now specify correct MIME-Types (#1679, @ynotzort)
- π» [UI] Previews now render correctly for artifacts with uppercase file extensions (e.g.,
.JSON
,.YAML
) (#1664, @ankitmathur-db) - π» [UI] Fixed a bug that caused search API errors to surface a Niagara Falls page (#1681, @dbczumar)
- [Installation] MLflow dependencies are now selected properly based on the target installation platform (#1643, @akshaya-a)
- π [UI] Fixed a bug where the "load more" button in the experiment view did not appear on browsers in Windows (#1718, @Zangr)
β‘οΈ Small bug fixes and doc updates (#1663, #1719, @dbczumar; #1693, @max-allen-db; #1695, #1659, @smurching; #1675, @jdlesage; #1699, @ankitmathur-db; #1696, @aarondav; #1710, #1700, #1656, @apurva-koti)
-
v1.1 Changes
July 22, 2019MLflow 1.1 includes several major features and improvements:
In MLflow Tracking:
- β‘οΈ Experimental support for autologging from Tensorflow and Keras. Using
mlflow.tensorflow.autolog()
will enable automatic logging of metrics and optimizer parameters from TensorFlow to MLflow. The feature will work with TensorFlow versions1.12 <= v < 2.0
. (#1520, #1601, @apurva-koti) - π» Parallel coordinates plot in the MLflow compare run UI. Adds out of the box support for a parallel coordinates plot. The plot allows users to observe relationships between a n-dimensional set of parameters to metrics. It visualizes all runs as lines that are color-coded based on the value of a metric (e.g. accuracy), and shows what parameter values each run took on. (#1497, @Zangr)
- Pandas based search API. Adds the ability to return the results of a search as a pandas dataframe using the new
mlflow.search_runs
API. (#1483, #1548, @max-allen-db) - π² Java fluent API. Adds a new set of APIs to create and log to MLflow runs. This API contrasts with the existing low level
MlflowClient
API which simply wraps the REST APIs. The new fluent API allows you to create and log runs similar to how you would using the Python fluent API. (#1508, @andrewmchen) - π» Run tags improvements. Adds the ability to add and edit tags from the run view UI, delete tags from the API, and view tags in the experiment search view. (#1400, #1426, @Zangr; #1548, #1558, @ankitmathur-db)
- Search API improvements. Adds order by and pagination to the search API. Pagination allows you to read a large set of runs in small page sized chunks. This allows clients and backend implementations to handle an unbounded set of runs in a scalable manner. (#1444, @sueann; #1437, #1455, #1482, #1485, #1542, @aarondav; #1567, @max-allen-db; #1217, @mparkhe)
- π Windows support for running the MLflow tracking server and UI. (#1080, @akshaya-a)
In MLflow Projects:
- π³ Experimental support to run Docker based MLprojects in Kubernetes. Adds the first fully open source remote execution backend for MLflow projects. With this, you can leverage elastic compute resources managed by kubernetes for their ML training purposes. For example, you can run grid search over a set of hyperparameters by running several instances of an MLproject in parallel. (#1181, @marcusrehm, @tomasatdatabricks, @andrewmchen; #1566, @stbof, @dbczumar; #1574 @dbczumar)
More features and improvements
In MLflow Tracking:
- π» Paginated βload moreβ and backend sorting for experiment search view UI. This change allows the UI to scalably display the sorted runs from large experiments. (#1564, @Zangr)
- Search results are encoded in the URL. This allows you to share searches through their URL and to deep link to them. (#1416, @apurva-koti)
- π» Ability to serve MLflow UI behind
jupyter-server-proxy
or outside of the root path/
. Previous to MLflow 1.1, the UI could only be hosted on/
since the Javascript makes requests directly to/ajax-api/...
. With this patch, MLflow will make requests toajax-api/...
or a path relative to where the HTML is being served. (#1413, @xhochy)
In MLflow Models:
- β‘οΈ Update
mlflow.spark.log_model()
to accept descendants of pyspark.Model (#1519, @ankitmathur-db) - Support for saving custom Keras models with
custom_objects
. This field is semantically equivalent to custom_objects parameter ofkeras.models.load_model()
function (#1525, @ankitmathur-db) - π New more performant split orient based input format for pyfunc scoring server (#1479, @lennon310)
- π Ability to specify gunicorn server options for pyfunc scoring server built with
mlflow models build-docker
. #1428, @lennon310)
π Bug fixes and documentation updates
- β¬οΈ [Tracking] Fix database migration for MySQL.
mlflow db upgrade
should now work for MySQL backends. (#1404, @sueann) - π» [Tracking] Make CLI
mlflow server
andmlflow ui
commands to work with SQLAlchemy URIs that specify a database driver. (#1411, @sueann) - [Tracking] Fix usability bugs related to FTP artifact repository. (#1398, @kafendt; #1421, @nlaille)
- [Tracking] Return appropriate HTTP status codes for MLflowException (#1434, @max-allen-db)
- [Tracking] Fix sorting by user ID in the experiment search view. (#1401, @andrewmchen)
- [Tracking] Allow calling log_metric with NaNs and infs. (#1573, @tomasatdatabricks)
- π [Tracking] Fixes an infinite loop in downloading artifacts logged via dbfs and retrieved via S3. (#1605, @sueann)
- π³ [Projects] Docker projects should preserve directory structure (#1436, @ahutterTA)
- [Projects] Fix conda activation for newer versions of conda. (#1576, @avinashraghuthu, @smurching)
- π² [Models] Allow you to log Tensorflow keras models from the
tf.keras
module. (#1546, @tomasatdatabricks)
β‘οΈ Small bug fixes and doc updates (#1463, @mateiz; #1641, #1622, #1418, @sueann; #1607, #1568, #1536, #1478, #1406, #1408, @smurching; #1504, @LizaShak; #1490, @acroz; #1633, #1631, #1603, #1589, #1569, #1526, #1446, #1438, @apurva-koti; #1456, @Taur1ne; #1547, #1495, @aarondav; #1610, #1600, #1492, #1493, #1447, @tomasatdatabricks; #1430, @javierluraschi; #1424, @nathansuh; #1488, @henningsway; #1590, #1427, @Zangr; #1629, #1614, #1574, #1521, #1522, @dbczumar; #1577, #1514, @ankitmathur-db; #1588, #1566, @stbof; #1575, #1599, @max-allen-db; #1592, @abaveja313; #1606, @andrewmchen)
- β‘οΈ Experimental support for autologging from Tensorflow and Keras. Using
-
v1.0 Changes
June 03, 2019π MLflow 1.0 includes many significant features and improvements. From this version, MLflow is no longer beta, and all APIs except those marked as experimental are intended to be stable until the next major version. As such, this release includes a number of breaking changes.
Major features, improvements, and breaking changes
- π Support for recording, querying, and visualizing metrics along a new βstepβ axis (x coordinate), providing increased flexibility for examining model performance relative to training progress. For example, you can now record performance metrics as a function of the number of training iterations or epochs. MLflow 1.0βs enhanced metrics UI enables you to visualize the change in a metricβs value as a function of its step, augmenting MLflowβs existing UI for plotting a metricβs value as a function of wall-clock time. (#1202, #1237, @dbczumar; #1132, #1142, #1143, @smurching; #1211, #1225, @Zangr; #1372, @stbof) - π Search improvements. MLflow 1.0 includes additional support in both the API and UI for searching runs within a single experiment or a group of experiments. The search filter API supports a simplified version of the ``SQL WHERE`` clause. In addition to searching using run's metrics and params, the API has been enhanced to support a subset of run attributes as well as user and `system tags <https://mlflow.org/docs/latest/tracking.html#system-tags>`_. For details see `Search syntax <https://mlflow.org/docs/latest/search-syntax.html#syntax>`_ and `examples for programmatically searching runs <https://mlflow.org/docs/latest/search-syntax.html#programmatically-searching-runs>`_. (#1245, #1272, #1323, #1326, @mparkhe; #1052, @Zangr; #1363, @aarondav) - Logging metrics in batches. MLflow 1.0 now has a ``runs/log-batch`` REST API endpoint for logging multiple metrics, params, and tags in a single API request. The endpoint useful for performant logging of multiple metrics at the end of a model training epoch (see `example <https://github.com/mlflow/mlflow/blob/bb8c7602dcb6a3a8786301fe6b98f01e8d3f288d/examples/hyperparam/search_hyperopt.py#L161>`_), or logging of many input model parameters at the start of training. You can call this batched-logging endpoint from Python (``mlflow.log_metrics``, ``mlflow.log_params``, ``mlflow.set_tags``), R (``mlflow_log_batch``), and Java (``MlflowClient.logBatch``). (#1214, @dbczumar; see 0.9.1 and 0.9.0 for other changes) - π Windows support for MLflow Tracking. The Tracking portion of the MLflow client is now supported on Windows. (#1171, @eedeleon, @tomasatdatabricks) - π² HDFS support for artifacts. Hadoop artifact repository with Kerberos authorization support was added, so you can use HDFS to log and retrieve models and other artifacts. (#1011, @jaroslawk) - π CLI command to build Docker images for serving. Added an ``mlflow models build-docker`` CLI command for building a Docker image capable of serving an MLflow model. The model is served at port 8080 within the container by default. Note that this API is experimental and does not guarantee that the arguments nor format of the Docker container will remain the same. (#1329, @smurching, @tomasatdatabricks) - π New ``onnx`` model flavor for saving, loading, and evaluating ONNX models with MLflow. ONNX flavor APIs are available in the ``mlflow.onnx`` module. (#1127, @avflor, @dbczumar; #1388, #1389, @dbczumar) - Major breaking changes: - Some of the breaking changes involve database schema changes in the SQLAlchemy tracking store. If your database instance's schema is not up-to-date, MLflow will issue an error at the start-up of ``mlflow server`` or ``mlflow ui``. To migrate an existing database to the newest schema, you can use the ``mlflow db upgrade`` CLI command. (#1155, #1371, @smurching; #1360, @aarondav) - [Installation] The MLflow Python package no longer depends on ``scikit-learn``, ``mleap``, or ``boto3``. If you want to use the ``scikit-learn`` support, the ``MLeap`` support, or ``s3`` artifact repository / ``sagemaker`` support, you will have to install these respective dependencies explicitly. (#1223, @aarondav) - [Artifacts] In the Models API, an artifact's location is now represented as a URI. See the `documentation <https://mlflow.org/docs/latest/tracking.html#artifact-locations>`_ for the list of accepted URIs. (#1190, #1254, @dbczumar; #1174, @dbczumar, @sueann; #1206, @tomasatdatabricks; #1253, @stbof) - The affected methods are: - Python: ``<model-type>.load_model``, ``azureml.build_image``, ``sagemaker.deploy``, ``sagemaker.run_local``, ``pyfunc._load_model_env``, ``pyfunc.load_pyfunc``, and ``pyfunc.spark_udf`` - R: ``mlflow_load_model``, ``mlflow_rfunc_predict``, ``mlflow_rfunc_serve`` - CLI: ``mlflow models serve``, ``mlflow models predict``, ``mlflow sagemaker``, ``mlflow azureml`` (with the new ``--model-uri`` option) - To allow referring to artifacts in the context of a run, MLflow introduces a new URI scheme of the form ``runs:/<run_id>/relative/path/to/artifact``. (#1169, #1175, @sueann) - [CLI] ``mlflow pyfunc`` and ``mlflow rfunc`` commands have been unified as ``mlflow models`` (#1257, @tomasatdatabricks; #1321, @dbczumar) - [CLI] ``mlflow artifacts download``, ``mlflow artifacts download-from-uri`` and ``mlflow download`` commands have been consolidated into ``mlflow artifacts download`` (#1233, @sueann) - [Runs] Expose ``RunData`` fields (``metrics``, ``params``, ``tags``) as dictionaries. Note that the ``mlflow.entities.RunData`` constructor still accepts lists of ``metric``/``param``/``tag`` entities. (#1078, @smurching) - [Runs] Rename ``run_uuid`` to ``run_id`` in Python, Java, and REST API. Where necessary, MLflow will continue to accept ``run_uuid`` until MLflow 1.1. (#1187, @aarondav) Other breaking changes ~~~~~~~~~~~~~~~~~~~~~~ CLI: - π The ``--file-store`` option is deprecated in ``mlflow server`` and ``mlflow ui`` commands. (#1196, @smurching) - π The ``--host`` and ``--gunicorn-opts`` options are removed in the ``mlflow ui`` command. (#1267, @aarondav) - Arguments to ``mlflow experiments`` subcommands, notably ``--experiment-name`` and ``--experiment-id`` are now options (#1235, @sueann) - π ``mlflow sagemaker list-flavors`` has been removed (#1233, @sueann) Tracking: - The ``user`` property of ``Run``s has been moved to tags (similarly, the ``run_name``, ``source_type``, ``source_name`` properties were moved to tags in 0.9.0). (#1230, @acroz; #1275, #1276, @aarondav) - In R, the return values of experiment CRUD APIs have been updated to more closely match the REST API. In particular, ``mlflow_create_experiment`` now returns a string experiment ID instead of an experiment, and the other APIs return NULL. (#1246, @smurching) - ``RunInfo.status``'s type is now string. (#1264, @mparkhe) - β Remove deprecated ``RunInfo`` properties from ``start_run``. (#1220, @aarondav) - As deprecated in 0.9.1 and before, the ``RunInfo`` fields ``run_name``, ``source_name``, ``source_version``, ``source_type``, and ``entry_point_name`` and the ``SearchRuns`` field ``anded_expressions`` have been removed from the REST API and Python, Java, and R tracking client APIs. They are still available as tags, documented in the REST API documentation. (#1188, @aarondav) π Models and deployment: - In Python, require arguments as keywords in ``log_model``, ``save_model`` and ``add_to_model`` methods in the ``tensorflow`` and ``mleap`` modules to avoid breaking changes in the future (#1226, @sueann) - β Remove the unsupported ``jars`` argument from ```spark.log_model`` in Python (#1222, @sueann) - Introduce ``pyfunc.load_model`` to be consistent with other Models modules. ``pyfunc.load_pyfunc`` will be deprecated in the near future. (#1222, @sueann) - Rename ``dst_path`` parameter in ``pyfunc.save_model`` to ``path`` (#1221, @aarondav) - π¨ R flavors refactor (#1299, @kevinykuo) - ``mlflow_predict()`` has been added in favor of ``mlflow_predict_model()`` and ``mlflow_predict_flavor()`` which have been removed. - ``mlflow_save_model()`` is now a generic and ``mlflow_save_flavor()`` is no longer needed and has been removed. - ``mlflow_predict()`` takes ``...`` to pass to underlying predict methods. - ``mlflow_load_flavor()`` now has the signature ``function(flavor, model_path)`` and flavor authors should implement ``mlflow_load_flavor.mlflow_flavor_{FLAVORNAME}``. The flavor argument is inferred from the inputs of user-facing ``mlflow_load_model()`` and does not need to be explicitly provided by the user. Projects: - β Remove and rename some ``projects.run`` parameters for generality and consistency. (#1222, @sueann) - β In R, the ``mlflow_run`` API for running MLflow projects has been modified to more closely reflect the Python ``mlflow.run`` API. In particular, the order of the ``uri`` and ``entry_point`` arguments has been reversed and the ``param_list`` argument has been renamed to ``parameters``. (#1265, @smurching) R: - Remove ``mlflow_snapshot`` and ``mlflow_restore_snapshot`` APIs. Also, the ``r_dependencies`` argument used to specify the path to a packrat r-dependencies.txt file has been removed from all APIs. (#1263, @smurching) - The ``mlflow_cli`` and ``crate`` APIs are now private. (#1246, @smurching) Environment variables: - Prefix environment variables with "MLFLOW_" (#1268, @aarondav). Affected variables are: - [Tracking] ``_MLFLOW_SERVER_FILE_STORE``, ``_MLFLOW_SERVER_ARTIFACT_ROOT``, ``_MLFLOW_STATIC_PREFIX`` - [SageMaker] ``MLFLOW_SAGEMAKER_DEPLOY_IMG_URL``, ``MLFLOW_DEPLOYMENT_FLAVOR_NAME`` - [Scoring] ``MLFLOW_SCORING_SERVER_MIN_THREADS``, ``MLFLOW_SCORING_SERVER_MAX_THREADS`` More features and improvements ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - 0οΈβ£ [Tracking] Non-default driver support for SQLAlchemy backends: ``db+driver`` is now a valid tracking backend URI scheme (#1297, @drewmcdonald; #1374, @mparkhe) - [Tracking] Validate backend store URI before starting tracking server (#1218, @luke-zhu, @sueann) - [Tracking] Add ``GetMetricHistory`` client API in Python and Java corresponding to the REST API. (#1178, @smurching) - [Tracking] Add ``view_type`` argument to ``MlflowClient.list_experiments()`` in Python. (#1212, @smurching) - [Tracking] Dictionary values provided to ``mlflow.log_params`` and ``mlflow.set_tags`` in Python can now be non-string types (e.g., numbers), and they are automatically converted to strings. (#1364, @aarondav) - [Tracking] R API additions to be at parity with REST API and Python (#1122, @kevinykuo) - π» [Tracking] Limit number of results returned from ``SearchRuns`` API and UI for faster load (#1125, @mparkhe; #1154, @andrewmchen) - [Artifacts] To avoid having many copies of large model files in serving, ``ArtifactRepository.download_artifacts`` no longer copies local artifacts (#1307, @andrewmchen; #1383, @dbczumar) - π [Artifacts][Projects] Support GCS in download utilities. ``gs://bucket/path`` files are now supported by the ``mlflow artifacts download`` CLI command and as parameters of type ``path`` in MLProject files. (#1168, @drewmcdonald) - β [Models] All Python models exported by MLflow now declare ``mlflow`` as a dependency by default. In addition, we introduce a flag ``--install-mlflow`` users can pass to ``mlflow models serve`` and ``mlflow models predict`` methods to force installation of the latest version of MLflow into the model's environment. (#1308, @tomasatdatabricks) - [Models] Update model flavors to lazily import dependencies in Python. Modules that define Model flavors now import extra dependencies such as ``tensorflow``, ``scikit-learn``, and ``pytorch`` inside individual _methods_, ensuring that these modules can be imported and explored even if the dependencies have not been installed on your system. Also, the ``DEFAULT_CONDA_ENVIRONMENT`` module variable has been replaced with a ``get_default_conda_env()`` function for each flavor. (#1238, @dbczumar) - [Models] It is now possible to pass extra arguments to ``mlflow.keras.load_model`` that will be passed through to ``keras.load_model``. (#1330, @yorickvP) - π [Serving] For better performance, switch to ``gunicorn`` for serving Python models. This does not change the user interface. (#1322, @tomasatdatabricks) - π [Deployment] For SageMaker, use the uniquely-generated model name as the S3 bucket prefix instead of requiring one. (#1183, @dbczumar) - π [REST API] Add support for API paths without the ``preview`` component. The ``preview`` paths will be deprecated in a future version of MLflow. (#1236, @mparkhe) π Bug fixes and documentation updates ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - 0οΈβ£ [Tracking] Log metric timestamps in milliseconds by default (#1177, @smurching; #1333, @dbczumar) - [Tracking] Fix bug when deserializing integer experiment ID for runs in ``SQLAlchemyStore`` (#1167, @smurching) - [Tracking] Ensure unique constraint names in MLflow tracking database (#1292, @smurching) - [Tracking] Fix base64 encoding for basic auth in R tracking client (#1126, @freefrag) - π» [Tracking] Correctly handle ``file:`` URIs for the ``-βbackend-store-uri`` option in ``mlflow server`` and ``mlflow ui`` CLI commands (#1171, @eedeleon, @tomasatdatabricks) - β‘οΈ [Artifacts] Update artifact repository download methods to return absolute paths (#1179, @dbczumar) - 0οΈβ£ [Artifacts] Make FileStore respect the default artifact location (#1332, @dbczumar) - [Artifacts] Fix ``log_artifact`` failures due to existing directory on FTP server (#1327, @kafendt) - π² [Artifacts] Fix GCS artifact logging of subdirectories (#1285, @jason-huling) - π³ [Projects] Fix bug not sharing ``SQLite`` database file with Docker container (#1347, @tomasatdatabricks; #1375, @aarondav) - [Java] Mark ``sendPost`` and ``sendGet`` as experimental (#1186, @aarondav) - π [Python][CLI] Mark ``azureml.build_image`` as experimental (#1222, #1233 @sueann) - π [Docs] Document public MLflow environment variables (#1343, @aarondav) - π [Docs] Document MLflow system tags for runs (#1342, @aarondav) - π [Docs] Autogenerate CLI documentation to include subcommands and descriptions (#1231, @sueann) - [Docs] Update run selection description in ``mlflow_get_run`` in R documentation (#1258, @dbczumar) - β‘οΈ [Examples] Update examples to reflect API changes (#1361, @tomasatdatabricks; #1367, @mparkhe) β‘οΈ Small bug fixes and doc updates (#1359, #1350, #1331, #1301, #1270, #1271, #1180, #1144, #1135, #1131, #1358, #1369, #1368, #1387, @aarondav; #1373, @akarloff; #1287, #1344, #1309, @stbof; #1312, @hchiuzhuo; #1348, #1349, #1294, #1227, #1384, @tomasatdatabricks; #1345, @withsmilo; #1316, @ancasarb; #1313, #1310, #1305, #1289, #1256, #1124, #1097, #1162, #1163, #1137, #1351, @smurching; #1319, #1244, #1224, #1195, #1194, #1328, @dbczumar; #1213, #1200, @Kublai-Jing; #1304, #1320, @andrewmchen; #1311, @Zangr; #1306, #1293, #1147, @mateiz; #1303, @gliptak; #1261, #1192, @eedeleon; #1273, #1259, @kevinykuo; #1277, #1247, #1243, #1182, #1376, @mparkhe; #1210, @vgod-dbx; #1199, @ashtuchkin; #1176, #1138, #1365, @sueann; #1157, @cclauss; #1156, @clemens-db; #1152, @pogil; #1146, @srowen; #875, #1251, @jimthompson5802)
-
v0.9.1 Changes
April 21, 2019π MLflow 0.9.1 is a patch release on top of 0.9.0 containing mostly bug fixes and internal improvements. We have also included a one breaking API change in preparation for additions in MLflow 1.0 and later. This release also includes significant improvements to the Search API.
π₯ Breaking changes:
- [Tracking] Generalized experiment_id to string (from a long) to be more permissive of different ID types in different backend stores. While breaking for the REST API, this change is backwards compatible for python and R clients. (#1067, #1034 @eedeleon)
More features and improvements:
- π» [Search][API] Moving search filters into a query string based syntax, with Java client, Python client, and UI support. This also improves quote, period, and special character handling in query strings and adds the ability to search on tags in filter string. (#1042, #1055, #1063, #1068, #1099, #1106 @mparkhe; #1025 @andrewmchen; #1060 @smurching)
- π² [Tracking] Limits and validations to batch-logging APIs in OSS server (#958 @smurching)
- π² [Tracking][Java] Java client API for batch-logging (#1081 @mparkhe)
- [Tracking] Improved consistency of handling multiple metric values per timestamp across tracking stores (#972, #999 @dbczumar)
π Bug fixes and documentation updates:
- β [Tracking][Python] Reintroduces the parent_run_id argument to MlflowClient.create_run. This API is planned for removal in MLflow 1.0 (#1137 @smurching)
- 0οΈβ£ [Tracking][Python] Provide default implementations of AbstractStore log methods (#1051 @acroz)
- π [R] (Released on CRAN as MLflow 0.9.0.1) Small bug fixes with R (#1123 @smurching; #1045, #1017, #1019, #1039, #1048, #1098, #1101, #1107, #1108, #1119 @tomasatdatabricks)
β‘οΈ Small bug fixes and doc updates (#1024, #1029 @bayethiernodiop; #1075 @avflor; #968, #1010, #1070, #1091, #1092 @smurching; #1004, #1085 @dbczumar; #1033, #1046 @sueann; #1053 @tomasatdatabricks; #987 @hanyucui; #935, #941 @jimthompson5802; #963 @amilbourne; #1016 @andrewmchen; #991 @jaroslawk; #1007 @mparkhe)
-
v0.9.0 Changes
March 13, 2019Major features:
- π Support for running MLflow Projects in Docker containers. This allows you to include non-Python dependencies in their project environments and provides stronger isolation when running projects. See the
Projects documentation <https://mlflow.org/docs/latest/projects.html>
_ for more information. (#555, @marcusrehm; #819, @mparkhe; #970, @dbczumar) - π Database stores for the MLflow Tracking Server. Support for a scalable and performant backend store was one of the top community requests. This feature enables you to connect to local or remote SQLAlchemy-compatible databases (currently supported flavors include MySQL, PostgreSQL, SQLite, and MS SQL) and is compatible with file backed store. See the
Tracking Store documentation <https://mlflow.org/docs/latest/tracking.html#storage>
_ for more information. (#756, @AndersonReyes; #800, #844, #847, #848, #860, #868, #975, @mparkhe; #980, @dbczumar) - π Simplified custom Python model packaging. You can easily include custom preprocessing and postprocessing logic, as well as data dependencies in models with the
python_function
flavor using updatedmlflow.pyfunc
Python APIs. For more information, see theCustom Python Models documentation <https://mlflow.org/docs/latest/models.html#custom-python-models>
_. (#791, #792, #793, #830, #910, @dbczumar) π Plugin systems allowing third party libraries to extend MLflow functionality. The
proposal document <https://gist.github.com/zblz/9e337a55a7ba73314890be68370fa69a>
_ gives the full detail of the three main changes:- You can register additional providers of tracking stores using the
mlflow.tracking_store
entrypoint. (#881, @zblz) - You can register additional providers of artifact repositories using the
mlflow.artifact_repository
entrypoint. (#882, @mociarain) - The logic generating run metadata from the run context (e.g.
source_name
,source_version
) has been refactored into an extendable system of run context providers. Plugins can register additional providers using themlflow.run_context_provider
entrypoint, which add to or overwrite tags set by the base library. (#913, #926, #930, #978, @acroz)
- You can register additional providers of tracking stores using the
π Support for HTTP authentication to the Tracking Server in the R client. Now you can connect to secure Tracking Servers using credentials set in environment variables, or provide custom plugins for setting the credentials. As an example, this release contains a Databricks plugin that can detect existing Databricks credentials to allow you to connect to the Databricks Tracking Server. (#938, #959, #992, @tomasatdatabricks)
π₯ Breaking changes:
- π [Scoring] The
pyfunc
scoring server now expects requests with theapplication/json
content type to contain json-serialized pandas dataframes in the split format, rather than the records format. See thedocumentation on deployment <https://mlflow.org/docs/latest/models.html#deploy-a-python-function-model-as-a-local-rest-api-endpoint>
_ for more detail. (#960, @dbczumar) Also, when reading the pandas dataframes from JSON, the scoring server no longer automatically infers data types as it can result in unintentional conversion of data types (#916, @mparkhe). - π [API] Remove
GetMetric
&GetParam
from the REST API as they are subsumed byGetRun
. (#879, @aarondav)
More features and improvements:
- π» [UI] Add a button for downloading artifacts (#967, @mateiz)
- βͺ [CLI] Add CLI commands for runs: now you can
list
,delete
,restore
, anddescribe
runs through the CLI (#720, @DorIndivo) - [CLI] The
run
command now can take--experiment-name
as an argument, as an alternative to the--experiment-id
argument. You can also choose to set the_EXPERIMENT_NAME_ENV_VAR
environment variable instead of passing in the value explicitly. (#889, #894, @mparkhe) - [Examples] Add Image classification example with Keras. (#743, @tomasatdatabricks )
- [Artifacts] Add
get_artifact_uri()
and_download_artifact_from_uri
convenience functions (#779) - [Artifacts] Allow writing Spark models directly to the target artifact store when possible (#808, @smurching)
- [Models] PyTorch model persistence improvements to allow persisting definitions and dependencies outside the immediate scope:
- Add a
code_paths
parameter tomlflow.pytorch.save_model
andmlflow.pytorch.log_model
to allow external module dependencies to be specified as paths to python files. (#842, @dbczumar) - Improve
mlflow.pytorch.save_model
to capture class definitions from notebooks and the__main__
scope (#851, #861, @dbczumar)
- Add a
- [Runs][R] Allow client to infer context info when creating new run in fluent API (#958, @tomasatdatabricks)
- π» [Runs][UI] Support Git Commit hyperlink for Gitlab and Bitbucket. Previously the clickable hyperlink was generated only for Github pages. (#901)
- [Search][API] Allow param value to have any content, not just alphanumeric characters,
.
, and-
(#788, @mparkhe) - π» [Search][API] Support "filter" string in the
SearchRuns
API. Corresponding UI improvements are planned for the future (#905, @mparkhe) - π² [Logging] Basic support for LogBatch. NOTE: The feature is currently experimental and the behavior is expected to change in the near future. (#950, #951, #955, #1001, @smurching)
π Bug fixes and documentation updates:
- π² [Artifacts] Fix empty-file upload to DBFS in
log_artifact
andlog_artifacts
(#895, #818, @smurching) - [Artifacts] S3 artifact store: fix path resolution error when artifact root is bucket root (#928, @dbczumar)
- π» [UI] Fix a bug with Databricks notebook URL links (#891, @smurching)
- [Export] Fix for missing run name in csv export (#864, @jimthompson5802)
- π³ [Example] Correct missing tensorboardX module error in PyTorch example when running in MLflow Docker container (#809, @jimthompson5802)
- [Scoring][R] Fix local serving of rfunc models (#874, @kevinykuo)
- π [Docs] Improve flavor-specific documentation in Models documentation (#909, @dbczumar)
β‘οΈ Small bug fixes and doc updates (#822, #899, #787, #785, #780, #942, @hanyucui; #862, #904, #954, #806, #857, #845, @stbof; #907, #872, @smurching; #896, #858, #836, #859, #923, #939, #933, #931, #952, @dbczumar; #880, @zblz; #876, @acroz; #827, #812, #816, #829, @jimthompson5802; #837, #790, #897, #974, #900, @mparkhe; #831, #798, @aarondav; #814, @sueann; #824, #912, @mateiz; #922, #947, @tomasatdatabricks; #795, @KevYuen; #676, @mlaradji; #906, @4n4nd; #777, @tmielika; #804, @alkersan)
- π Support for running MLflow Projects in Docker containers. This allows you to include non-Python dependencies in their project environments and provides stronger isolation when running projects. See the
-
v0.9.0.1 Changes
April 09, 2019π Bugfix release (PyPI only) with the following changes:
- π± Rebuilt MLflow JS assets to fix an issue where form input was broken in MLflow 0.9.0 (identified in #1056, #1113 by @shu-yusa, @timothyjlaurent)
0.9.0 (2019-03-13)
Major features:
- π Support for running MLflow Projects in Docker containers. This allows you to include non-Python dependencies in their project environments and provides stronger isolation when running projects. See the Projects documentation for more information. (#555, @marcusrehm; #819, @mparkhe; #970, @dbczumar)
- π Database stores for the MLflow Tracking Server. Support for a scalable and performant backend store was one of the top community requests. This feature enables you to connect to local or remote SQLAlchemy-compatible databases (currently supported flavors include MySQL, PostgreSQL, SQLite, and MS SQL) and is compatible with file backed store. See the Tracking Store documentation for more information. (#756, @AndersonReyes; #800, #844, #847, #848, #860, #868, #975, @mparkhe; #980, @dbczumar)
- π Simplified custom Python model packaging. You can easily include custom preprocessing and postprocessing logic, as well as data dependencies in models with the
python_function
flavor using updatedmlflow.pyfunc
Python APIs. For more information, see the Custom Python Models documentation. (#791, #792, #793, #830, #910, @dbczumar) π Plugin systems allowing third party libraries to extend MLflow functionality. The proposal document gives the full detail of the three main changes:
- You can register additional providers of tracking stores using the
mlflow.tracking_store
entrypoint. (#881, @zblz) - You can register additional providers of artifact repositories using the
mlflow.artifact_repository
entrypoint. (#882, @mociarain) - The logic generating run metadata from the run context (e.g.
source_name
,source_version
) has been refactored into an extendable system of run context providers. Plugins can register additional providers using themlflow.run_context_provider
entrypoint, which add to or overwrite tags set by the base library. (#913, #926, #930, #978, @acroz)
- You can register additional providers of tracking stores using the
π Support for HTTP authentication to the Tracking Server in the R client. Now you can connect to secure Tracking Servers using credentials set in environment variables, or provide custom plugins for setting the credentials. As an example, this release contains a Databricks plugin that can detect existing Databricks credentials to allow you to connect to the Databricks Tracking Server. (#938, #959, #992, @tomasatdatabricks)
π₯ Breaking changes:
- π [Scoring] The
pyfunc
scoring server now expects requests with theapplication/json
content type to contain json-serialized pandas dataframes in the split format, rather than the records format. See the documentation on deployment for more detail. (#960, @dbczumar) Also, when reading the pandas dataframes from JSON, the scoring server no longer automatically infers data types as it can result in unintentional conversion of data types (#916, @mparkhe). - π [API] Remove
GetMetric
&GetParam
from the REST API as they are subsumed byGetRun
. (#879, @aarondav)
More features and improvements:
- π» [UI] Add a button for downloading artifacts (#967, @mateiz)
- βͺ [CLI] Add CLI commands for runs: now you can
list
,delete
,restore
, anddescribe
runs through the CLI (#720, @DorIndivo) - [CLI] The
run
command now can take--experiment-name
as an argument, as an alternative to the--experiment-id
argument. You can also choose to set the_EXPERIMENT_NAME_ENV_VAR
environment variable instead of passing in the value explicitly. (#889, #894, @mparkhe) - [Examples] Add Image classification example with Keras. (#743, @tomasatdatabricks )
- [Artifacts] Add
get_artifact_uri()
and_download_artifact_from_uri
convenience functions (#779) - [Artifacts] Allow writing Spark models directly to the target artifact store when possible (#808, @smurching)
- [Models] PyTorch model persistence improvements to allow persisting definitions and dependencies outside the immediate scope:
- Add a
code_paths
parameter tomlflow.pytorch.save_model
andmlflow.pytorch.log_model
to allow external module dependencies to be specified as paths to python files. (#842, @dbczumar) - Improve
mlflow.pytorch.save_model
to capture class definitions from notebooks and the__main__
scope (#851, #861, @dbczumar)
- Add a
- [Runs/R] Allow client to infer context info when creating new run in fluent API (#958, @tomasatdatabricks)
- π» [Runs/UI] Support Git Commit hyperlink for Gitlab and Bitbucket. Previously the clickable hyperlink was generated only for Github pages. (#901)
- [Search]/API] Allow param value to have any content, not just alphanumeric characters,
.
, and-
(#788, @mparkhe) - π» [Search/API] Support "filter" string in the
SearchRuns
API. Corresponding UI improvements are planned for the future (#905, @mparkhe) - π² [Logging] Basic support for LogBatch. NOTE: The feature is currently experimental and the behavior is expected to change in the near future. (#950, #951, #955, #1001, @smurching)
π Bug fixes and documentation updates:
- π² [Artifacts] Fix empty-file upload to DBFS in
log_artifact
andlog_artifacts
(#895, #818, @smurching) - [Artifacts] S3 artifact store: fix path resolution error when artifact root is bucket root (#928, @dbczumar)
- π» [UI] Fix a bug with Databricks notebook URL links (#891, @smurching)
- [Export] Fix for missing run name in csv export (#864, @jimthompson5802)
- π³ [Example] Correct missing tensorboardX module error in PyTorch example when running in MLflow Docker container (#809, @jimthompson5802)
- [Scoring/R] Fix local serving of rfunc models (#874, @kevinykuo)
- π [Docs] Improve flavor-specific documentation in Models documentation (#909, @dbczumar)
β‘οΈ Small bug fixes and doc updates (#822, #899, #787, #785, #780, #942, @hanyucui; #862, #904, #954, #806, #857, #845, @stbof; #907, #872, @smurching; #896, #858, #836, #859, #923, #939, #933, #931, #952, @dbczumar; #880, @zblz; #876, @acroz; #827, #812, #816, #829, @jimthompson5802; #837, #790, #897, #974, #900, @mparkhe; #831, #798, @aarondav; #814, @sueann; #824, #912, @mateiz; #922, #947, @tomasatdatabricks; #795, @KevYuen; #676, @mlaradji; #906, @4n4nd; #777, @tmielika; #804, @alkersan)
-
v0.8.2 Changes
January 28, 2019π MLflow 0.8.2 is a patch release on top of 0.8.1 containing only bug fixes and no breaking changes or features.
π Bug fixes:
- [Python API] CloudPickle has been added to the set of MLflow library dependencies, fixing missing import errors when attempting to save models (#777, @tmielika)
- [Python API] Fixed a malformed logging call that prevented
mlflow.sagemaker.push_image_to_ecr()
invocations from succeeding (#784, @jackblandin) - [Models] PyTorch models can now be saved with code dependencies, allowing model classes to be loaded successfully in new environments (#842, #836, @dbczumar)
- π [Artifacts] Fixed a timeout when logging zero-length files to DBFS artifact stores (#818, @smurching)
β‘οΈ Small docs updates (#845, @stbof; #840, @grahamhealy20; #839, @wilderrodrigues)
-
v0.8.1 Changes
December 21, 2018MLflow 0.8.1 introduces several significant improvements:
- π Improved UI responsiveness and load time, especially when displaying experiments containing hundreds to thousands of runs.
- π Improved visualizations, including interactive scatter plots for MLflow run comparisons
- π Expanded support for scoring Python models as Spark UDFs. For more information, see the
updated documentation for this feature <https://mlflow.org/docs/latest/models.html#export-a-python-function-model-as-an-apache-spark-udf>
_. - 0οΈβ£ By default, saved models will now include a Conda environment specifying all of the dependencies necessary for loading them in a new environment.
π Features:
- π [API/CLI] Support for running MLflow projects from ZIP files (#759, @jmorefieldexpe)
- π² [Python API] Support for passing model conda environments as dictionaries to
save_model
andlog_model
functions (#748, @dbczumar) - π² [Models] Default Anaconda environments have been added to many Python model flavors. By default, models produced by
save_model
andlog_model
functions will include an environment that specifies all of the versioned dependencies necessary to load and serve the models. Previously, users had to specify these environments manually. (#705, #707, #708, #749, @dbczumar) - π [Scoring] Support for synchronous deployment of models to SageMaker (#717, @dbczumar)
- [Tracking] Include the Git repository URL as a tag when tracking an MLflow run within a Git repository (#741, @whiletruelearn, @mateiz)
- π [UI] Improved runs UI performance by using a react-virtualized table to optimize row rendering (#765, #762, #745, @smurching)
- π [UI] Significant performance improvements for rendering run metrics, tags, and parameter information (#764, #747, @smurching)
- π» [UI] Scatter plots, including run comparsion plots, are now interactive (#737, @mateiz)
- π» [UI] Extended CSRF support by allowing the MLflow UI server to specify a set of expected headers that clients should set when making AJAX requests (#733, @aarondav)
π Bug fixes and documentation updates:
- π [Python/Scoring] MLflow Python models that produce Pandas DataFrames can now be evaluated as Spark UDFs correctly. Spark UDF outputs containing multiple columns of primitive types are now supported (#719, @tomasatdatabricks)
- π [Scoring] Fixed a serialization error that prevented models served with Azure ML from returning Pandas DataFrames (#754, @dbczumar)
- π² [Docs] New example demonstrating how the MLflow REST API can be used to create experiments and log run information (#750, kjahan)
- π [Docs] R documentation has been updated for clarity and style consistency (#683, @stbof)
- π [Docs] Added clarification about user setup requirements for executing remote MLflow runs on Databricks (#736, @andyk)
β‘οΈ Small bug fixes and doc updates (#768, #715, @smurching; #728, dodysw; #730, mshr-h; #725, @kryptec; #769, #721, @dbczumar; #714, @stbof)
-
v0.8.0 Changes
November 08, 2018MLflow 0.8.0 introduces several major features:
π» Dramatically improved UI for comparing experiment run results:
- Metrics and parameters are by default grouped into a single column, to avoid an explosion of mostly-empty columns. Individual metrics and parameters can be moved into their own column to help compare across rows.
- Runs that are "nested" inside other runs (e.g., as part of a hyperparameter search or multistep workflow) now show up grouped by their parent run, and can be expanded or collapsed altogether. Runs can be nested by calling
mlflow.start_run
ormlflow.run
while already within a run. - Run names (as opposed to automatically generated run UUIDs) now show up instead of the run ID, making comparing runs in graphs easier.
- The state of the run results table, including filters, sorting, and expanded rows, is persisted in browser local storage, making it easier to go back and forth between an individual run view and the table.
π Support for deploying models as Docker containers directly to Azure Machine Learning Service Workspace (as opposed to the previously-recommended solution of Azure ML Workbench).
π₯ Breaking changes:
- π [CLI]
mlflow sklearn serve
has been removed in favor ofmlflow pyfunc serve
, which takes the same arguments but works against any pyfunc model (#690, @dbczumar)
π Features:
- 0οΈβ£ [Scoring] pyfunc server and SageMaker now support the pandas "split" JSON format in addition to the "records" format. The split format allows the client to specify the order of columns, which is necessary for some model formats. We recommend switching client code over to use this new format (by sending the Content-Type header
application/json; format=pandas-split
), as it will become the default JSON format in MLflow 0.9.0. (#690, @dbczumar) - π» [UI] Add compact experiment view (#546, #620, #662, #665, @smurching)
- π» [UI] Add support for viewing & tracking nested runs in experiment view (#588, @andrewmchen; #618, #619, @aarondav)
- π» [UI] Persist experiments view filters and sorting in browser local storage (#687, @smurching)
- π» [UI] Show run name instead of run ID when present (#476, @smurching)
- π [Scoring] Support for deploying Models directly to Azure Machine Learning Service Workspace (#631, @dbczumar)
- [Server/Python/Java] Add
rename_experiment
to Tracking API (#570, @aarondav) - [Server] Add
get_experiment_by_name
to RestStore (#592, @dmarkhas) - [Server] Allow passing gunicorn options when starting mlflow server (#626, @mparkhe)
- π [Python] Cloudpickle support for sklearn serialization (#653, @dbczumar)
- [Artifacts] FTP artifactory store added (#287, @Shenggan)
π Bug fixes and documentation updates:
- β‘οΈ [Python] Update TensorFlow integration to match API provided by other flavors (#612, @dbczumar; #670, @mlaradji)
- π [Python] Support for TensorFlow 1.12 (#692, @smurching)
- [R] Explicitly loading Keras module at predict time no longer required (#586, @kevinykuo)
- π [R] pyfunc serve can correctly load models saved with the R Keras support (#634, @tomasatdatabricks)
- β± [R] Increase network timeout of calls to the RestStore from 1 second to 60 seconds (#704, @aarondav)
- [Server] Improve errors returned by RestStore (#582, @andrewmchen; #560, @smurching)
- 0οΈβ£ [Server] Deleting the default experiment no longer causes it to be immediately recreated (#604, @andrewmchen; #641, @schipiga)
- π [Server] Azure Blob Storage artifact repo supports Windows paths (#642, @marcusrehm)
- [Server] Improve behavior when environment and run files are corrupted (#632, #654, #661, @mparkhe)
- π» [UI] Improve error page when viewing nonexistent runs or views (#600, @andrewmchen; #560, @andrewmchen)
- π» [UI] UI no longer throws an error if all experiments are deleted (#605, @andrewmchen)
- π [Docs] Include diagram of workflow for multistep example (#581, @dennyglee)
- π [Docs] Add reference tags and R and Java APIs to tracking documentation (#514, @stbof)
- π [Docs/R] Use CRAN installation (#686, @javierluraschi)
β‘οΈ Small bug fixes and doc updates (#576, #594, @javierluraschi; #585, @kevinykuo; #593, #601, #611, #650, #669, #671, #679, @dbczumar; #607, @suzil; #583, #615, @andrewmchen; #622, #681, @aarondav; #625, @pogil; #589, @tomasatdatabricks; #529, #635, #684, @stbof; #657, @mvsusp; #682, @mateiz; #678, vfdev-5; #596, @yutannihilation; #663, @smurching)
-
v0.7.0 Changes
October 01, 2018MLflow 0.7.0 introduces several major features:
- π An R client API (to be released on CRAN soon)
- π Support for deleting runs (API + UI)
- π» UI support for adding notes to a run
π The release also includes bugfixes and improvements across the Python and Java clients, tracking UI, π and documentation.
π₯ Breaking changes:
- [Python] The per-flavor implementation of load_pyfunc has been made private (#539, @tomasatdatabricks)
- [REST API, Java] logMetric now accepts a double metric value instead of a float (#566, @aarondav)
π Features:
- π [R] Support for R (#370, #471, @javierluraschi; #548 @kevinykuo)
- π» [UI] Add support for adding notes to Runs (#396, @aadamson)
- π» [Python] Python API, REST API, and UI support for deleting Runs (#418, #473, #526, #579 @andrewmchen)
- [Python] Set a tag containing the branch name when executing a branch of a Git project (#469, @adrian555)
- [Python] Add a set_experiment API to activate an experiment before starting runs (#462, @mparkhe)
- [Python] Add arguments for specifying a parent run to tracking & projects APIs (#547, @andrewmchen)
- [Java] Add Java set tag API (#495, @smurching)
- π² [Python] Support logging a conda environment with sklearn models (#489, @dbczumar)
- π [Scoring] Support downloading MLflow scoring JAR from Maven during scoring container build (#507, @dbczumar)
π Bug fixes:
- π¨ [Python] Print errors when the Databricks run fails to start (#412, @andrewmchen)
- [Python] Fix Spark ML PyFunc loader to work on Spark driver (#480, @tomasatdatabricks)
- [Python] Fix Spark ML load_pyfunc on distributed clusters (#490, @tomasatdatabricks)
- [Python] Fix error when downloading artifacts from a run's artifact root (#472, @dbczumar)
- [Python] Fix DBFS upload file-existence-checking logic during Databricks project execution (#510, @smurching)
- π [Python] Support multi-line and unicode tags (#502, @mparkhe)
- βͺ [Python] Add missing DeleteExperiment, RestoreExperiment implementations in the Python REST API client (#551, @mparkhe)
- [Scoring] Convert Spark DataFrame schema to an MLeap schema prior to serialization (#540, @dbczumar)
- π» [UI] Fix bar chart always showing in metric view (#488, @smurching)
β‘οΈ Small bug fixes and doc updates (#467 @drorata; #470, #497, #508, #518 @dbczumar;
455, #466, #492, #504, #527 @aarondav; #481, #475, #484, #496, #515, #517, #498, #521, #522,
573 @smurching; #477 @parkerzf; #494 @jainr; #501, #531, #532, #552 @mparkhe; #503, #520 @dmatrix;
509, #532 @tomasatdatabricks; #484, #486 @stbof; #533, #534 @javierluraschi;
542 @GCBallesteros; #511 @AdamBarnhard)