H2O v3.26.0.1 Release Notes

  • ๐Ÿš€ Download at: http://h2o-release.s3.amazonaws.com/h2o/rel-yau/1/index.html

    Bug

    ๐Ÿšš [PUBDEV-5595] - Removed an unncessary warning in predict function that occcured when a test set was missing fold_column. ๐Ÿ‘ท [PUBDEV-6359] - AutoML no longer continues training models after a job cancellation. ๐Ÿ— [PUBDEV-6453] - Fixed an issue that caused h2o Docker image builds to fail. ๐Ÿ“œ [PUBDEV-6552] - In XGBoost, parallel sparse matrix conversion is no longer using a non-threadsafe API. [PUBDEV-6569] - AutoML uses a default value of 5 for score_tree_interval with all algorithms. ๐Ÿ›  [PUBDEV-6576] - Fixed an issue that caused the Python client API to break when passing a frame to the constructor. [PUBDEV-6601] - In Flow, you can now specify blending_frrame and max_runtime_per_model when running AutoML. [PUBDEV-6627] - Frame Summary is now available when running the Python client in Zeppelin. ๐Ÿ›  [PUBDEV-6657] - Fixed an issue that caused H2O.CLOUD._memary(idx).getTimestamp to return 0 rather than the timestamp of the remote node. ๐Ÿ›  [PUBDEV-6661] - Fixed a link function NPE in MOJOs. ๐Ÿ›  [PUBDEV-6673] - Fixed the frame.tocsv signature. Instead of passing true, false, this now takes CSVStreamParams.

    New Feature

    ๐Ÿ‘ [PUBDEV-4076] - Added support for a custom Loss Metric in GBM. [PUBDEV-6089] - When running AutoML in R or Python, and EventLog is now available. [PUBDEV-6090] - When polling an AutoML run, an EventLog displays now rather than a progress bar. [PUBDEV-6108] - CoxPH is now available in the Python client. ๐Ÿ‘ [PUBDEV-6134] - Added support for SVM in the h2o-3 R and Python clients. [PUBDEV-6492] - Added Isolation Forest to Flow. ๐ŸŽ [PUBDEV-6510] - In XGBoost improved performance of moving sparse matrices to off-heap memory. ๐Ÿ”Š [PUBDEV-6518] - Logs from H2O can now be downloaded in plain text format.

    Task

    ๐Ÿ—„ [PUBDEV-6015] - Deprecated support for Java 7. ๐Ÿ›  [PUBDEV-6611] - Fixed an issue that caused h2o.scale to corrupt the frame when run over a frame with categorical columns. ๐Ÿ— [PUBDEV-6619] - Removed the Deep Water booklet from H2O-3 builds.

    Improvement

    [PUBDEV-5316] - AutoML runtime information is now stored and available in an EventLog. [PUBDEV-5885] - Users can now pass an ID to training_frame in h2o.StackedEnsemble. [PUBDEV-6410] - Added early stopping options to Isolation Forest. ๐Ÿ— [PUBDEV-6438] - Users can now build 2D Partial Dependence plots with the R and Python clients. [PUBDEV-6482] - When loading MOJOs that were trained on older versions of H2O-3 into newer versions of H2O-3, users can now access all the information that was saved in the model object and use the MOJO to score. ๐Ÿ— [PUBDEV-6543] - Users can now specify a row_index parameter when building PDPs. This allows partial dependence to be calculated for a row. ๐Ÿ— [PUBDEV-6553] - Users can now specify a row_index parameter when building PDPs in Flow. [PUBDEV-6573] - Enabled Java scoring for XGBoost MOJOs. [PUBDEV-6590] - User can now delete an AutoML instance and all its dependencies from any client (including models and other dependencies). [PUBDEV-6617] - h2o.mojo_predict_csv() and h2o.mojo_predict_pandas() now accept a setInvNumNA parameter. ๐Ÿ‘ [PUBDEV-6621] - Added support for TreeShap in DRF. [PUBDEV-6633] - Added a feature_frequencies function in GBM, DRF, and IF, which retrieves the number of times a feature was used on a prediction path in a tree model. [PUBDEV-6634] - Users can now retrieve variable split information in the Isolation Forest output. [PUBDEV-6646] - Created a SharedTreeMojoModelWithContributions class, which provides a central location of contribs for DRF and GBM MOJO. [PUBDEV-6647] - ScoreContributionsTask is no longer abstract.

    ๐Ÿ“„ Docs

    ๐ŸŒฒ [PUBDEV-6452] - Clarified in the GLM docs that h2o-3 determines the values of alpha and theta by minimizing the negative log-likelihood plus the same Regularization Penalty. ๐Ÿ“š [PUBDEV-6500] - Create initial, alpha version of SVM documentation. [PUBDEV-6554] - Added upload_custom_distribution to the Parameters Appendix. ๐Ÿ“š [PUBDEV-6604] - Removed note in XGBoost documentation indicating that "Multi-node support is currently available as a Beta feature." ๐Ÿ“š [PUBDEV-6608] - SVM R client documentation is now available. [PUBDEV-6610] - Explained how the nthreads parameter can impact reproducibility. [PUBDEV-6613] - Added stopping parameters to the Isolation Forest chapter. [PUBDEV-6642] - Fixed the parameters listing display for predict and predict_leaf_node_assignment in the Python documentation. ๐Ÿ‘ [PUBDEV-6644] - DRF is now included in the list of supported algorithms for predict_contributions. [PUBDEV-6648] - Added more examples to the Predict topic. ๐Ÿ“š [PUBDEV-6650] - Improved Data Manipulation Python documentation. ๐Ÿ“š [PUBDEV-6651] - Improved Modeling functions in the Python documentation. ๐Ÿ“š [PUBDEV-6653] - Improved the tree_class Python documentation. ๐Ÿ“š [PUBDEV-6654] - Improved the Model Metrics Python documentation. ๐Ÿ“š [PUBDEV-6656] - Improved GLM documentation by informing users that they can only specify a list in the GLM interactions parameter. ๐Ÿ“š [PUBDEV-6660] - Updated Flow documentation to include Isolation Forest. ๐Ÿ“š [PUBDEV-6663] - Improved the Python documentation for h2o.frame(). ๐Ÿ“š [PUBDEV-6664] - Added examples to the TargetEncoding Python documentation.