H2O v3.24.0.1 release notes

« Changelog History

H2O v3.24.0.1 Release Notes

🚀 Download at: http://h2o-release.s3.amazonaws.com/h2o/rel-yates/1/index.html

Bug

✅ [PUBDEV-6159] - The AutoMLTest.java test suite now runs correctly on a local machine. 🛠 [PUBDEV-6189] - Fixed an issue in as_date that occurred when the column included NAs. [PUBDEV-6208] - AutoML no longer fails if one of the Stacked Ensemble models is deleted. 🚚 [PUBDEV-6230] - Removed elipses after the H2O server link when launching the Python client. 🛠 [PUBDEV-6231] - In Deep Learning, fixed an issue that occurred when running one-hot-encoding on categoricals. 🏗 [PUBDEV-6262] - When running GBM in R without specifically setting a seed, users can now extract the seed that was used to build the model and reproduce that model. 🛠 [PUBDEV-6266] - In predictions, fixed an issue that resulted in a "Categorical value out of bounds error" when calling a model. [PUBDEV-6284] - The Python API no longer reverses the labels for positive and negative values in the standardized coefficients plot legend. 🛠 [PUBDEV-6346] - In R, fixed an issue that cause group_by mean to only calculate one column when multiple columns were specified. 🛠 [PUBDEV-6350] - Fixed an issue that caused the confusion_matrix method to return matrices for other metrics. 🛠 [PUBDEV-6357] - Fixed an issue that resulted in a "Categorical value out of bounds error" when calling a model using Python. [PUBDEV-6360] - Improved the error message that displays when a user attempts to modify an Enum/categorical column as if it were a string. [PUBDEV-6367] - Rows that start with a # symbol are no longer dropped during the import process. 🛠 [PUBDEV-6368] - Fixed an SVM import failure. ✅ [PUBDEV-6376] - Fixed an issue that caused the default StackedEnsemble prediction to fail when applied to a test dataset without a response column. 🛠 [PUBDEV-6379] - Fixed handling of BAD state in CategoricalWrapperVec.

New Feature

[PUBDEV-4680] - Added Blending mode to Stacked Ensembles, which can be specified with the blending_frame parameter. With Blending mode, you do not use cross-validation preds to train the metalearner. Instead you score the base models on a holdout set and use those predicted values. [PUBDEV-5801] - Model output now includes column names and types. ⚙ [PUBDEV-5809] - AutoML now includes a max_runtime_secs_per_model option. 👍 [PUBDEV-5925] - In GLM, added support for negative binomial family. [PUBDEV-5980] - ExposeD Java target encoding to R. [PUBDEV-6056] - For GBM and XGBoost models, users can now generate feature contributions (SHAP values). 👍 [PUBDEV-6136] - Added support for Generic Models, which provide a means to use external, pretrained MOJO models in H2O for scoring. Currently only GBM, DRF, IF, and GLM MOJO models are supported. [PUBDEV-6180] - Added the blending_frame parameter to Stacked Ensembles in Flow. [PUBDEV-6196] - Added an include_algos parameter to AutoML in the R and Python APIs. Note that in Flow, users can specify exclude_algos only. [PUBDEV-6339] - In the R and Python clients, added a function that calculates the chunk size based on raw size of the data, number of CPU cores, and number of nodes. 📇 [PUBDEV-6344] - Added ability to import from Hive using metadata from Metastore. [PUBDEV-6358] - Users can now choose the database where import_sql_select creates a temporary table. 👍 [PUBDEV-6365] - Added support for monotonicity constraints for binomial GBMs. [PUBDEV-6374] - Users can now define custom HTTP headers using an -add_http_header option. 0️⃣ [PUBDEV-6386] - XGBoost MOJO now uses Java predictor by default.

Task

[PUBDEV-4982] - Fixed an issue that caused the pyunit_lending_club_munging_assembly_large.py and pyunit_assembly_munge_large.py tests to sometimes fail when run inside a Docker container. [PUBDEV-5876] - Simplified and improved the GLM COD implementation.

Improvement

👍 [PUBDEV-5491] - SQLite support is available via any JDBC driver in streaming mode. ⚡️ [PUBDEV-5993] - Updated Retrofit and okHttp dependecies. [PUBDEV-6129] - Target Encoding is now available in the Python client. 📦 [PUBDEV-6176] - Moved StackedEnsembleModel to hex.ensemble packages. In prior versions, this was in a root hex package. [PUBDEV-6188] - Secret key ID and secret key are available for s3:// AWS protocol. This can be done in the R client using: h2o.setS3Credentials(accessKeyId, accesSecretKey) and in the Python client using: from h2o.persist import set_s3_credentials set_s3_credentials(access_key_id, secret_access_key) [PUBDEV-6217] - Users can now specify AWS credentials at runtime. [PUBDEV-6254] - The new blending_frame parameter is now available in AutoML. 🛠 [PUBDEV-6334] - Fixed an error in the Javadoc for the Frame.java sort function. 🛠 [PUBDEV-6363] - Fixed Hive delegation token generation. [PUBDEV-6388] - Reordered the algorithms train in AutoML and prioritized hardcoded XGBoost models.

📄 Docs

🚚 [PUBDEV-4977] - Removed FAQ indicating that Java 9 was not yet supported. [PUBDEV-6136] - Added a "Generic Models" chapter to the Algorithms section. 📚 [PUBDEV-6179] - Added the blending_frame parameter to Stacked Ensembles documentation. [PUBDEV-6280] - Added information about the Negative Binomial family to the GLM booklet and the user guide. 📚 [PUBDV-6289] - Improved the R and Python client documentation for the sum function. [PUBDEV-6331] - Added include_algos,e xclude_algos, max_models, and max_runtime_secs_per_model examples to the Parameters appendix. 📚 [PUBDEV-6362] - In the User Guide and R an Python documentation, replaced references to "H2O Cloud" with "H2O Cluster". 🐎 [PUBDEV-6375] - Added information about predict_contributions to the Performance and Prediction chapter. [PUBDEV-6381] - In the GBM chapter, noted that monotone_constraints is available for Bernoulli distributions in addition to Gaussian distributions. 🙋 Improved the GBM Reproducibility FAQ.

H2O v3.24.0.1

Version Release Notes

« Changelog History

H2O v3.24.0.1 Release Notes