H2O/CHANGELOG and H2O Releases (Page 10)

All Versions

188

Latest Version

3.38.0.3

Avg Release Cycle

13 days

Latest Release

Changelog History

Page 8

v3.22.1.4 Changes

🚀 Download at: http://h2o-release.s3.amazonaws.com/h2o/rel-xu/4/index.html

Bug

[PUBDEV-6242] - Users can now save and load Isolation Forest models. 🛠 [PUBDEV-6264] - In K-Means, fixed and issue in which time columns were treated as if they were categorical. 0️⃣ [PUBDEV-6267] - Fixed Autoencoder calculateReconstructionErrorPerRowData error and set the default value of the result MSE to -1.

Improvement

[HEXDEV-733] - When using h2o.import_sql_table to read from a Hive table, the username and password no longer appear in the logs. [PUBDEV-6207] - Monotone constraints are now exposed in Flow. [PUBDEV-6277] - The check for constants in response columns is now optional for all models.

📄 Docs

📚 [PUBDEV-6032] - Added to the documentation that MOJO/POJO predict cannot parse columns enclosed in double quotes (for example, ""2""). ⚡️ [PUBDEV-6174] - Updated the description for Gini in the User Guide. 🛠 [PUBDEV-6183] - Fixed the equation for Tweedie Deviance in the GLM booklet and in the User Guide. [PUBDEV-6199] - Added a "Tokenize Strings" topic to the Data Manipulation chapter. [PUBDEV-6245] - Added predict_leaf_node_assignment information to the User Guide in the Performance and Prediction chapter. 📚 [PUBDEV-6253] - Noted in the documentation that the custom and custom_increasing stopping metric options are not available in the R client.
v3.22.1.3 Changes

🚀 Download at: http://h2o-release.s3.amazonaws.com/h2o/rel-xu/3/index.html

Bug

[PUBDEV-6186] - Improved error handling for a wrong Hive JDBC connector error. 🛠 [PUBDEV-6233] - Fixed an issue that caused H2O clusters to fail to come up on Cloudera 6 with HTTPS.

New Feature

👍 [PUBDEV-6216] - Added Hive with Kerberos support for H2O on Hadoop.

📄 Docs

⚡️ [PUBDEV-6219] - Updated the default value for min_rows in the User Guide when used with XGBoost, DRF, and Isolation Forest.
v3.22.1.2 Changes

🚀 Download at: http://h2o-release.s3.amazonaws.com/h2o/rel-xu/2/index.html

Bug

🚀 [PUBDEV-6109] - In Flow, fixed an issue that caused POJOs, MOJOs, and genmodel.jar to fail to download. This occurred when Flow was launched via Enterprise Steam and in any deployment where user_context was specified. 🛠 [PUBDEV-6177] - Fixed an issue that caused H2OTree to fail with Isolation Forest models trained on data with categorical columns. [PUBDEV-6178] - When a new tree is assembled from a model, the root node now includes information about the split feature in the description array. 🛠 [PUBDEV-6181] - Fixed an issue where Flow failed to provide the ability to ignore certain columns. 🛠 [PUBDEV-6192] - In Flow, fixed an issue where users were not able to select a frame when splitting a dataset. [PUBDEV-6197] - Setting the ignored_columns parameter via the Python API now works correctly. 🚀 [PUBDEV-6198] - Fixed an issue that caused H2O to hang in Sparkling Water deployments. [PUBDEV-6200] - Splitting frames now works correctly in Flow. [PUBDEV-6201] - Import SQL Table now works correctly in Flow. 🛠 [PUBDEV-6203] - Fixed an issue with imports in Flow. 🛠 [PUBDEV-6204] - Fixed interaction pairs for GLM in Flow. 🛠 [PUBDEV-6206] - Fixed broken "Combine predictions with frame" in Flow.

New Feature

👍 [PUBDEV-6146] - Added support for HDP 3.1.

Task

[PUBDEV-6171] - Fixed the pyunit_pubdev_3500_max_k_large.py unit test. [PUBDEV-6172] - Fixed the runit_PUBDEV_5705_drop_columns_parser_gz.R unit test.

Improvement

✅ [PUBDEV-6167] - Increased the XGBoost stress test timeout. [PUBDEV-6188] - Implemented secret key credentials for s3:// AWS protocol. [PUBDEV-6205] - Renamed .jade files to .pug.

📄 Docs

👍 [PUBDEV-6165] - Added HDP 3.0 and 3.1 to list of supported Hadoop versions. ⚡️ [PUBDEV-6190] - Updated wording for Kmeans Scoring History Graph. This graph shows the number of iterations vs. within the cluster’s sum of squares.
v3.22.1.1 Changes

🚀 Download at: http://h2o-release.s3.amazonaws.com/h2o/rel-xu/1/index.html

Bug

✅ [PUBDEV-5236] - PCA tests now work correctly with the "from h2o.estimators.pca import H2OPrincipalComponentAnalysisEstimator" import statement. ✅ [PUBDEV-5956] - Fixed an AutoMLTest test that was leaking keys in KeepCrossValidationFoldAssignment test. [PUBDEV-6081] - Reduced the Invocation JMH level setup/teardown to only the training model. 0️⃣ [PUBDEV-6124] - In XGBoost, the default value of L2 regularization for tree models is now 1, which is consistent with native XGBoost. 🛠 [PUBDEV-6157] - Fixed an issue that caused Stacked Ensembles to fail with GLM metalearner when the same H2O instance was used to train a GLM multinomial classification model with more classes than what is used in Stacked Ensembles.

New Feature

[PUBDEV-5261] - Users can now specify custom and custom_increasing when setting the stopping_criteria parameter in GBM and DRF. [PUBDEV-5770] - Checkpoints can now be exported when running Grid Search or AutomL.

Task

[PUBDEV-5894] - Added support for CDH 6.0, which includes Hadoop 3 support. Be sure to review https://www.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_cdh_600_release_notes.html for more information. ✅ [PUBDEV-5953] - Fixed an AutoMLTest that was leaking keys. ✅ [PUBDEV-6085] - Added a test that runs multiple nfolds>0 DRF models in parallel. 👍 [PUBDEV-6153] - Added support for CDH 6.1

Improvement

🏗 [PUBDEV-5820] - Hadoop builds now work with Jetty 8 and 9. 💅 [PUBDEV-5897] - R examples in the R package docs now use Hadley's style guide.

📄 Docs

📚 [PUBDEV-6048] - Added documentation for the new stopping_metric options in GBM and DRF. 👍 [PUBDEV-6154] - Added CDH 6 and 6.1 to list of supported Hadoop versions. ⚡️ [PUBDEV-6156] - In the XGBoost chapter, updated the default value for reg_lambda to be 1.
v3.22.0.5 Changes

🚀 Download at: http://h2o-release.s3.amazonaws.com/h2o/rel-xia/5/index.html

Bug

🚀 [PUBDEV-6198] - Fixed an H2O hang issue in Sparkling Water deployments.
v3.22.0.4 Changes

🚀 Download at: http://h2o-release.s3.amazonaws.com/h2o/rel-xia/4/index.html

Bug

🚀 [PUBDEV-6109] - In Flow, fixed an issue that caused POJOs, MOJOs, and genmodel.jar to fail to download. This occurred when Flow was launched via Enterprise Steam and in any deployment where user_context was specified. [PUBDEV-6166] - On the external backedn, H2O now explicitly passes the timestamp from the Spark Driver node.
v3.22.0.3 Changes

🚀 Download at: http://h2o-release.s3.amazonaws.com/h2o/rel-xia/3/index.html

Bug

🛠 [PUBDEV-5829] - Fixed an issue with the REST API. Calling "get model" no longer returns 0 for the timestamp of the model. [PUBDEV-5959] - The PySparking client no longer hangs after re-connecting to the H2O external backend. 🛠 [PUBDEV-5990] - Fixed an OOM issue in h2o.arrange. 🛠 [PUBDEV-6059] - Fixed an issue that caused importing Pargue files with large Double data to fail. [PUBDEV-6076] - After applying group_by to a time stamped column, the original time stamp format is now retained. 0️⃣ [PUBDEV-6079] - In AutoML, cross-validation metrics are now used for early stopping by default. Because of this, the validation_frame argument is now ignored unless nfolds==0 and, in that case, will be used for early stopping. 🛠 [PUBDEV-6098] - Fixed an issue that caused the MOJO visualizer to fail for Isolation Forest models. [PUBDEV-6101] - StackedEnsembleMojoModel is now serializable. 🛠 [PUBDEV-6107] - In the R client, fixed an error that occurrred when running getModelTree. 🚀 [PUBDEV-6109] - In Flow, fixed an issue that caused POJOs, MOJOs, and genmodel.jar to fail to download. This occurred when Flow was launched via Enterprise Steam and in any deployment where user_context was specified. 🛠 [PUBDEV-6111] - Fixed the formula used for calculating L2 distance. [PUBDEV-6117] - The Python client now allows users to enable XGBoost compare with any H2O frame. The convert_H2OFrame_2_DMatrix method accepts any H2O frame and can convert it to valid data for native XGBoost. [PUBDEV-6120] - H2O XGBoost now reports correct variable importances. The variable importances are computed from the gains of their respective loss functions during tree construction. [PUBDEV-6122] - Users can now save PDP plots. 🛠 [PUBDEV-6123] - Fixed an issue that resulted in a SQL exception when connecting H2O to a SQL server and importing a table. 🛠 [PUBDEV-6137] - Fixed an issue with GCS support on Hadoop environments.

New Feature

[PUBDEV-1984] - Added monotonic variables for GBM. [PUBDEV-6030] - EasyPredictModelWrapper now calculates reconstruction errors for AutoEncoder. [PUBDEV-6091] - When running a grid search, a timesteamp column was added that shows when each model was added to the grid summary table.

Improvement

[PUBDEV-5865] - In GBM, users can now specify the monotone_constraints parameter. [PUBDEV-6106] - Prediction contributions from each tree from MOJO to easywrapper are now exposed. ⚡️ [PUBDEV-6110] - Updated Gradle to version 5.0. 🛠 [PUBDEV-6115] - Fixed the output of rankTsv in the AutoML leaderboard.

📄 Docs

⚡️ [PUBDEV-4377] - Updated the Prediction section to include information on how the prediction threshold is selected for classification problems. ⚡️ [PUBDEV-6105] - Updated the description of enum_limited to indicate that T=1024. [PUBDEV-6148] - In the GBM chapter, added monotone_constraints to list of available parameters.
v3.22.0.2 Changes

🚀 Download at: http://h2o-release.s3.amazonaws.com/h2o/rel-xia/2/index.html

Bug

📜 [PUBDEV-3281] - Fixed an issue that caused ARFF parser to parse some file incorrectly. 🛠 [PUBDEV-4737] - When performing a grid search in Python, fixed an issue that caused all models to return a model.type of "supervised." [PUBDEV-5352] - When running DRF in the Python client, checkpointing on new data now works correctly. 🛠 [PUBDEV-5869] - Fixed an issue that caused the confusion matrix recall and precision values to be switched. 🛠 [PUBDEV-6036] - In the Python client, fixed an issue that caused the offset_column parameter to be ignored when it was passed in the GLM train statement. [PUBDEV-6042] - The H2O Tree Handler now works correctly on Isolation Forest models. 🛠 [PUBDEV-6046] - When running AutoML, fixed an issue that resulted in a "Failed to get metric: auc from ModelMetrics type BinomialGLM" message. [PUBDEV-6050] - In Flow, Precision and Recall definitions are no longer inverted in the confusion matrix. 🛠 [PUBDEV-6052] - Fixed the error message that displays when converting from a pandas dataframe to an h2oframe in Python 3.6. 🛠 [PUBDEV-6054] - In XGBoost, fixed an issue that resulted in a "Maximum amount of file descriptors hit" message. 🛠 [PUBDEV-6060] - Fixed the description of sample_rate in Isolation Forest. 0️⃣ [PUBDEV-6063] - Cross validation models are no longer deleted by default. 🛠 [PUBDEV-6065] - When viewing an AutoML leaderboard, fixed an issue that resulted in an ArrayIndexOutOfBoundsException if sort_metric was specified but no model was built.

New Feature

[PUBDEV-5766] - Added monotonicity constraints to H2O XGBoost.

Task

[PUBDEV-6039] - When generating MOJOs, h2o-genmodel.jar now includes a check for MOJO version 1.3 to determine whether the ho2-genmodel.jar and the MOJO version can work together. Prior versions of h2o-3 did not include MOJO 1.3, and as a result, MOJOs silently returned predicted values executed on an empty vector.

Improvement

📜 [PUBDEV-5705] - With a new skipped_columns option, users can now specify to drop specific columns before parsing. Note that this functionality is not supported for SVMLight or Avro file formats. [PUBDEV-6062] - The GLM multinomial coefficient table now includes the original levels as column names.

📄 Docs

🐎 [PUBDEV-3216] - Created new Performance & Prediction and Variable Importance sections in the User Guide. 0️⃣ [PUBDEV-5313] - Updatd the default value of categorical_encoding for XGBoost. This defaults to Auto (which is one_hot_encoding). ⚡️ [PUBDEV-6012] - In the parameter entry for weights_column, updated the example to exclude the weight column in the list of predictors. ⚡️ [PUBDEV-6016] - In the DRF FAQ, updated the "What happens when you try to predict on a categorical level not seen during training?" question. 📄 [PUBDEV-6025] - TargetingEncoder is now included in the Python module docs. 📚 [PUBDEV-6041] - In GLM, updated the documentation to indicate that coordinate_descent is no longer experimental. [PUBDEV-6064] - Added default values for max_depth, sample_size, and sample_rate. Also added a parameter description entry for sample_size, showing an Isolation Forest example. [PUBDEV-6086] - Added the new monotone_constraints option to the XGBoost chapter.
v3.22.0.1 Changes

🚀 Download at: http://h2o-release.s3.amazonaws.com/h2o/rel-xia/1/index.html

Bug

[PUBDEV-5023] - In Python, the metalearner method is only available for Stacked Ensembles. ✅ [PUBDEV-5658] - Fixed an issue that caused micro benchmark tests to fail to run in the jmh directory. 🛠 [PUBDEV-5663] - Fixed an issue that caused H2O to fail to export dataframes to S3. [PUBDEV-5745] - Added the keep_cross_validation_models argument to Grid Search. [PUBDEV-5746] - Improved efficiency of the keep_cross_validation_models parameter in AutoML [PUBDEV-5777] - Simplified the comparison of H2OXGBoost with native XGBoost when using the Python client. 🛠 [PUBDEV-5780] - Fixed JDBC ingestion for Teradata databases. [PUBDEV-5824] - In the Python client and the Java API, multiple runs of the same AutoML instance no longer fail training new "Best Of Family" SE models that would include the newly generated models. 🛠 [PUBDEV-5873] - Fixed an issue that resulted in an AssertionError when calling cbind from the Python client. [PUBDEV-5881] - AutoML now enforces case for the sort_metric option when using the Java API. ⚙ [PUBDEV-5903] - In AutoML, StackEnsemble models are now always trained, even if we reached max_runtime_secs limit. 📚 [PUBDEV-5904] - In the R client, added documentation for helper functions. [PUBDEV-5922] - Renamed x to X in the H2O-sklearn fit method to be consistent with the sklearn API. 🔀 [PUBDEV-5924] - Merging datasets now works correctly. 🏗 [PUBDEV-5931] - Building on Maven with h2o-ext-xgboost on versions later than 3.18.0.11 no longer results in a dependency error. 📜 [PUBDEV-5933] - Fixed a Java 11 ORC file parsing failure. ⬆️ [PUBDEV-5954] - Upgraded the version of the lodash package used in H2O Flow. [PUBDEV-5967] - -ip localhost now works correctly on WSL. 📜 [PUBDEV-5971] - CSV/ARFF Parser no longer treats blank lines as data lines with NAs. [PUBDEV-5976] - Starting h2o-3 from the Python Client no longer fails on Java 10.0.2. 🛠 [PUBDEV-5995] - Fixed an issue that caused StackedEnsemble MOJO model to return an "IllegalArgumentException: categorical value out of range" message. 🚚 [PUBDEV-5996] - Removed the "nclasses" parameter from tree traversal routines. [PUBDEV-5998] - Exposed H2OXGBoost parameters used to train a model to the Python API. Previously, this information was visible in the Java backend but was not passed back to the Python API. 🚚 [PUBDEV-5999] - Removed "illegal reflective access" warnings when starting H2O-3 with Java 10. [PUBDEV-6004] - In Stacked Ensembles, changes made to data during scoring now apply to all models. ⚡️ [PUBDEV-6005] - When running AutoML in Flow, updated the list of algorithms that can ber selected in the "Exclude These Algorithms" section.

New Feature

[PUBDEV-5170] - Individual predictions of GBM trees are now exposed in the MOJO API. [PUBDEV-5378] - Exposed target encoding in the Java API. [PUBDEV-5399] - The keep_cross_validation_fold_assignment option is now available in AutoML. 👍 [PUBDEV-5609] - Added support for the Isolation Forest algorithm in H2O-3. Note that this is a Beta version of the algorithm. [PUBDEV-5668] - Added the keep_cross_validation_fold_assignment option to AutoML in Flow. 🔖 [PUBDEV-5681] - h2o.connect no longer ignores strict_version_check=FALSE when connecting to a Steam cluster. 👷 [PUBDEV-5695] - Created an R demo for CoxPH. This is available here. [PUBDEV-5775] - It is now possible to combine two models into one MOJO, with the second model using the prediction from the first model as a feature. These models can be from any algorithm or combination of algorithms except Word2Vec. [PUBDEV-5852] - Implemented h2oframe.fillna(method='backward'). [PUBDEV-5977] - Improved speed-up of AutoML training on smaller datesets in client mode (Sparkling Water). [PUBDEV-5979] - Exposed Java Target Encoding in the Python client. 🚚 [PUBDEV-5988] - Users can now specify a -features parameter when starting h2o from the command line. This allows users to remove experimental or beta algorithms when starting H2O-3. Available options for this parameter include beta, stable, and experimental.

Task

[PUBDEV-4507] - Added XGBoost to AutoML. [PUBDEV-5696] - Added an option to allow users to use a user-specified JDBC driver. [PUBDEV-5722] - Exposed pr_auc to areas where you can find AUC, including scoring_history, model summary. Also added h2o.pr_auc() in R. 👍 [PUBDEV-5901] - Added support for Java 11. 📚 [PUBDEV-6001] - Improved the AutoML documentation in the User Guide.

Improvement

[PUBDEV-5590] - Added a MAX_USR_CONNECTIONS_KEY argument to limit number of sessions for import_sql_table. 🐎 [PUBDEV-5669] - Improved performance gap when importing data using Hive2. [PUBDEV-5719] - Improved and cleaned up output for the h2o.mojo_predict_csv and h2o.mojo_predict_df functions. [PUBDEV-5743] - Users can now visualize XGBoost trees when running predictions. [PUBDEV-5761] - Added weights to partial depenced plots. Also added a level for missing values. [PUBDEV-5822] - Users can now download the genmodel.jar in Flow for completed models. [PUBDEV-5886] - In AutoML, changed the default for keep_cross_validation_models and keep_cross_validation_predictions from True to False. 👍 [PUBDEV-5888] - Added support for predicting using the XGBoost Predictor. ⚡️ [PUBDEV-5909] - In XGBoost, optimized the matrix exchange between Java and native C++ code. [PUBDEV-5913] - Improved the h2o-3 README for installing in R and IntelliJ IDEA. [PUBDEV-5927] - Introduced a simple "streaming" mode that allows H2O to read from a table using basic SQL:92 constructs. [PUBDEV-5929] - In AutoML, stopping_metric is now based on sort_metric. [PUBDEV-5952] - The requirements.txt file now includes the Colorama version. [PUBDEV-5961] - In lockable.java, delete is now final in order to prevent inconsistent overrides. ⏪ [PUBDEV-5964] - Reverted AutoML naming change from Auto.Algo to Auto.algo. [PUBDEV-6000] - In AutoML, automatic partitioning of the valiation frame now uses 10% of the training data instead of 20%. [PUBDEV-6002] - Changed model and grid indexing in autogenerated model names in AutoML to be 1 instead of 0 indexed. [PUBDEV-6017] - Allow public access to H2O instances started from R/Python. This can be done with the new bind_to_localhost (Boolean) parameter, which can be specified in h2o.init().

📄 Docs

🏗 [PUBDEV-4505] - Added Scala and Java examples to the Building and Extracting a MOJO topic. [PUBDEV-4590] - Added a Scala example to the Stacked Ensembles topic. 📚 [PUBDEV-5949] - Added Tree class method to the Python module documentation. 📚 [PUBDEV-5641] - Removed references to UDP in the documentation. 🚚 [PUBDEV-5664] - Removed Sparkling Water topics from H2O-3 User Guide. These are in the Sparkling Water User Guide. 🔊 [PUBDEV-5674] - Added a Resources section to the Overview and included links to the awesome-h2o repository, H2O.ai blogs, and customer use cases. 📚 [PUBDEV-5693] - Updated GCP Installation documentation with infomation about quota limits. 📚 [PUBDEV-5709] - Updated Gains/Lift documentation. 16 groups are now used by default. [PUBDEV-5756] - Added Python examples to the Cross-Validation topic in the User Guide. [PUBDEV-5762] - Added loss_by_col and loss_by_col_idx to list of GLRM parameters. [PUBDEV-5810] - Updated documentation for class_sampling_factors. balance_classes must be enabled when using class_sampling_factors. 🐳 [PUBDEV-5839] - Added a Python example for initializing and starting h2o-3 in Docker. 📚 [PUBDEV-5857] - Updated the Admin menu documentation in Flow after adding "Download Gen Model" option. 👍 [PUBDEV-5905] - In GBM and DRF, enum_limited is a supported option for categorical_encoding. 💻 [PUBDEV-5962] - Added the -notify_local flag to list of flags available when starting H2O-3 from the command line. 📚 [PUBDEV-5982] - Added documentation for Isolation Forest (beta).
v3.20.0.9 Changes

🚀 Download at: http://h2o-release.s3.amazonaws.com/h2o/rel-wright/9/index.html

Bug

🛠 [PUBDEV-5930] - Fixed an issue that caused H2O to fail when loading a GLRM model.

Improvement

[PUBDEV-5938] - log4j.properties can be loaded from classpath. 🔧 [PUBDEV-5939] - Buffer configuration is now available for http/https connections.

H2O changelog

Changelog History Page 8

Changelog History

Page 8