Changelog History
Page 2
-
v0.8.0 Changes
December 18, 2019๐ Ray 0.8.0 Release Notes
๐ This is the first release with gRPC direct calls enabled by default for both tasks and actors, which substantially improves task submission performance.
Highlights
- โฑ Enable gRPC direct calls by default (#6367). In this mode, actor tasks are sent directly from actor to actor over gRPC; the Raylet only coordinates actor creation. Similarly, with tasks, tasks are submitted directly from worker to worker over gRPC; the Raylet only coordinates the scheduling decisions. In addition, small objects (<100KB in size) are no longer placed in the object store. They are inlined into task submissions and returns when possible.
๐ฎ Note: in some cases, reconstruction of large evicted objects is not possible with direct calls. To revert to the 0.7.7 behaviour, you can set the environment variable
RAY_FORCE_DIRECT=0
.Core
- [Dashboard] Add remaining features from old dashboard (#6489)
- Ray Kubernetes Operator Part 1: readme, structure, config and CRD realted file (#6332)
- ๐ Make sure numpy >= 1.16.0 is installed for fast pickling support (#6486)
- ๐ Avoid workers starting with the same random seed (#6471)
- Properly handle a forwarded task that gets forwarded back (#6271)
RLlib
- ๐ (Bug Fix): Remove the extra 0.5 in the Diagonal Gaussian entropy (#6475)
- AlphaZero and Ranked reward implementation (#6385)
Tune
- โ Add example and tutorial for DCGAN (#6400)
- Report trials by state fairly (#6395)
- ๐ Fixed bug in PBT where initial trial result is empty. (#6351)
Other Libraries
- โก๏ธ [sgd] Add support for multi-model multi-optimizer training (#6317)
- [serve] Added deadline awareness (#6442)
- [projects] Return parameters for a command (#6409)
- [streaming] Streaming data transfer and python integration (#6185)
Thanks
๐ We thank the following contributors for their work on this release:
@zplizzi, @istoica, @ericl, @mehrdadn, @walterddr, @ujvl, @alindkhare, @timgates42, @chaokunyang, @eugenevinitsky, @kfstorm, @Maltimore, @visatish, @simon-mo, @AmeerHajAli, @wumuzi520, @robertnishihara, @micafan, @pcmoritz, @zhijunfu, @edoakes, @sytelus, @ffbin, @richardliaw, @Qstar, @stephanie-wang, @Coac, @mitchellstern, @MissiontoMars, @deanwampler, @hhbyyh, @raulchen
-
v0.7.7 Changes
December 16, 2019๐ Ray 0.7.7 Release Notes
Highlights
- ๐ Remote functions and actors now support kwargs and positionals (#5606).
- โฑ
ray.get
now supports atimeout
argument (#6107). If the object isn't available before the timeout passes, aRayTimeoutError
is raised. - โ Ray now supports detached actors (#6036), which persist beyond the lifetime of the script that creates them and can be referred to by a user-defined name.
- โ Added documentation for how to deploy Ray on YARN clusters using Skein (#6119, #6173).
- โฑ The Ray scheduler now attempts to schedule tasks fairly to avoid starvation (#5851).
Core
- ๐ท Progress towards a new backend architecture where tasks and actor tasks are submitted directly between workers. #5783, #5991, #6040, #6054, #6075, #6088, #6122, #6147, #6171, #6177, #6118, #6188, #6259, #6277
- ๐ Progress towards Windows compatibility. #6071, #6204, #6205, #6282
- ๐ Now using cloudpickle_fast for serialization by default, which supports more types of Python objects without sacrificing performance. #5658, #5805, #5960, #5978
- ๐ Various bugfixes. #5946, #6175, #6176, #6231, #6253, #6257, #6276,
RLlib
- ๐ Now using pytorch's function to see if gpu is available. #5890
- ๐ Fixed APEX priorities returning zero all the time. #5980
- ๐ Fixed leak of TensorFlow assign operations in DQN/DDPG. #5979
- ๐ Fixed choosing the wrong neural network model for Atari in 0.7.5. #6087
- โ Added large scale regression test for RLlib. #6093
- ๐ Fixed and added test for LR annealing config. #6101
- โฌ๏ธ Reduced log verbosity. #6154
- โ Added a microbatch optimizer with an A2C example. #6161
Tune
- Search algorithms now use early stopped trials for optimization. #5651
- Metrics are now outputted via a tabular format. Errors are outputted on a separate table. #5822
- ๐ In the distributed setting, checkpoints are now deleted automatically post-sync using an rsync flag. Checkpoints on the driver are garbage collected according to the policy defined by the user. #5877
- A much faster ExperimentAnalysis tool. #5962
- Trial executor callbacks now take in a โRunnerโ parameter. #5868
- ๐ Fixed
queue_trials
so to enable cluster autoscaling with a CPU-Only Head Node. #5900 - โ Added a TensorBoardX logger. #6133
Other Libraries
- Serving: Progress towards a new Ray serving library. #5854, #5886, #5894, #5929, #5937, #5961, #6051
Thanks
We thank the following contributors for their amazing contributions:
@zhuohan123, @jovany-wang, @micafan, @richardliaw, @waldroje, @mitchellstern, @visatish, @mehrdadn, @istoica, @ericl, @adizim, @simon-mo, @lsklyut, @zhu-eric, @pcmoritz, @hhbyyh, @suquark, @sotte, @hershg, @pschafhalter, @stackedsax, @edoakes, @mawright, @stephanie-wang, @ujvl, @ashione, @couturierc, @AdamGleave, @robertnishihara, @DaveyBiggers, @daiyaanarfeen, @danyangz, @AmeerHajAli, @mimoralea
-
v0.7.6 Changes
October 24, 2019๐ Ray 0.7.6 Release Notes
Highlights
๐ The Ray autoscaler now supports Kubernetes as a backend (#5492). This makes it possible to start a Ray cluster on top of your existing Kubernetes cluster with a simple shell command.
- Please see the Kubernetes section of the autoscaler documentation to get started.
- This is a new feature and may be rough around the edges. If you run into problems or have suggestions for how to improve Ray on Kubernetes, please file an issue.
๐ The Ray cluster dashboard has been revamped (#5730, #5857) to improve the UI and include logs and error messages. More improvements will be coming in the near future.
- You can try out the dashboard by starting Ray with
ray.init(include_webui=True)
orray start --include-webui
. - Please let us know if you have suggestions for what would be most useful to you in the new dashboard.
Core
- ๐จ Progress towards refactoring the Python worker on top of the core worker. #5750, #5771, #5752
- ๐ Fix an issue in local mode where multiple actors didn't work properly. #5863
- ๐ Fix class attributes and methods for actor classes. #5802
- ๐ Improvements in error messages and handling. #5782, #5746, #5799
- Serialization improvements. #5841, #5725
- ๐ Various documentation improvements. #5801, #5792, #5414, #5747, #5780, #5582
RLlib
- โ Added a link to BAIR blog posts in the documentation. #5762
- Tracing for eager tensorflow policies with
tf.function
. #5705
Tune
- ๐ Improved MedianStoppingRule. #5402
- โ Add PBT + Memnn example. #5723
- โ Add support for function-based stopping condition. #5754
- ๐พ Save/Restore for Suggestion Algorithms. #5719
- TensorBoard HParams for TF2.0. #5678
Other Libraries
Thanks
We thank the following contributors for their amazing contributions:
@hershg, @JasonWayne, @kfstorm, @richardliaw, @batzner, @vakker, @robertnishihara, @stephanie-wang, @gehring, @edoakes, @zhijunfu, @pcmoritz, @mitchellstern, @ujvl, @simon-mo, @ecederstrand, @mawright, @ericl, @anthonyhsyu, @suquark, @waldroje
-
v0.7.5
September 17, 2019 -
v0.7.4 Changes
September 05, 2019๐ Ray 0.7.4 Release Notes
Highlights
๐ There were many documentation improvements (#5391, #5389, #5175). As we continue to improve the documentation we value your feedback through the โDoc suggestion?โ link at the top of the documentation. Notable improvements:
- Weโve added guides for best practices using TensorFlow and PyTorch.
- Weโve revamped the Walkthrough page for Ray users, providing a better experience for beginners.
- Weโve revamped guides for using Actors and inspecting internal state.
Ray supports memory limits now to ensure memory-intensive applications run predictably and reliably. You
can activate them through theray.remote
decorator:@ray.remote( memory=2000 \* 1024 \* 1024, object\_store\_memory=200 \* 1024 \* 1024)class SomeActor(object): def \_\_init\_\_(self, a, b): pass
๐ You can set limits for the heap and the object store, see the documentation.
There is now preliminary support for projects , see the the project documentation. Projects allow you to
๐ฆ package your code and easily share it with others, ensuring a reproducible cluster setup. To get started, you
can run# Create a new project.ray project create \<project-name\># Launch a session for the project in the current directory.ray session start# Open a console for the given session.ray session attach# Stop the given session and all of its worker nodes.ray session stop
Check out the examples. This is an actively developed new feature so we appreciate your feedback!
๐ฅ Breaking change: The
redis_address
parameter was renamed toaddress
(#5412, #5602) and the former will be removed in the future.Core
- ๐ Move Java bindings on top of the core worker #5370
- ๐ Improve log file discoverability #5580
- Clean up and improve error messages #5368, #5351
RLlib
- ๐ Support custom action space distributions #5164
- โ Add TensorFlow eager support #5436
- โ Add autoregressive KL #5469
- Autoregressive Action Distributions #5304
- Implement MADDPG agent #5348
- Port Soft Actor-Critic on Model v2 API #5328
- More examples: Add CARLA community example #5333 and rock paper scissors multi-agent example #5336
- ๐ Moved RLlib to top level directory #5324
Tune
- Experimental Implementation of the BOHB algorithm #5382
- ๐ฅ Breaking change: Nested dictionary results are now flattened for CSV writing:
{โaโ: {โbโ: 1}} => {โa/bโ: 1}
#5346 - โ Add Logger for MLFlow #5438
- ๐ TensorBoard support for TensorFlow 2.0 #5547
- โ Added examples for XGBoost and LightGBM #5500
- HyperOptSearch now has warmstarting #5372
Other Libraries
- SGD: Tune interface for Pytorch MultiNode SGD #5350
- ๐ Serving: The old version of ray.serve was deprecated #5541
- Autoscaler: Fix ssh control path limit #5476
- ๐ท Dev experience: Ray CI tracker online at https://ray-travis-tracker.herokuapp.com/
๐ Various fixes: Fix log monitor issues #4382 #5221 #5569, the top-level ray directory was cleaned up #5404
Thanks
We thank the following contributors for their amazing contributions:
@jon-chuang, @lufol, @adamochayon, @idthanm, @RehanSD, @ericl, @michaelzhiluo, @nflu, @pengzhenghao, @hartikainen, @wsjeon, @raulchen, @TomVeniat, @layssi, @jovany-wang, @llan-ml, @ConeyLiu, @mitchellstern, @gregSchwartz18, @jiangzihao2009, @jichan3751, @mhgump, @zhijunfu, @micafan, @simon-mo, @richardliaw, @stephanie-wang, @edoakes, @akharitonov, @mawright, @robertnishihara, @lisadunlap, @flying-mojo, @pcmoritz, @jredondopizarro, @gehring, @holli, @kfstorm
-
v0.7.3
July 31, 2019 -
v0.7.2 Changes
July 03, 2019Core
- ๐ Improvements
- Python
- Java
- Allow users to set JVM options at actor creation time. #4970
- Internal
- Peformance
Tune
- โ Add directional metrics for components. #4120, #4915
- Disallow setting
resources_per_trial
when it is already configured. #4880 - ๐ง Make PBT Quantile fraction configurable. #4912
RLlib
- โ Add QMIX mixer parameters to optimizer param list. #5014
- Allow Torch policies access to full action input dict in
extra_action_out_fn
. #4894 - ๐ Allow access to batches prior to postprocessing. #4871
- Throw error if
sample_async
is used with pytorch for A3C. #5000 - Patterns & User Experience
- ๐ Documentation
Other Libraries
- โ Add support for distributed training with PyTorch. #4797, #4933
- ๐ท Autoscaler will kill workers on exception. #4997
- ๐ Fix handling of non-integral timeout values in
signal.receive
. #5002
Thanks
We thank the following contributors for their amazing contributions: @jiangzihao2009, @raulchen, @ericl, @hershg, @kfstorm, @kiddyboots216, @jovany-wang, @pschafhalter, @richardliaw, @robertnishihara, @stephanie-wang, @simon-mo, @zhijunfu, @ls-daniel, @ajgokhale, @rueberger, @suquark, @guoyuhong, @jovany-wang, @pcmoritz, @hartikainen, @timonbimon, @TianhongDai
-
v0.7.1 Changes
June 23, 2019Core
- ๐ Change global state API. #4857
ray.global_state.client_table()
->ray.nodes()
ray.global_state.task_table()
->ray.tasks()
ray.global_state.object_table()
->ray.objects()
ray.global_state.chrome_tracing_dump()
->ray.timeline()
ray.global_state.cluster_resources()
->ray.cluster_resources()
ray.global_state.available_resources()
->ray.available_resources()
- Export remote functions lazily. #4898
- ๐ท Begin moving worker code to C++. #4875, #4899, #4898
- โฌ๏ธ Upgrade arrow to latest master. #4858
- Upload wheels to S3 under
<branch-name>/<commit-id>
. #4949 - โ Add hash table to Redis-Module. #4911
- ๐ Initial support for distributed training with PyTorch. #4797
Tune
- Disallow setting
resources_per_trial
when it is already configured. #4880 - ๐ Initial experiment tracking support. #4362
RLlib
- ๐ Begin deprecating Python 2 support in RLlib. #4832
- TensorFlow 2 compatibility. #4802
- Allow Torch policies access to full action input dict in
extra_action_out_fn
. #4894 - ๐ Allow access to batches prior to postprocessing. #4871
- ๐ Port algorithms to
build_trainer()
pattern. #4823 - ๐ท Rename
PolicyEvaluator
->RolloutWorker
. #4820 - ๐ Rename
PolicyGraph
->Policy
, move from evaluation/ to policy/. #4819 - ๐ Support continuous action distributions in IMPALA/APPO. #4771
๐ (Revision: 6/23/2019 - Accidentally included commits that were not part of the release.)
- ๐ Change global state API. #4857
-
v0.7.0 Changes
May 18, 2019Core
- ๐ Backend bug fixes. #4766, #4763, #4605
- โ Add experimental API for creating resources at runtime. #3742
Tune
RLlib
- โ Remove dependency on TensorFlow. #4764
- TD3/DDPG improvements and MuJoCo benchmarks. #4694
- Evaluation mode implementation for rllib.Trainer class. #4647
- Replace ray.get() with ray_get_and_free() to automatically free object store memory. #4586
- ๐ RLLib bug fixes. #4736, #4735, #4652, #4630
Autoscaler
-
v0.6.6 Changes
April 19, 2019Core
- Add
delete_creating_tasks
option forinternal.free()
#4588
Tune
- โ Add filter flag for Tune CLI. #4337
- ๐ Better handling of
tune.function
in global checkpoint. #4519 - โ Add compatibility to nevergrad 0.2.0+. #4529
- โ Add
--columns
flag for CLI. #4564 - โ Add checkpoint eraser. #4490
- ๐ Fix checkpointing for Gym types. #4619
RLlib
- ๐ Report sampler performance metrics. #4427
- Ensure stats are consistently reported across all algos. #4445
- Cleanup
TFPolicyGraph
. #4478 - ๐ท Make batch timeout for remote workers tunable. #4435
- ๐ Fix inconsistent weight assignment operations in
DQNPolicyGraph
. #4504 - โ Add support for LR schedule to DQN/APEX. #4473
- โ Add option for RNN state and value estimates to span episodes. #4429
- Create a combination of
ExternalEnv
andMultiAgentEnv
, calledExternalMutliAgentEnv
. #4200 - Support
prev_state
/prev_action
in rollout and fix multiagent. #4565 - ๐ Support torch device and distributions. #4553
Java
- โ TestNG outputs more verbose error messages. #4507
- Implement
GcsClient
. #4601 - Avoid unnecessary memory copy and addd a benchmark. #4611
Autoscaler
- Add