MXNet v1.0.0 Release Notes
Release Date: 2017-12-04 // over 6 years ago-
π² MXNet Change Log
1.0.0
π Performance
- β¨ Enhanced the performance of
sparse.dot
operator. - MXNet now automatically set OpenMP to use all available CPU cores to maximize CPU utilization when
NUM_OMP_THREADS
is not set. - π Unary and binary operators now avoid using OpenMP on small arrays if using OpenMP actually hurts performance due to multithreading overhead.
- β Significantly improved performance of
broadcast_add
,broadcast_mul
, etc on CPU. - β Added bulk execution to imperative mode. You can control segment size with
mxnet.engine.bulk
. As a result, the speed of Gluon in hybrid mode is improved, especially on small networks and multiple GPUs. - π Improved speed for
ctypes
invocation from Python frontend.
π New Features - Gradient Compression [Experimental]
- Speed up multi-GPU and distributed training by compressing communication of gradients. This is especially effective when training networks with large fully-connected layers. In Gluon this can be activated with
compression_params
in Trainer.
π New Features - Support of NVIDIA Collective Communication Library (NCCL) [Experimental]
- π Use
kvstore=βncclβ
for (in some cases) faster training on multiple GPUs. - Significantly faster than kvstore=βdeviceβ when batch size is small.
- It is recommended to set environment variable
NCCL_LAUNCH_MODE
toPARALLEL
when using NCCL version 2.1 or newer.
π New Features - Advanced Indexing [General Availability]
- π NDArray now supports advanced indexing (both slice and assign) as specified by the numpy standard: https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.indexing.html#combining-advanced-and-basic-indexing with the following restrictions:
- if key is a list type, only a list of integers is supported, e.g.
key=[1, 2]
is supported, while not forkey=[[1, 2]]
. - Ellipsis (...) and np.newaxis are not supported.
Boolean
array indexing is not supported.
- if key is a list type, only a list of integers is supported, e.g.
π New Features - Gluon [General Availability]
- π Performance optimizations discussed above.
- β Added support for loading data in parallel with multiple processes to
gluon.data.DataLoader
. The number of workers can be set withnum_worker
. Does not support windows yet. - β Added Block.cast to support networks with different data types, e.g.
float16
. - β Added Lambda block for wrapping a user defined function as a block.
- π Generalized
gluon.data.ArrayDataset
to support arbitrary number of arrays.
π New Features - ARM / Raspberry Pi support [Experimental]
- π³ MXNet now compiles and runs on ARMv6, ARMv7, ARMv64 including Raspberry Pi devices. See https://github.com/apache/incubator-mxnet/tree/master/docker_multiarch for more information.
π New Features - NVIDIA Jetson support [Experimental]
- MXNet now compiles and runs on NVIDIA Jetson TX2 boards with GPU acceleration.
- π¦ You can install the python MXNet package on a Jetson board by running -
$ pip install mxnet-jetson-tx2
.
π New Features - Sparse Tensor Support [General Availability]
- β Added more sparse operators:
contrib.SparseEmbedding
,sparse.sum
andsparse.mean
. - β Added
asscipy()
for easier conversion to scipy. - β Added
check_format()
for sparse ndarrays to check if the array format is valid.
π Bug-fixes
- π Fixed a[-1] indexing doesn't work on
NDArray
. - π Fixed
expand_dims
if axis < 0. - π Fixed a bug that causes topk to produce incorrect result on large arrays.
- π Improved numerical precision of unary and binary operators for
float64
data. - π Fixed derivatives of log2 and log10. They used to be the same with log.
- π Fixed a bug that causes MXNet to hang after fork. Note that you still cannot use GPU in child processes after fork due to limitations of CUDA.
- π Fixed a bug that causes
CustomOp
to fail when using auxiliary states. - π Fixed a security bug that is causing MXNet to listen on all available interfaces when running training in distributed mode.
β‘οΈ Doc Updates
- β Added a security best practices document under FAQ section.
- π Fixed License Headers including restoring copyright attributions.
- π Documentation updates.
- π Links for viewing source.
π For more information and examples, see full release notes
- β¨ Enhanced the performance of