Changelog History
Page 4
-
v0.5.19 Changes
๐ This release brings optimization improvements for dictionary using code. This is now lowering subscripts to dictionary accesses where possible and adds new code generation for known dictionary values. Besides this there is the usual ๐ range of bug fixes.
๐ Bug Fixes
๐ Fix, attribute assignments or deletions where the assigned value or the attribute source was statically raising crashed the compiler.
๐ Fix, the order of evaluation during optimization was considered in the wrong order for attribute assignments source and value.
๐ Windows: Fix, when
g++
is the path, it was not used automatically, but now it is.๐ Windows: Detect the 32 bits variant of MinGW64 too.
Python3.4: The finalize of compiled generators could corrupt reference counts for shared generator objects. Fixed in 0.5.18.1 already.
Python3.5: The finalize of compiled coroutines could corrupt reference counts for shared generator objects.
Optimization
- When a variable is known to have dictionary shape (assigned from a constant
value, result of
dict
built-in, or a general dictionary creation), or the branch merge thereof, we lower subscripts from expecting mapping nodes to dictionary specific nodes. These generate more efficient code, and some are then known to not raise an exception.
.. code-block:: python
def someFunction(a,b): value = {a : b} value["c"] = 1 return value
The above function is not yet fully optimized (dictionary key/value tracing is not yet finished), however it at least knows that no exception can raise from assigning
value["c"]
anymore and creates more efficient code for the typicalresult = {}
functions.The use of "logical" sharing during optimization has been replaced with checks for actual sharing. So closure variables that were written to in dead code no longer inhibit optimization of the then no more shared local variable.
Global variable traces are now faster to decide definite writes without need to check traces for this each time.
Cleanups
๐ No more using "logical sharing" allowed to remove that function entirely.
Using "technical sharing" less often for decisions during optimization and instead rely more often on proper variable registry.
Connected variables with their global variable trace statically avoid the need to check in variable registry for it.
โ Removed old and mostly unused "assume unclear locals" indications, we use global variable traces for this now.
Summary
๐ This release aimed at dictionary tracing. As a first step, the value assign is now traced to have a dictionary shape, and this this then used to lower the operations which used to be normal subscript operations to mapping, but now can be more specific.
Making use of the dictionary values knowledge, tracing keys and values is not yet inside the scope, but expected to follow. We got the first signs of type inference here, but to really take advantage, more specific shape tracing will be needed.
-
v0.5.18 Changes
๐ This release mainly has a scalability focus. While there are few compatibility ๐ improvements, the larger goal has been to make Nuitka compilation and the final C compilation faster.
๐ Bug Fixes
- Compatibility: The nested arguments functions can now be called using their keyword arguments.
.. code-block:: python
def someFunction(a,(b,c)): return a, b, c someFunction(a = 1, **{".1" : (2,3)})
Compatibility: Generators with Python3.4 or higher now also have a
__del__
attribute, and therefore properly participate in finalization. This should improve their interactions with garbage collection reference cycles, although no issues had been observed so far.๐ Windows: Was outputting command line arguments debug information at program start.
Issue#284 <http://bugs.nuitka.net/issue284>
__. Fixed in 0.5.17.1 already.
Optimization
Code generated for parameter parsing is now a lot less verbose. Python level loops and conditionals to generate code for each variable has been replaced with C level generic code. This will speed up the backend compilation by a lot.
Function calls with constant arguments were speed up specifically, as their call is now fully prepared, and yet using less code. Variable arguments are also faster, and all defaulted arguments are also much faster. Method calls are not affected by these improvements though.
Nested argument functions now have a quick call entry point as well, making them faster to call too.
The
slice
built-in, and internal creation of slices (e.g. in re-formulations of Python3 slices as subscripts) cannot raise.Issue#262 <http://bugs.nuitka.net/issue262>
__.โ Standalone: Avoid inclusion of bytecode of
unittest.test
,sqlite3.test
,distutils.test
, andensurepip
. These are not needed, but simply bloat the amount of bytecode used on e.g. macOS.Issue#272 <http://bugs.nuitka.net/issue272>
__.Speed up compilation with Nuitka itself by avoid to copying and constructing variable lists as much as possible using an always accurate variable registry.
Cleanups
Nested argument functions of Python2 are now re-formulated into a wrapping function that directly calls the actual function body with the unpacking of nested arguments done in nodes explicitly. This allows for better optimization and checks of these steps and potential in-lining of these functions too.
Unified slice object creation and built-in
slice
nodes, these were two distinct nodes before.The code generation for all statement kinds is now done via dispatching from a dictionary instead of long
elif
chains.Named nodes more often consistently, e.g. all loop related nodes start with
Loop
now, making them easier to group.Parameter specifications got simplified to work without variables where it is possible.
Organizational
- Nuitka is now available on the social code platforms gitlab as well.
Summary
๐ Long standing weaknesses have been addressed in this release, also quite a few structural cleanups have been performed, e.g. strengthening the role of the variable registry to always be accurate, is groundlaying to further improvement of optimization.
๐ However, this release cycle was mostly dedicated to performance of the actual compilation, and more accurate information was needed to e.g. not search for information that should be instant.
๐ Upcoming releases will focus on usability issues and further optimization, it ๐ was nice however to see speedups of created code even from these scalability ๐ improvements.
-
v0.5.17 Changes
๐ This release is a major feature release, as it adds full support for Python3.5 ๐ and its coroutines. In addition, in order to properly support coroutines, the generator implementation got enhanced. On top of that, there is the usual range of corrections.
๐ Bug Fixes
๐ Windows: Command line arguments that are unicode strings were not properly working.
Compatibility: Fix, only the code object attached to exceptions contained all variable names, but not the one of the function object.
๐ Python3: Support for virtualenv on Windows was using non-portable code and therefore failing.
Issue#266 <http://bugs.nuitka.net/issue266>
__.The tree displayed with
--display-tree
duplicated all functions and did not resolve source lines for functions. It also displayed unused functions, which is not helpful.Generators with parameters leaked C level memory for each instance of them leading to memory bloat for long running programs that use a lot of generators. Fixed in 0.5.16.1 already.
Don't drop positional arguments when called with
--run
, also make it an error if they are present without that option.
๐ New Features
- โ Added full support for Python3.5, coroutines work now too.
Optimization
โก๏ธ Optimized frame access of generators to not use both a local frame variable and the frame object stored in the generator object itself. This gave about 1% speed up to setting them up.
Avoid having multiple code objects for functions that can raise and have local variables. Previously one code object would be used to create the function (with parameter variable names only) and when raising an exception, another one would be used (with all local variable names). Creating them both at start-up was wasteful and also needed two tuples to be created, thus more constants setup code.
The entry point for generators is now shared code instead of being generated for each one over and over. This should make things more cache local and also results in less generated C code.
When creating frame codes, avoid working with strings, but use proper emission for less memory churn during code generation.
Organizational
โก๏ธ Updated the key for the Debian/Ubuntu repositories to remain valid for 2 more years.
โ Added support for Fedora 23.
๐ MinGW32 is no more supported, use MinGW64 in the 32 bits variant, which has less issues.
Cleanups
Detecting function type ahead of times, allows to handle generators different from normal functions immediately.
Massive removal of code duplication between normal functions and generator functions. The later are now normal functions creating generator objects, which makes them much more lightweight.
The
return
statement in generators is now immediately set to the proper node as opposed to doing this in variable closure phase only. We can now use the ahead knowledge of the function type.The
nonlocal
statement is now immediately checked for syntax errors as opposed to doing that only in variable closure phase.The name of contraction making functions is no longer skewed to empty, but the real thing instead. The code name is solved differently now.
๐ The
local_locals
mode for function node was removed, it was always true ever since Python2 list contractions stop using pseudo functions.The outline nodes allowed to provide a body when creating them, although creating that body required using the outline node already to create temporary variables. Removed that argument.
โ Removed PyLint false positive annotations no more needed for PyLint 1.5 and solved some TODOs.
Code objects are now mostly created from specs (not yet complete) which are attached and shared between statement frames and function creations nodes, in order to have less guess work to do.
โ Tests
โ Added the CPython3.5 test suite.
โก๏ธ Updated generated doctests to fix typos and use common code in all CPython test suites.
Summary
๐ This release continues to address technical debt. Adding support for Python3.5 was the major driving force, while at the same time removing obstacles to the ๐ changes that were needed for coroutine support.
With Python3.5 sorted out, it will be time to focus on general optimization again, but there is more technical debt related to classes, so the cleanup has to continue.
-
v0.5.16 Changes
๐ This is a maintenance release, largely intended to put out improved support for ๐ new platforms and minor corrections. It should improve the speed for standalone mode, and compilation in general for some use cases, but this is mostly to clean up open ends.
๐ Bug Fixes
- ๐ Fix, the
len
built-in could give false values for dictionary and set creations with the same element.
.. code-block:: python
# This was falsely optimized to 2 even if "a is b and a == b" was true. len({a, b})
Python: Fix, the
gi_running
attribute of generators is no longer anint
, butbool
instead.Python3: Fix, the
int
built-in with two arguments, value and base, raisedUnicodeDecodeError
instead ofValueError
for illegal bytes given as value.Python3: Using
tokenize.open
to read source code, instead of reading manually and decoding fromtokenize.detect_encoding
, this handles corner cases more compatible.๐ Fix, the PyLint warnings plug-in could crash in some cases, make sure it's more robust.
๐ Windows: Fix, the combination of AnaConda Python, MinGW 64 bits and mere acceleration was not working.
Issue#254 <http://bugs.nuitka.net/issue254>
__.๐ฆ Standalone: Preserve not only namespace packages created by
.pth
files, but also make the imports done by them. This makes it more compatible with uses of it in Fedora 22.Standalone: The extension modules could be duplicated, turned this into an error and cache finding them during compile time and during early import resolution to avoid duplication.
Standalone: Handle "not found" from
ldd
output, on some systems not all the libraries wanted are accessible for every library.๐ฆ Python3.5: Fixed support for namespace packages, these were not yet working for that version yet.
๐ Python3.5: Fixes lack of support for unpacking in normal
tuple
,list
, andset
creations.
.. code-block:: python
[*a] # this has become legal in 3.5 and now works too.
Now also gives compatible
SyntaxError
for earlier versions. Python2 was good already.Python3.5: Fix, need to reduce compiled functions to
__qualname__
value, rather than just__name__
or else pickling methods doesn't work.Python3.5: Fix, added
gi_yieldfrom
attribute to generator objects.๐ Windows: Fixed harmless warnings for Visual Studio 2015 in
--debug
mode.
Optimization
0๏ธโฃ Re-formulate
exec
andeval
to default toglobals()
as the default for the locals dictionary in modules.๐ The
try
node was making a description of nodes moved to the outside when shrinking its scope, which was using a lot of time, just to not be output, now these can be postponed.๐จ Refactored how freezing of bytecode works. Uncompiled modules are now explicit nodes too, and in the registry. We only have one or the other of it, avoiding to compile both.
โ Tests
When
strace
ordtruss
are not found, given proper error message, so people know what to do.โ The doc tests extracted and then generated for CPython3 test suites were not printing the expressions of the doc test, leading to largely decreased test coverage here.
โ The CPython 3.4 test suite is now also using common runner code, and avoids ignoring all Nuitka warnings, instead more white listing was added.
โ Started to run CPython 3.5 test suite almost completely, but coroutines are blocking some parts of that, so these tests that use this feature are currently skipped.
โ Removed more CPython tests that access the network and are generally useless to testing Nuitka.
When comparing outputs, normalize typical temporary file names used on posix systems.
โ Coverage tests have made some progress, and some changes were made due to its results.
โ Added test to cover too complex code module of
idna
module.โ Added Python3.5 only test for unpacking variants.
Cleanups
โ Prepare plug-in interface to allow suppression of import warnings to access the node doing it, making the import node is accessible.
Have dedicated class function body object, which is a specialization of the function body node base class. This allowed removing class specific code from that class.
The use of "win_target" as a scons parameter was useless. Make more consistent use of it as a flag indicator in the scons file.
๐ Compiled types were mixing uses of
compiled_
prefixes, something with a space, sometimes with an underscore.
Organizational
๐ Improved support for Python3.5 missing compatibility with new language features.
โก๏ธ Updated the Developer Manual with changes that SSA is now a fact.
โ Added Python3.5 Windows MSI downloads.
โ Added repository for Ubuntu Wily (15.10) for download. Removed Ubuntu Utopic package download, no longer supported by Ubuntu.
โ Added repository with RPM packages for Fedora 22.
Summary
๐ So this release is mostly to lower the technical debt incurred that holds it ๐ back from supporting making more interesting changes. Upcoming releases may have continue that trend for some time.
๐ This release is mostly about catching up with Python3.5, to make sure we did not miss anything important. The new function body variants will make it easier to implement coroutines, and help with optimization and compatibility problems that remain for Python3 classes.
Ultimately it will be nice to require a lot less checks for when function in-line is going to be acceptable. Also code generation will need a continued push to use the new structure in preparation for making type specific code generation a reality.
- ๐ Fix, the
-
v0.5.15 Changes
๐ This release enables SSA based optimization, the huge leap, not so much in ๐ terms of actual performance increase, but for now making the things possible that will allow it.
This has been in the making literally for years. Over and over, there was just "one more thing" needed. But now it's there.
๐ The release includes much stuff, and there is a perspective on the open tasks in the summary, but first out to the many details.
๐ Bug Fixes
๐ง Standalone: Added implicit import for
reportlab
package configuration dynamic import. Fixed in 0.5.14.1 already.Standalone: Fix, compilation of the
ctypes
module could happen for some import patterns, and then prevented the distribution to contain all necessary libraries. Now it is made sure to not include compiled and frozen form both.Issue#241 <http://bugs.nuitka.net/issue241>
__. Fixed in 0.5.14.1 already.๐ Fix, compilation for conditional statements where the boolean check on the condition cannot raise, could fail compilation.
Issue#240 <http://bugs.nuitka.net/issue240>
__. Fixed in 0.5.14.2 already.Fix, the
__import__
built-in was making static optimization assuming compile time constants to be strings, which in the error case they are not, which was crashing the compiler.Issue#240 <http://bugs.nuitka.net/issue245>
__.
.. code-block:: python
__import__(("some.module",)) # tuples don't work
This error became only apparent, because now in some cases, Nuitka forward propagates values.
๐ Windows: Fix, when installing Python2 only for the user, the detection of it via registry failed as it was only searching system key. This was
a github pull request <https://github.com/kayhayen/Nuitka/pull/8>
__. Fixed in 0.5.14.3 already.Some modules have extremely complex expressions requiring too deep recursion to work on all platforms. These modules are now included entirely as bytecode fallback.
Issue#240 <http://bugs.nuitka.net/issue240>
__.The standard library may contain broken code due to installation mistakes. We have to ignore their
SyntaxError
.Issue#244 <http://bugs.nuitka.net/issue244>
__.๐ Fix, pickling compiled methods was failing with the wrong kind of error, because they should not implement
__reduce__
, but only__deepcopy__
.Issue#219 <http://bugs.nuitka.net/issue219>
__.๐ Fix, when running under
wine
, the check for scons binary was fooled by existence of/usr/bin/scons
.Issue#251 <http://bugs.nuitka.net/issue251>
__.
๐ New Features
โ Added experimental support for Python3.5, coroutines don't work yet, but it works perfectly as a 3.4 replacement.
โ Added experimental Nuitka plug-in framework, and use it for the packaging of Qt plugins in standalone mode. The API is not yet stable nor polished.
๐ New option
--debugger
that makes--run
execute directly ingdb
and gives a stack trace on crash.๐ New option
--profile
executes compiled binary and outputs measured performance withvmprof
. This is work in progress and not functional yet.Started work on
--graph
to render the SSA state into diagrams. This is work in progress and not functional yet.๐ Plug-in framework added. Not yet ready for users. Working
PyQt4
andPyQt5
plug-in support. Experimental Windowsmultiprocessing
support. Experimental PyLint warnings disable support. More to come.โ Added support for AnaConda accelerated mode on macOS by modifying the rpath to the Python DLL.
โ Added experimental support for
multiprocessing
on Windows, which needs money patching of the module to support compiled methods.
Optimization
0๏ธโฃ The SSA analysis is now enabled by default, eliminating variables that are not shared, and can be forward propagated. This is currently limited mostly to compile time constants, but things won't remain that way.
Code generation for many constructs now takes into account if a specific operation can raise or not. If e.g. an attribute look-up is known to not raise, then that is now decided by the node the looked is done to, and then more often can determine this, or even directly the value.
Calls to C-API that we know cannot raise, no longer check, but merely assert the result.
For attribute look-up and other operations that might be known to not raise, we now only assert that it succeeds.
Built-in loop-ups cannot fail, merely assert that.
Creation of built-in exceptions never raises, merely assert that too.
More Python operation slots now have their own computations and some of these gained overloads for more compile time constant optimization.
When taking an iterator cannot raise, this is now detected more often.
The
try
/finally
construct is now represented by duplicating the final block into all kinds of handlers (break
,continue
,return
, orexcept
) and optimized separately. This allows for SSA to trace values more correctly.The
hash
built-in now has dedicated node and code generation too. This is mostly intended to represent the side effects of dictionary look-up, but gives more compact and faster code too.Type
type
built-in cannot raise and has no side effect.Speed improvement for in-place float operations for
+=
and*=
, as these will be common cases.
โ Tests
โ Made the construct based testing executable with Python3.
โ Removed warnings using the new PyLint warnings plug-in for the reflected test. Nuitka now uses the PyLint annotations to not warn. Also do not go into PyQt for reflected test, not needed. Many Python3 improvements for cases where there are differences to report.
โ The optimization tests no longer use 2to3 anymore, made the tests portable to all versions.
Checked more in-place operations for speed.
Organizational
- ๐ Many improvements to the coverage taking. We can hope to see public data from this, some improvements were triggered from this already, but full runs of the test suite with coverage data collection are yet to be done.
Summary
๐ The release includes many important new directions. Coverage analysis will be โ important to remain certain of test coverage of Nuitka itself. This is mostly done, but needs more work to complete.
Then the graphing surely will help us to debug and understand code examples. So instead of tracing, and reading stuff, we should visualize things, to more ๐ clearly see, how things evolve under optimization iteration, and where exactly one thing goes wrong. This will be improved as it proves necessary to do just that. So far, this has been rare. Expect this to become end user capable with โก๏ธ time. If only to allow you to understand why Nuitka won't optimize code of yours, and what change of Nuitka it will need to improve.
๐ The comparative performance benchmarking is clearly the most important thing to have for users. It deserves to be the top priority. Thanks to the PyPy tool
vmprof
, we may already be there on the data taking side, but the presenting and correlation part, is still open and a fair bit of work. It will be most ๐ important to empower users to make competent performance bug reports, now that Nuitka enters the phase, where these things matter.As this is a lot of ground to cover. More than ever. We can make this compiler, but only if you help, it will arrive in your life time.
-
v0.5.14 Changes
๐ This release is an intermediate step towards value propagation, which is not ๐ considered ready for stable release yet. The major point is the elimination of the
try
/finally
expressions, as they are problems to SSA. Thetry
/finally
statement change is delayed.๐ There are also a lot of bug fixes, and enhancements to code generation, as well as major cleanups of code base.
๐ Bug Fixes
- ๐ Python3: Added support assignments trailing star assignment.
.. code-block:: python
*a, b = 1, 2
This raised
ValueError
before.- Python3: Properly detect illegal double star assignments.
.. code-block:: python
*a, *b = c
- Python3: Properly detect the syntax error to star assign from non-tuple/list.
.. code-block:: python
*a = 1
๐ Python3.4: Fixed a crash of the binary when copying dictionaries with split tables received as star arguments.
๐ Python3: Fixed reference loss, when using
raise a from b
whereb
was an exception instance. Fixed in 0.5.13.8 already.๐ Windows: Fix, the flag
--disable-windows-console
was not properly handled for MinGW32 run time resulting in a crash.Python2.7.10: Was not recognizing this as a 2.7.x variant and therefore not applying minor version compatibility levels properly.
๐ Fix, when choosing to have frozen source references, code objects were not use the same value as
__file__
did for its filename.๐ Fix, when re-executing itself to drop the
site
module, make sure we find the same file again, and not according to thePYTHONPATH
changes coming from it.Issue#223 <http://bugs.nuitka.net/issue223>
__. Fixed in 0.5.13.4 already.โจ Enhanced code generation for
del variable
statements, where it's clear that the value must be assigned.When pressing CTRL-C, the stack traces from both Nuitka and Scons were given, we now avoid the one from Scons.
๐ Fix, the dump from
--xml
no longer contains functions that have become unused during analysis.Standalone: Creating or running programs from inside unicode paths was not working on Windows.
Issue#231 <http://bugs.nuitka.net/issue231>
__Issue#229 <http://bugs.nuitka.net/issue229>
__ and. Fixed in 0.5.13.7 already.๐ฆ Namespace package support was not yet complete, importing the parent of a package was still failing.
Issue#230 <http://bugs.nuitka.net/issue231>
__. Fixed in 0.5.13.7 already.๐ป Python2.6: Compatibility for exception check messages enhanced with newest minor releases.
Compatibility: The
NameError
in classes needs to sayglobal name
and not justname
too.๐ Python3: Fixed creation of XML representation, now done without
lxml
as it doesn't support needed features on that version. Fixed in 0.5.13.5 already.Python2: Fix, when creating code for the largest negative constant to still fit into
int
, that was only working in the main module.Issue#228 <http://bugs.nuitka.net/issue228>
__. Fixed in 0.5.13.5 already.๐จ Compatibility: The
print
statement raised an assertion on unicode objects that could not be encoded withascii
codec.
๐ New Features
โ Added support for Windows 10.
Followed changes for Python 3.5 beta 2. Still only usable as a Python 3.4 replacement, no new features.
๐ Using a self compiled Python running from the source tree is now supported.
โ Added support for
AnaConda
Python distribution. As it doesn't install the Python DLL, we copy it along for acceleration mode.โ Added support for Visual Studio 2015.
Issue#222 <http://bugs.nuitka.net/issue222>
__. Fixed in 0.5.13.3 already.โ Added support for self compiled Python versions running from build tree, this is intended to help debug things on Windows.
Optimization
Function in-lining is now present in the code, but still disabled, because it needs more changes in other areas, before we can generally do it.
Trivial outlines, result of re-formulations or function in-lining, are now in-lined, in case they just return an expression.
The re-formulation for
or
andand
has been giving up, eliminating the use of atry
/finally
expression, at the cost of dedicated boolean nodes and code generation for these.
This saves around 8% of compile time memory for Nuitka, and allows for faster and more complete optimization, and gets rid of a complicated structure for analysis.
๐ป When a frame is used in an exception, its locals are detached. This was done more often than necessary and even for frames that are not necessary our own ones. This will speed up some exception cases.
0๏ธโฃ When the default arguments, or the keyword default arguments (Python3) or the annotations (Python3) were raising an exception, the function definition is now replaced with the exception, saving a code generation. This happens frequently with Python2/Python3 compatible code guarded by version checks.
The SSA analysis for loops now properly traces "break" statement situations and merges the post-loop situation from all of them. This significantly allows for and improves optimization of code following the loop.
The SSA analysis of
try
/finally
statements has been greatly enhanced. The handler forfinally
is now optimized for exception raise and no exception raise individually, as well as forbreak
,continue
andreturn
in the tried code. The SSA analysis for after the statement is now the result of merging these different cases, should they not abort.The code generation for
del
statements is now taking advantage should there be definite knowledge of previous value. This speed them up slightly.The SSA analysis of
del
statements now properly decided if the statement can raise or not, allowing for more optimization.For list contractions, the re-formulation was enhanced using the new outline construct instead of a pseudo function, leading to better analysis and code generation.
Comparison chains are now re-formulated into outlines too, allowing for better analysis of them.
0๏ธโฃ Exceptions raised in function creations, e.g. in default values, are now propagated, eliminating the function's code. This happens most often with Python2/Python3 in branches. On the other hand, function creations that cannot are also annotated now.
Closure variables that become unreferenced outside of the function become normal variables leading to better tracing and code generation for them.
0๏ธโฃ Function creations cannot raise except their defaults, keyword defaults or annotations do.
Built-in references can now be converted to strings at compile time, e.g. when printed.
Organizational
โ Removed gitorious mirror of the git repository, they shut down.
๐ Make it more clear in the documentation that Python2 is needed at compile time to create Python3 executables.
Cleanups
๐ Moved more parts of code generation to their own modules, and used registry for code generation for more expression kinds.
Unified
try
/except
andtry
/finally
into a single construct that handles both throughtry
/except
/break
/continue
/return
semantics. Finally is now solved via duplicating the handler into cases necessary.
No longer are nodes annotated with information if they need to publish the exception or not, this is now all done with the dedicated nodes.
The
try
/finally
expressions have been replaced with outline function bodies, that instead of side effect statements, are more like functions with return values, allowing for easier analysis and dedicated code generation of much lower complexity.๐ No more "tolerant" flag for release nodes, we now decide this fully based on SSA information.
โ Added helper for assertions that code flow does not reach certain positions, e.g. a function must return or raise, aborting statements do not continue and so on.
To keep cloning of code parts as simple as possible, the limited use of
makeCloneAt
has been changed to a newmakeClone
which produces identical copies, which is what we always do. And a generic cloning based on "details" has been added, requiring to make constructor arguments and details complete and consistent.The re-formulation code helpers have been improved to be more convenient at creating nodes.
The old
nuitka.codegen
moduleGenerator
was still used for many things. These now all got moved to appropriate code generation modules, and their users got updated, also moving some code generator functions in the process.The module
nuitka.codegen.CodeTemplates
got replaces with direct uses of the proper topic module fromnuitka.codegen.templates
, with some more added, and their names harmonized to be more easily recognizable.โ Added more assertions to the generated code, to aid bug finding.
๐ The autoformat now sorts pylint markups for increased consistency.
๐ Releases no longer have a
tolerant
flag, this was not needed anymore as we use SSA.๐ท Handle CTRL-C in scons code preventing per job messages that are not helpful and avoid tracebacks from scons, also remove more unused tools like
rpm
from out in-line copy.
โ Tests
โ Added the CPython3.4 test suite.
โ The CPython3.2, CPython3.3, and CPython3.4 test suite now run with Python2 giving the same errors. Previously there were a few specific errors, some with line numbers, some with different
SyntaxError
be raised, due to different order of checks.
This increases the coverage of the exception raising tests somewhat.
โ Also the CPython3.x test suites now all pass with debug Python, as does the CPython 2.6 test suite with 2.6 now.
โ Added tests to cover all forms of unpacking assignments supported in Python3, to be sure there are no other errors unknown to us.
โ Started to document the reference count tests, and to make it more robust against SSA optimization. This will take some time and is work in progress.
โ Made the compile library test robust against modules that raise a syntax error, checking that Nuitka does the same.
โ Refined more tests to be directly executable with Python3, this is an ongoing effort.
Summary
๐ This release is clearly major. It represents a huge step forward for Nuitka as it improves nearly every aspect of code generation and analysis. Removing the
try
/finally
expression nodes proved to be necessary in order to even have the correct SSA in their cases. Very important optimization was blocked by it.๐ Going forward, the
try
/finally
statements will be removed and dead variable elimination will happen, which then will give function inlining. This ๐ is expected to happen in one of the next releases.๐ This release is a consolidation of 8 hotfix releases, and many refactorings needed towards the next big step, which might also break things, and for that ๐ reason is going to get its own release cycle.
-
v0.5.13 Changes
๐ This release contains the first use of SSA for value propagation and massive ๐ amounts of bug fixes and optimization. Some of the bugs that were delivered ๐ as hotfixes, were only revealed when doing the value propagation as they still could apply to real code.
๐ Bug Fixes
๐ Fix, relative imports in packages were not working with absolute imports enabled via future flags. Fixed in 0.5.12.1 already.
Loops were not properly degrading knowledge from inside the loop at loop exit, and therefore this could have lead missing checks and releases in code generation for cases, for
del
statements in the loop body. Fixed in 0.5.12.1 already.The
or
andand
re-formulation could trigger false assertions, due to early releases for compatibility. Fixed in 0.5.12.1 already.๐ Fix, optimizion of calls of constant objects (always an exception), crashed the compiler. This corrects
Issue#202 <http://bugs.nuitka.net/issue202>
__. Fixed in 0.5.12.2 already.๐ Standalone: Added support for
site.py
installations with a leadingdef
orclass
statement, which is defeating our attempt to patch__file__
for it. This correctsIssue#189 <http://bugs.nuitka.net/issue189>
__.Compatibility: In full compatibility mode, the tracebacks of
or
andand
expressions are now as wrong as they are in CPython. Does not apply to--improved
mode.Standalone: Added missing dependency on
QtGui
byQtWidgets
for PyQt5.๐ macOS: Improved parsing of
otool
output to avoid duplicate entries, which can also be entirely wrong in the case of Qt plugins at least.Avoid relative paths for main program with file reference mode
original
, as it otherwise changes as the file moves.MinGW: The created modules depended on MinGW to be in
PATH
for their usage. This is no longer necessary, as we now link these libraries statically for modules too.๐ Windows: For modules, the option
--run
to immediately load the modules had been broken for a while.๐ Standalone: Ignore Windows DLLs that were attempted to be loaded, but then failed to load. This happens e.g. when both PySide and PyQt are installed, and could cause the dreaded conflicting DLLs message. The DLL loaded in error is now ignored, which avoids this.
MinGW: The resource file used might be empty, in which case it doesn't get created, avoiding an error due to that.
MinGW: Modules can now be created again. The run time relative code uses an API that is WinXP only, and MinGW failed to find it without guidance.
Optimization
- ๐ Make direct calls out of called function creations. Initially this applies to lambda functions only, but it's expected to become common place in coming releases. This is now 20x faster than CPython.
.. code-block:: python
# Nuitka avoids creating a function object, parsing function arguments: (lambda x:x)(something)
Propagate assignments from non-mutable constants forward based on SSA information. This is the first step of using SSA for real compile time optimization.
Specialized the creation of call nodes at creation, avoiding to have all kinds be the most flexible form (keyword and plain arguments), but instead only what kind of call they really are. This saves lots of memory, and makes the tree faster to visit.
โ Added support for optimizing the
slice
built-in with compile time constant arguments to constants. The re-formulation for slices in Python3 uses these a lot. And the lack of this optimization prevented a bunch of optimization in this area. For Python2 the built-in is optimized too, but not as important probably.โ Added support for optimizing
isinstance
calls with compile time constant arguments. This avoids static exception raises in theexec
re-formulation which tests forfile
type, and then optimization couldn't tell that astr
is not afile
instance. Now it can.Lower in-place operations on immutable types to normal operations. This will allow to compile time compute these more accurately.
The re-formulation of loops puts the loop condition as a conditional statement with break. The
not
that needs to apply was only added in later optimization, leading to unnecessary compile time efforts.โ Removed per variable trace visit from optimization, removing useless code and compile time overhead. We are going to optimize things by making decision in assignment and reference nodes based on forward looking statements using the last trace collection.
๐ New Features
โ Added experimental support for Python 3.5, which seems to be passing the test suites just fine. The new
@
matrix multiplicator operators are not yet supported though.โ Added support for patching source on the fly. This is used to work around a (now fixed) issue with
numexpr.cpuinfo
making type checks with theis
operation, about the only thing we cannot detect.
Organizational
โ Added repository for Ubuntu Vivid (15.04) for download. Removed Ubuntu Saucy and Ubuntu Raring package downloads, these are no longer supported by Ubuntu.
โ Added repository for Debian Stretch, after Jessie release.
๐ Make it more clear in the documentation that in order to compile Python3, a Python2 is needed to execute Scons, but that the end result is a Python3 binary.
๐ The PyLint checker tool now can operate on directories given on the command line, and whitelists an error that is Windows only.
Cleanups
Split up standalone code further, moving
depends.exe
handling to a separate module.โฌ๏ธ Reduced code complexity of scons interface.
Cleaned up where trace collection is being done. It was partially still done inside the collection itself instead in the owner.
In case of conflicting DLLs for standalone mode, these are now output with nicer formatting, that makes it easy to recognize what is going on.
๐ Moved code to fetch
depends.exe
to dedicated module, so it's not as much in the way of standalone code.
โ Tests
โ Made
BuiltinsTest
directly executable with Python3.โ Added construct test to demonstrate the speed up of direct lambda calls.
โ The deletion of
@test
for the CPython test suite is more robust now, esp. on Windows, the symbolic links are now handled.โ Added test to cover
or
usage with in-place assignment.Cover local relative
import from .
withabsolute_import
future flag enabled.โ Again, more basic tests are now directly executable with Python3.
Summary
๐ This release is major due to amount of ground covered. The reduction in memory usage of Nuitka itself (the C++ compiler will still use much memory) is very massive and an important aspect of scalability too.
Then the SSA changes are truly the first sign of major improvements to come. In their current form, without eliminating dead assignments, the full advantage is ๐ not taken yet, but the next releases will do this, and that's a major milestone to Nuitka.
The other optimization mostly stem from looking at things closer, and trying to work towards function in-lining, for which we are making a lot of progress now.
-
v0.5.12 Changes
๐ This release contains massive amounts of corrections for long standing issues in the import recursion mechanism, as well as for standalone issues now visible after the
__file__
and__path__
values have changed to become runtime dependent values.๐ Bug Fixes
Fix, the
__path__
attribute for packages was still the original filename's directory, even in file reference mode wasruntime
.0๏ธโฃ The use of
runtime
as default file reference mode for executables, even if not in standalone mode, was making acceleration harder than necessary. Changed tooriginal
for that case. Fixed in 0.5.11.1 already.The constant value for the smallest
int
that is not yet along
is created using1
due to C compiler limitations, but1
was not yet initialized properly, if this was a global constant, i.e. used in multiple modules. Fixed in 0.5.11.2 already.Standalone: Recent fixes around
__path__
revealed issues with PyWin32, where modules fromwin32com.shell
were not properly recursed to. Fixed in 0.5.11.2 already.The importing of modules with the same name as a built-in module inside a package falsely assumed these were the built-ins which need not exist, and then didn't recurse into them. This affected standalone mode the most, as the module was then missing entirely. This corrects
Issue#178 <http://bugs.nuitka.net/issue178>
__.
.. code-block:: python
# Inside "x.y" module: import x.y.exceptions
- Similarly, the importing of modules with the same name as standard library
modules could go wrong. This corrects
Issue#184 <http://bugs.nuitka.net/issue184>
__.
.. code-block:: python
# Inside "x.y" module: import x.y.types
๐ Importing modules on Windows and macOS was not properly checking the checking the case, making it associate wrong modules from files with mismatching case. This corrects
Issue#188 <http://bugs.nuitka.net/issue188>
__.Standalone: Importing with
from __future__ import absolute_import
would prefer relative imports still. This correctsIssue#187 <http://bugs.nuitka.net/issue188>
__.Python3: Code generation for
try
/return expr
/finally
could loose exceptions whenexpr
raised an exception, leading to aRuntimeError
forNULL
return value. The real exception was lost.Lambda expressions that were directly called with star arguments caused the compiler to crash.
.. code-block:: python
(lambda *args:args)(*args) # was crashing Nuitka
๐ New Optimization
๐ Focusing on compile time memory usage, cyclic dependencies of trace merges that prevented them from being released, even when replaced were removed.
โก๏ธ More memory efficient updating of global SSA traces, reducing memory usage during optimization by ca. 50%.
Code paths that cannot and therefore must not happen are now more clearly indicated to the backend compiler, allowing for slightly better code to be generated by it, as it can tell that certain code flows need not be merged.
๐ New Features
๐ฆ Standalone: On systems, where
.pth
files inject Python packages at launch, these are now detected, and taking into account. Previously Nuitka did not recognize them, due to lack of__init__.py
files. These are mostly pip installations of e.g.zope.interface
.โ Added option
--explain-imports
to debug the import resolution code of Nuitka.โ Added options
--show-memory
to display the amount of memory used in total and how it's spread across the different node types during compilation.The option
--trace-execution
now also covers early program initialisation before any Python code runs, to ease finding bugs in this domain as well.
Organizational
๐ Changed default for file reference mode to
original
unless standalone or module mode are used. For mere acceleration, breaking the reading of data files from__file__
is useless.โ Added check that the in-line copy of scons is not run with Python3, which is not supported. Nuitka works fine with Python3, but a Python2 is required to execute scons.
๐ Discover more kinds of Python2 installations on Linux/macOS installations.
โ Added instructions for macOS to the download page.
Cleanups
๐ Moved
oset
andodict
modules which provide ordered sets and dictionaries into a new packagenuitka.container
to clean up the top level scope.๐ฆ Moved
SyntaxErrors
tonuitka.tree
package, where it is used to format error messages.๐ฆ Moved
nuitka.Utils
package tonuitka.utils.Utils
creating a whole package for utils, so as to better structure them for their purpose.
Summary
๐ This release is a major maintenance release. Support for namespace modules injected by
*.pth
is a major step for new compatibility. The import logic ๐ improvements expand the ability of standalone mode widely. Many more use cases will now work out of the box, and less errors will be found on case insensitive systems.There is aside of memory issues, no new optimization though as many of these ๐ improvements could not be delivered as hotfixes (too invasive code changes), ๐ and should be out to the users as a stable release. Real optimization changes ๐ have been postponed to be next release.
-
v0.5.11 Changes
๐ The last release represented a significant change and introduced a few ๐ regressions, which got addressed with hot fix releases. But it also had a focus on cleaning up open optimization issues that were postponed in the last ๐ release.
๐ New Features
- The filenames of source files as found in the
__file__
attribute are now made relative for all modes, not just standalone mode.
This makes it possible to put data files along side compiled modules in a deployment. This solves
Issue#170 <http://bugs.nuitka.net/issue170>
__.๐ Bug Fixes
- ๐ Local functions that reference themselves were not released. They now are.
.. code-block:: python
def someFunction(): def f(): f() # referencing 'f' in 'f' caused the garbage collection to fail.
Recent changes to code generation attached closure variable values to the function object, so now they can be properly visited. This corrects
Issue#45 <http://bugs.nuitka.net/issue45>
__. Fixed in 0.5.10.1 already.- Python2.6: The complex constants with real or imaginary parts
-0.0
were collapsed with constants of value0.0
. This became more evident after we started to optimize thecomplex
built-in. Fixed in 0.5.10.1 already.
.. code-block:: python
complex(0.0, 0.0) complex(-0.0, -0.0) # Could be confused with the above.
Complex call helpers could leak references to their arguments. This was a regression. Fixed in 0.5.10.1 already.
๐ Parameter variables offered as closure variables were not properly released, only the cell object was, but not the value. This was a regression. Fixed in 0.5.10.1 already.
๐ป Compatibility: The exception type given when accessing local variable values not initialized in a closure taking function, needs to be
NameError
andUnboundLocalError
for accesses in the providing function. Fixed in 0.5.10.1 already.๐ Fix support for "venv" on systems, where the system Python uses symbolic links too. This is the case on at least on Mageia Linux. Fixed in 0.5.10.2 already.
Python3.4: On systems where
long
andPy_ssize_t
are different (e.g. Win64) iterators could be corrupted if used by uncompiled Python code. Fixed in 0.5.10.2 already.๐ Fix, generator objects didn't release weak references to them properly. Fixed in 0.5.10.2 already.
Compatibility: The
__closure__
attributes of functions was so far not supported, and rarely missing. Recent changes made it easy to expose, so now it was added. This correctsIssue#45 <http://bugs.nuitka.net/issue45>
__.๐ macOS: A linker warning about deprecated linker option
-s
was solved by removing the option.Compatibility: Nuitka was enforcing that the
__doc__
attribute to be a string object, and gave a misleading error message. This check must not be done though,__doc__
can be any type in Python. This correctsIssue#177 <http://bugs.nuitka.net/issue177>
__.
๐ New Optimization
Variables that need not be shared, because the uses in closure taking functions were eliminated, no longer use cell objects.
The
try
/except
andtry
/finally
statements now both have actual merging for SSA, allowing for better optimization of code behind it.
.. code-block:: python
def f(): try: a = something() except: return 2 # Since the above exception handling cannot continue the code flow, # we do not have to invalidate the trace of "a", and e.g. do not have # to generate code to check if it's assigned. return a
Since
try
/finally
is used in almost all re-formulations of complex Python constructs this is improving SSA application widely. The uses oftry
/except
in user code will no longer degrade optimization and code generation efficiency as much as they did.- The
try
/except
statement now reduces the scope of tried block if possible. When no statement raised, already the handling was removed, but leading and trailing statements that cannot raise, were not considered.
.. code-block:: python
def f(): try: b = 1 a = something() c = 1 except: return 2
This is now optimized to.
.. code-block:: python
def f(): b = 1 try: a = something() except: return 2 c = 1
The impact may on execution speed may be marginal, but it is definitely going to improve the branch merging to be added later. Note that
c
can only be optimized, because the exception handler is aborting, otherwise it would change behaviour.- The creation of code objects for standalone mode and now all code objects was creating a distinct filename object for every function in a module, despite them being same content. This was wasteful for module loading. Now it's done only once.
Also, when having multiple modules, the code to build the run time filename used for code objects, was calling import logic, and doing lookups to find
os.path.join
again and again. These are now cached, speeding up the use of many modules as well.Cleanups
- Nuitka used to have "variable usage profiles" and still used them to decide if a global variable is written to, in which case, it stays away from doing optimization of it to built-in lookups, and later calls.
The have been replaced by "global variable traces", which collect the traces to a variable across all modules and functions. While this is now only a replacement, and getting rid of old code, and basing on SSA, later it will also allow to become more correct and more optimized.
- ๐ The standalone now queries its hidden dependencies from a plugin framework, which will become an interface to Nuitka internals in the future.
โ Testing
The use of deep hashing of constants allows us to check if constants become mutated during the run-time of a program. This allows to discover corruption should we encounter it.
โ The tests of CPython are now also run with Python in debug mode, but only on Linux, enhancing reference leak coverage.
โ The CPython test parts which had been disabled due to reference cycles involving compiled functions, or usage of
__closure__
attribute, were reactivated.
Organizational
- ๐ Since Google Code has shutdown, it has been removed from the Nuitka git mirrors.
Summary
๐ This release brings exciting new optimization with the focus on the
try
๐ constructs, now being done more optimal. It is also a maintenance release, ๐ bringing out compatibility improvements, and important bug fixes, and important ๐ usability features for the deployment of modules and packages, that further expand the use cases of Nuitka.๐ The git flow had to be applied this time to get out fixes for regression bug ๐ fixes, that the big change of the last release brought, so this is also to ๐ consolidate these and the other corrections into a full release before making more invasive changes.
The cleanups are leading the way to expanded SSA applied to global variable and shared variable values as well. Already the built-in detect is now based on global SSA information, which was an important step ahead.
- The filenames of source files as found in the
-
v0.5.10 Changes
๐ This release has a focus on code generation optimization. Doing major changes away from "C++-ish" code to "C-ish" code, many constructs are now faster or โก๏ธ got looked at and optimized.
๐ Bug Fixes
Compatibility: The variable name in locals for the iterator provided to the generator expression should be
.0
, now it is.Generators could leak frames until program exit, these are now properly freed immediately.
๐ New Optimization
โช Faster exception save and restore functions that might be in-lined by the backend C compiler.
Faster error checks for many operations, where these errors are expected, e.g. instance attribute lookups.
Do not create traceback and locals dictionary for frame when
StopIteration
orGeneratorExit
are raised. These tracebacks were wasted, as they were immediately released afterwards.Closure variables to functions and parameters of generator functions are now attached to the function and generator objects.
The creation of functions with closure taking was accelerated.
The creation and destruction of generator objects was accelerated.
The re-formulation for in-place assignments got simplified and got faster doing so.
In-place operations of
str
were always copying the string, even if was not necessary. This correctsIssue#124 <http://bugs.nuitka.net/issue124>
__.
.. code-block:: python
a += b # Was not re-using the storage of "a" in case of strings
Python2: Additions of
int
for Python2 are now even faster.Access to local variable values got slightly accelerated at the expense of closure variables.
โ Added support for optimizing the
complex
built-in.Removing unused temporary and local variables as a result of optimization, these previously still allocated storage.
Cleanup
๐ The use of C++ classes for variable objects was removed. Closure variables are now attached as
PyCellObject
to the function objects owning them.The use of C++ context classes for closure taking and generator parameters has been replaced with attaching values directly to functions and generator objects.
The indentation of code template instantiations spanning multiple was not in all cases proper. We were using emission objects that handle it new lines in code and mere
list
objects, that don't handle them in mixed forms. Now only the emission objects are used.Some templates with C++ helper functions that had no variables got changed to be properly formatted templates.
The internal API for handling of exceptions is now more consistent and used more efficiently.
๐ The printing helpers got cleaned up and moved to static code, removing any need for forward declaration.
The use of
INCREASE_REFCOUNT_X
was removed, it got replaced with properPy_XINCREF
usages. The function was once required before "C-ish" lifted the need to do everything in one function call.๐ The use of
INCREASE_REFCOUNT
got reduced. See above for why that is any good. The idea is thatPy_INCREF
must be good enough, and that we want to avoid the C function it was, even if in-lined.The
assertObject
function that checks if an object is notNULL
and has positive reference count, i.e. is sane, got turned into a preprocessor macro.Deep hashes of constant values created in
--debug
mode, which cover also mutable values, and attempt to depend on actual content. These are checked at program exit for corruption. This may help uncover bugs.
Organizational
๐ Speedcenter has been enhanced with better graphing and has more benchmarks now. More work will be needed to make it useful.
โก๏ธ Updates to the Developer Manual, reflecting the current near finished state of "C-ish" code generation.
โ Tests
๐ New reference count tests to cover generator expressions and their usage got added.
๐ Many new construct based tests got added, these will be used for performance graphing, and serve as micro benchmarks now.
โ Again, more basic tests are directly executable with Python3.
Summary
This is the next evolution of "C-ish" coming to pass. The use of C++ has for all practical purposes vanished. It will remain an ongoing activity to clear that up and become real C. The C++ classes were a huge road block to many things, that now will become simpler. One example of these were in-place operations, which now can be dealt with easily.
๐ Also, lots of polishing and tweaking was done while adding construct benchmarks that were made to check the impact of these changes. Here, generators probably stand out the most, as some of the missed optimization got revealed and then โ addressed.
Their speed increases will be visible to some programs that depend a lot on generators.
๐ This release is clearly major in that the most important issues got addressed, ๐ future releases will provide more tuning and completeness, but structurally the "C-ish" migration has succeeded, and now we can reap the benefits in the ๐ coming releases. More work will be needed for all in-place operations to be accelerated.
More work will be needed to complete this, but it's good that this is coming to an end, so we can focus on SSA based optimization for the major gains to be had.