| Age | Commit message (Collapse) | Author |
|
Fleshed out invoke and const-string support. Fixed a bug in Phi node
insertion.
With this CL, the "Recursive Fibonacci" and "HelloWorld" milestones are
met.
Added are a set of "HL" (for High-Level) invoke intrinsics. Until we
complete the merging of the Quick & Iceland runtime models the invoke
code sequences are slightly different. Thus, the Greenland IR needs
to represent invokes at a somewhat higher level than Iceland. The
test for fast/slow path needs to happen during the lowering of the
HLInvokeXXX intrinsics in both the Quick and Portable paths.
This will generally be the case in the short term - push fast/slow
path determination below the Greenland IR level. As unification
proceeds, we'll pull as much as makes sense into the common front end.
Change-Id: I0a18edf1be18583c0afdc3f7e10a3e4691968e77
|
|
Funnily enough, no-one was using the default on purpose anyway.
Change-Id: Id7b576565c1929087c05834078147dbbc8e635b3
|
|
First of several CLs to bring code closer to alignment with Art and LLVM
standards. Move to 2-space indenting. Sticking with 80-col line
length (which LLVM apparently also wants). LLVM also prefers camel
case names, so keeping Dalvik convention there as well (for now).
Change-Id: I351ab234e640678d97747377cccdd6df0a770f4a
|
|
Change-Id: I53b457cab9067369320457549071fc3e4c23c81b
|
|
Three changes for the price of one:
1. Fix the mutex diagnostics so they work right during startup and shutdown.
2. Fix a memory leak in common_test.
3. Fix memory corruption in the compiler; we were calling memset(3) on a struct
with non-POD members.
Thanks, as usual, to valgrind(1) for the latter two (and several bugs in
earlier attempts at the former).
Change-Id: I15e1ffb01e73e4c56a5bbdcaa7233a4b5221e08a
|
|
Add ability for the compiler to allocate new frame temporaries
that play nicely with the register allocation mechanism. To do this
we assign negative virtual register numbers and give them SSA names.
As part of this change, I did a general cleanup of the ssa naming.
An ssa name (or SReg) is in index into an array of (virtual reg, subscript)
pairs. Previously, 16 bits were allocated for the reg and the subscript.
This CL expands the virtual reg and subscript to 32 bits each.
Method* is now treated as a RegLocation, and will be subject to
temp register tracking and reuse. This CL does not yet include
support for promotion of Method* - that will show up in the next one.
Also included is the beginning of a basic block optimization pass (not
yet in a runable state, so conditionally compiled out).
(cherry picked from commit f689ffec8827f1dd6b31084f8a6bb240338c7acf)
Change-Id: Ibbdeb97fe05d0e33c1f4a9a6ccbdef1cac7646fc
|
|
Seeing new instances of this C-ism go in makes me a sad panda.
Change-Id: Ie3dd414b8b5e57a4164e88eb2d8559545569628d
|
|
The compiler inherited a simple memory management scheme that
involved malloc'ng a clump of memory, allocating out of that
clump for each unit of compilation, and then resetting after
the compilation was complete. Simple & fast, but built with the
expectation of a single compiler worker thread.
This change moves the memory allocation arena into the
CompilationUnit structure, and makes it private for each method
compilation. Unlike the old scheme, allocated memory is returned
to the system following completion (whereas before it was reused
for the next compilation).
As of this CL, each compilation is completely independent.
The changes involved were mostly mechanical to pass around the
cUnit pointer to anything which might need to allocate, but the
accretion of crud has moved me much closer to the point that
all of this stuff gets ripped out and replaced.
Change-Id: I19dda0a7fb5aa228f6baee7ae5293fdd174c8337
|
|
Significant reduction in memory usage by the compiler.
o Estimated sizes of growable lists to avoid waste
o Changed basic block predecessor structure from a growable bitmap
to a growable list.
o Conditionalized code which produced disassembly strings.
o Avoided generating some dataflow-related structures when compiling
in dataflow-disabled mode.
o Added memory usage statistics
o Eliminated floating point usage as a barrier to disabling expensive
dataflow analysis for very large init routines.
o Because iterating through sparse bit maps is much less of a concern now,
removed earlier hack that remembered runs of leading and trailing
zeroes.
Also, some general tuning.
o Minor tweaks to register utilties
o Speed up the assembly loop
o Rewrite of the bit vector iterator
Our previous worst-case method originally consumed 360 megabytes, but through
earlier changes was whittled down to 113 megabytes. Now it consumes 12 (which
so far appears to close to the highest compiler heap usage of anything
I've seen).
Post-wipe cold boot time is now less than 7 minutes.
Installation time for our application test cases also shows a large
gain - typically 25% to 40% speedup.
Single-threaded host compilation of core.jar down to <3.0s, boot.oat builds
in 17.2s. Next up: multi-threaded compilation.
Change-Id: I493d0d584c4145a6deccdd9bff344473023deb46
|
|
Rework key portions of the dataflow analysis code to avoid the
massive slowdowns we're seeing with methods containing more than
20,000 basic blocks.
Add a somewhat hacky lookup cache in front of findBlock. Later
we'll want to rework the "GrowableList" mechanism to incorporate
this properly.
Add memory to the bit vector utilities to speed up the iterators
for large sparse vectors. (Similarly, in the future we'll want
to include a rewrite of the home-grown bit-vector manipluation
utilites to be better with large ones).
With this CL, compilation speed on the degenerate cases is > 10x
faster. Compilation of typically-sized methods will see a smallish
improvement.
Change-Id: I7f086c1229dd4fe62f0a5fc361234bf204ebc2b1
|
|
This leaves us with just the mspace stuff and three libdex functions to clean
up. We deliberately expose the JII API, and I don't think there's anything we
can really do about the art_..._from_code stuff (and at least that starts with
"art_").
Change-Id: I77e58e8330cd2afeb496642302dfe3311e68091a
|
|
Restructured the type inference mechanism, added lots of DCHECKS,
bumped the default memory allocation size to reflect AOT
compilation and tweaked the bit vector manipulation routines
to be better at handling large sparse vectors (something the old
trace JIT didn't encounter enough to care).
With this CL, optimization is back on by default. Should also see
a significant boost in compilation speed (~2x better for boot.oat).
Change-Id: Ifd134ef337be173a1be756bb9198b24c5b4936b3
|
|
I plan to enable some of the old-world basic block optimizations.
Those care about temp register status, so we needed a bit of
cleanup on the temp tracking.
Change-Id: I317bce1b91a73ec9589c20ed5bfe00d53994991a
|
|
Also replaced static function defs with a STATIC macro to make normally
hidden functions visible to DCHECK's traceback listing). Additionally,
added some portions of the new type & size inference mechanism (but not
taking advantage of them yet).
Change-Id: Ib42a08777f28ab879d0df37617e1b77e3f09ba52
|
|
Cleanly compiles, but not integrated. Old-world dependencies captured
in hacked-up temporary files "Dalvik.h" and "HackStubs.cc".
Dalvik.h is a placeholder that captures all of the constants, struct
definitions and inline functions the compiler needs. It largely consists
of declaration fragments of libdex, Object.h, DvmDex.h and Thread.h.
HackStubs.cc contains empty shells for some required libdex routines.
Change-Id: Ia479dda41da4e3162ff6df383252fdc7dbf38d71
|