| page.title=Performance Tips |
| page.article=true |
| @jd:body |
| |
| <div id="tb-wrapper"> |
| <div id="tb"> |
| |
| <h2>In this document</h2> |
| <ol class="nolist"> |
| <li><a href="#ObjectCreation">Avoid Creating Unnecessary Objects</a></li> |
| <li><a href="#PreferStatic">Prefer Static Over Virtual</a></li> |
| <li><a href="#UseFinal">Use Static Final For Constants</a></li> |
| <li><a href="#GettersSetters">Avoid Internal Getters/Setters</a></li> |
| <li><a href="#Loops">Use Enhanced For Loop Syntax</a></li> |
| <li><a href="#PackageInner">Consider Package Instead of Private Access with Private Inner Classes</a></li> |
| <li><a href="#AvoidFloat">Avoid Using Floating-Point</a></li> |
| <li><a href="#UseLibraries">Know and Use the Libraries</a></li> |
| <li><a href="#NativeMethods">Use Native Methods Carefully</a></li> |
| <li><a href="#native_methods">Use Native Methods Judiciously</a></li> |
| <li><a href="#closing_notes">Closing Notes</a></li> |
| </ol> |
| |
| </div> |
| </div> |
| |
| <p>This document primarily covers micro-optimizations that can improve overall app performance |
| when combined, but it's unlikely that these changes will result in dramatic |
| performance effects. Choosing the right algorithms and data structures should always be your |
| priority, but is outside the scope of this document. You should use the tips in this document |
| as general coding practices that you can incorporate into your habits for general code |
| efficiency.</p> |
| |
| <p>There are two basic rules for writing efficient code:</p> |
| <ul> |
| <li>Don't do work that you don't need to do.</li> |
| <li>Don't allocate memory if you can avoid it.</li> |
| </ul> |
| |
| <p>One of the trickiest problems you'll face when micro-optimizing an Android |
| app is that your app is certain to be running on multiple types of |
| hardware. Different versions of the VM running on different |
| processors running at different speeds. It's not even generally the case |
| that you can simply say "device X is a factor F faster/slower than device Y", |
| and scale your results from one device to others. In particular, measurement |
| on the emulator tells you very little about performance on any device. There |
| are also huge differences between devices with and without a |
| <acronym title="Just In Time compiler">JIT</acronym>: the best |
| code for a device with a JIT is not always the best code for a device |
| without.</p> |
| |
| <p>To ensure your app performs well across a wide variety of devices, ensure |
| your code is efficient at all levels and agressively optimize your performance.</p> |
| |
| |
| <h2 id="ObjectCreation">Avoid Creating Unnecessary Objects</h2> |
| |
| <p>Object creation is never free. A generational garbage collector with per-thread allocation |
| pools for temporary objects can make allocation cheaper, but allocating memory |
| is always more expensive than not allocating memory.</p> |
| |
| <p>As you allocate more objects in your app, you will force a periodic |
| garbage collection, creating little "hiccups" in the user experience. The |
| concurrent garbage collector introduced in Android 2.3 helps, but unnecessary work |
| should always be avoided.</p> |
| |
| <p>Thus, you should avoid creating object instances you don't need to. Some |
| examples of things that can help:</p> |
| |
| <ul> |
| <li>If you have a method returning a string, and you know that its result |
| will always be appended to a {@link java.lang.StringBuffer} anyway, change your signature |
| and implementation so that the function does the append directly, |
| instead of creating a short-lived temporary object.</li> |
| <li>When extracting strings from a set of input data, try |
| to return a substring of the original data, instead of creating a copy. |
| You will create a new {@link java.lang.String} object, but it will share the {@code char[]} |
| with the data. (The trade-off being that if you're only using a small |
| part of the original input, you'll be keeping it all around in memory |
| anyway if you go this route.)</li> |
| </ul> |
| |
| <p>A somewhat more radical idea is to slice up multidimensional arrays into |
| parallel single one-dimension arrays:</p> |
| |
| <ul> |
| <li>An array of {@code int}s is a much better than an array of {@link java.lang.Integer} |
| objects, |
| but this also generalizes to the fact that two parallel arrays of ints |
| are also a <strong>lot</strong> more efficient than an array of {@code (int,int)} |
| objects. The same goes for any combination of primitive types.</li> |
| |
| <li>If you need to implement a container that stores tuples of {@code (Foo,Bar)} |
| objects, try to remember that two parallel {@code Foo[]} and {@code Bar[]} arrays are |
| generally much better than a single array of custom {@code (Foo,Bar)} objects. |
| (The exception to this, of course, is when you're designing an API for |
| other code to access. In those cases, it's usually better to make a small |
| compromise to the speed in order to achieve a good API design. But in your own internal |
| code, you should try and be as efficient as possible.)</li> |
| </ul> |
| |
| <p>Generally speaking, avoid creating short-term temporary objects if you |
| can. Fewer objects created mean less-frequent garbage collection, which has |
| a direct impact on user experience.</p> |
| |
| |
| |
| |
| <h2 id="PreferStatic">Prefer Static Over Virtual</h2> |
| |
| <p>If you don't need to access an object's fields, make your method static. |
| Invocations will be about 15%-20% faster. |
| It's also good practice, because you can tell from the method |
| signature that calling the method can't alter the object's state.</p> |
| |
| |
| |
| |
| |
| <h2 id="UseFinal">Use Static Final For Constants</h2> |
| |
| <p>Consider the following declaration at the top of a class:</p> |
| |
| <pre> |
| static int intVal = 42; |
| static String strVal = "Hello, world!"; |
| </pre> |
| |
| <p>The compiler generates a class initializer method, called |
| <code><clinit></code>, that is executed when the class is first used. |
| The method stores the value 42 into <code>intVal</code>, and extracts a |
| reference from the classfile string constant table for <code>strVal</code>. |
| When these values are referenced later on, they are accessed with field |
| lookups.</p> |
| |
| <p>We can improve matters with the "final" keyword:</p> |
| |
| <pre> |
| static final int intVal = 42; |
| static final String strVal = "Hello, world!"; |
| </pre> |
| |
| <p>The class no longer requires a <code><clinit></code> method, |
| because the constants go into static field initializers in the dex file. |
| Code that refers to <code>intVal</code> will use |
| the integer value 42 directly, and accesses to <code>strVal</code> will |
| use a relatively inexpensive "string constant" instruction instead of a |
| field lookup.</p> |
| |
| <p class="note"><strong>Note:</strong> This optimization applies only to primitive types and |
| {@link java.lang.String} constants, not arbitrary reference types. Still, it's good |
| practice to declare constants <code>static final</code> whenever possible.</p> |
| |
| |
| |
| |
| |
| <h2 id="GettersSetters">Avoid Internal Getters/Setters</h2> |
| |
| <p>In native languages like C++ it's common practice to use getters |
| (<code>i = getCount()</code>) instead of accessing the field directly (<code>i |
| = mCount</code>). This is an excellent habit for C++ and is often practiced in other |
| object oriented languages like C# and Java, because the compiler can |
| usually inline the access, and if you need to restrict or debug field access |
| you can add the code at any time.</p> |
| |
| <p>However, this is a bad idea on Android. Virtual method calls are expensive, |
| much more so than instance field lookups. It's reasonable to follow |
| common object-oriented programming practices and have getters and setters |
| in the public interface, but within a class you should always access |
| fields directly.</p> |
| |
| <p>Without a <acronym title="Just In Time compiler">JIT</acronym>, |
| direct field access is about 3x faster than invoking a |
| trivial getter. With the JIT (where direct field access is as cheap as |
| accessing a local), direct field access is about 7x faster than invoking a |
| trivial getter.</p> |
| |
| <p>Note that if you're using <a href="{@docRoot}tools/help/proguard.html">ProGuard</a>, |
| you can have the best of both worlds because ProGuard can inline accessors for you.</p> |
| |
| |
| |
| |
| |
| <h2 id="Loops">Use Enhanced For Loop Syntax</h2> |
| |
| <p>The enhanced <code>for</code> loop (also sometimes known as "for-each" loop) can be used |
| for collections that implement the {@link java.lang.Iterable} interface and for arrays. |
| With collections, an iterator is allocated to make interface calls |
| to {@code hasNext()} and {@code next()}. With an {@link java.util.ArrayList}, |
| a hand-written counted loop is |
| about 3x faster (with or without JIT), but for other collections the enhanced |
| for loop syntax will be exactly equivalent to explicit iterator usage.</p> |
| |
| <p>There are several alternatives for iterating through an array:</p> |
| |
| <pre> |
| static class Foo { |
| int mSplat; |
| } |
| |
| Foo[] mArray = ... |
| |
| public void zero() { |
| int sum = 0; |
| for (int i = 0; i < mArray.length; ++i) { |
| sum += mArray[i].mSplat; |
| } |
| } |
| |
| public void one() { |
| int sum = 0; |
| Foo[] localArray = mArray; |
| int len = localArray.length; |
| |
| for (int i = 0; i < len; ++i) { |
| sum += localArray[i].mSplat; |
| } |
| } |
| |
| public void two() { |
| int sum = 0; |
| for (Foo a : mArray) { |
| sum += a.mSplat; |
| } |
| } |
| </pre> |
| |
| <p><code>zero()</code> is slowest, because the JIT can't yet optimize away |
| the cost of getting the array length once for every iteration through the |
| loop.</p> |
| |
| <p><code>one()</code> is faster. It pulls everything out into local |
| variables, avoiding the lookups. Only the array length offers a performance |
| benefit.</p> |
| |
| <p><code>two()</code> is fastest for devices without a JIT, and |
| indistinguishable from <strong>one()</strong> for devices with a JIT. |
| It uses the enhanced for loop syntax introduced in version 1.5 of the Java |
| programming language.</p> |
| |
| <p>So, you should use the enhanced <code>for</code> loop by default, but consider a |
| hand-written counted loop for performance-critical {@link java.util.ArrayList} iteration.</p> |
| |
| <p class="note"><strong>Tip:</strong> |
| Also see Josh Bloch's <em>Effective Java</em>, item 46.</p> |
| |
| |
| |
| <h2 id="PackageInner">Consider Package Instead of Private Access with Private Inner Classes</h2> |
| |
| <p>Consider the following class definition:</p> |
| |
| <pre> |
| public class Foo { |
| private class Inner { |
| void stuff() { |
| Foo.this.doStuff(Foo.this.mValue); |
| } |
| } |
| |
| private int mValue; |
| |
| public void run() { |
| Inner in = new Inner(); |
| mValue = 27; |
| in.stuff(); |
| } |
| |
| private void doStuff(int value) { |
| System.out.println("Value is " + value); |
| } |
| }</pre> |
| |
| <p>What's important here is that we define a private inner class |
| (<code>Foo$Inner</code>) that directly accesses a private method and a private |
| instance field in the outer class. This is legal, and the code prints "Value is |
| 27" as expected.</p> |
| |
| <p>The problem is that the VM considers direct access to <code>Foo</code>'s |
| private members from <code>Foo$Inner</code> to be illegal because |
| <code>Foo</code> and <code>Foo$Inner</code> are different classes, even though |
| the Java language allows an inner class to access an outer class' private |
| members. To bridge the gap, the compiler generates a couple of synthetic |
| methods:</p> |
| |
| <pre> |
| /*package*/ static int Foo.access$100(Foo foo) { |
| return foo.mValue; |
| } |
| /*package*/ static void Foo.access$200(Foo foo, int value) { |
| foo.doStuff(value); |
| }</pre> |
| |
| <p>The inner class code calls these static methods whenever it needs to |
| access the <code>mValue</code> field or invoke the <code>doStuff()</code> method |
| in the outer class. What this means is that the code above really boils down to |
| a case where you're accessing member fields through accessor methods. |
| Earlier we talked about how accessors are slower than direct field |
| accesses, so this is an example of a certain language idiom resulting in an |
| "invisible" performance hit.</p> |
| |
| <p>If you're using code like this in a performance hotspot, you can avoid the |
| overhead by declaring fields and methods accessed by inner classes to have |
| package access, rather than private access. Unfortunately this means the fields |
| can be accessed directly by other classes in the same package, so you shouldn't |
| use this in public API.</p> |
| |
| |
| |
| |
| <h2 id="AvoidFloat">Avoid Using Floating-Point</h2> |
| |
| <p>As a rule of thumb, floating-point is about 2x slower than integer on |
| Android-powered devices.</p> |
| |
| <p>In speed terms, there's no difference between <code>float</code> and |
| <code>double</code> on the more modern hardware. Space-wise, <code>double</code> |
| is 2x larger. As with desktop machines, assuming space isn't an issue, you |
| should prefer <code>double</code> to <code>float</code>.</p> |
| |
| <p>Also, even for integers, some processors have hardware multiply but lack |
| hardware divide. In such cases, integer division and modulus operations are |
| performed in software—something to think about if you're designing a |
| hash table or doing lots of math.</p> |
| |
| |
| |
| |
| <h2 id="UseLibraries">Know and Use the Libraries</h2> |
| |
| <p>In addition to all the usual reasons to prefer library code over rolling |
| your own, bear in mind that the system is at liberty to replace calls |
| to library methods with hand-coded assembler, which may be better than the |
| best code the JIT can produce for the equivalent Java. The typical example |
| here is {@link java.lang.String#indexOf String.indexOf()} and |
| related APIs, which Dalvik replaces with |
| an inlined intrinsic. Similarly, the {@link java.lang.System#arraycopy |
| System.arraycopy()} method |
| is about 9x faster than a hand-coded loop on a Nexus One with the JIT.</p> |
| |
| |
| <p class="note"><strong>Tip:</strong> |
| Also see Josh Bloch's <em>Effective Java</em>, item 47.</p> |
| |
| |
| |
| |
| <h2 id="NativeMethods">Use Native Methods Carefully</h2> |
| |
| <p>Developing your app with native code using the |
| <a href="{@docRoot}tools/sdk/ndk/index.html">Android NDK</a> |
| isn't necessarily more efficient than programming with the |
| Java language. For one thing, |
| there's a cost associated with the Java-native transition, and the JIT can't |
| optimize across these boundaries. If you're allocating native resources (memory |
| on the native heap, file descriptors, or whatever), it can be significantly |
| more difficult to arrange timely collection of these resources. You also |
| need to compile your code for each architecture you wish to run on (rather |
| than rely on it having a JIT). You may even have to compile multiple versions |
| for what you consider the same architecture: native code compiled for the ARM |
| processor in the G1 can't take full advantage of the ARM in the Nexus One, and |
| code compiled for the ARM in the Nexus One won't run on the ARM in the G1.</p> |
| |
| <p>Native code is primarily useful when you have an existing native codebase |
| that you want to port to Android, not for "speeding up" parts of your Android app |
| written with the Java language.</p> |
| |
| <p>If you do need to use native code, you should read our |
| <a href="{@docRoot}guide/practices/jni.html">JNI Tips</a>.</p> |
| |
| <p class="note"><strong>Tip:</strong> |
| Also see Josh Bloch's <em>Effective Java</em>, item 54.</p> |
| |
| |
| |
| |
| |
| <h2 id="Myths">Performance Myths</h2> |
| |
| |
| <p>On devices without a JIT, it is true that invoking methods via a |
| variable with an exact type rather than an interface is slightly more |
| efficient. (So, for example, it was cheaper to invoke methods on a |
| <code>HashMap map</code> than a <code>Map map</code>, even though in both |
| cases the map was a <code>HashMap</code>.) It was not the case that this |
| was 2x slower; the actual difference was more like 6% slower. Furthermore, |
| the JIT makes the two effectively indistinguishable.</p> |
| |
| <p>On devices without a JIT, caching field accesses is about 20% faster than |
| repeatedly accessing the field. With a JIT, field access costs about the same |
| as local access, so this isn't a worthwhile optimization unless you feel it |
| makes your code easier to read. (This is true of final, static, and static |
| final fields too.) |
| |
| |
| |
| <h2 id="Measure">Always Measure</h2> |
| |
| <p>Before you start optimizing, make sure you have a problem that you |
| need to solve. Make sure you can accurately measure your existing performance, |
| or you won't be able to measure the benefit of the alternatives you try.</p> |
| |
| <p>Every claim made in this document is backed up by a benchmark. The source |
| to these benchmarks can be found in the <a |
| href="http://code.google.com/p/dalvik/source/browse/#svn/trunk/benchmarks">code.google.com |
| "dalvik" project</a>.</p> |
| |
| <p>The benchmarks are built with the |
| <a href="http://code.google.com/p/caliper/">Caliper</a> microbenchmarking |
| framework for Java. Microbenchmarks are hard to get right, so Caliper goes out |
| of its way to do the hard work for you, and even detect some cases where you're |
| not measuring what you think you're measuring (because, say, the VM has |
| managed to optimize all your code away). We highly recommend you use Caliper |
| to run your own microbenchmarks.</p> |
| |
| <p>You may also find |
| <a href="{@docRoot}tools/debugging/debugging-tracing.html">Traceview</a> useful |
| for profiling, but it's important to realize that it currently disables the JIT, |
| which may cause it to misattribute time to code that the JIT may be able to win |
| back. It's especially important after making changes suggested by Traceview |
| data to ensure that the resulting code actually runs faster when run without |
| Traceview.</p> |
| |
| <p>For more help profiling and debugging your apps, see the following documents:</p> |
| |
| <ul> |
| <li><a href="{@docRoot}tools/debugging/debugging-tracing.html">Profiling with |
| Traceview and dmtracedump</a></li> |
| <li><a href="{@docRoot}tools/debugging/systrace.html">Analyzing UI Performance |
| with Systrace</a></li> |
| </ul> |
| |