Fix intersecting live ranges created by instruction scheduler

When scheduling code like the following:
LOOP:
   v2=phi(v0, v1)
   use(v2)
   v1=...
   goto LOOP

the instruction scheduler can move 'v1=...' before 'use(v2)'. This
causes live ranges of v1 and v2 to intersect and results to a MOV
instruction to be created.

The CL fixes this.

Improvements, Pixel3:
  Little CPU, arm64
    micro/GCCLoops
      Example12       14.1%
      Example10b      11.0%
      Example23       8.1%
      Example24       6.6%
      Example10a      5.0%
    FFT workload      4.7%
    Compress workload 1.2%

  Little CPU, arm32
    micro/GCCLoops
      Example23         7.5%
      Example24         4.3%
    MonteCarlo workload 1.35%

  Big CPU, arm32 and arm64
    No significant improvements

No significant regressions (> 5%) are found.

Test: test.py --host --optimizing --jit --gtest
Test: test.py --target --optimizing --jit
Test: run-gtests.sh
Change-Id: I1e4282af18f2d51fde5325a0c00a57e8bbc4fbed
diff --git a/compiler/optimizing/scheduler.h b/compiler/optimizing/scheduler.h
index a97eda7..f7180a0 100644
--- a/compiler/optimizing/scheduler.h
+++ b/compiler/optimizing/scheduler.h
@@ -183,7 +183,9 @@
 
   void AddOtherPredecessor(SchedulingNode* predecessor) {
     // Check whether the predecessor has been added earlier.
-    if (HasOtherDependency(predecessor)) {
+    // As an optimization of the scheduling graph, we don't need to create another dependency if
+    // there is a data dependency between scheduling nodes.
+    if (HasOtherDependency(predecessor) || HasDataDependency(predecessor)) {
       return;
     }
     other_predecessors_.push_back(predecessor);
@@ -362,6 +364,25 @@
     AddDependency(node, dependency, /*is_data_dependency*/false);
   }
 
+  // Analyze whether the scheduling node has cross-iteration dependencies which mean it uses
+  // values defined on the previous iteration.
+  //
+  // Supported cases:
+  //
+  //   L:
+  //     v2 = loop_head_phi(v1)
+  //     instr1(v2)
+  //     v1 = instr2
+  //     goto L
+  //
+  // In such cases moving instr2 before instr1 creates intersecting live ranges
+  // of v1 and v2. As a result a separate register is needed to keep the value
+  // defined by instr2 which is only used on the next iteration.
+  // If instr2 is not moved, no additional register is needed. The register
+  // used by instr1 is reused.
+  // To prevent such a situation a "other" dependency between instr1 and instr2 must be set.
+  void AddCrossIterationDependencies(SchedulingNode* node);
+
   // Add dependencies nodes for the given `SchedulingNode`: inputs, environments, and side-effects.
   void AddDependencies(SchedulingNode* node, bool is_scheduling_barrier = false);