Suppose we have a tool that generates a Ninja file from some other description (think Kati and makefiles), and during the testing we discovered a regression. Furthermore, suppose that the generated Ninja file is large (think millions of lines). And, the new Ninja file has build statements and rules in a slightly different order. As the tool generates the rule names, the real differences in the output of the diff
command are drowned in noise. Enter Canoninja.
Canoninja renames each Ninja rule to the hash of its contents. After that, we can just sort the build statements, and a simple comm
command immediately reveal the essential difference between the files.
Consider the following makefile
second := first: foo foo: @echo foo second: bar bar: @echo bar
Depending on Kati version converting it to Ninja file will yield either:
$ cat /tmp/1.ninja # Generated by kati 06f2569b2d16628608c000a76e3d495a5a5528cb pool local_pool depth = 72 build _kati_always_build_: phony build first: phony foo rule rule0 description = build $out command = /bin/sh -c "echo foo" build foo: rule0 build second: phony bar rule rule1 description = build $out command = /bin/sh -c "echo bar" build bar: rule1 default first
or
$ cat 2.ninja # Generated by kati 371194da71b3e191fea6f2ccceb7b061bd0de310 pool local_pool depth = 72 build _kati_always_build_: phony build second: phony bar rule rule0 description = build $out command = /bin/sh -c "echo bar" build bar: rule0 build first: phony foo rule rule1 description = build $out command = /bin/sh -c "echo foo" build foo: rule1 default first
This is a quirk in Kati, see https://github.com/google/kati/issues/238
Trying to find out the difference between the targets even after sorting them isn't too helpful:
diff <(grep '^build' /tmp/1.ninja|sort) <(grep '^build' /tmp/2.ninja | sort) 1c1 < build bar: rule1 --- > build bar: rule0 3c3 < build foo: rule0 --- > build foo: rule1
However, running these files through canoninja
yields
$ canoninja /tmp/1.ninja # Generated by kati 06f2569b2d16628608c000a76e3d495a5a5528cb pool local_pool depth = 72 build _kati_always_build_: phony build first: phony foo rule R2f9981d3c152fc255370dc67028244f7bed72a03 description = build $out command = /bin/sh -c "echo foo" build foo: R2f9981d3c152fc255370dc67028244f7bed72a03 build second: phony bar rule R62640f3f9095cf2da5b9d9e2a82f746cc710c94c description = build $out command = /bin/sh -c "echo bar" build bar: R62640f3f9095cf2da5b9d9e2a82f746cc710c94c default first
and
~/go/bin/canoninja /tmp/2.ninja # Generated by kati 371194da71b3e191fea6f2ccceb7b061bd0de310 pool local_pool depth = 72 build _kati_always_build_: phony build second: phony bar rule R62640f3f9095cf2da5b9d9e2a82f746cc710c94c description = build $out command = /bin/sh -c "echo bar" build bar: R62640f3f9095cf2da5b9d9e2a82f746cc710c94c build first: phony foo rule R2f9981d3c152fc255370dc67028244f7bed72a03 description = build $out command = /bin/sh -c "echo foo" build foo: R2f9981d3c152fc255370dc67028244f7bed72a03 default first
and when we extract only build statements and sort them, we see that both Ninja files define the same graph:
$ diff <(~/go/bin/canoninja /tmp/1.ninja | grep '^build' | sort) \ <(~/go/bin/canoninja /tmp/2.ninja | grep '^build' | sort)