Blackfin: make deferred hardware errors more exact
Hardware errors on the Blackfin architecture are queued by nature of the
hardware design. Things that could generate a hardware level queue up at
the system interface and might not process until much later, at which
point the system would send a notification back to the core.
As such, it is possible for user space code to do something that would
trigger a hardware error, but have it delay long enough for the process
context to switch. So when the hardware error does signal, we mistakenly
evaluate it as a different process or as kernel context and panic (erp!).
This makes it pretty difficult to find the offending context. But wait,
there is good news somewhere.
By forcing a SSYNC in the interrupt entry, we force all pending queues at
the system level to be processed and all hardware errors to be signaled.
Then we check the current interrupt state to see if the hardware error is
now signaled. If so, we re-queue the current interrupt and return thus
allowing the higher priority hardware error interrupt to process properly.
Since we haven't done any other context processing yet, the right context
will be selected and killed. There is still the possibility that the
exact offending instruction will be unknown, but at least we'll have a
much better idea of where to look.
The downside of course is that this causes system-wide syncs at every
interrupt point which results in significant performance degradation.
Since this situation should not occur in any properly configured system
(as hardware errors are triggered by things like bad pointers), make it a
debug configuration option and disable it by default.
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
4 files changed