| ACPI on ARMv8 Servers |
| --------------------- |
| ACPI can be used for ARMv8 general purpose servers designed to follow |
| the ARM SBSA (Server Base System Architecture) [0] and SBBR (Server |
| Base Boot Requirements) [1] specifications. Please note that the SBBR |
| can be retrieved simply by visiting [1], but the SBSA is currently only |
| available to those with an ARM login due to ARM IP licensing concerns. |
| |
| The ARMv8 kernel implements the reduced hardware model of ACPI version |
| 5.1 or later. Links to the specification and all external documents |
| it refers to are managed by the UEFI Forum. The specification is |
| available at http://www.uefi.org/specifications and documents referenced |
| by the specification can be found via http://www.uefi.org/acpi. |
| |
| If an ARMv8 system does not meet the requirements of the SBSA and SBBR, |
| or cannot be described using the mechanisms defined in the required ACPI |
| specifications, then ACPI may not be a good fit for the hardware. |
| |
| While the documents mentioned above set out the requirements for building |
| industry-standard ARMv8 servers, they also apply to more than one operating |
| system. The purpose of this document is to describe the interaction between |
| ACPI and Linux only, on an ARMv8 system -- that is, what Linux expects of |
| ACPI and what ACPI can expect of Linux. |
| |
| |
| Why ACPI on ARM? |
| ---------------- |
| Before examining the details of the interface between ACPI and Linux, it is |
| useful to understand why ACPI is being used. Several technologies already |
| exist in Linux for describing non-enumerable hardware, after all. In this |
| section we summarize a blog post [2] from Grant Likely that outlines the |
| reasoning behind ACPI on ARMv8 servers. Actually, we snitch a good portion |
| of the summary text almost directly, to be honest. |
| |
| The short form of the rationale for ACPI on ARM is: |
| |
| -- ACPI’s byte code (AML) allows the platform to encode hardware behavior, |
| while DT explicitly does not support this. For hardware vendors, being |
| able to encode behavior is a key tool used in supporting operating |
| system releases on new hardware. |
| |
| -- ACPI’s OSPM defines a power management model that constrains what the |
| platform is allowed to do into a specific model, while still providing |
| flexibility in hardware design. |
| |
| -- In the enterprise server environment, ACPI has established bindings (such |
| as for RAS) which are currently used in production systems. DT does not. |
| Such bindings could be defined in DT at some point, but doing so means ARM |
| and x86 would end up using completely different code paths in both firmware |
| and the kernel. |
| |
| -- Choosing a single interface to describe the abstraction between a platform |
| and an OS is important. Hardware vendors would not be required to implement |
| both DT and ACPI if they want to support multiple operating systems. And, |
| agreeing on a single interface instead of being fragmented into per OS |
| interfaces makes for better interoperability overall. |
| |
| -- The new ACPI governance process works well and Linux is now at the same |
| table as hardware vendors and other OS vendors. In fact, there is no |
| longer any reason to feel that ACPI only belongs to Windows or that |
| Linux is in any way secondary to Microsoft in this arena. The move of |
| ACPI governance into the UEFI forum has significantly opened up the |
| specification development process, and currently, a large portion of the |
| changes being made to ACPI are being driven by Linux. |
| |
| Key to the use of ACPI is the support model. For servers in general, the |
| responsibility for hardware behaviour cannot solely be the domain of the |
| kernel, but rather must be split between the platform and the kernel, in |
| order to allow for orderly change over time. ACPI frees the OS from needing |
| to understand all the minute details of the hardware so that the OS doesn’t |
| need to be ported to each and every device individually. It allows the |
| hardware vendors to take responsibility for power management behaviour without |
| depending on an OS release cycle which is not under their control. |
| |
| ACPI is also important because hardware and OS vendors have already worked |
| out the mechanisms for supporting a general purpose computing ecosystem. The |
| infrastructure is in place, the bindings are in place, and the processes are |
| in place. DT does exactly what Linux needs it to when working with vertically |
| integrated devices, but there are no good processes for supporting what the |
| server vendors need. Linux could potentially get there with DT, but doing so |
| really just duplicates something that already works. ACPI already does what |
| the hardware vendors need, Microsoft won’t collaborate on DT, and hardware |
| vendors would still end up providing two completely separate firmware |
| interfaces -- one for Linux and one for Windows. |
| |
| |
| Kernel Compatibility |
| -------------------- |
| One of the primary motivations for ACPI is standardization, and using that |
| to provide backward compatibility for Linux kernels. In the server market, |
| software and hardware are often used for long periods. ACPI allows the |
| kernel and firmware to agree on a consistent abstraction that can be |
| maintained over time, even as hardware or software change. As long as the |
| abstraction is supported, systems can be updated without necessarily having |
| to replace the kernel. |
| |
| When a Linux driver or subsystem is first implemented using ACPI, it by |
| definition ends up requiring a specific version of the ACPI specification |
| -- it's baseline. ACPI firmware must continue to work, even though it may |
| not be optimal, with the earliest kernel version that first provides support |
| for that baseline version of ACPI. There may be a need for additional drivers, |
| but adding new functionality (e.g., CPU power management) should not break |
| older kernel versions. Further, ACPI firmware must also work with the most |
| recent version of the kernel. |
| |
| |
| Relationship with Device Tree |
| ----------------------------- |
| ACPI support in drivers and subsystems for ARMv8 should never be mutually |
| exclusive with DT support at compile time. |
| |
| At boot time the kernel will only use one description method depending on |
| parameters passed from the boot loader (including kernel bootargs). |
| |
| Regardless of whether DT or ACPI is used, the kernel must always be capable |
| of booting with either scheme (in kernels with both schemes enabled at compile |
| time). |
| |
| |
| Booting using ACPI tables |
| ------------------------- |
| The only defined method for passing ACPI tables to the kernel on ARMv8 |
| is via the UEFI system configuration table. Just so it is explicit, this |
| means that ACPI is only supported on platforms that boot via UEFI. |
| |
| When an ARMv8 system boots, it can either have DT information, ACPI tables, |
| or in some very unusual cases, both. If no command line parameters are used, |
| the kernel will try to use DT for device enumeration; if there is no DT |
| present, the kernel will try to use ACPI tables, but only if they are present. |
| In neither is available, the kernel will not boot. If acpi=force is used |
| on the command line, the kernel will attempt to use ACPI tables first, but |
| fall back to DT if there are no ACPI tables present. The basic idea is that |
| the kernel will not fail to boot unless it absolutely has no other choice. |
| |
| Processing of ACPI tables may be disabled by passing acpi=off on the kernel |
| command line; this is the default behavior. |
| |
| In order for the kernel to load and use ACPI tables, the UEFI implementation |
| MUST set the ACPI_20_TABLE_GUID to point to the RSDP table (the table with |
| the ACPI signature "RSD PTR "). If this pointer is incorrect and acpi=force |
| is used, the kernel will disable ACPI and try to use DT to boot instead; the |
| kernel has, in effect, determined that ACPI tables are not present at that |
| point. |
| |
| If the pointer to the RSDP table is correct, the table will be mapped into |
| the kernel by the ACPI core, using the address provided by UEFI. |
| |
| The ACPI core will then locate and map in all other ACPI tables provided by |
| using the addresses in the RSDP table to find the XSDT (eXtended System |
| Description Table). The XSDT in turn provides the addresses to all other |
| ACPI tables provided by the system firmware; the ACPI core will then traverse |
| this table and map in the tables listed. |
| |
| The ACPI core will ignore any provided RSDT (Root System Description Table). |
| RSDTs have been deprecated and are ignored on arm64 since they only allow |
| for 32-bit addresses. |
| |
| Further, the ACPI core will only use the 64-bit address fields in the FADT |
| (Fixed ACPI Description Table). Any 32-bit address fields in the FADT will |
| be ignored on arm64. |
| |
| Hardware reduced mode (see Section 4.1 of the ACPI 6.1 specification) will |
| be enforced by the ACPI core on arm64. Doing so allows the ACPI core to |
| run less complex code since it no longer has to provide support for legacy |
| hardware from other architectures. Any fields that are not to be used for |
| hardware reduced mode must be set to zero. |
| |
| For the ACPI core to operate properly, and in turn provide the information |
| the kernel needs to configure devices, it expects to find the following |
| tables (all section numbers refer to the ACPI 6.1 specification): |
| |
| -- RSDP (Root System Description Pointer), section 5.2.5 |
| |
| -- XSDT (eXtended System Description Table), section 5.2.8 |
| |
| -- FADT (Fixed ACPI Description Table), section 5.2.9 |
| |
| -- DSDT (Differentiated System Description Table), section |
| 5.2.11.1 |
| |
| -- MADT (Multiple APIC Description Table), section 5.2.12 |
| |
| -- GTDT (Generic Timer Description Table), section 5.2.24 |
| |
| -- If PCI is supported, the MCFG (Memory mapped ConFiGuration |
| Table), section 5.2.6, specifically Table 5-31. |
| |
| -- If booting without a console=<device> kernel parameter is |
| supported, the SPCR (Serial Port Console Redirection table), |
| section 5.2.6, specifically Table 5-31. |
| |
| -- If necessary to describe the I/O topology, SMMUs and GIC ITSs, |
| the IORT (Input Output Remapping Table, section 5.2.6, specifically |
| Table 5-31). |
| |
| -- If NUMA is supported, the SRAT (System Resource Affinity Table) |
| and SLIT (System Locality distance Information Table), sections |
| 5.2.16 and 5.2.17, respectively. |
| |
| If the above tables are not all present, the kernel may or may not be |
| able to boot properly since it may not be able to configure all of the |
| devices available. This list of tables is not meant to be all inclusive; |
| in some environments other tables may be needed (e.g., any of the APEI |
| tables from section 18) to support specific functionality. |
| |
| |
| ACPI Detection |
| -------------- |
| Drivers should determine their probe() type by checking for a null |
| value for ACPI_HANDLE, or checking .of_node, or other information in |
| the device structure. This is detailed further in the "Driver |
| Recommendations" section. |
| |
| In non-driver code, if the presence of ACPI needs to be detected at |
| run time, then check the value of acpi_disabled. If CONFIG_ACPI is not |
| set, acpi_disabled will always be 1. |
| |
| |
| Device Enumeration |
| ------------------ |
| Device descriptions in ACPI should use standard recognized ACPI interfaces. |
| These may contain less information than is typically provided via a Device |
| Tree description for the same device. This is also one of the reasons that |
| ACPI can be useful -- the driver takes into account that it may have less |
| detailed information about the device and uses sensible defaults instead. |
| If done properly in the driver, the hardware can change and improve over |
| time without the driver having to change at all. |
| |
| Clocks provide an excellent example. In DT, clocks need to be specified |
| and the drivers need to take them into account. In ACPI, the assumption |
| is that UEFI will leave the device in a reasonable default state, including |
| any clock settings. If for some reason the driver needs to change a clock |
| value, this can be done in an ACPI method; all the driver needs to do is |
| invoke the method and not concern itself with what the method needs to do |
| to change the clock. Changing the hardware can then take place over time |
| by changing what the ACPI method does, and not the driver. |
| |
| In DT, the parameters needed by the driver to set up clocks as in the example |
| above are known as "bindings"; in ACPI, these are known as "Device Properties" |
| and provided to a driver via the _DSD object. |
| |
| ACPI tables are described with a formal language called ASL, the ACPI |
| Source Language (section 19 of the specification). This means that there |
| are always multiple ways to describe the same thing -- including device |
| properties. For example, device properties could use an ASL construct |
| that looks like this: Name(KEY0, "value0"). An ACPI device driver would |
| then retrieve the value of the property by evaluating the KEY0 object. |
| However, using Name() this way has multiple problems: (1) ACPI limits |
| names ("KEY0") to four characters unlike DT; (2) there is no industry |
| wide registry that maintains a list of names, minimizing re-use; (3) |
| there is also no registry for the definition of property values ("value0"), |
| again making re-use difficult; and (4) how does one maintain backward |
| compatibility as new hardware comes out? The _DSD method was created |
| to solve precisely these sorts of problems; Linux drivers should ALWAYS |
| use the _DSD method for device properties and nothing else. |
| |
| The _DSM object (ACPI Section 9.14.1) could also be used for conveying |
| device properties to a driver. Linux drivers should only expect it to |
| be used if _DSD cannot represent the data required, and there is no way |
| to create a new UUID for the _DSD object. Note that there is even less |
| regulation of the use of _DSM than there is of _DSD. Drivers that depend |
| on the contents of _DSM objects will be more difficult to maintain over |
| time because of this; as of this writing, the use of _DSM is the cause |
| of quite a few firmware problems and is not recommended. |
| |
| Drivers should look for device properties in the _DSD object ONLY; the _DSD |
| object is described in the ACPI specification section 6.2.5, but this only |
| describes how to define the structure of an object returned via _DSD, and |
| how specific data structures are defined by specific UUIDs. Linux should |
| only use the _DSD Device Properties UUID [5]: |
| |
| -- UUID: daffd814-6eba-4d8c-8a91-bc9bbf4aa301 |
| |
| -- http://www.uefi.org/sites/default/files/resources/_DSD-device-properties-UUID.pdf |
| |
| The UEFI Forum provides a mechanism for registering device properties [4] |
| so that they may be used across all operating systems supporting ACPI. |
| Device properties that have not been registered with the UEFI Forum should |
| not be used. |
| |
| Before creating new device properties, check to be sure that they have not |
| been defined before and either registered in the Linux kernel documentation |
| as DT bindings, or the UEFI Forum as device properties. While we do not want |
| to simply move all DT bindings into ACPI device properties, we can learn from |
| what has been previously defined. |
| |
| If it is necessary to define a new device property, or if it makes sense to |
| synthesize the definition of a binding so it can be used in any firmware, |
| both DT bindings and ACPI device properties for device drivers have review |
| processes. Use them both. When the driver itself is submitted for review |
| to the Linux mailing lists, the device property definitions needed must be |
| submitted at the same time. A driver that supports ACPI and uses device |
| properties will not be considered complete without their definitions. Once |
| the device property has been accepted by the Linux community, it must be |
| registered with the UEFI Forum [4], which will review it again for consistency |
| within the registry. This may require iteration. The UEFI Forum, though, |
| will always be the canonical site for device property definitions. |
| |
| It may make sense to provide notice to the UEFI Forum that there is the |
| intent to register a previously unused device property name as a means of |
| reserving the name for later use. Other operating system vendors will |
| also be submitting registration requests and this may help smooth the |
| process. |
| |
| Once registration and review have been completed, the kernel provides an |
| interface for looking up device properties in a manner independent of |
| whether DT or ACPI is being used. This API should be used [6]; it can |
| eliminate some duplication of code paths in driver probing functions and |
| discourage divergence between DT bindings and ACPI device properties. |
| |
| |
| Programmable Power Control Resources |
| ------------------------------------ |
| Programmable power control resources include such resources as voltage/current |
| providers (regulators) and clock sources. |
| |
| With ACPI, the kernel clock and regulator framework is not expected to be used |
| at all. |
| |
| The kernel assumes that power control of these resources is represented with |
| Power Resource Objects (ACPI section 7.1). The ACPI core will then handle |
| correctly enabling and disabling resources as they are needed. In order to |
| get that to work, ACPI assumes each device has defined D-states and that these |
| can be controlled through the optional ACPI methods _PS0, _PS1, _PS2, and _PS3; |
| in ACPI, _PS0 is the method to invoke to turn a device full on, and _PS3 is for |
| turning a device full off. |
| |
| There are two options for using those Power Resources. They can: |
| |
| -- be managed in a _PSx method which gets called on entry to power |
| state Dx. |
| |
| -- be declared separately as power resources with their own _ON and _OFF |
| methods. They are then tied back to D-states for a particular device |
| via _PRx which specifies which power resources a device needs to be on |
| while in Dx. Kernel then tracks number of devices using a power resource |
| and calls _ON/_OFF as needed. |
| |
| The kernel ACPI code will also assume that the _PSx methods follow the normal |
| ACPI rules for such methods: |
| |
| -- If either _PS0 or _PS3 is implemented, then the other method must also |
| be implemented. |
| |
| -- If a device requires usage or setup of a power resource when on, the ASL |
| should organize that it is allocated/enabled using the _PS0 method. |
| |
| -- Resources allocated or enabled in the _PS0 method should be disabled |
| or de-allocated in the _PS3 method. |
| |
| -- Firmware will leave the resources in a reasonable state before handing |
| over control to the kernel. |
| |
| Such code in _PSx methods will of course be very platform specific. But, |
| this allows the driver to abstract out the interface for operating the device |
| and avoid having to read special non-standard values from ACPI tables. Further, |
| abstracting the use of these resources allows the hardware to change over time |
| without requiring updates to the driver. |
| |
| |
| Clocks |
| ------ |
| ACPI makes the assumption that clocks are initialized by the firmware -- |
| UEFI, in this case -- to some working value before control is handed over |
| to the kernel. This has implications for devices such as UARTs, or SoC-driven |
| LCD displays, for example. |
| |
| When the kernel boots, the clocks are assumed to be set to reasonable |
| working values. If for some reason the frequency needs to change -- e.g., |
| throttling for power management -- the device driver should expect that |
| process to be abstracted out into some ACPI method that can be invoked |
| (please see the ACPI specification for further recommendations on standard |
| methods to be expected). The only exceptions to this are CPU clocks where |
| CPPC provides a much richer interface than ACPI methods. If the clocks |
| are not set, there is no direct way for Linux to control them. |
| |
| If an SoC vendor wants to provide fine-grained control of the system clocks, |
| they could do so by providing ACPI methods that could be invoked by Linux |
| drivers. However, this is NOT recommended and Linux drivers should NOT use |
| such methods, even if they are provided. Such methods are not currently |
| standardized in the ACPI specification, and using them could tie a kernel |
| to a very specific SoC, or tie an SoC to a very specific version of the |
| kernel, both of which we are trying to avoid. |
| |
| |
| Driver Recommendations |
| ---------------------- |
| DO NOT remove any DT handling when adding ACPI support for a driver. The |
| same device may be used on many different systems. |
| |
| DO try to structure the driver so that it is data-driven. That is, set up |
| a struct containing internal per-device state based on defaults and whatever |
| else must be discovered by the driver probe function. Then, have the rest |
| of the driver operate off of the contents of that struct. Doing so should |
| allow most divergence between ACPI and DT functionality to be kept local to |
| the probe function instead of being scattered throughout the driver. For |
| example: |
| |
| static int device_probe_dt(struct platform_device *pdev) |
| { |
| /* DT specific functionality */ |
| ... |
| } |
| |
| static int device_probe_acpi(struct platform_device *pdev) |
| { |
| /* ACPI specific functionality */ |
| ... |
| } |
| |
| static int device_probe(struct platform_device *pdev) |
| { |
| ... |
| struct device_node node = pdev->dev.of_node; |
| ... |
| |
| if (node) |
| ret = device_probe_dt(pdev); |
| else if (ACPI_HANDLE(&pdev->dev)) |
| ret = device_probe_acpi(pdev); |
| else |
| /* other initialization */ |
| ... |
| /* Continue with any generic probe operations */ |
| ... |
| } |
| |
| DO keep the MODULE_DEVICE_TABLE entries together in the driver to make it |
| clear the different names the driver is probed for, both from DT and from |
| ACPI: |
| |
| static struct of_device_id virtio_mmio_match[] = { |
| { .compatible = "virtio,mmio", }, |
| { } |
| }; |
| MODULE_DEVICE_TABLE(of, virtio_mmio_match); |
| |
| static const struct acpi_device_id virtio_mmio_acpi_match[] = { |
| { "LNRO0005", }, |
| { } |
| }; |
| MODULE_DEVICE_TABLE(acpi, virtio_mmio_acpi_match); |
| |
| |
| ASWG |
| ---- |
| The ACPI specification changes regularly. During the year 2014, for instance, |
| version 5.1 was released and version 6.0 substantially completed, with most of |
| the changes being driven by ARM-specific requirements. Proposed changes are |
| presented and discussed in the ASWG (ACPI Specification Working Group) which |
| is a part of the UEFI Forum. The current version of the ACPI specification |
| is 6.1 release in January 2016. |
| |
| Participation in this group is open to all UEFI members. Please see |
| http://www.uefi.org/workinggroup for details on group membership. |
| |
| It is the intent of the ARMv8 ACPI kernel code to follow the ACPI specification |
| as closely as possible, and to only implement functionality that complies with |
| the released standards from UEFI ASWG. As a practical matter, there will be |
| vendors that provide bad ACPI tables or violate the standards in some way. |
| If this is because of errors, quirks and fix-ups may be necessary, but will |
| be avoided if possible. If there are features missing from ACPI that preclude |
| it from being used on a platform, ECRs (Engineering Change Requests) should be |
| submitted to ASWG and go through the normal approval process; for those that |
| are not UEFI members, many other members of the Linux community are and would |
| likely be willing to assist in submitting ECRs. |
| |
| |
| Linux Code |
| ---------- |
| Individual items specific to Linux on ARM, contained in the the Linux |
| source code, are in the list that follows: |
| |
| ACPI_OS_NAME This macro defines the string to be returned when |
| an ACPI method invokes the _OS method. On ARM64 |
| systems, this macro will be "Linux" by default. |
| The command line parameter acpi_os=<string> |
| can be used to set it to some other value. The |
| default value for other architectures is "Microsoft |
| Windows NT", for example. |
| |
| ACPI Objects |
| ------------ |
| Detailed expectations for ACPI tables and object are listed in the file |
| Documentation/arm64/acpi_object_usage.txt. |
| |
| |
| References |
| ---------- |
| [0] http://silver.arm.com -- document ARM-DEN-0029, or newer |
| "Server Base System Architecture", version 2.3, dated 27 Mar 2014 |
| |
| [1] http://infocenter.arm.com/help/topic/com.arm.doc.den0044a/Server_Base_Boot_Requirements.pdf |
| Document ARM-DEN-0044A, or newer: "Server Base Boot Requirements, System |
| Software on ARM Platforms", dated 16 Aug 2014 |
| |
| [2] http://www.secretlab.ca/archives/151, 10 Jan 2015, Copyright (c) 2015, |
| Linaro Ltd., written by Grant Likely. |
| |
| [3] AMD ACPI for Seattle platform documentation: |
| http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2012/10/Seattle_ACPI_Guide.pdf |
| |
| [4] http://www.uefi.org/acpi -- please see the link for the "ACPI _DSD Device |
| Property Registry Instructions" |
| |
| [5] http://www.uefi.org/acpi -- please see the link for the "_DSD (Device |
| Specific Data) Implementation Guide" |
| |
| [6] Kernel code for the unified device property interface can be found in |
| include/linux/property.h and drivers/base/property.c. |
| |
| |
| Authors |
| ------- |
| Al Stone <al.stone@linaro.org> |
| Graeme Gregory <graeme.gregory@linaro.org> |
| Hanjun Guo <hanjun.guo@linaro.org> |
| |
| Grant Likely <grant.likely@linaro.org>, for the "Why ACPI on ARM?" section |