| Linux for S/390 and zSeries |
| |
| Common Device Support (CDS) |
| Device Driver I/O Support Routines |
| |
| Authors : Ingo Adlung |
| Cornelia Huck |
| |
| Copyright, IBM Corp. 1999-2002 |
| |
| Introduction |
| |
| This document describes the common device support routines for Linux/390. |
| Different than other hardware architectures, ESA/390 has defined a unified |
| I/O access method. This gives relief to the device drivers as they don't |
| have to deal with different bus types, polling versus interrupt |
| processing, shared versus non-shared interrupt processing, DMA versus port |
| I/O (PIO), and other hardware features more. However, this implies that |
| either every single device driver needs to implement the hardware I/O |
| attachment functionality itself, or the operating system provides for a |
| unified method to access the hardware, providing all the functionality that |
| every single device driver would have to provide itself. |
| |
| The document does not intend to explain the ESA/390 hardware architecture in |
| every detail.This information can be obtained from the ESA/390 Principles of |
| Operation manual (IBM Form. No. SA22-7201). |
| |
| In order to build common device support for ESA/390 I/O interfaces, a |
| functional layer was introduced that provides generic I/O access methods to |
| the hardware. |
| |
| The common device support layer comprises the I/O support routines defined |
| below. Some of them implement common Linux device driver interfaces, while |
| some of them are ESA/390 platform specific. |
| |
| Note: |
| In order to write a driver for S/390, you also need to look into the interface |
| described in Documentation/s390/driver-model.txt. |
| |
| Note for porting drivers from 2.4: |
| The major changes are: |
| * The functions use a ccw_device instead of an irq (subchannel). |
| * All drivers must define a ccw_driver (see driver-model.txt) and the associated |
| functions. |
| * request_irq() and free_irq() are no longer done by the driver. |
| * The oper_handler is (kindof) replaced by the probe() and set_online() functions |
| of the ccw_driver. |
| * The not_oper_handler is (kindof) replaced by the remove() and set_offline() |
| functions of the ccw_driver. |
| * The channel device layer is gone. |
| * The interrupt handlers must be adapted to use a ccw_device as argument. |
| Moreover, they don't return a devstat, but an irb. |
| * Before initiating an io, the options must be set via ccw_device_set_options(). |
| |
| read_dev_chars() |
| read device characteristics |
| |
| read_conf_data() |
| read_conf_data_lpm() |
| read configuration data. |
| |
| ccw_device_get_ciw() |
| get commands from extended sense data. |
| |
| ccw_device_start() |
| ccw_device_start_timeout() |
| ccw_device_start_key() |
| ccw_device_start_key_timeout() |
| initiate an I/O request. |
| |
| ccw_device_resume() |
| resume channel program execution. |
| |
| ccw_device_halt() |
| terminate the current I/O request processed on the device. |
| |
| do_IRQ() |
| generic interrupt routine. This function is called by the interrupt entry |
| routine whenever an I/O interrupt is presented to the system. The do_IRQ() |
| routine determines the interrupt status and calls the device specific |
| interrupt handler according to the rules (flags) defined during I/O request |
| initiation with do_IO(). |
| |
| The next chapters describe the functions other than do_IRQ() in more details. |
| The do_IRQ() interface is not described, as it is called from the Linux/390 |
| first level interrupt handler only and does not comprise a device driver |
| callable interface. Instead, the functional description of do_IO() also |
| describes the input to the device specific interrupt handler. |
| |
| Note: All explanations apply also to the 64 bit architecture s390x. |
| |
| |
| Common Device Support (CDS) for Linux/390 Device Drivers |
| |
| General Information |
| |
| The following chapters describe the I/O related interface routines the |
| Linux/390 common device support (CDS) provides to allow for device specific |
| driver implementations on the IBM ESA/390 hardware platform. Those interfaces |
| intend to provide the functionality required by every device driver |
| implementaion to allow to drive a specific hardware device on the ESA/390 |
| platform. Some of the interface routines are specific to Linux/390 and some |
| of them can be found on other Linux platforms implementations too. |
| Miscellaneous function prototypes, data declarations, and macro definitions |
| can be found in the architecture specific C header file |
| linux/include/asm-s390/irq.h. |
| |
| Overview of CDS interface concepts |
| |
| Different to other hardware platforms, the ESA/390 architecture doesn't define |
| interrupt lines managed by a specific interrupt controller and bus systems |
| that may or may not allow for shared interrupts, DMA processing, etc.. Instead, |
| the ESA/390 architecture has implemented a so called channel subsystem, that |
| provides a unified view of the devices physically attached to the systems. |
| Though the ESA/390 hardware platform knows about a huge variety of different |
| peripheral attachments like disk devices (aka. DASDs), tapes, communication |
| controllers, etc. they can all by accessed by a well defined access method and |
| they are presenting I/O completion a unified way : I/O interruptions. Every |
| single device is uniquely identified to the system by a so called subchannel, |
| where the ESA/390 architecture allows for 64k devices be attached. |
| |
| Linux, however, was first built on the Intel PC architecture, with its two |
| cascaded 8259 programmable interrupt controllers (PICs), that allow for a |
| maximum of 15 different interrupt lines. All devices attached to such a system |
| share those 15 interrupt levels. Devices attached to the ISA bus system must |
| not share interrupt levels (aka. IRQs), as the ISA bus bases on edge triggered |
| interrupts. MCA, EISA, PCI and other bus systems base on level triggered |
| interrupts, and therewith allow for shared IRQs. However, if multiple devices |
| present their hardware status by the same (shared) IRQ, the operating system |
| has to call every single device driver registered on this IRQ in order to |
| determine the device driver owning the device that raised the interrupt. |
| |
| In order not to introduce a new I/O concept to the common Linux code, |
| Linux/390 preserves the IRQ concept and semantically maps the ESA/390 |
| subchannels to Linux as IRQs. This allows Linux/390 to support up to 64k |
| different IRQs, uniquely representig a single device each. |
| |
| Up to kernel 2.4, Linux/390 used to provide interfaces via the IRQ (subchannel). |
| For internal use of the common I/O layer, these are still there. However, |
| device drivers should use the new calling interface via the ccw_device only. |
| |
| During its startup the Linux/390 system checks for peripheral devices. Each |
| of those devices is uniquely defined by a so called subchannel by the ESA/390 |
| channel subsystem. While the subchannel numbers are system generated, each |
| subchannel also takes a user defined attribute, the so called device number. |
| Both subchannel number and device number can not exceed 65535. During driverfs |
| initialisation, the information about control unit type and device types that |
| imply specific I/O commands (channel command words - CCWs) in order to operate |
| the device are gathered. Device drivers can retrieve this set of hardware |
| information during their initialization step to recognize the devices they |
| support using the information saved in the struct ccw_device given to them. |
| This methods implies that Linux/390 doesn't require to probe for free (not |
| armed) interrupt request lines (IRQs) to drive its devices with. Where |
| applicable, the device drivers can use the read_dev_chars() to retrieve device |
| characteristics. This can be done without having to request device ownership |
| previously. |
| |
| In order to allow for easy I/O initiation the CDS layer provides a |
| ccw_device_start() interface that takes a device specific channel program (one |
| or more CCWs) as input sets up the required architecture specific control blocks |
| and initiates an I/O request on behalf of the device driver. The |
| ccw_device_start() routine allows to specify whether it expects the CDS layer |
| to notify the device driver for every interrupt it observes, or with final status |
| only. See ccw_device_start() for more details. A device driver must never issue |
| ESA/390 I/O commands itself, but must use the Linux/390 CDS interfaces instead. |
| |
| For long running I/O request to be canceled, the CDS layer provides the |
| ccw_device_halt() function. Some devices require to initially issue a HALT |
| SUBCHANNEL (HSCH) command without having pending I/O requests. This function is |
| also covered by ccw_device_halt(). |
| |
| |
| read_dev_chars() - Read Device Characteristics |
| |
| This routine returns the characteristics for the device specified. |
| |
| The function is meant to be called with an irq handler in place; that is, |
| at earliest during set_online() processing. |
| |
| While the request is processed synchronously, the device interrupt |
| handler is called for final ending status. In case of error situations the |
| interrupt handler may recover appropriately. The device irq handler can |
| recognize the corresponding interrupts by the interruption parameter be |
| 0x00524443. The ccw_device must not be locked prior to calling read_dev_chars(). |
| |
| The function may be called enabled or disabled. |
| |
| int read_dev_chars(struct ccw_device *cdev, void **buffer, int length ); |
| |
| cdev - the ccw_device the information is requested for. |
| buffer - pointer to a buffer pointer. The buffer pointer itself |
| must contain a valid buffer area. |
| length - length of the buffer provided. |
| |
| The read_dev_chars() function returns : |
| |
| 0 - successful completion |
| -ENODEV - cdev invalid |
| -EINVAL - an invalid parameter was detected, or the function was called early. |
| -EBUSY - an irrecoverable I/O error occurred or the device is not |
| operational. |
| |
| |
| read_conf_data(), read_conf_data_lpm() - Read Configuration Data |
| |
| Retrieve the device dependent configuration data. Please have a look at your |
| device dependent I/O commands for the device specific layout of the node |
| descriptor elements. read_conf_data_lpm() will retrieve the configuration data |
| for a specific path. |
| |
| The function is meant to be called with the device already enabled; that is, |
| at earliest during set_online() processing. |
| |
| The function may be called enabled or disabled, but the device must not be |
| locked |
| |
| int read_conf_data(struct ccw_device, void **buffer, int *length); |
| int read_conf_data_lpm(struct ccw_device, void **buffer, int *length, __u8 lpm); |
| |
| cdev - the ccw_device the data is requested for. |
| buffer - Pointer to a buffer pointer. The read_conf_data() routine |
| will allocate a buffer and initialize the buffer pointer |
| accordingly. It's the device driver's responsibility to |
| release the kernel memory if no longer needed. |
| length - Length of the buffer allocated and retrieved. |
| lpm - Logical path mask to be used for retrieving the data. If |
| zero the data is retrieved on the next path available. |
| |
| The read_conf_data() function returns : |
| 0 - Successful completion |
| -ENODEV - cdev invalid. |
| -EINVAL - An invalid parameter was detected, or the function was called early. |
| -EIO - An irrecoverable I/O error occurred or the device is |
| not operational. |
| -ENOMEM - The read_conf_data() routine couldn't obtain storage. |
| -EOPNOTSUPP - The device doesn't support the read configuration |
| data command. |
| |
| |
| get_ciw() - get command information word |
| |
| This call enables a device driver to get information about supported commands |
| from the extended SenseID data. |
| |
| struct ciw * |
| ccw_device_get_ciw(struct ccw_device *cdev, __u32 cmd); |
| |
| cdev - The ccw_device for which the command is to be retrieved. |
| cmd - The command type to be retrieved. |
| |
| ccw_device_get_ciw() returns: |
| NULL - No extended data available, invalid device or command not found. |
| !NULL - The command requested. |
| |
| |
| ccw_device_start() - Initiate I/O Request |
| |
| The ccw_device_start() routines is the I/O request front-end processor. All |
| device driver I/O requests must be issued using this routine. A device driver |
| must not issue ESA/390 I/O commands itself. Instead the ccw_device_start() |
| routine provides all interfaces required to drive arbitrary devices. |
| |
| This description also covers the status information passed to the device |
| driver's interrupt handler as this is related to the rules (flags) defined |
| with the associated I/O request when calling ccw_device_start(). |
| |
| int ccw_device_start(struct ccw_device *cdev, |
| struct ccw1 *cpa, |
| unsigned long intparm, |
| __u8 lpm, |
| unsigned long flags); |
| int ccw_device_start_timeout(struct ccw_device *cdev, |
| struct ccw1 *cpa, |
| unsigned long intparm, |
| __u8 lpm, |
| unsigned long flags, |
| int expires); |
| int ccw_device_start_key(struct ccw_device *cdev, |
| struct ccw1 *cpa, |
| unsigned long intparm, |
| __u8 lpm, |
| __u8 key, |
| unsigned long flags); |
| int ccw_device_start_key_timeout(struct ccw_device *cdev, |
| struct ccw1 *cpa, |
| unsigned long intparm, |
| __u8 lpm, |
| __u8 key, |
| unsigned long flags, |
| int expires); |
| |
| cdev : ccw_device the I/O is destined for |
| cpa : logical start address of channel program |
| user_intparm : user specific interrupt information; will be presented |
| back to the device driver's interrupt handler. Allows a |
| device driver to associate the interrupt with a |
| particular I/O request. |
| lpm : defines the channel path to be used for a specific I/O |
| request. A value of 0 will make cio use the opm. |
| key : the storage key to use for the I/O (useful for operating on a |
| storage with a storage key != default key) |
| flag : defines the action to be performed for I/O processing |
| expires : timeout value in jiffies. The common I/O layer will terminate |
| the running program after this and call the interrupt handler |
| with ERR_PTR(-ETIMEDOUT) as irb. |
| |
| Possible flag values are : |
| |
| DOIO_ALLOW_SUSPEND - channel program may become suspended |
| DOIO_DENY_PREFETCH - don't allow for CCW prefetch; usually |
| this implies the channel program might |
| become modified |
| DOIO_SUPPRESS_INTER - don't call the handler on intermediate status |
| |
| The cpa parameter points to the first format 1 CCW of a channel program : |
| |
| struct ccw1 { |
| __u8 cmd_code;/* command code */ |
| __u8 flags; /* flags, like IDA addressing, etc. */ |
| __u16 count; /* byte count */ |
| __u32 cda; /* data address */ |
| } __attribute__ ((packed,aligned(8))); |
| |
| with the following CCW flags values defined : |
| |
| CCW_FLAG_DC - data chaining |
| CCW_FLAG_CC - command chaining |
| CCW_FLAG_SLI - suppress incorrect length |
| CCW_FLAG_SKIP - skip |
| CCW_FLAG_PCI - PCI |
| CCW_FLAG_IDA - indirect addressing |
| CCW_FLAG_SUSPEND - suspend |
| |
| |
| Via ccw_device_set_options(), the device driver may specify the following |
| options for the device: |
| |
| DOIO_EARLY_NOTIFICATION - allow for early interrupt notification |
| DOIO_REPORT_ALL - report all interrupt conditions |
| |
| |
| The ccw_device_start() function returns : |
| |
| 0 - successful completion or request successfully initiated |
| -EBUSY - The device is currently processing a previous I/O request, or ther is |
| a status pending at the device. |
| -ENODEV - cdev is invalid, the device is not operational or the ccw_device is |
| not online. |
| |
| When the I/O request completes, the CDS first level interrupt handler will |
| accumulate the status in a struct irb and then call the device interrupt handler. |
| The intparm field will contain the value the device driver has associated with a |
| particular I/O request. If a pending device status was recognized, |
| intparm will be set to 0 (zero). This may happen during I/O initiation or delayed |
| by an alert status notification. In any case this status is not related to the |
| current (last) I/O request. In case of a delayed status notification no special |
| interrupt will be presented to indicate I/O completion as the I/O request was |
| never started, even though ccw_device_start() returned with successful completion. |
| |
| The irb may contain an error value, and the device driver should check for this |
| first: |
| |
| -ETIMEDOUT: the common I/O layer terminated the request after the specified |
| timeout value |
| -EIO: the common I/O layer terminated the request due to an error state |
| |
| If the concurrent sense flag in the extended status word in the irb is set, the |
| field irb->scsw.count describes the numer of device specific sense bytes |
| available in the extended control word irb->scsw.ecw[0]. No device sensing by |
| the device driver itself is required. |
| |
| The device interrupt handler can use the following definitions to investigate |
| the primary unit check source coded in sense byte 0 : |
| |
| SNS0_CMD_REJECT 0x80 |
| SNS0_INTERVENTION_REQ 0x40 |
| SNS0_BUS_OUT_CHECK 0x20 |
| SNS0_EQUIPMENT_CHECK 0x10 |
| SNS0_DATA_CHECK 0x08 |
| SNS0_OVERRUN 0x04 |
| SNS0_INCOMPL_DOMAIN 0x01 |
| |
| Depending on the device status, multiple of those values may be set together. |
| Please refer to the device specific documentation for details. |
| |
| The irb->scsw.cstat field provides the (accumulated) subchannel status : |
| |
| SCHN_STAT_PCI - program controlled interrupt |
| SCHN_STAT_INCORR_LEN - incorrect length |
| SCHN_STAT_PROG_CHECK - program check |
| SCHN_STAT_PROT_CHECK - protection check |
| SCHN_STAT_CHN_DATA_CHK - channel data check |
| SCHN_STAT_CHN_CTRL_CHK - channel control check |
| SCHN_STAT_INTF_CTRL_CHK - interface control check |
| SCHN_STAT_CHAIN_CHECK - chaining check |
| |
| The irb->scsw.dstat field provides the (accumulated) device status : |
| |
| DEV_STAT_ATTENTION - attention |
| DEV_STAT_STAT_MOD - status modifier |
| DEV_STAT_CU_END - control unit end |
| DEV_STAT_BUSY - busy |
| DEV_STAT_CHN_END - channel end |
| DEV_STAT_DEV_END - device end |
| DEV_STAT_UNIT_CHECK - unit check |
| DEV_STAT_UNIT_EXCEP - unit exception |
| |
| Please see the ESA/390 Principles of Operation manual for details on the |
| individual flag meanings. |
| |
| Usage Notes : |
| |
| Prior to call ccw_device_start() the device driver must assure disabled state, |
| i.e. the I/O mask value in the PSW must be disabled. This can be accomplished |
| by calling local_save_flags( flags). The current PSW flags are preserved and |
| can be restored by local_irq_restore( flags) at a later time. |
| |
| If the device driver violates this rule while running in a uni-processor |
| environment an interrupt might be presented prior to the ccw_device_start() |
| routine returning to the device driver main path. In this case we will end in a |
| deadlock situation as the interrupt handler will try to obtain the irq |
| lock the device driver still owns (see below) ! |
| |
| The driver must assure to hold the device specific lock. This can be |
| accomplished by |
| |
| (i) spin_lock(get_ccwdev_lock(cdev)), or |
| (ii) spin_lock_irqsave(get_ccwdev_lock(cdev), flags) |
| |
| Option (i) should be used if the calling routine is running disabled for |
| I/O interrupts (see above) already. Option (ii) obtains the device gate und |
| puts the CPU into I/O disabled state by preserving the current PSW flags. |
| |
| The device driver is allowed to issue the next ccw_device_start() call from |
| within its interrupt handler already. It is not required to schedule a |
| bottom-half, unless an non deterministically long running error recovery procedure |
| or similar needs to be scheduled. During I/O processing the Linux/390 generic |
| I/O device driver support has already obtained the IRQ lock, i.e. the handler |
| must not try to obtain it again when calling ccw_device_start() or we end in a |
| deadlock situation! |
| |
| If a device driver relies on an I/O request to be completed prior to start the |
| next it can reduce I/O processing overhead by chaining a NoOp I/O command |
| CCW_CMD_NOOP to the end of the submitted CCW chain. This will force Channel-End |
| and Device-End status to be presented together, with a single interrupt. |
| However, this should be used with care as it implies the channel will remain |
| busy, not being able to process I/O requests for other devices on the same |
| channel. Therefore e.g. read commands should never use this technique, as the |
| result will be presented by a single interrupt anyway. |
| |
| In order to minimize I/O overhead, a device driver should use the |
| DOIO_REPORT_ALL only if the device can report intermediate interrupt |
| information prior to device-end the device driver urgently relies on. In this |
| case all I/O interruptions are presented to the device driver until final |
| status is recognized. |
| |
| If a device is able to recover from asynchronosly presented I/O errors, it can |
| perform overlapping I/O using the DOIO_EARLY_NOTIFICATION flag. While some |
| devices always report channel-end and device-end together, with a single |
| interrupt, others present primary status (channel-end) when the channel is |
| ready for the next I/O request and secondary status (device-end) when the data |
| transmission has been completed at the device. |
| |
| Above flag allows to exploit this feature, e.g. for communication devices that |
| can handle lost data on the network to allow for enhanced I/O processing. |
| |
| Unless the channel subsystem at any time presents a secondary status interrupt, |
| exploiting this feature will cause only primary status interrupts to be |
| presented to the device driver while overlapping I/O is performed. When a |
| secondary status without error (alert status) is presented, this indicates |
| successful completion for all overlapping ccw_device_start() requests that have |
| been issued since the last secondary (final) status. |
| |
| Channel programs that intend to set the suspend flag on a channel command word |
| (CCW) must start the I/O operation with the DOIO_ALLOW_SUSPEND option or the |
| suspend flag will cause a channel program check. At the time the channel program |
| becomes suspended an intermediate interrupt will be generated by the channel |
| subsystem. |
| |
| ccw_device_resume() - Resume Channel Program Execution |
| |
| If a device driver chooses to suspend the current channel program execution by |
| setting the CCW suspend flag on a particular CCW, the channel program execution |
| is suspended. In order to resume channel program execution the CIO layer |
| provides the ccw_device_resume() routine. |
| |
| int ccw_device_resume(struct ccw_device *cdev); |
| |
| cdev - ccw_device the resume operation is requested for |
| |
| The resume_IO() function returns: |
| |
| 0 - suspended channel program is resumed |
| -EBUSY - status pending |
| -ENODEV - cdev invalid or not-operational subchannel |
| -EINVAL - resume function not applicable |
| -ENOTCONN - there is no I/O request pending for completion |
| |
| Usage Notes: |
| Please have a look at the ccw_device_start() usage notes for more details on |
| suspended channel programs. |
| |
| ccw_device_halt() - Halt I/O Request Processing |
| |
| Sometimes a device driver might need a possibility to stop the processing of |
| a long-running channel program or the device might require to initially issue |
| a halt subchannel (HSCH) I/O command. For those purposes the ccw_device_halt() |
| command is provided. |
| |
| int ccw_device_halt(struct ccw_device *cdev, |
| unsigned long intparm); |
| |
| cdev : ccw_device the halt operation is requested for |
| intparm : interruption parameter; value is only used if no I/O |
| is outstanding, otherwise the intparm associated with |
| the I/O request is returned |
| |
| The ccw_device_halt() function returns : |
| |
| 0 - successful completion or request successfully initiated |
| -EBUSY - the device is currently busy, or status pending. |
| -ENODEV - cdev invalid. |
| -EINVAL - The device is not operational or the ccw device is not online. |
| |
| Usage Notes : |
| |
| A device driver may write a never-ending channel program by writing a channel |
| program that at its end loops back to its beginning by means of a transfer in |
| channel (TIC) command (CCW_CMD_TIC). Usually this is performed by network |
| device drivers by setting the PCI CCW flag (CCW_FLAG_PCI). Once this CCW is |
| executed a program controlled interrupt (PCI) is generated. The device driver |
| can then perform an appropriate action. Prior to interrupt of an outstanding |
| read to a network device (with or without PCI flag) a ccw_device_halt() |
| is required to end the pending operation. |
| |
| |
| Miscellaneous Support Routines |
| |
| This chapter describes various routines to be used in a Linux/390 device |
| driver programming environment. |
| |
| get_ccwdev_lock() |
| |
| Get the address of the device specific lock. This is then used in |
| spin_lock() / spin_unlock() calls. |
| |
| |
| __u8 ccw_device_get_path_mask(struct ccw_device *cdev); |
| |
| Get the mask of the path currently available for cdev. |