From nobody Fri Feb 18 21:28:10 2022 X-Original-To: dev-commits-src-branches@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 97F6C19DA739; Fri, 18 Feb 2022 21:28:10 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4K0lDB3XnJz3G9w; Fri, 18 Feb 2022 21:28:10 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1645219690; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=14Vp+hN5ZNBy4QqrzLnQP9eqoCuVqUzjTCjs75Mjtoo=; b=nUY6YmbpB8RCWxSgUhXLuRBnx/LnEZGdBuQzBi2EOWDiN3Udfi4OOufn6huoE50u+gvyqY zh4lfXsPl6XZMBPlnlfu8oSfDvqeRfyZsicPC5dnkoZSSoA6N3tcWNx+MCD1T5sIa8nNtW 3rxzQ2+isyWg7HEk012H1tkilE12PW8X9CuAMAYJgO4kAWswdgNIiU4GfWUGNUOgBk3ddW MOEshYTRIOcMm573yIrf2wQ8fjB9SGmCMV4DnKxCVt63Yb7RlxmKUBkFcLwdbCIvaqZ0sE vIgMGGXHkbVcYkQljbtugFg6FHg4T2hbPSnR/o7PYzJ/v2h0ZDdQnl8j0C0wIw== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 575A6245F5; Fri, 18 Feb 2022 21:28:10 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 21ILSAuw004571; Fri, 18 Feb 2022 21:28:10 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 21ILSAUG004569; Fri, 18 Feb 2022 21:28:10 GMT (envelope-from git) Date: Fri, 18 Feb 2022 21:28:10 GMT Message-Id: <202202182128.21ILSAUG004569@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: "Kenneth D. Merry" Subject: git: acbae1011ed2 - stable/12 - Switch to using drive-supplied timeouts for the sa(4) driver. List-Id: Commits to the stable branches of the FreeBSD src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-branches List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-branches@freebsd.org X-BeenThere: dev-commits-src-branches@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: ken X-Git-Repository: src X-Git-Refname: refs/heads/stable/12 X-Git-Reftype: branch X-Git-Commit: acbae1011ed2c13821f580ac8811a0cb27c44765 Auto-Submitted: auto-generated ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1645219690; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=14Vp+hN5ZNBy4QqrzLnQP9eqoCuVqUzjTCjs75Mjtoo=; b=YNE4NQ35JypE6yMrqYUcZvbNOIrdvRF+/oS8ucdMkkLmKVeHpuBc+VXjr0r2gJ7zzCejgW 3Vsrd2WU6jaBvTm3xxM7AMOssV7n53SsWOeZnUSp3wuT8aoc+jCPwjwkqGVoAdF+X+sYNe d+ZJUV52YVrzyW4CXLUTUNApuGzFLgwUhbQQFueprmj9coemFgJBs7DoH8GwsGMChyk7e3 L/rFbpYxzsFAvjkpnWTFTqj1+W6efTRtXoDdlec2yfBPuymKiFsy89SWrMMt9kFo5ct0oa z491Wkch7sTyxZCOfdsEvZ4vLe7QA88AoFYwBJmyNIuv5XUvJMeMQ++lLYz7tg== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1645219690; a=rsa-sha256; cv=none; b=Z3GqF0Y6ao1sV/mxXsfNAkNbHvvtVioaHRcNOxPE62pMDCKFKfhXeGIENefAX7Ah1tP4qt goWCdrcbxki7Oog80cwlIY4ZEC32dKDoMg3poyGEHm8DXz2PuPAa4x5EGbMoawt4MNDgKf Gf+RG6wIKPd4dJuB2+hZ30rgRkhbtgCFAh5HEV39Y3d3NjRDP8FavvNHdIdBEJrreTvHJH ljL17UGIITkqG72mVL3Bi5AaZGdA/ERY8Thvx9IzfIj2YWG2E9Oe7BM/AG3i6eTgBtRr61 ugIDtIGaCZ1SnsN4G8oV3EzxHVvz2XfAFjNtmd8pEapK6cV1QG7fg+zsahygpA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N The branch stable/12 has been updated by ken: URL: https://cgit.FreeBSD.org/src/commit/?id=acbae1011ed2c13821f580ac8811a0cb27c44765 commit acbae1011ed2c13821f580ac8811a0cb27c44765 Author: Kenneth D. Merry AuthorDate: 2022-01-13 21:07:58 +0000 Commit: Kenneth D. Merry CommitDate: 2022-02-18 20:23:50 +0000 Switch to using drive-supplied timeouts for the sa(4) driver. Summary: The sa(4) driver has historically used tape drive timeouts that were one-size fits all, with compile-time options to adjust a few of them. LTO-9 drives (and presumably other tape drives in the future) implement a tape characterization process that happens the first time a tape is loaded. The characterization process formats the tape to account for the temperature and humidity in the environment it is being used in. The process for LTO-9 tapes can take from 20 minutes (I have observed 17-18 minutes) to 2 hours according to the documentation. As a result, LTO-9 drives have significantly longer recommended load times than previous LTO generations. To handle this, change the sa(4) driver over to using timeouts supplied by the tape drive using the timeout descriptors obtained through the REPORT SUPPORTED OPERATION CODES command. That command was introduced in SPC-4. IBM tape drives going back to at least LTO-5 report timeout values. Oracle/Sun/StorageTek tape drives going back to at least the T10000C report timeout values. HP LTO-5 and newer drives report timeout values. The sa(4) driver only queries drives that claim to support SPC-4. This makes the timeout settings automatic and accurate for newer tape drives. Also, add loader tunable and sysctl support so that the user can override individual command type timeouts for all tape drives in the system, or only for specific drives. The new global (these affect all tape drives) loader tunables are: kern.cam.sa.timeout.erase kern.cam.sa.timeout.load kern.cam.sa.timeout.locate kern.cam.sa.timeout.mode_select kern.cam.sa.timeout.mode_sense kern.cam.sa.timeout.prevent kern.cam.sa.timeout.read kern.cam.sa.timeout.read_position kern.cam.sa.timeout.read_block_limits kern.cam.sa.timeout.report_density kern.cam.sa.timeout.reserve kern.cam.sa.timeout.rewind kern.cam.sa.timeout.space kern.cam.sa.timeout.tur kern.cam.sa.timeout.write kern.cam.sa.timeout.write_filemarks The new per-instance loader tunable / sysctl variables are: kern.cam.sa.%d.timeout.erase kern.cam.sa.%d.timeout.load kern.cam.sa.%d.timeout.locate kern.cam.sa.%d.timeout.mode_select kern.cam.sa.%d.timeout.mode_sense kern.cam.sa.%d.timeout.prevent kern.cam.sa.%d.timeout.read kern.cam.sa.%d.timeout.read_position kern.cam.sa.%d.timeout.read_block_limits kern.cam.sa.%d.timeout.report_density kern.cam.sa.%d.timeout.reserve kern.cam.sa.%d.timeout.rewind kern.cam.sa.%d.timeout.space kern.cam.sa.%d.timeout.tur kern.cam.sa.%d.timeout.write kern.cam.sa.%d.timeout.write_filemarks The values are reported and set in units of thousandths of a second. share/man/man4/sa.4: Document the new loader tunables in the sa(4) man page. sys/cam/scsi/scsi_sa.c: Add a new timeout_info array to the softc. Add a default timeouts array, along with descriptions. Add a new sysctl tree to the softc to handle the timeout sysctl values. Add a new function, saloadtotunables(), that will load the global loader tunables first and then any per-instance loader tunables second. Add creation of the new timeout sysctl variables in sasysctlinit(). Add a new, optional probe state to the sa(4) driver. We previously didn't do any probing, but now we probe for timeout descriptors if the drive claims to support SPC-4 or later. In saregister(), we check the SCSI revision and either launch the probe state machine, or announce the device and become ready. In sastart() and sadone(), add support for the new SA_STATE_PROBE. If we're probing, we don't go through saerror(), since that is currently only written to handle I/O errors in the normal state. Change every place in the sa(4) driver that fills in timeout values in a CCB to use the new timeout_info[] array in the softc. Add a new saloadtimeouts() routine to parse the returned timeout descriptors from a completed REPORT SUPPORTED OPERATION CODES command, and set the values for the commands we support. Add comments explaining the priority order of the various sources of timeout values. Also, explain that the probe that pulls in drive recommended timeouts via the REPORT SUPPORTED OPERATION CODES command is in a race with the thread that creates the sysctl variables. Because of that race, it is important that the sysctl thread not load any timeout values from the kernel environment. Sponsored by: Spectra Logic Test Plan: Try this out with a variety of tape drives and make sure the timeouts that result (sysctl kern.cam.sa to see them) are reasonable. Reviewers: #manpages, #cam Subscribers: imp Differential Revision: https://reviews.freebsd.org/D33883 (cherry picked from commit 5719b5a1bb643d5622557afe78dca63a800d9b7c) (cherry picked from commit bcff64c54a74268742f52d40d1eb2acd8ab6f07d) (cherry picked from commit 6e8a2f04001735353e445570f0d83aa88d4b9b37) --- share/man/man4/sa.4 | 96 +++++++- sys/cam/scsi/scsi_sa.c | 592 ++++++++++++++++++++++++++++++++++++++++++++----- 2 files changed, 628 insertions(+), 60 deletions(-) diff --git a/share/man/man4/sa.4 b/share/man/man4/sa.4 index c7336748d432..c30db5e44d88 100644 --- a/share/man/man4/sa.4 +++ b/share/man/man4/sa.4 @@ -25,7 +25,7 @@ .\" .\" $FreeBSD$ .\" -.Dd May 5, 2017 +.Dd January 18, 2022 .Dt SA 4 .Os .Sh NAME @@ -323,6 +323,100 @@ The driver does not currently use the RECOVER BUFFERED DATA command. .El +.Sh TIMEOUTS +The +.Nm +driver has a set of default timeouts for SCSI commands (READ, WRITE, TEST UNIT +READY, etc.) that will likely work in most cases for many tape drives. +.Pp +For newer tape drives that claim to support the SPC-4 +standard (SCSI Primary Commands 4) or later standards, the +.Nm +driver will attempt to use the REPORT SUPPORTED OPERATION CODES command to +fetch timeout descriptors from the drive. +If the drive does report timeout descriptors, the +.Nm +driver will use the drive's recommended timeouts for commands. +.Pp +The timeouts in use are reported in units of +.Sy thousandths +of a second via the +.Va kern.cam.sa.%d.timeout.* +.Xr sysctl 8 +variables. +.Pp +To override either the default timeouts, or the timeouts recommended by the +drive, you can set one of two sets of loader tunable values. +If you have a drive that supports the REPORT SUPPORTED OPERATION CODES +timeout descriptors (see the +.Xr camcontrol 8 +.Va opcodes +subcommand) it is generally best to use those values. +The global +.Va kern.cam.sa.timeout.* +values will override the timeouts for all +.Nm +driver instances. +If there are 5 tape drives in the system, they'll all get the same timeouts. +The +.Va kern.cam.sa.%d.timeout.* +values (where %d is the numeric +.Nm +instance number) will override the global timeouts as well as either the +default timeouts or the timeouts recommended by the drive. +.Pp +To set timeouts after boot, the per-instance timeout values, for example: +.Va kern.cam.sa.0.timeout.read , +are available as sysctl variables. +.Pp +If a tape drive arrives after boot, the global tunables or per-instance +tunables that apply to the newly arrived drive will be used. +.Pp +Loader tunables: +.Pp +.Bl -tag -compact +.It kern.cam.sa.timeout.erase +.It kern.cam.sa.timeout.locate +.It kern.cam.sa.timeout.mode_select +.It kern.cam.sa.timeout.mode_sense +.It kern.cam.sa.timeout.prevent +.It kern.cam.sa.timeout.read +.It kern.cam.sa.timeout.read_position +.It kern.cam.sa.timeout.read_block_limits +.It kern.cam.sa.timeout.report_density +.It kern.cam.sa.timeout.reserve +.It kern.cam.sa.timeout.rewind +.It kern.cam.sa.timeout.space +.It kern.cam.sa.timeout.tur +.It kern.cam.sa.timeout.write +.It kern.cam.sa.timeout.write_filemarks +.El +.Pp +Loader tunable values and +.Xr sysctl 8 +values: +.Pp +.Bl -tag -compact +.It kern.cam.sa.%d.timeout.erase +.It kern.cam.sa.%d.timeout.locate +.It kern.cam.sa.%d.timeout.mode_select +.It kern.cam.sa.%d.timeout.mode_sense +.It kern.cam.sa.%d.timeout.prevent +.It kern.cam.sa.%d.timeout.read +.It kern.cam.sa.%d.timeout.read_position +.It kern.cam.sa.%d.timeout.read_block_limits +.It kern.cam.sa.%d.timeout.report_density +.It kern.cam.sa.%d.timeout.reserve +.It kern.cam.sa.%d.timeout.rewind +.It kern.cam.sa.%d.timeout.space +.It kern.cam.sa.%d.timeout.tur +.It kern.cam.sa.%d.timeout.write +.It kern.cam.sa.%d.timeout.write_filemarks +.El +.Pp +As mentioned above, the timeouts are set and reported in +.Sy thousandths +of a second, so be sure to account for that when setting them. .Sh IOCTLS The .Nm diff --git a/sys/cam/scsi/scsi_sa.c b/sys/cam/scsi/scsi_sa.c index 4b71a4179efc..67e868ab500e 100644 --- a/sys/cam/scsi/scsi_sa.c +++ b/sys/cam/scsi/scsi_sa.c @@ -4,7 +4,7 @@ * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 1999, 2000 Matthew Jacob - * Copyright (c) 2013, 2014, 2015 Spectra Logic Corporation + * Copyright (c) 2013, 2014, 2015, 2021 Spectra Logic Corporation * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -85,7 +85,7 @@ __FBSDID("$FreeBSD$"); #define SA_ERASE_TIMEOUT 4 * 60 #endif #ifndef SA_REP_DENSITY_TIMEOUT -#define SA_REP_DENSITY_TIMEOUT 90 +#define SA_REP_DENSITY_TIMEOUT 1 #endif #define SCSIOP_TIMEOUT (60 * 1000) /* not an option */ @@ -115,7 +115,7 @@ __FBSDID("$FreeBSD$"); static MALLOC_DEFINE(M_SCSISA, "SCSI sa", "SCSI sequential access buffers"); typedef enum { - SA_STATE_NORMAL, SA_STATE_ABNORMAL + SA_STATE_NORMAL, SA_STATE_PROBE, SA_STATE_ABNORMAL } sa_state; #define ccb_pflags ppriv_field0 @@ -126,27 +126,28 @@ typedef enum { typedef enum { - SA_FLAG_OPEN = 0x0001, - SA_FLAG_FIXED = 0x0002, - SA_FLAG_TAPE_LOCKED = 0x0004, - SA_FLAG_TAPE_MOUNTED = 0x0008, - SA_FLAG_TAPE_WP = 0x0010, - SA_FLAG_TAPE_WRITTEN = 0x0020, - SA_FLAG_EOM_PENDING = 0x0040, - SA_FLAG_EIO_PENDING = 0x0080, - SA_FLAG_EOF_PENDING = 0x0100, + SA_FLAG_OPEN = 0x00001, + SA_FLAG_FIXED = 0x00002, + SA_FLAG_TAPE_LOCKED = 0x00004, + SA_FLAG_TAPE_MOUNTED = 0x00008, + SA_FLAG_TAPE_WP = 0x00010, + SA_FLAG_TAPE_WRITTEN = 0x00020, + SA_FLAG_EOM_PENDING = 0x00040, + SA_FLAG_EIO_PENDING = 0x00080, + SA_FLAG_EOF_PENDING = 0x00100, SA_FLAG_ERR_PENDING = (SA_FLAG_EOM_PENDING|SA_FLAG_EIO_PENDING| SA_FLAG_EOF_PENDING), - SA_FLAG_INVALID = 0x0200, - SA_FLAG_COMP_ENABLED = 0x0400, - SA_FLAG_COMP_SUPP = 0x0800, - SA_FLAG_COMP_UNSUPP = 0x1000, - SA_FLAG_TAPE_FROZEN = 0x2000, - SA_FLAG_PROTECT_SUPP = 0x4000, + SA_FLAG_INVALID = 0x00200, + SA_FLAG_COMP_ENABLED = 0x00400, + SA_FLAG_COMP_SUPP = 0x00800, + SA_FLAG_COMP_UNSUPP = 0x01000, + SA_FLAG_TAPE_FROZEN = 0x02000, + SA_FLAG_PROTECT_SUPP = 0x04000, SA_FLAG_COMPRESSION = (SA_FLAG_COMP_SUPP|SA_FLAG_COMP_ENABLED| SA_FLAG_COMP_UNSUPP), - SA_FLAG_SCTX_INIT = 0x8000 + SA_FLAG_SCTX_INIT = 0x08000, + SA_FLAG_RSOC_TO_TRY = 0x10000, } sa_flags; typedef enum { @@ -155,6 +156,64 @@ typedef enum { SA_MODE_OFFLINE = 0x02 } sa_mode; +typedef enum { + SA_TIMEOUT_ERASE, + SA_TIMEOUT_LOAD, + SA_TIMEOUT_LOCATE, + SA_TIMEOUT_MODE_SELECT, + SA_TIMEOUT_MODE_SENSE, + SA_TIMEOUT_PREVENT, + SA_TIMEOUT_READ, + SA_TIMEOUT_READ_BLOCK_LIMITS, + SA_TIMEOUT_READ_POSITION, + SA_TIMEOUT_REP_DENSITY, + SA_TIMEOUT_RESERVE, + SA_TIMEOUT_REWIND, + SA_TIMEOUT_SPACE, + SA_TIMEOUT_TUR, + SA_TIMEOUT_WRITE, + SA_TIMEOUT_WRITE_FILEMARKS, + SA_TIMEOUT_TYPE_MAX +} sa_timeout_types; + +/* + * These are the default timeout values that apply to all tape drives. + * + * We get timeouts from the following places in order of increasing + * priority: + * 1. Driver default timeouts. (Set in the structure below.) + * 2. Timeouts loaded from the drive via REPORT SUPPORTED OPERATION + * CODES. (If the drive supports it, SPC-4/LTO-5 and newer should.) + * 3. Global loader tunables, used for all sa(4) driver instances on + * a machine. + * 4. Instance-specific loader tunables, used for say sa5. + * 5. On the fly user sysctl changes. + * + * Each step will overwrite the timeout value set from the one + * before, so you go from general to most specific. + */ +static struct sa_timeout_desc { + const char *desc; + int value; +} sa_default_timeouts[SA_TIMEOUT_TYPE_MAX] = { + {"erase", ERASE_TIMEOUT}, + {"load", REWIND_TIMEOUT}, + {"locate", SPACE_TIMEOUT}, + {"mode_select", SCSIOP_TIMEOUT}, + {"mode_sense", SCSIOP_TIMEOUT}, + {"prevent", SCSIOP_TIMEOUT}, + {"read", IO_TIMEOUT}, + {"read_block_limits", SCSIOP_TIMEOUT}, + {"read_position", SCSIOP_TIMEOUT}, + {"report_density", REP_DENSITY_TIMEOUT}, + {"reserve", SCSIOP_TIMEOUT}, + {"rewind", REWIND_TIMEOUT}, + {"space", SPACE_TIMEOUT}, + {"tur", SCSIOP_TIMEOUT}, + {"write", IO_TIMEOUT}, + {"write_filemarks", IO_TIMEOUT}, +}; + typedef enum { SA_PARAM_NONE = 0x000, SA_PARAM_BLOCKSIZE = 0x001, @@ -357,6 +416,7 @@ struct sa_softc { uint8_t density_type_bits[SA_DENSITY_TYPES]; int density_info_valid[SA_DENSITY_TYPES]; uint8_t density_info[SA_DENSITY_TYPES][SRDS_MAX_LENGTH]; + int timeout_info[SA_TIMEOUT_TYPE_MAX]; struct sa_prot_info prot_info; @@ -415,6 +475,8 @@ struct sa_softc { struct task sysctl_task; struct sysctl_ctx_list sysctl_ctx; struct sysctl_oid *sysctl_tree; + struct sysctl_ctx_list sysctl_timeout_ctx; + struct sysctl_oid *sysctl_timeout_tree; }; struct sa_quirk_entry { @@ -587,6 +649,8 @@ static int saspace(struct cam_periph *periph, int count, scsi_space_code code); static void sadevgonecb(void *arg); static void sasetupdev(struct sa_softc *softc, struct cdev *dev); +static void saloadtotunables(struct sa_softc *softc); +static void sasysctlinit(void *context, int pending); static int samount(struct cam_periph *, int, struct cdev *); static int saretension(struct cam_periph *periph); static int sareservereleaseunit(struct cam_periph *periph, @@ -604,6 +668,7 @@ static void safilldenstypesb(struct sbuf *sb, int *indent, int is_density); static void safilldensitysb(struct sa_softc *softc, int *indent, struct sbuf *sb); +static void saloadtimeouts(struct sa_softc *softc, union ccb *ccb); #ifndef SA_DEFAULT_IO_SPLIT @@ -2220,7 +2285,9 @@ sacleanup(struct cam_periph *periph) cam_periph_unlock(periph); if ((softc->flags & SA_FLAG_SCTX_INIT) != 0 - && sysctl_ctx_free(&softc->sysctl_ctx) != 0) + && (((softc->sysctl_timeout_tree != NULL) + && (sysctl_ctx_free(&softc->sysctl_timeout_ctx) != 0)) + || sysctl_ctx_free(&softc->sysctl_ctx) != 0)) xpt_print(periph->path, "can't remove sysctl context\n"); cam_periph_lock(periph); @@ -2291,12 +2358,47 @@ sasetupdev(struct sa_softc *softc, struct cdev *dev) softc->num_devs_to_destroy++; } +/* + * Load the global (for all sa(4) instances) and per-instance tunable + * values for timeouts for various sa(4) commands. This should be run + * after the default timeouts are fetched from the drive, so the user's + * preference will override the drive's defaults. + */ +static void +saloadtotunables(struct sa_softc *softc) +{ + int i; + char tmpstr[80]; + + for (i = 0; i < SA_TIMEOUT_TYPE_MAX; i++) { + int tmpval, retval; + + /* First grab any global timeout setting */ + snprintf(tmpstr, sizeof(tmpstr), "kern.cam.sa.timeout.%s", + sa_default_timeouts[i].desc); + retval = TUNABLE_INT_FETCH(tmpstr, &tmpval); + if (retval != 0) + softc->timeout_info[i] = tmpval; + + /* + * Then overwrite any global timeout settings with + * per-instance timeout settings. + */ + snprintf(tmpstr, sizeof(tmpstr), "kern.cam.sa.%u.timeout.%s", + softc->periph->unit_number, sa_default_timeouts[i].desc); + retval = TUNABLE_INT_FETCH(tmpstr, &tmpval); + if (retval != 0) + softc->timeout_info[i] = tmpval; + } +} + static void sasysctlinit(void *context, int pending) { struct cam_periph *periph; struct sa_softc *softc; - char tmpstr[32], tmpstr2[16]; + char tmpstr[64], tmpstr2[16]; + int i; periph = (struct cam_periph *)context; /* @@ -2331,6 +2433,32 @@ sasysctlinit(void *context, int pending) OID_AUTO, "inject_eom", CTLFLAG_RW, &softc->inject_eom, 0, "Queue EOM for the next write/read"); + sysctl_ctx_init(&softc->sysctl_timeout_ctx); + softc->sysctl_timeout_tree = SYSCTL_ADD_NODE(&softc->sysctl_timeout_ctx, + SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "timeout", + CTLFLAG_RD | CTLFLAG_MPSAFE, 0, "Timeouts"); + if (softc->sysctl_timeout_tree == NULL) + goto bailout; + + for (i = 0; i < SA_TIMEOUT_TYPE_MAX; i++) { + snprintf(tmpstr, sizeof(tmpstr), "%s timeout", + sa_default_timeouts[i].desc); + + /* + * Do NOT change this sysctl declaration to also load any + * tunable values for this sa(4) instance. In other words, + * do not change this to CTLFLAG_RWTUN. This function is + * run in parallel with the probe routine that fetches + * recommended timeout values from the tape drive, and we + * don't want the values from the drive to override the + * user's preference. + */ + SYSCTL_ADD_INT(&softc->sysctl_timeout_ctx, + SYSCTL_CHILDREN(softc->sysctl_timeout_tree), + OID_AUTO, sa_default_timeouts[i].desc, CTLFLAG_RW, + &softc->timeout_info[i], 0, tmpstr); + } + bailout: /* * Release the reference that was held when this task was enqueued. @@ -2348,7 +2476,7 @@ saregister(struct cam_periph *periph, void *arg) caddr_t match; char tmpstr[80]; int error; - + int i; cgd = (struct ccb_getdev *)arg; if (cgd == NULL) { printf("saregister: no getdev CCB, can't register device\n"); @@ -2392,6 +2520,15 @@ saregister(struct cam_periph *periph, void *arg) } else softc->quirks = SA_QUIRK_NONE; + + /* + * Initialize the default timeouts. If this drive supports + * timeout descriptors we'll overwrite these values with the + * recommended timeouts from the drive. + */ + for (i = 0; i < SA_TIMEOUT_TYPE_MAX; i++) + softc->timeout_info[i] = sa_default_timeouts[i].value; + /* * Long format data for READ POSITION was introduced in SSC, which * was after SCSI-2. (Roughly equivalent to SCSI-3.) If the drive @@ -2405,6 +2542,19 @@ saregister(struct cam_periph *periph, void *arg) if (cgd->inq_data.version <= SCSI_REV_CCS) softc->quirks |= SA_QUIRK_NO_LONG_POS; + /* + * The SCSI REPORT SUPPORTED OPERATION CODES command was added in + * SPC-4. That command optionally includes timeout data for + * different commands. Timeout values can vary wildly among + * different drives, so if the drive itself has recommended values, + * we will try to use them. Set this flag to indicate we're going + * to ask the drive for timeout data. This flag also tells us to + * wait on loading timeout tunables so we can properly override + * timeouts with any user-specified values. + */ + if (SID_ANSI_REV(&cgd->inq_data) >= SCSI_REV_SPC4) + softc->flags |= SA_FLAG_RSOC_TO_TRY; + if (cgd->inq_data.spc3_flags & SPC3_SID_PROTECT) { struct ccb_dev_advinfo cdai; struct scsi_vpd_extended_inquiry_data ext_inq; @@ -2553,7 +2703,9 @@ saregister(struct cam_periph *periph, void *arg) softc->density_type_bits[3] = SRDS_MEDIUM_TYPE | SRDS_MEDIA; /* * Bump the peripheral refcount for the sysctl thread, in case we - * get invalidated before the thread has a chance to run. + * get invalidated before the thread has a chance to run. Note + * that this runs in parallel with the probe for the timeout + * values. */ cam_periph_acquire(periph); taskqueue_enqueue(taskqueue_thread, &softc->sysctl_task); @@ -2564,8 +2716,41 @@ saregister(struct cam_periph *periph, void *arg) */ xpt_register_async(AC_LOST_DEVICE, saasync, periph, periph->path); - xpt_announce_periph(periph, NULL); - xpt_announce_quirks(periph, softc->quirks, SA_QUIRK_BIT_STRING); + /* + * See comment above, try fetching timeout values for drives that + * might support it. Otherwise, use the defaults. + * + * We get timeouts from the following places in order of increasing + * priority: + * 1. Driver default timeouts. + * 2. Timeouts loaded from the drive via REPORT SUPPORTED OPERATION + * CODES. (We kick that off here if SA_FLAG_RSOC_TO_TRY is set.) + * 3. Global loader tunables, used for all sa(4) driver instances on + * a machine. + * 4. Instance-specific loader tunables, used for say sa5. + * 5. On the fly user sysctl changes. + * + * Each step will overwrite the timeout value set from the one + * before, so you go from general to most specific. + */ + if (softc->flags & SA_FLAG_RSOC_TO_TRY) { + /* + * Bump the peripheral refcount while we are probing. + */ + cam_periph_acquire(periph); + softc->state = SA_STATE_PROBE; + xpt_schedule(periph, CAM_PRIORITY_DEV); + } else { + /* + * This drive doesn't support Report Supported Operation + * Codes, so we load the tunables at this point to bring + * in any user preferences. + */ + saloadtotunables(softc); + + xpt_announce_periph(periph, NULL); + xpt_announce_quirks(periph, softc->quirks, SA_QUIRK_BIT_STRING); + } return (CAM_REQ_CMP); } @@ -2755,7 +2940,9 @@ again: (softc->flags & SA_FLAG_FIXED) != 0, length, (bp->bio_flags & BIO_UNMAPPED) != 0 ? (void *)bp : bp->bio_data, bp->bio_bcount, SSD_FULL_SIZE, - IO_TIMEOUT); + (bp->bio_cmd == BIO_READ) ? + softc->timeout_info[SA_TIMEOUT_READ] : + softc->timeout_info[SA_TIMEOUT_WRITE]); start_ccb->ccb_h.ccb_pflags &= ~SA_POSITION_UPDATED; start_ccb->ccb_h.ccb_bp = bp; bp = bioq_first(&softc->bio_queue); @@ -2768,6 +2955,59 @@ again: } break; } + case SA_STATE_PROBE: { + int num_opcodes; + size_t alloc_len; + uint8_t *params; + + /* + * This is an arbitrary number. An IBM LTO-6 drive reports + * 67 entries, and an IBM LTO-9 drive reports 71 entries. + * There can theoretically be more than 256 because + * service actions of a particular opcode are reported + * separately, but we're far enough ahead of the practical + * number here that we don't need to implement logic to + * retry if we don't get all the timeout descriptors. + */ + num_opcodes = 256; + + alloc_len = num_opcodes * + (sizeof(struct scsi_report_supported_opcodes_descr) + + sizeof(struct scsi_report_supported_opcodes_timeout)); + + params = malloc(alloc_len, M_SCSISA, M_NOWAIT| M_ZERO); + if (params == NULL) { + /* + * If this happens, go with default + * timeouts and announce the drive. + */ + saloadtotunables(softc); + + softc->state = SA_STATE_NORMAL; + + xpt_announce_periph(periph, NULL); + xpt_announce_quirks(periph, softc->quirks, + SA_QUIRK_BIT_STRING); + xpt_release_ccb(start_ccb); + cam_periph_release_locked(periph); + return; + } + + scsi_report_supported_opcodes(&start_ccb->csio, + /*retries*/ 3, + /*cbfcnp*/ sadone, + /*tag_action*/ MSG_SIMPLE_Q_TAG, + /*options*/ RSO_RCTD, + /*req_opcode*/ 0, + /*req_service_action*/ 0, + /*data_ptr*/ params, + /*dxfer_len*/ alloc_len, + /*sense_len*/ SSD_FULL_SIZE, + /*timeout*/ softc->timeout_info[SA_TIMEOUT_TUR]); + + xpt_action(start_ccb); + break; + } case SA_STATE_ABNORMAL: default: panic("state 0x%x in sastart", softc->state); @@ -2786,17 +3026,79 @@ sadone(struct cam_periph *periph, union ccb *done_ccb) softc = (struct sa_softc *)periph->softc; csio = &done_ccb->csio; - - softc->dsreg = MTIO_DSREG_REST; - bp = (struct bio *)done_ccb->ccb_h.ccb_bp; error = 0; - if ((done_ccb->ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) { - if ((error = saerror(done_ccb, 0, 0)) == ERESTART) { + + if (softc->state == SA_STATE_NORMAL) { + softc->dsreg = MTIO_DSREG_REST; + bp = (struct bio *)done_ccb->ccb_h.ccb_bp; + + if ((done_ccb->ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) { + if ((error = saerror(done_ccb, 0, 0)) == ERESTART) { + /* + * A retry was scheduled, so just return. + */ + return; + } + } + } else if (softc->state == SA_STATE_PROBE) { + bp = NULL; + if ((done_ccb->ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) { /* - * A retry was scheduled, so just return. + * Note that on probe, we just run through + * cam_periph_error(), since saerror() has a lot of + * special handling for I/O errors. We don't need + * that to get the opcodes. We either succeed + * after a retry or two, or give up. We don't + * print sense, we don't need to worry the user if + * this drive doesn't support timeout descriptors. */ - return; + if ((error = cam_periph_error(done_ccb, 0, + SF_NO_PRINT)) == ERESTART) { + /* + * A retry was scheduled, so just return. + */ + return; + } else if (error != 0) { + /* We failed to get opcodes. Give up. */ + + saloadtotunables(softc); + + softc->state = SA_STATE_NORMAL; + + xpt_release_ccb(done_ccb); + + xpt_announce_periph(periph, NULL); + xpt_announce_quirks(periph, softc->quirks, + SA_QUIRK_BIT_STRING); + cam_periph_release_locked(periph); + return; + } } + /* + * At this point, we have succeeded, so load the timeouts + * and go into the normal state. + */ + softc->state = SA_STATE_NORMAL; + + /* + * First, load the timeouts we got from the drive. + */ + saloadtimeouts(softc, done_ccb); + + /* + * Next, overwrite the timeouts from the drive with any + * loader tunables that the user set. + */ + saloadtotunables(softc); + + xpt_release_ccb(done_ccb); + xpt_announce_periph(periph, NULL); + xpt_announce_quirks(periph, softc->quirks, + SA_QUIRK_BIT_STRING); + cam_periph_release_locked(periph); + return; + } else { + panic("state 0x%x in sadone", softc->state); } if (error == EIO) { @@ -2896,13 +3198,15 @@ samount(struct cam_periph *periph, int oflags, struct cdev *dev) if (softc->flags & SA_FLAG_TAPE_MOUNTED) { ccb = cam_periph_getccb(periph, 1); scsi_test_unit_ready(&ccb->csio, 0, NULL, - MSG_SIMPLE_Q_TAG, SSD_FULL_SIZE, IO_TIMEOUT); + MSG_SIMPLE_Q_TAG, SSD_FULL_SIZE, + softc->timeout_info[SA_TIMEOUT_TUR]); error = cam_periph_runccb(ccb, saerror, 0, SF_NO_PRINT, softc->device_stats); if (error == ENXIO) { softc->flags &= ~SA_FLAG_TAPE_MOUNTED; scsi_test_unit_ready(&ccb->csio, 0, NULL, - MSG_SIMPLE_Q_TAG, SSD_FULL_SIZE, IO_TIMEOUT); + MSG_SIMPLE_Q_TAG, SSD_FULL_SIZE, + softc->timeout_info[SA_TIMEOUT_TUR]); error = cam_periph_runccb(ccb, saerror, 0, SF_NO_PRINT, softc->device_stats); } else if (error) { @@ -2923,7 +3227,8 @@ samount(struct cam_periph *periph, int oflags, struct cdev *dev) } ccb = cam_periph_getccb(periph, 1); scsi_test_unit_ready(&ccb->csio, 0, NULL, - MSG_SIMPLE_Q_TAG, SSD_FULL_SIZE, IO_TIMEOUT); + MSG_SIMPLE_Q_TAG, SSD_FULL_SIZE, + softc->timeout_info[SA_TIMEOUT_TUR]); error = cam_periph_runccb(ccb, saerror, 0, SF_NO_PRINT, softc->device_stats); } @@ -2944,7 +3249,8 @@ samount(struct cam_periph *periph, int oflags, struct cdev *dev) * *Very* first off, make sure we're loaded to BOT. */ scsi_load_unload(&ccb->csio, 2, NULL, MSG_SIMPLE_Q_TAG, FALSE, - FALSE, FALSE, 1, SSD_FULL_SIZE, REWIND_TIMEOUT); + FALSE, FALSE, 1, SSD_FULL_SIZE, + softc->timeout_info[SA_TIMEOUT_LOAD]); error = cam_periph_runccb(ccb, saerror, 0, SF_NO_PRINT, softc->device_stats); @@ -2953,7 +3259,8 @@ samount(struct cam_periph *periph, int oflags, struct cdev *dev) */ if (error) { scsi_rewind(&ccb->csio, 2, NULL, MSG_SIMPLE_Q_TAG, - FALSE, SSD_FULL_SIZE, REWIND_TIMEOUT); + FALSE, SSD_FULL_SIZE, + softc->timeout_info[SA_TIMEOUT_REWIND]); error = cam_periph_runccb(ccb, saerror, 0, SF_NO_PRINT, softc->device_stats); } @@ -2982,11 +3289,12 @@ samount(struct cam_periph *periph, int oflags, struct cdev *dev) scsi_sa_read_write(&ccb->csio, 0, NULL, MSG_SIMPLE_Q_TAG, 1, FALSE, 0, 8192, (void *) rblim, 8192, SSD_FULL_SIZE, - IO_TIMEOUT); + softc->timeout_info[SA_TIMEOUT_READ]); (void) cam_periph_runccb(ccb, saerror, 0, SF_NO_PRINT, softc->device_stats); scsi_rewind(&ccb->csio, 1, NULL, MSG_SIMPLE_Q_TAG, - FALSE, SSD_FULL_SIZE, REWIND_TIMEOUT); + FALSE, SSD_FULL_SIZE, + softc->timeout_info[SA_TIMEOUT_REWIND]); error = cam_periph_runccb(ccb, saerror, CAM_RETRY_SELTO, SF_NO_PRINT | SF_RETRY_UA, softc->device_stats); @@ -3002,7 +3310,8 @@ samount(struct cam_periph *periph, int oflags, struct cdev *dev) * Next off, determine block limits. */ scsi_read_block_limits(&ccb->csio, 5, NULL, MSG_SIMPLE_Q_TAG, - rblim, SSD_FULL_SIZE, SCSIOP_TIMEOUT); + rblim, SSD_FULL_SIZE, + softc->timeout_info[SA_TIMEOUT_READ_BLOCK_LIMITS]); error = cam_periph_runccb(ccb, saerror, CAM_RETRY_SELTO, SF_NO_PRINT | SF_RETRY_UA, softc->device_stats); @@ -3619,7 +3928,7 @@ retry: scsi_mode_sense(&ccb->csio, 5, NULL, MSG_SIMPLE_Q_TAG, FALSE, SMS_PAGE_CTRL_CURRENT, (params_to_get & SA_PARAM_COMPRESSION) ? cpage : SMS_VENDOR_SPECIFIC_PAGE, mode_buffer, mode_buffer_len, - SSD_FULL_SIZE, SCSIOP_TIMEOUT); + SSD_FULL_SIZE, softc->timeout_info[SA_TIMEOUT_MODE_SENSE]); error = cam_periph_runccb(ccb, saerror, 0, SF_NO_PRINT, softc->device_stats); @@ -3682,7 +3991,7 @@ retry: scsi_mode_sense(&ccb->csio, 2, NULL, MSG_SIMPLE_Q_TAG, FALSE, SMS_PAGE_CTRL_CURRENT, SMS_VENDOR_SPECIFIC_PAGE, mode_buffer, mode_buffer_len, SSD_FULL_SIZE, - SCSIOP_TIMEOUT); + softc->timeout_info[SA_TIMEOUT_MODE_SENSE]); error = cam_periph_runccb(ccb, saerror, 0, SF_NO_PRINT, softc->device_stats); @@ -3749,7 +4058,8 @@ retry: /*data_ptr*/ softc->density_info[i], /*length*/ sizeof(softc->density_info[i]), /*sense_len*/ SSD_FULL_SIZE, - /*timeout*/ REP_DENSITY_TIMEOUT); + /*timeout*/ + softc->timeout_info[SA_TIMEOUT_REP_DENSITY]); error = cam_periph_runccb(ccb, saerror, 0, SF_NO_PRINT, softc->device_stats); status = ccb->ccb_h.status & CAM_STATUS_MASK; @@ -3811,7 +4121,8 @@ retry: /*param_len*/ dp_len, /*minimum_cmd_size*/ 10, /*sense_len*/ SSD_FULL_SIZE, - /*timeout*/ SCSIOP_TIMEOUT); + /*timeout*/ + softc->timeout_info[SA_TIMEOUT_MODE_SENSE]); /* * XXX KDM we need to be able to set the subpage in the * fill function. @@ -4039,7 +4350,8 @@ retry_length: /*param_len*/ dp_len, /*minimum_cmd_size*/ 10, /*sense_len*/ SSD_FULL_SIZE, - /*timeout*/ SCSIOP_TIMEOUT); + /*timeout*/ + softc->timeout_info[SA_TIMEOUT_MODE_SELECT]); error = cam_periph_runccb(ccb, saerror, 0, 0, softc->device_stats); if (error != 0) @@ -4309,7 +4621,8 @@ retry: /* It is safe to retry this operation */ scsi_mode_select(&ccb->csio, 5, NULL, MSG_SIMPLE_Q_TAG, (params_to_set & SA_PARAM_COMPRESSION)? TRUE : FALSE, - FALSE, mode_buffer, mode_buffer_len, SSD_FULL_SIZE, SCSIOP_TIMEOUT); + FALSE, mode_buffer, mode_buffer_len, SSD_FULL_SIZE, + softc->timeout_info[SA_TIMEOUT_MODE_SELECT]); error = cam_periph_runccb(ccb, saerror, 0, sense_flags, softc->device_stats); @@ -4626,7 +4939,7 @@ saprevent(struct cam_periph *periph, int action) /* It is safe to retry this operation */ scsi_prevent(&ccb->csio, 5, NULL, MSG_SIMPLE_Q_TAG, action, - SSD_FULL_SIZE, SCSIOP_TIMEOUT); + SSD_FULL_SIZE, softc->timeout_info[SA_TIMEOUT_PREVENT]); error = cam_periph_runccb(ccb, saerror, 0, sf, softc->device_stats); if (error == 0) { @@ -4652,7 +4965,7 @@ sarewind(struct cam_periph *periph) /* It is safe to retry this operation */ scsi_rewind(&ccb->csio, 2, NULL, MSG_SIMPLE_Q_TAG, FALSE, - SSD_FULL_SIZE, REWIND_TIMEOUT); + SSD_FULL_SIZE, softc->timeout_info[SA_TIMEOUT_REWIND]); softc->dsreg = MTIO_DSREG_REW; error = cam_periph_runccb(ccb, saerror, 0, 0, softc->device_stats); @@ -4684,7 +4997,7 @@ saspace(struct cam_periph *periph, int count, scsi_space_code code) /* This cannot be retried */ scsi_space(&ccb->csio, 0, NULL, MSG_SIMPLE_Q_TAG, code, count, - SSD_FULL_SIZE, SPACE_TIMEOUT); + SSD_FULL_SIZE, softc->timeout_info[SA_TIMEOUT_SPACE]); /* * Clear residual because we will be using it. @@ -4765,7 +5078,8 @@ sawritefilemarks(struct cam_periph *periph, int nmarks, int setmarks, int immed) softc->dsreg = MTIO_DSREG_FMK; /* this *must* not be retried */ scsi_write_filemarks(&ccb->csio, 0, NULL, MSG_SIMPLE_Q_TAG, - immed, setmarks, nmarks, SSD_FULL_SIZE, IO_TIMEOUT); + immed, setmarks, nmarks, SSD_FULL_SIZE, + softc->timeout_info[SA_TIMEOUT_WRITE_FILEMARKS]); softc->dsreg = MTIO_DSREG_REST; @@ -4833,7 +5147,8 @@ sagetpos(struct cam_periph *periph) /*data_ptr*/ (uint8_t *)&long_pos, /*length*/ sizeof(long_pos), /*sense_len*/ SSD_FULL_SIZE, - /*timeout*/ SCSIOP_TIMEOUT); + /*timeout*/ + softc->timeout_info[SA_TIMEOUT_READ_POSITION]); softc->dsreg = MTIO_DSREG_RBSY; error = cam_periph_runccb(ccb, saerror, 0, SF_QUIET_IR, @@ -4929,7 +5244,8 @@ sardpos(struct cam_periph *periph, int hard, u_int32_t *blkptr) ccb = cam_periph_getccb(periph, 1); scsi_read_position(&ccb->csio, 1, NULL, MSG_SIMPLE_Q_TAG, - hard, &loc, SSD_FULL_SIZE, SCSIOP_TIMEOUT); + hard, &loc, SSD_FULL_SIZE, + softc->timeout_info[SA_TIMEOUT_READ_POSITION]); softc->dsreg = MTIO_DSREG_RBSY; error = cam_periph_runccb(ccb, saerror, 0, 0, softc->device_stats); softc->dsreg = MTIO_DSREG_REST; @@ -4997,7 +5313,8 @@ sasetpos(struct cam_periph *periph, int hard, struct mtlocate *locate_info) /*partition*/ locate_info->partition, /*logical_id*/ locate_info->logical_id, /*sense_len*/ SSD_FULL_SIZE, - /*timeout*/ SPACE_TIMEOUT); + /*timeout*/ + softc->timeout_info[SA_TIMEOUT_LOCATE]); *** 212 LINES SKIPPED ***