From nobody Mon May 20 13:33:12 2024 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Vjdmn0gBfz5LH5P; Mon, 20 May 2024 13:33:13 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Vjdmn08pdz4NmS; Mon, 20 May 2024 13:33:13 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1716211993; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=Z7ahZmNLAHzE8IUqxP9PBNV29Kf15Je3UKwudAwrWuw=; b=i4O3c34KFwcGoiC0+jRMEK1cmBMfGZBmUPBs3apU4NcjdaEYfzYO++nAx+wEMUU0Rp76EF p6nuf6digM/z0M1Rvvw78hlss3PRP3FXmlv57Dd4qlQuGRAIz/xEsNgxcJze4giota6b9a J7mQU975DoaJzOoUS6IAR3Xo6krpKREIEt2vS2USKipWWg3Kjhkropn5mPV5ngM4I/HZVX XxA1dnGQafHzbFZ+ab3U1L8kJAf3IKYTLmwhbu+O5bs06+Ozc7UKZGnHGZw1B+Jx9TFOUo pC24o26TXYgOW1B55UYokf3ZSsX2WVEqmrAAW35MlnQioUa3r+HAjA9H9d3NbQ== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1716211993; a=rsa-sha256; cv=none; b=lTrBX3/Kfuu7Y0fMDch2YuXnLpMy1Erm7ay76ijEGQhCY0re5WZWS/5LDSVoq3OZMS9NSH tDgc/tVURmWBhYTrD5LEV/RRIJ3cwk5mUqS+L7mSnhmfGo2Ae0w4aPiUeRXhxp3x+qnCC2 uvnbzCrq3WQbb6YDAeFUal6Gtl2L1o428Vl1199toNGDVaj1D5eltmZh0VpJcExjoDYz3j HBHZN9HOgneIQtihaeXuTsFpQ+fhBawPo6Gzfy1YWAoy8hci+XszFlLR8fknbrp+W1CAMC sErJXsY5KeAK73oW84xAMulf+d37je31XdADpDTVGWbM8bWj4Q0IdSuyLDpI9Q== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1716211993; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=Z7ahZmNLAHzE8IUqxP9PBNV29Kf15Je3UKwudAwrWuw=; b=lk5zgA72AwWnuHzrnCpZQOJzyEUr9vEDH9eC6ADRwn/YfoP8T77ALpiZ7qO2dxKpaK/GyR Gc9+3BsTqKzOp5B9SiJku3WWGljcv3LrNyjqIpLdcX0ly/m+rP9X+AtcHKUJcbe9D9K+UE eZHFKtYwoBerdJmaFUaSuDWJLJ5y7L12mmfdrOBVWuewFWfOYNlLjucdlpqTaSoMqBl9+m JftmoyPT3iKC35gqFozc2p4ydgP3Sjo9wHzj7wiRB71EGIQjfc3VI48aw6D+5Yj2oYLJCe KaaIfRXc9tvAHvRmpyokbCSqUJq6CSgQgg+Fj4SnSK7cjt/XoNP/4SLoqPzxYg== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4Vjdmm6tgFzgll; Mon, 20 May 2024 13:33:12 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.17.1/8.17.1) with ESMTP id 44KDXClO045747; Mon, 20 May 2024 13:33:12 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.17.1/8.17.1/Submit) id 44KDXCLb045744; Mon, 20 May 2024 13:33:12 GMT (envelope-from git) Date: Mon, 20 May 2024 13:33:12 GMT Message-Id: <202405201333.44KDXCLb045744@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Dag-Erling =?utf-8?Q?Sm=C3=B8rgrav?= Subject: git: 974ea6b297f8 - main - libdiff: Detect and recover from file truncation. List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-main@freebsd.org Sender: owner-dev-commits-src-main@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: des X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 974ea6b297f8f9821bbb60670e2b90ba9989b283 Auto-Submitted: auto-generated The branch main has been updated by des: URL: https://cgit.FreeBSD.org/src/commit/?id=974ea6b297f8f9821bbb60670e2b90ba9989b283 commit 974ea6b297f8f9821bbb60670e2b90ba9989b283 Author: Dag-Erling Smørgrav AuthorDate: 2024-05-20 13:26:33 +0000 Commit: Dag-Erling Smørgrav CommitDate: 2024-05-20 13:26:33 +0000 libdiff: Detect and recover from file truncation. If a memory-mapped file is truncated before we get to the end, the atomizer may catch SIGBUS. Detect that, reduce the input length to what we were actually able to read, and set a flag so the caller can take further action (e.g. warn the user and / or start over). Sponsored by: Klara, Inc. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D45217 --- contrib/libdiff/include/diff_main.h | 1 + contrib/libdiff/lib/diff_atomize_text.c | 35 ++++++++++++++++++++++++++++----- 2 files changed, 31 insertions(+), 5 deletions(-) diff --git a/contrib/libdiff/include/diff_main.h b/contrib/libdiff/include/diff_main.h index 04a6c6e748c9..11700580db4b 100644 --- a/contrib/libdiff/include/diff_main.h +++ b/contrib/libdiff/include/diff_main.h @@ -118,6 +118,7 @@ struct diff_data { /* Flags set by file atomizer. */ #define DIFF_ATOMIZER_FOUND_BINARY_DATA 0x00000001 +#define DIFF_ATOMIZER_FILE_TRUNCATED 0x00000002 /* Flags set by caller of diff_main(). */ #define DIFF_FLAG_IGNORE_WHITESPACE 0x00000001 diff --git a/contrib/libdiff/lib/diff_atomize_text.c b/contrib/libdiff/lib/diff_atomize_text.c index 32023105af94..d8a69733fc00 100644 --- a/contrib/libdiff/lib/diff_atomize_text.c +++ b/contrib/libdiff/lib/diff_atomize_text.c @@ -16,6 +16,8 @@ */ #include +#include +#include #include #include #include @@ -122,10 +124,18 @@ diff_data_atomize_text_lines_fd(struct diff_data *d) return DIFF_RC_OK; } +static sigjmp_buf diff_data_signal_env; +static void +diff_data_signal_handler(int sig) +{ + siglongjmp(diff_data_signal_env, sig); +} + static int diff_data_atomize_text_lines_mmap(struct diff_data *d) { - const uint8_t *pos = d->data; + struct sigaction act, oact; + const uint8_t *volatile pos = d->data; const uint8_t *end = pos + d->len; bool ignore_whitespace = (d->diff_flags & DIFF_FLAG_IGNORE_WHITESPACE); bool embedded_nul = false; @@ -136,8 +146,22 @@ diff_data_atomize_text_lines_mmap(struct diff_data *d) ARRAYLIST_INIT(d->atoms, 1 << pow2); + sigemptyset(&act.sa_mask); + act.sa_flags = 0; + act.sa_handler = diff_data_signal_handler; + sigaction(SIGBUS, &act, &oact); + if (sigsetjmp(diff_data_signal_env, 0) > 0) { + /* + * The file was truncated while we were reading it. Set + * the end pointer to the beginning of the line we were + * trying to read, adjust the file length, and set a flag. + */ + end = pos; + d->len = end - d->data; + d->atomizer_flags |= DIFF_ATOMIZER_FILE_TRUNCATED; + } while (pos < end) { - const uint8_t *line_end = pos; + const uint8_t *line_start = pos, *line_end = pos; unsigned int hash = 0; while (line_end < end && *line_end != '\r' && *line_end != '\n') { @@ -164,15 +188,16 @@ diff_data_atomize_text_lines_mmap(struct diff_data *d) *atom = (struct diff_atom){ .root = d, - .pos = (off_t)(pos - d->data), - .at = pos, - .len = line_end - pos, + .pos = (off_t)(line_start - d->data), + .at = line_start, + .len = line_end - line_start, .hash = hash, }; /* Starting point for next line: */ pos = line_end; } + sigaction(SIGBUS, &oact, NULL); /* File are considered binary if they contain embedded '\0' bytes. */ if (embedded_nul)