git: 1af7d5f38953 - main - libfetch: don't include fragments in HTTP requests

From: Pietro Cerutti <gahr_at_FreeBSD.org>
Date: Wed, 21 Aug 2024 12:35:33 UTC
The branch main has been updated by gahr:

URL: https://cgit.FreeBSD.org/src/commit/?id=1af7d5f389536a2f391153513d95d92ffdf360e4

commit 1af7d5f389536a2f391153513d95d92ffdf360e4
Author:     Pietro Cerutti <gahr@FreeBSD.org>
AuthorDate: 2024-08-21 12:35:27 +0000
Commit:     Pietro Cerutti <gahr@FreeBSD.org>
CommitDate: 2024-08-21 12:35:27 +0000

    libfetch: don't include fragments in HTTP requests
    
    Summary:
    Fragments are reserved for client-side processing, see
    https://www.rfc-editor.org/rfc/rfc9110.html#section-7.1
    
    Also, some servers don't like to receive HTTP requests with fragments.
    
    ```
    $ fetch 'https://dropbox.com/a/b'
    fetch: https://dropbox.com/a/b: Not Found
    
    $ fetch 'https://dropbox.com/a/b#'
    fetch: https://dropbox.com/a/b#: Bad Request
    ```
    
    This is a real-world scenario, where some download link from dropbox
    (eventually) redirects to an URL with a fragment:
    
    ```
    $ fetch -v 'https://www.dropbox.com/sh/<some>/<thing>?dl=1' 2>&1 | grep requesting
    requesting https://www.dropbox.com/sh/<some>/<thing>?dl=1
    requesting https://www.dropbox.com/scl/fo/<foo>/<bar>?rlkey=<baz>&dl=1
    requesting https://<boo>.dl.dropboxusercontent.com/zip_download_get/<some-long-strig>#
    ```
    
    See how the last redirect ends with a `#`.
    
    Currently, libfetch includes the ending fragment and makes it impossible
    to download the file.
    
    Differential Revision:  https://reviews.freebsd.org/D46318
    MFC after:              2 weeks
---
 lib/libfetch/fetch.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/libfetch/fetch.c b/lib/libfetch/fetch.c
index 12cbd0fb746f..97fc04bb09a6 100644
--- a/lib/libfetch/fetch.c
+++ b/lib/libfetch/fetch.c
@@ -447,7 +447,10 @@ nohost:
 			goto ouch;
 		}
 		u->doc = doc;
-		while (*p != '\0') {
+		/* fragments are reserved for client-side processing, see
+		 * https://www.rfc-editor.org/rfc/rfc9110.html#section-7.1
+		 */
+		while (*p != '\0' && *p != '#') {
 			if (!isspace((unsigned char)*p)) {
 				*doc++ = *p++;
 			} else {