Re: pkgs contain non URL safe characters

From: Ronald Klop <ronald-lists_at_klop.ws>
Date: Tue, 01 Mar 2022 11:57:56 UTC
On 2/17/22 03:05, Aristedes Maniatis wrote:
> Just to check this behaviour, I used tcpdump to see what the request looked like from pkg-fetch.
> 
> 
>      123.ish.com.au.15580 > pkg0.twn.freebsd.org.http: Flags [P.], cksum 0x80e0 (incorrect -> 0xfc82), seq 1:184, ack 1, win 1027, options [nop,nop,TS val 975600196 ecr 3136747760], length 183: HTTP, length: 183
>      GET /FreeBSD:13:amd64/quarterly/All/openjdk11-11.0.13+8.1.pkg HTTP/1.1
>      Host: pkgmir.geo.freebsd.org
>      Accept: */*
>      User-Agent: pkg/1.17.5
>      Range: bytes=6733824-
>      Connection: close
> 
> 
> You can see in there that the + is not URL encoded. Is it expected that pkg uses URL standards for its repository? If not, any advice on how to host a repository on a commercial service like AWS cloudfront?
> 
> Should we rewrite all our files with + symbols to spaces? Should pkg names only contain URL safe characters? Or should pkg-fetch be fixed to encode URLs?
> 
> 
> I took a quick look at the source for pkg.c and where it calls fetchXGet but I can't understand where any URL encoding might happen.
> 
> 
> Ari
> 
> 
> On 14/2/2022 11:18am, Aristedes Maniatis wrote:
>> Some packages contain "+" symbol which is a way of encoding spaces in a URL. This means that I'm having trouble hosting our pkg repository behind cloudfront/S3.
>>
>> I wasn't sure where to post this issue, so I put more details here: https://github.com/freebsd/poudriere/issues/976
>>
>>
>> Is there a workaround for this issue? Could pkg-fetch escape such characters when interacting with a http repository?
>>
>>
>> Cheers
>>
>> Ari
>>
>>


Hi,

I looked into this a bit and did not see another answer yet on the ML.

I think this describes it pretty clearly and also points to official HTTP specifications.
https://stackoverflow.com/questions/2678551/when-should-space-be-encoded-to-plus-or-20

TL;DR:
The + character is not special in this part of the URL. The request send by pkg is compliant to the specs.

I'm aware of having specs and having what browsers and servers do in real life.
Why does Cloudfront decode a + to a space in this part of the URL?

Regards,
Ronald.