Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 33.7k
Closed
Labels
stdlibStandard Library Python modules in the Lib/ directoryStandard Library Python modules in the Lib/ directorytype-bugAn unexpected behavior, bug, or errorAn unexpected behavior, bug, or error
Description
Background
RFC 3986 defines a host as follows
host = IP-literal / IPv4address / reg-name Where
IP-literal = "[" ( IPv6address / IPvFuture ) "]" reg-name = *( unreserved / pct-encoded / sub-delims ) IPv4address = dec-octet "." dec-octet "." dec-octet "." dec-octet WhatWG says that "A valid host string must be a valid domain string, a valid IPv4-address string, or: U+005B ([), followed by a valid IPv6-address string, followed by U+005D (])."
The Bug
This is code from Lib/urllib/parse.py:196-208 used for retrieving the hostname from the netloc
@propertydef_hostinfo(self): netloc=self.netloc_, _, hostinfo=netloc.rpartition('@') _, have_open_br, bracketed=hostinfo.partition('[') ifhave_open_br: hostname, _, port=bracketed.partition(']') _, _, port=port.partition(':') else: hostname, _, port=hostinfo.partition(':') ifnotport: port=Nonereturnhostname, portIt will incorrectly retrieve IPv4 addresses and regular name hosts from inside brackets. This is in violation of both specifications.
Minimally reproducible example:
fromurllib.parseimporturlsplitparsedURL=urlsplit('scheme://user@[regname]/Path') print(parsedURL.hostname) # Prints 'regname'Your environment
- CPython versions tested on:
- 3.12a7 (
23cf1e2) - 3.10.10
- 3.12a7 (
- Operating system and architecture:
- Arch Linux x86_64
Linked PRs
- gh-103848: Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format #103849
- [3.11] gh-103848: Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format (GH-103849) #104349
- [3.10] gh-103848: Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format (#103849) #126975
- [3.9] gh-103848: Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format (#103849) #126976
Metadata
Metadata
Assignees
Labels
stdlibStandard Library Python modules in the Lib/ directoryStandard Library Python modules in the Lib/ directorytype-bugAn unexpected behavior, bug, or errorAn unexpected behavior, bug, or error