SiddiqSoft.SplitUri
Header-only C++20 library for parsing Uri.
Getting started
- This library does not use any Windows-specific code. However, use Visual Studio 2019 v16.11.2 or newer as support for
<format>
is not present for GCC or Clang! - On Windows with VisualStudio, use the Nuget package!
- Make sure you use
c++latest
as the<format>
is no longer in thec++20
option pending ABI resolution.
Dependencies
None. However, if you’re using the nlohmann.json
library then we will enable the serializers for that library.
API
The anatomy of a Uri can be summarized as follows:
userinfo host port
┌──┴───┐ ┌──────┴──────┐ ┌┴┐
https://john.doe@www.example.com:123/forum/questions/?tag=networking&order=newest#top
└─┬─┘ └───────────┬──────────────┘└───────┬───────┘ └───────────┬─────────────┘ └┬┘
scheme authority path query fragment
In this library, we focus on the http
and https
scheme.
class siddiqsoft::Uri
Signature
template <typename T = char, class Auth = AuthorityHttp<T>>
requires (same_as<char, T> || same_as<wchar_t, T>) &&
same_as<Auth, AuthorityHttp<T>>
class Uri
{
private:
basic_string<T> sourceUri {};
public:
UriScheme scheme {UriScheme::WebHttp};
Auth authority {};
vector<std::basic_string<T>> path {};
map<basic_string<T>, basic_string<T>> query {};
basic_string<T> fragment {};
basic_string<T> urlPart {};
basic_string<T> queryPart {};
Uri();
Uri(const basic_string<T>& aEndpoint);
operator basic_string<T>() const;
const basic_string<T> string() const;
void split(const basic_string<T>& aEndpoint);
}
We use <concepts>
requires
to ensure that we only get std::string
or std::wstring
. When possible we can update this library to support char8_t
.
Member Variables
Variable | Description |
---|---|
std::basic_string<CharT> - sourceUri |
Stores the original string |
UriScheme - scheme |
Defaults to {UriScheme::WebHttp} .Represents the scheme for this Uri. |
Auth - authority |
Template type represents the authority. Current implementation defaults to AuthorityHttp<> and must not be changed. We’ve limited the template to check via the requires . |
std::vector<std::basic_string<CharT>> - path |
An array of strings representing the segments in the path. |
std::map<std::basic_string<CharT>,std::basic_string<CharT>> - query |
A collection of key-value elements in the query section. |
std::basic_string<CharT> - fragment |
The fragment segment of the Uri. |
std::basic_string<CharT> - urlPart |
Shortcut to the url segment (balance post the authority section). |
std::basic_string<CharT> - queryPart |
Shortcut to the query segment. |
Member Functions
Uri::Uri
Uri();
Default (empty) constructor.
Uri::Uri
Uri(const std::basic_string<CharT>&) noexcept;
Delegates to the split
function.
Uri::split
static Uri<CharT, Auth> split(const std::basic_string<CharT>&) noexcept(false);
since v1.8.0
This method performs the parsing, splitting and general decomposition of the source string.
It may throw!
returns
A Uri<> object
Uri::operator basic_string()
operator basic_string<CharT>() const;
Convenience delgates to string()
.
returns
A std::basic_string<CharT>
is returned.
Uri::string
const std::basic_string<CharT> string() const;
Rebuild the Uri string.
returns
Returns the sourceUri
member variable or rebuilds the string via std::format.
struct siddiqsoft::AuthorityHttp
Signature
template <typename CharT>
requires std::same_as<char, CharT> || std::same_as<wchar_t, CharT>
struct AuthorityHttp
{
std::basic_string<CharT> userInfo {};
std::basic_string<CharT> host {};
uint16_t port {0};
operator std::basic_string<CharT>() const
};
Member Variables
Variable | Description |
---|---|
std::basic_string<CharT> - userInfo |
The fragment segment of the Uri. |
std::basic_string<CharT> - host |
Shortcut to the url segment (balance post the authority section). |
uint16_t - port |
Shortcut to the query segment. |
Member Functions
operator std::basic_string<CharT>() const
enum siddiqsoft::UriScheme
Signature
enum class UriScheme
{
WebHttp, WebHttps,
Ldap, Mailto, News, Tel, Telnet, Urn, // Not supported
Unknown
};
static siddiqsoft::SplitUri
template <typename T = char>
requires(std::same_as<char, T> || std::same_as<wchar_t, T>)
static Uri<T, AuthorityHttp<T>> SplitUri(const std::basic_string<T>& aEndpoint);
template <typename T = char>
requires(std::same_as<char, T> || std::same_as<wchar_t, T>)
static Uri<T, AuthorityHttp<T>> SplitUri(const T* aEndpoint);
This convenience wrapper instantiates a Uri<>
object via its constructor.
NOTE: There is a
SplitUri(const T*)
in addition to theSplitUri(const std::basic_string<T>&)
to account for failed implict conversion between T* into basic_string.
static siddiqsoft::w2n
Signature
[[nodiscard]] static inline std::string w2n(const std::wstring& ws);
Wrapper atop wcstombs_s
with limits on 256 characters as the destination. Used internally by the various serializer functions.
It is meant to be used internally to handle conversion of small data sets.
Examples
#include "siddiqsoft/SplitUri.hpp"
..
..
using namespace siddiqsoft::splituri_literals;
// Use the literal operator helper.
auto uri= "https://www.google.com/search?q=siddiqsoft"_Uri;
// Outputs https://www.google.com/search?q=siddiqsoft
std::cout << std::format("{}", uri) << std::endl;
A more through example
using namespace siddiqsoft::splituri_literals;
auto u = "https://www.google.com/search?flag&q=siddiqsoft#v1"_Uri;
std::cerr << u.authority.host << std::endl;
std::cerr << u.authority.port << std::endl;
std::cerr << u.urlPart << std::endl;
std::cerr << u.queryPart << std::endl;
std::cerr << u.fragment << std::endl;
std::cerr << nlohmann::json(u.path).dump() << std::endl;
std::cerr << nlohmann::json(u.query).dump() << std::endl;
std::cerr << std::format("{}", u.scheme) << "...." << nlohmann::json(u.scheme).dump() << std::endl;
std::cerr << std::format("{}", u.authority) << std::endl;
std::cerr << std::format("{}", u) << std::endl;
And the corresponding output
www.google.com
443
/search?flag&q=siddiqsoft#v1
flag&q=siddiqsoft
v1
["search"]
{"flag":"", "q":"siddiqsoft"}
https...."https"
www.google.com:443
https://www.google.com/search?flag&q=siddiqsoft#v1
Notes
There are two internal functions used by the library. They may be used as you wish but note the inline documentation and accomodate accordingly.
_NORW
is a macro that I gleamed from the<string>
header. Basically it allows us to use a single narrow string in cases such as format strings and other non-Unicode-sensitive strings without the uglyL""
prefix or having to write an explicit case forchar
andwchar_t
.w2n
is a wrapper forwcstombs_s
with limits on 256 characters as the destination. It is meant to be used internally to handle conversion of small data.