Skip to the content.

SiddiqSoft.SplitUri

Header-only C++20 library for parsing Uri.

CodeQL Build Status Build Status

Getting started

Dependencies

None. However, if you’re using the nlohmann.json library then we will enable the serializers for that library.


API

The anatomy of a Uri can be summarized as follows:

           userinfo       host      port
           ┌──┴───┐ ┌──────┴──────┐ ┌┴┐
   https://john.doe@www.example.com:123/forum/questions/?tag=networking&order=newest#top
   └─┬─┘   └───────────┬──────────────┘└───────┬───────┘ └───────────┬─────────────┘ └┬┘
   scheme          authority                  path                 query           fragment

In this library, we focus on the http and https scheme.


class siddiqsoft::Uri

Signature

    template <typename T = char, class Auth = AuthorityHttp<T>>
        requires (same_as<char, T> || same_as<wchar_t, T>) &&
                  same_as<Auth, AuthorityHttp<T>>
    class Uri
    {
    private:
        basic_string<T>                       sourceUri {};

    public:
        UriScheme                             scheme {UriScheme::WebHttp};
        Auth                                  authority {};
        vector<std::basic_string<T>>          path {};
        map<basic_string<T>, basic_string<T>> query {};
        basic_string<T>                       fragment {};
        basic_string<T>                       urlPart {};
        basic_string<T>                       queryPart {};

        Uri();
        Uri(const basic_string<T>& aEndpoint);

                              operator basic_string<T>() const;
        const basic_string<T> string() const;
        void                  split(const basic_string<T>& aEndpoint);
    }

We use <concepts> requires to ensure that we only get std::string or std::wstring. When possible we can update this library to support char8_t.

Member Variables

Variable Description
std::basic_string<CharT>
-sourceUri
Stores the original string
UriScheme
-scheme
Defaults to {UriScheme::WebHttp}.
Represents the scheme for this Uri.
Auth
-authority
Template type represents the authority. Current implementation defaults to AuthorityHttp<> and must not be changed. We’ve limited the template to check via the requires.
std::vector<std::basic_string<CharT>>
-path
An array of strings representing the segments in the path.
std::map<std::basic_string<CharT>,std::basic_string<CharT>>
-query
A collection of key-value elements in the query section.
std::basic_string<CharT>
-fragment
The fragment segment of the Uri.
std::basic_string<CharT>
-urlPart
Shortcut to the url segment (balance post the authority section).
std::basic_string<CharT>
-queryPart
Shortcut to the query segment.

Member Functions

Uri::Uri

    Uri();

Default (empty) constructor.

Uri::Uri

    Uri(const std::basic_string<CharT>&) noexcept;

Delegates to the split function.

Uri::split

    static Uri<CharT, Auth> split(const std::basic_string<CharT>&) noexcept(false);

since v1.8.0 This method performs the parsing, splitting and general decomposition of the source string.
It may throw!

returns

A Uri<> object

Uri::operator basic_string()

    operator basic_string<CharT>() const;

Convenience delgates to string().

returns

A std::basic_string<CharT> is returned.

Uri::string

    const std::basic_string<CharT> string() const;

Rebuild the Uri string.

returns

Returns the sourceUri member variable or rebuilds the string via std::format.


struct siddiqsoft::AuthorityHttp

Signature

    template <typename CharT>
        requires std::same_as<char, CharT> || std::same_as<wchar_t, CharT>
    struct AuthorityHttp
    {
        std::basic_string<CharT> userInfo {};
        std::basic_string<CharT> host {};
        uint16_t                 port {0};

        operator std::basic_string<CharT>() const
    };

Member Variables

Variable Description
std::basic_string<CharT>
-userInfo
The fragment segment of the Uri.
std::basic_string<CharT>
-host
Shortcut to the url segment (balance post the authority section).
uint16_t
-port
Shortcut to the query segment.

Member Functions

    operator std::basic_string<CharT>() const

enum siddiqsoft::UriScheme

Signature

    enum class UriScheme
    {
        WebHttp,  WebHttps,
        Ldap,  Mailto,  News,  Tel,  Telnet,  Urn, // Not supported
        Unknown
    };

static siddiqsoft::SplitUri

    template <typename T = char>
        requires(std::same_as<char, T> || std::same_as<wchar_t, T>)
    static Uri<T, AuthorityHttp<T>> SplitUri(const std::basic_string<T>& aEndpoint);

    template <typename T = char>
        requires(std::same_as<char, T> || std::same_as<wchar_t, T>)
    static Uri<T, AuthorityHttp<T>> SplitUri(const T* aEndpoint);

This convenience wrapper instantiates a Uri<> object via its constructor.

NOTE: There is a SplitUri(const T*) in addition to the SplitUri(const std::basic_string<T>&) to account for failed implict conversion between T* into basic_string.

static siddiqsoft::w2n

Signature

    [[nodiscard]] static inline std::string w2n(const std::wstring& ws);

Wrapper atop wcstombs_s with limits on 256 characters as the destination. Used internally by the various serializer functions. It is meant to be used internally to handle conversion of small data sets.


Examples

#include "siddiqsoft/SplitUri.hpp"
..
..
using namespace siddiqsoft::literals;
// Use the literal operator helper.
auto uri= "https://www.google.com/search?q=siddiqsoft"_Uri;
// Outputs https://www.google.com/search?q=siddiqsoft
std::cout << std::format("{}", uri) << std::endl;

A more through example

using namespace siddiqsoft::literals;

auto u = "https://www.google.com/search?flag&q=siddiqsoft#v1"_Uri;

std::cerr << u.authority.host << std::endl;
std::cerr << u.authority.port << std::endl;
std::cerr << u.urlPart << std::endl;
std::cerr << u.queryPart << std::endl;
std::cerr << u.fragment << std::endl;
std::cerr << nlohmann::json(u.path).dump() << std::endl;
std::cerr << nlohmann::json(u.query).dump() << std::endl;
std::cerr << std::format("{}", u.scheme) << "...." << nlohmann::json(u.scheme).dump() << std::endl;
std::cerr << std::format("{}", u.authority) << std::endl;
std::cerr << std::format("{}", u) << std::endl;

And the corresponding output

www.google.com
443
/search?flag&q=siddiqsoft#v1
flag&q=siddiqsoft
v1
["search"]
{"flag":"", "q":"siddiqsoft"}
https...."https"
www.google.com:443
https://www.google.com/search?flag&q=siddiqsoft#v1

Notes

There are two internal functions used by the library. They may be used as you wish but note the inline documentation and accomodate accordingly.