Making HTTP REST Request in C++

Introduction

Today, I am going to show you on how to make HTTP request to a REST server using C++ Requests library by Huu Nguyen. Mr Nguyen is heavily influenced by Python Requests design philosoply when writing C++ Requests. Those who had used or familiar with Python Requests, should feel right at home with C++ Requests.

To demostrate our client code, we need a web server that we can make our request to so in our case, we’ll use ASP.NET Web API version 2 to implement our CRUD API. The web server is not this article’s focus but I shall still devote some time to explain the Web API code. For those readers not interested in the server code (because they are not using ASP.NET), they can skip to the client section.

ASP.NET Web API

I am not going to go through the details on how to setup the ASP.NET Web API project. Interested readers can read the this tutorial and that tutorial provided by Microsoft.

The Web API is based loosely on MVC design. MVC stands for Model, View and Controller. Model represents the data layer, usually they are class that models after data design in storage, View represents the presentation layer while Controller is the business logic layer. Strictly speaking, a pure ASP.NET Web API server does not serve out HTML pages, so it does not have the presentation layer. In our example, we have the Product class as our data Model.

public class Product
{
    public int Id { get; set; }
    public string Name { get; set; }
    public int Qty { get; set; }
    public decimal Price { get; set; }
}

In our ProductsController, Product is stored in a static Dictionary which is not persistent, meaning to say the data disappear after the Web API server shuts down. But this should suffice for our demostration without involving the use of database.

public class ProductsController : ApiController
{
    static Dictionary products = new Dictionary();

This is the Create method to create a Product object to store in our dictionary. [FromBody] attribute means the Product item shall be populated with the contents found in the request body.

[HttpPost]
public IHttpActionResult Create([FromBody] Product item)
{
    if (item == null)
    {
        return BadRequest();
    }

    products[item.Id] = item;

    return Ok();
}

To test our code, I use curl command. If you have already had Postman installed, you can use that as well. I am old school, so I prefer to use curl command directly.

curl -XPOST http://localhost:51654/api/products/create -H 'Content-Type: application/json' -d'{"Id":1, "Name":"ElectricFan","Qty":14,"Price":20.90}'
  • -X specifies the HTTP verb, POST which corresponds to create or update method.
  • 2nd argument is the URL to which this request should go to.
  • -H specifies the HTTP headers. We send JSON string so we set ‘Content-Type’ to ‘application/json’.
  • -d specifies the content body of the request. It can be seen clearly that the keys in the json dictionary correspond exactly to the Product members.

The output returned by curl is empty when the post request is successful. To see our created Product, we need to have the retrieval method which is discussed shortly below.

The methods to retrieve all products and a single product are listed below. Note: HTTP GET verb is used for data retrieval.

[HttpGet]
public List GetAll()
{
    List temp = new List();
    foreach(var item in products)
    {
        temp.Add(item.Value);
    }
    return temp;
}

[HttpGet]
public IHttpActionResult GetProduct(long id)
{
    try
    {
        return Ok(products[id]);
    }
    catch (System.Collections.Generic.KeyNotFoundException)
    {
        return NotFound();
    }
}

The respective curl commands retrieve all and a single Product based on id (which is 1). The commandline argument is similar to what I have explained in above, so I skip them.

curl -XGET http://localhost:51654/api/products'

curl -XGET http://localhost:51654/api/products/1'

The output is

[{"Id":1,"Name":"ElectricFan","Qty":14,"Price":20.90}]

{"Id":1,"Name":"ElectricFan","Qty":14,"Price":20.90}

We see the 1st output is enclosed by [] because 1st command returns a collection of Product objects but in our case, we only have 1 Product right now.

Lastly, we have the Update and Delete method. The difference between HTTP POST and PUT verb, is that PUT is purely a update method whereas POST create the object if it does not exist but POST can used for updating as well.

[HttpPut]
public IHttpActionResult Update(long id, [FromBody] Product item)
{
    if (item == null || item.Id != id)
    {
        return BadRequest();
    }

    if(products.ContainsKey(id)==false)
    {
        return NotFound();
    }
    var product = products[id];

    product.Name = item.Name;
    product.Qty = item.Qty;
    product.Price = item.Price;

    return Ok();
}
[HttpDelete]
public IHttpActionResult Delete(long id)
{
    var product = products[id];
    if (product == null)
    {
        return NotFound();
    }

    products.Remove(id);

    return Ok();
}

The respective curl commands below updates and deletes Product correspond to id=1.

curl -XPUT http://localhost:51654/api/products/1 -H 'Content-Type: application/json' -d'{"Id":1, "Name":"ElectricFan","Qty":15,"Price":29.80}'

curl -XDELETE http://localhost:51654/api/products/1

To see the Product is really updated or deleted, we have to use the retrieval curl command shown above.

C++ Client Code

At last, we have to come to main focus of this article! To able to use C++ Requests, please clone or download it at here and include its cpr header. Alternatively for Visual C++ users, you can install C++ Requests via vcpkg. C++ Requests is abbreviated as cpr in vcpkg.

.\vcpkg install cpr
#include <cpr/cpr.h>

To send a POST request to create a Product, we put our Product json in raw string literal inside the cpr::Body. Otherwise, without raw string literal, we have to escape all the double quotes found in our json string.

auto r = cpr::Post(cpr::Url{ "http://localhost:51654/api/products/create" },
    cpr::Body{ R"({"Id":1, "Name":"ElectricFan","Qty":14,"Price":20.90})" },
    cpr::Header{ { "Content-Type", "application/json" } });

Compare this C++ code to raw curl command, we can see which information goes to where.

curl -XPOST http://localhost:51654/api/products/create -H 'Content-Type: application/json' -d'{"Id":1, "Name":"ElectricFan","Qty":14,"Price":20.90}'

After product creation, we try to retrieve it using the C++ code below.

auto r = cpr::Get(cpr::Url{ "http://localhost:51654/api/products/1" });

The output is the same as the one from curl command which isn’t strange since C++ Requests utilize libcurl underneath to do its work.

{"Id":1,"Name":"ElectricFan","Qty":14,"Price":20.90}

The full C++ code to do CRUD with ASP.NET Web API is listed below with its output. By the way, CRUD is short for Create, Retrieve, Update and Delete. Be sure your ASP.NET Web API is up and running before running the C++ code below.

int main()
{
    {
        std::cout << "Action: Create Product with Id = 1" << std::endl;
        auto r = cpr::Post(cpr::Url{ "http://localhost:51654/api/products/create" },
            cpr::Body{ R"({"Id":1, "Name":"ElectricFan","Qty":14,"Price":20.90})" },
            cpr::Header{ { "Content-Type", "application/json" } });
        std::cout << "Returned Status:" << r.status_code << std::endl;
    }
    {
        std::cout << "Action: Retrieve the product with id = 1" << std::endl;
        auto r = cpr::Get(cpr::Url{ "http://localhost:51654/api/products/1" });
        std::cout << "Returned Text:" << r.text << std::endl;
    }
    {
        std::cout << "Action: Update Product with Id = 1" << std::endl;
        auto r = cpr::Post(cpr::Url{ "http://localhost:51654/api/products/1" },
            cpr::Body{ R"({"Id":1, "Name":"ElectricFan","Qty":15,"Price":29.80})" },
            cpr::Header{ { "Content-Type", "application/json" } });
        std::cout << "Returned Status:" << r.status_code << std::endl;
    }
    {
        std::cout << "Action: Retrieve all products" << std::endl;
        auto r = cpr::Get(cpr::Url{ "http://localhost:51654/api/products" });
        std::cout << "Returned Text:" << r.text << std::endl;
    }
    {
        std::cout << "Action: Delete the product with id = 1" << std::endl;
        auto r = cpr::Delete(cpr::Url{ "http://localhost:51654/api/products/1" });
        std::cout << "Returned Status:" << r.status_code << std::endl;
    }
    {
        std::cout << "Action: Retrieve all products" << std::endl;
        auto r = cpr::Get(cpr::Url{ "http://localhost:51654/api/products" });
        std::cout << "Returned Text:" << r.text << std::endl;
    }

    return 0;
}

The output as mentioned is shown below. I only display the returned text when the CRUD supports it, otherwise I just display the status. HTTP Status 200 means successful HTTP request. For example, Create/Update/Delete operation does not return any text, so I just display their status.

Action: Create Product with Id = 1
Returned Status:200

Action: Retrieve the product with id = 1
Returned Text:{"Id":1,"Name":"ElectricFan","Qty":14,"Price":20.90}

Action: Update Product with Id = 1
Returned Status:200

Action: Retrieve all products
Returned Text:[{"Id":1,"Name":"ElectricFan","Qty":15,"Price":29.80}]

Action: Delete the product with id = 1
Returned Status:200

Action: Retrieve all products
Returned Text:[]

For users looking to send request with parameters like below, you can make use of the cpr::Parameters.

http://www.example.com/products?quota=500&sold=true

C++ code for the above url example.

auto r = cpr::Get(cpr::Url{ "http://www.example.com/products" },
    cpr::Parameters{{"quota", "500"}, {"sold", "true"}});

Source code written for this article is hosted at Github.

 

C++11: Compile-time Conditional Types

Introduction

C++11 introduces std::conditional to give C++ developer the flexibility to choose a type based on the compile-time condition.

template< bool B, class T, class F >
struct conditional;

If the boolean parameter of the std::conditional is true, then the delved type is class T or else it is class F. Below is an example on how to use std::conditional. Before to use std::conditional, we have to include type_traits header. The typeinfo header is included because of typeid.

#include <iostream>
#include <type_traits>
#include <typeinfo>

int main() 
{
    typedef std::conditional<true, int, double>::type Type1;
    typedef std::conditional<false, int, double>::type Type2;
 
    std::cout << typeid(Type1).name() << '\n';
    std::cout << typeid(Type2).name() << '\n';
}

Output of the program is given below.

int
double

Applying to Endian Swapping

Let us apply what we have just learnt to implement endian swapping. We can determine whether to do endian swap at runtime. However we can make that decision in compile time to get rid of the runtime check to stave off some execution time. The obvious disadvantage to this approach is we cannot change our decision during runtime. Let us delve into the code explanation. First we define our Endian enum.

enum class Endian
{
    Big,
    Little
};
using BigEndian = std::integral_constant<Endian, Endian::Big>;
using LittleEndian = std::integral_constant<Endian, Endian::Little>;

The swap_endian_if_same_endian_is_false functions are defined below. The 2nd parameter determine whether the endian of the platform and data are the same. If it is false_type, then it must be swapped. We do nothing in the true_type case.

template<typename T>
void swap_endian_if_same_endian_is_false(T& val, std::false_type) // false_type means different endian, therefore must swap
{
    std::is_arithmetic<T> is_arithmetic_type;

    swap_endian_if_arithmetic(val, is_arithmetic_type);
}

template<typename T>
void swap_endian_if_same_endian_is_false(T& val, std::true_type)
{
    // same endian so do nothing.
}

We can determine the endian of the platform and data are the same with std::is_same. For this article, we are not going to bother using std::is_same. We short-circuit the check with std::false_type to force swapping.

using same_endian_type = std::is_same;

In the swap function mentioned above, we use is_arithmetic to check the type is integer or floating point before calling swap_endian_if_arithmetic. If T is not arithmetic, it is a no-op.

template<typename T>
void swap_endian_if_arithmetic(T& val, std::true_type)
{
    swap_endian(val, number_type<T>());
}

template<typename T>
void swap_endian_if_arithmetic(T& val, std::false_type)
{
    // T is not arithmetic so do nothing
}

These are the 5 overloaded swap_endian functions.

template<typename T>
void swap_endian(T& ui, UnknownSize)
{
}

template<typename T>
void swap_endian(T& ui, SizeOf1)
{
}

template<typename T>
void swap_endian(T& ui, SizeOf8)
{
    union EightBytes
    {
        T ui;
        uint8_t arr[8];
    };

    EightBytes fb;
    fb.ui = ui;
    // swap the endian
    std::swap(fb.arr[0], fb.arr[7]);
    std::swap(fb.arr[1], fb.arr[6]);
    std::swap(fb.arr[2], fb.arr[5]);
    std::swap(fb.arr[3], fb.arr[4]);

    ui = fb.ui;
}

template<typename T>
void swap_endian(T& ui, SizeOf4)
{
    union FourBytes
    {
        T ui;
        uint8_t arr[4];
    };

    FourBytes fb;
    fb.ui = ui;
    // swap the endian
    std::swap(fb.arr[0], fb.arr[3]);
    std::swap(fb.arr[1], fb.arr[2]);

    ui = fb.ui;
}

template<typename T>
void swap_endian(T& ui, SizeOf2)
{
    union TwoBytes
    {
        T ui;
        uint8_t arr[2];
    };

    TwoBytes fb;
    fb.ui = ui;
    // swap the endian
    std::swap(fb.arr[0], fb.arr[1]);

    ui = fb.ui;
}

Which swap_endian function is selected by C++ compiler is determined by the 2nd parameter type which are empty structure with a default constructor.

struct SizeOf1 { SizeOf1() { std::cout << "Size:1" << std::endl; } };
struct SizeOf2 { SizeOf2() { std::cout << "Size:2" << std::endl; } };
struct SizeOf4 { SizeOf4() { std::cout << "Size:4" << std::endl; } };
struct SizeOf8 { SizeOf8() { std::cout << "Size:8" << std::endl; } };
struct UnknownSize { UnknownSize() { std::cout << "Size:Unknown" << std::endl; } };

number_type is alias template which make use of nested std::conditional to determine the type to be SizeOf1, SizeOf2, SizeOf4, SizeOf8 or UnknownSize based on the sizeof(T). sizeof(T) is always evaluated at compile time.

template <class T>
using number_type =
typename std::conditional<
    sizeof(T) == 1,
    SizeOf1,
    typename std::conditional<
    sizeof(T) == 2,
    SizeOf2,
    typename std::conditional<
    sizeof(T) == 4,
    SizeOf4,
    typename std::conditional<
    sizeof(T) == 8,
    SizeOf8,
    UnknownSize
    >::type
    >::type
    >::type
>::type;

Here is an example of swapping the integer twice. std::false_type is specified for the 2nd argument to force swapping.

int main()
{
    int num = 1;
    std::cout << num << std::endl;
    swap_endian_if_same_endian_is_false(num, std::false_type());
    std::cout << num << std::endl;
    swap_endian_if_same_endian_is_false(num, std::false_type());
    std::cout << num << std::endl;

    return 0;
}

The output of the main function is shown.

1
Size:4
16777216
Size:4
1

std::conditional is a useful tool we can add to our arsenal when used judiciously. The repository for this article is at Github.

References

Design Patterns in Modern C++ Book

DesignPatternCpp

Just as I was scouring Amazon for new C++ book. I came across this Design Patterns book with examples based on Modern C++ by Dmitri Nesteruk. A quick look at content table uncovers the null object pattern, curiously recurring template pattern and SOLID design principles in addition to the classic design patterns in the GoF book. I have watched Design Pattern lessons taught by same author on Pluralsight.

SolidDesign

If time permits, I’ll do a book review after I finish the book.

Use STL copy, Not memcpy to Copy Array

Many C++ developers like to use memcpy() to copy POD arrays to extract the maximum performance during the copy. See the POD structure below.

struct MyStruct
{
    int n;
    double d;
};

What if a new unsuspecting developer adds a string member and is not aware that the code uses memcpy() to do copying? memcpy() only makes shallow copies. The code no longer works correctly. This is where STL copy() comes to the rescue. If the array contains type which is TriviallyCopyable, it calls memmove(), else it calls the assignment operator. The calls are determined at compile time. memmove() is similar to memcpy() that they both do copying, except memmove() allows the destination and source to overlap.

struct MyStruct
{
    int n;
    double d;
    std::string s; // Unsuspecting developer add this member!
};

Use the debugger to step into the first copy() you find it uses memmove() while the second copy() does not.

#include <iostream>
#include <string>
#include <algorithm>
#include <vector>

// TriviallyCopyable structure
struct MyStruct
{
    int n;
    double d;
};

struct MyStruct2
{
    int n;
    double d;
    std::string s;
};

int main()
{
    std::vector<MyStruct> v1{ {1, 1.0}, {2, 2.0} };
    std::vector<MyStruct> v2{ 2 };
    std::copy(v1.begin(), v1.end(), v2.begin()); // calls memmove()

    for (const auto& o : v2)
    {
        std::cout << "After v2 copy, n:" << o.n << ", d:" << o.d << std::endl;
    }
    std::vector<MyStruct2> v3{ { 1, 1.0, "Hello" },{ 2, 2.0, "World" } };
    std::vector<MyStruct2> v4{ 2 };
    std::copy(v3.begin(), v3.end(), v4.begin()); // Does not call memmove()

    for (const auto& o : v4)
    {
        std::cout << "After v4 copy, n:" << o.n << ", d:" << o.d << ", s:" << o.s << std::endl;
    }

    return 0;
}

The tip is to use STL copy() wherever possible to copy array. copy() delegates the calls to memmove() when the type is TriviallyCopyable.

References

 

Erase-remove Idiom Revisited

Recently, I reread Erase-remove Idiom on classic Effective STL by Scott Meyers, which is dated by now, so I also consulted the same topic on another STL book published in 2017. How to remove elements from container is a common C++ interview question, so you may want to sit up and pay attention here.

std::vector contains an erase function to remove elements. The problem is once erase is called, all iterators are invalidated. That means if std::vector contains more than 1 element which satisfy the criteria to be removed, developer has to restart iterating by getting a new iterator from vector::begin(). A much better way is to use erase-remove Idiom. First, we use the STL remove/remove_if to move the removed elements to back of the vector container. STL remove/remove_if, despite their name, cannot remove anything from the std::vector because they operate on iterators and are not aware of the underlying container object. So after calling remove/remove_if, we have to call container’s erase to actually erase the elements.

remove takes in value to be removed as the last argument: Any element matching the same value is ‘removed’. Whereas remove_if takes in functor or lambda as predicate in the last argument: Any element which satisfies predicate is ‘removed’.

  // initializes a vector that holds the numbers from 0-9.
  std::vector<int> v = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
  print(v);
 
  // removes all elements with the value 5
  v.erase( std::remove( v.begin(), v.end(), 5 ), v.end() ); 
  print(v);

  // removes all odd numbers
  v.erase( std::remove_if(v.begin(), v.end(), is_odd), v.end() );
  print(v);
std::remove/std::remove_if Output:
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 6 7 8 9
0 2 4 6 8

From Wikipedia Erase–remove idiom

STL remove/remove_if returns Past-the-end iterator for the new range of values (if this is not end, then it points to an unspecified value, and so do iterators to any values between this iterator and end).

If no element is ‘removed’, remove/remove_if returns v.end(), so the erase call actually does nothing and no element is erased.

v.erase( v.end(), v.end() );

STL remove/remove_if shifts the rest of the elements to fill the gap left by removed element. The performance could be improved if maintaining relative order between the elements is not a requirement. Mentioned in Mastering the C++17 STL by Arthur O’Dwyer’, unstable_remove has been proposed for future standardization but has not yet adopted into the STL. unstable_remove does not shift the remaining element to fill the ‘hole’, instead it moves the last element to fill the ‘hole’.

namespace my 
{
    template<class BidirIt, class T>
    BidirIt unstable_remove(BidirIt first, BidirIt last, const T& value)
    {
        while (true) 
        {
            // Find the first instance of "value"...
            first = std::find(first, last, value);
            // ...and the last instance of "not value"...
            do 
            {
                if (first == last) 
                    return last;
                --last;
            } 
            while (*last == value);
            // ...and move the latter over top of the former.
            *first = std::move(*last);
            // Rinse and repeat.
            ++first;
        }
    }
    template<class BidirIt, class Pred>
    BidirIt unstable_remove_if(BidirIt first, BidirIt last, Pred predicate)
    {
        while (true) 
        {
            // Find the first instance of "value"...
            first = std::find_if(first, last, predicate);
            // ...and the last instance of "not value"...
            do 
            {
                if (first == last) 
                    return last;
                --last;
            } 
            while (predicate(*last));
            // ...and move the latter over top of the former.
            *first = std::move(*last);
            // Rinse and repeat.
            ++first;
        }
    }
} // namespace my

unstable_remove usage is the same as std::remove but their output is different.

  // initializes a vector that holds the numbers from 0-9.
  std::vector<int> v = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
  print(v);
 
  // removes all elements with the value 5
  v.erase( my::unstable_remove( v.begin(), v.end(), 5 ), v.end() ); 
  print(v);

  // removes all odd numbers
  v.erase( my::unstable_remove_if(v.begin(), v.end(), is_odd), v.end() );
  print(v);
my::unstable_remove/my::unstable_remove_if Output:
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 9 6 7 8
0 8 2 6 4

In his book, Arthur also noted:

For std::list container, std::list::erase can be much faster than the Erase-remove idiom on a std::list.

Special points of interest: This idiom can be used with std::unique where consecutive same values are moved to the back of the container. The erase–remove idiom cannot be used for containers that return const_iterator (for example, set). Another important note is erase–remove idiom can only be used with containers holding elements with full value semantics without incurring resource leaks, meaning raw pointer pointing to allocated memory should not be stored in the container, smart pointer is fine.

Reference

 

My IT Certificates

The blog post is for readers who are interested to see my IT certificates.

Certified Secure Software Lifecycle Professional(CSSLP)

CSSLPsmall

ITIL® Foundation Certificate(ITIL)

ITILsmall

EC-Council Certified Security Analyst(ECSA)

ECSAsmall

Certified Ethical Hacker(CEH)

CEHv8small

EC-Council Certified Secure Programmer(ECSP)

ECSPsmall

Project Management Professional(PMP). In Singapore, for Project Managers undertaking a IT project for government, they are required to be PMP certified. I became PMP certified to prepare for the PM role.

PMP

CodeProject Most Valued Author(2019)

CP MVA

CodeProject Most Valued Author(2020)

cp-mva2020

Pythonic Operators on STL Set Algorithms

 

Table of Contents

Introduction

Ever since I started working with Python and that have gotten me into alot thinking how to redesign my libraries to be pythonic, if I were to implement them from scratch again. In this first article of the series, I want to introduce Python’s wonderful and intuitive operators for working with set algebra into C++ world. These operators are nothing more than syntatic-sugar to reduce the amount of code to write.

Table of Python Set Operators

PythonSetOperators

Set Intersection

 

set_intersection2

C++ Reference: std::set_intersection

  • Commutative

set_intersection is an algorithm that produces a set of elements which are common to both sets. It is commutative, meaning that even the 2 sets are switched places, the algorithm returns the same result.

void intersection_example()
{
    std::vector<int> v1{ 1,2,3,4,5,6,7,8 };
    std::vector<int> v2{         5,  7,  9,10 };
    std::sort(v1.begin(), v1.end());
    std::sort(v2.begin(), v2.end());

    std::vector<int> v_intersection;

    std::set_intersection(v1.begin(), v1.end(),
        v2.begin(), v2.end(),
        std::back_inserter(v_intersection));

    for (int n : v_intersection)
        std::cout << n << ' ';
}

Output

5 7

This is the example using & operator to do intersection.

void intersection_example()
{
    std::vector<int> v1{ 1,2,3,4,5,6,7,8 };
    std::vector<int> v2{         5,  7,  9,10 };

    std::vector<int> v_intersection = s(v1) & s(v2);

    for (int n : v_intersection)
        std::cout << n << ' ';
}

I skip showing the output of the operator example as it is the same.

s is a function, not class. If s is to be a class, to instantiate it, a container type would have to be specified (See below).

std::vector v_intersection = s(v1) & s(v2);

In order to make use of automatic type deduction, s has to be a function that does nothing but returns the wrapper class.

#include <algorithm>
#include <iterator>

template<typename T>
struct wrapper
{
    wrapper(T& container) : cont(container) {}
    T& cont;
};

template<typename T>
wrapper<T> s(T& s_cont)
{
    return wrapper<T>(s_cont);
}

The & operator function checks whether to sort the container. Since std::sort only works with random access iterators, so we cannot use this function with STL list and slist which has non-random access iterators. In my 15 years of work, I have not seen a single use of list in any codebase.

template<typename T>
T operator&(wrapper<T>& left, wrapper<T>& right)
{
    T& c1 = left.cont;
    T& c2 = right.cont;
    if (!std::is_sorted(c1.begin(), c1.end()))
        std::sort(c1.begin(), c1.end());
    if (!std::is_sorted(c2.begin(), c2.end()))
        std::sort(c2.begin(), c2.end());

    T v_intersection;

    std::set_intersection(c1.begin(), c1.end(),
        c2.begin(), c2.end(),
        std::back_inserter(v_intersection));

    return v_intersection;
}

All set algorithm precondition requires the ranges to be sorted, hence this is_sorted check.

Set Union

 

set_union2

C++ Reference: std::set_union

  • Commutative

set_union is an algorithm that produces a set of elements from both sets. For the elements appearing in intersection, it always picks them from the 1st set, not 2nd set.

void union_example()
{
    std::vector<int> v1 = { 1, 2, 3, 4, 5 };
    std::vector<int> v2 = {       3, 4, 5, 6, 7 };
    std::sort(v1.begin(), v1.end());
    std::sort(v2.begin(), v2.end());

    std::vector<int> dest1;

    std::set_union(v1.begin(), v1.end(),
        v2.begin(), v2.end(),
        std::back_inserter(dest1));

    for (const auto &i : dest1) {
        std::cout << i << ' ';
    }
    std::cout << '\n';
}

Output

1 2 3 4 5 6 7

The code required to write is much lesser therefore the code is more concise.

void union_example()
{
    std::vector<int> v1 = { 1, 2, 3, 4, 5 };
    std::vector<int> v2 = {       3, 4, 5, 6, 7 };

    std::vector<int> dest1 = s(v1) | s(v2);

    for (int n : dest1)
        std::cout << n << ' ';
}

The | operator is almost similar to & operator except that algorithm is different.

template<typename T>
T operator|(wrapper<T>& left, wrapper<T>& right)
{
    T& c1 = left.cont;
    T& c2 = right.cont;
    if (!std::is_sorted(c1.begin(), c1.end()))
        std::sort(c1.begin(), c1.end());
    if (!std::is_sorted(c2.begin(), c2.end()))
        std::sort(c2.begin(), c2.end());

    T dest1;

    std::set_union(c1.begin(), c1.end(),
        c2.begin(), c2.end(),
        std::back_inserter(dest1));

    return dest1;
}

Set Difference

set_difference2

C++ Reference: std::set_difference

  • Non-Commutative

set_difference returns the elements in 1st set which is not in 2nd set and is represented by minus operator in Python. For obvious reason, the results is different when the arguments are swapped place. set_difference is non-commutative like minus operation.

void set_difference_example() 
{
    std::vector<int> v1{ 1, 2, 5, 5, 5,    9 };
    std::vector<int> v2{    2, 5,       7 };
    std::sort(v1.begin(), v1.end());
    std::sort(v2.begin(), v2.end());

    std::vector<int> diff;

    std::set_difference(v1.begin(), v1.end(), v2.begin(), v2.end(),
        std::inserter(diff, diff.begin()));

    for (auto i : v1) std::cout << i << ' ';
    std::cout << "minus ";
    for (auto i : v2) std::cout << i << ' ';
    std::cout << "is: ";

    for (auto i : diff) std::cout << i << ' ';
    std::cout << '\n';
}

Output

1 2 5 5 5 9 minus 2 5 7 is: 1 5 5 9

This is example with minus operator.

void set_difference_example()
{
    std::vector<int> v1{ 1, 2, 5, 5, 5, 9 };
    std::vector<int> v2{    2, 5,       7 };

    std::vector<int> diff = s(v1) - s(v2);

    for (auto i : v1) std::cout << i << ' ';
    std::cout << "minus ";
    for (auto i : v2) std::cout << i << ' ';
    std::cout << "is: ";

    for (auto i : diff) std::cout << i << ' ';
    std::cout << '\n';
}

The code for minus operator is shown below.

template<typename T>
T operator-(wrapper<T>& left, wrapper<T>& right)
{
    T& c1 = left.cont;
    T& c2 = right.cont;
    if (!std::is_sorted(c1.begin(), c1.end()))
        std::sort(c1.begin(), c1.end());
    if (!std::is_sorted(c2.begin(), c2.end()))
        std::sort(c2.begin(), c2.end());

    T diff;

    std::set_difference(c1.begin(), c1.end(),
        c2.begin(), c2.end(),
        std::back_inserter(diff));

    return diff;
}

Set Symmetric Difference

 

set_sym_difference2

C++ Reference: std::set_symmetric_difference

  • Commutative

set_symmetric_difference computes the elements in either set but not both.

void set_symmetric_difference_example()
{
    std::vector<int> v1{ 1,2,3,4,5,6,7,8 };
    std::vector<int> v2{         5,  7,  9,10 };
    std::sort(v1.begin(), v1.end());
    std::sort(v2.begin(), v2.end());

    std::vector<int> v_symDifference;

    std::set_symmetric_difference(
        v1.begin(), v1.end(),
        v2.begin(), v2.end(),
        std::back_inserter(v_symDifference));

    for (int n : v_symDifference)
        std::cout << n << ' ';
}

Output

1 2 3 4 6 8 9 10

set_symmetric_difference is represented by logical exclusive or operator.

void set_symmetric_difference_example()
{
    std::vector<int> v1{ 1,2,3,4,5,6,7,8 };
    std::vector<int> v2{         5,  7,  9,10 };

    std::vector<int> v_symDifference = s(v1) ^ s(v2);

    for (int n : v_symDifference)
        std::cout << n << ' ';
}

The code for logical exclusive or operator is shown below.

template<typename T>
T operator^(wrapper<T>& left, wrapper<T>& right)
{
    T& c1 = left.cont;
    T& c2 = right.cont;
    if (!std::is_sorted(c1.begin(), c1.end()))
        std::sort(c1.begin(), c1.end());
    if (!std::is_sorted(c2.begin(), c2.end()))
        std::sort(c2.begin(), c2.end());

    T v_symDifference;

    std::set_symmetric_difference(c1.begin(), c1.end(),
        c2.begin(), c2.end(),
        std::back_inserter(v_symDifference));

    return v_symDifference;
}

Superset and Subset

 

set_superset2

C++ Reference: std::includes

  • Non-Commutative

STL includes can be used to find out whether a set is a superset(returns a boolean). To check if it is subset, just switch the 2 sets.

void is_superset_example()
{
    std::vector<char> v1{ 'a', 'b', 'c', 'f', 'h', 'x' };
    std::vector<char> v2{ 'a', 'b', 'c' };
    std::vector<char> v3{ 'a', 'c' };
    std::vector<char> v4{ 'g' };
    std::vector<char> v5{ 'a', 'c', 'g' };
    std::sort(v1.begin(), v1.end());
    std::sort(v2.begin(), v2.end());
    std::sort(v3.begin(), v3.end());
    std::sort(v4.begin(), v4.end());
    std::sort(v5.begin(), v5.end());

    for (auto i : v1) std::cout << i << ' ';
    std::cout << "\nincludes:\n" << std::boolalpha;

    for (auto i : v2) std::cout << i << ' ';
    std::cout << ": " 
              << std::includes(v1.begin(), v1.end(), v2.begin(), v2.end()) << '\n';
    for (auto i : v3) std::cout << i << ' ';
    std::cout << ": " 
              << std::includes(v1.begin(), v1.end(), v3.begin(), v3.end()) << '\n';
    for (auto i : v4) std::cout << i << ' ';
    std::cout << ": " 
              << std::includes(v1.begin(), v1.end(), v4.begin(), v4.end()) << '\n';
    for (auto i : v5) std::cout << i << ' ';
    std::cout << ": " 
              << std::includes(v1.begin(), v1.end(), v5.begin(), v5.end()) << '\n';

    auto cmp_nocase = [](char a, char b) {
        return std::tolower(a) < std::tolower(b);
    };

    std::vector<char> v6{ 'A', 'B', 'C' };
    for (auto i : v6) std::cout << i << ' ';
    std::cout << ": (case-insensitive) "
        << std::includes(v1.begin(), v1.end(), v6.begin(), v6.end(), cmp_nocase)
        << '\n';
}

Output

a b c f h x
includes:
a b c : true
a c : true
g : false
a c g : false
A B C : (case-insensitive) true

The >= operator example is below. The <= operator example is not shown in this article.

void is_superset_example()
{
    std::vector<char> v1{ 'a', 'b', 'c', 'f', 'h', 'x' };
    std::vector<char> v2{ 'a', 'b', 'c' };
    std::vector<char> v3{ 'a', 'c' };
    std::vector<char> v4{ 'g' };
    std::vector<char> v5{ 'a', 'c', 'g' };

    for (auto i : v1) std::cout << i << ' ';
    std::cout << "\nincludes:\n" << std::boolalpha;

    for (auto i : v2) std::cout << i << ' ';
    std::cout << ": " << (s(v1) >= s(v2)) << '\n';
    for (auto i : v3) std::cout << i << ' ';
    std::cout << ": " << (s(v1) >= s(v3)) << '\n';
    for (auto i : v4) std::cout << i << ' ';
    std::cout << ": " << (s(v1) >= s(v4)) << '\n';
    for (auto i : v5) std::cout << i << ' ';
    std::cout << ": " << (s(v1) >= s(v5)) << '\n';

    auto cmp_nocase = [](char a, char b) {
        return std::tolower(a) < std::tolower(b);
    };

    std::vector<char> v6{ 'A', 'B', 'C' };
    for (auto i : v6) std::cout << i << ' ';
    std::cout << ": (case-insensitive) "
        << std::includes(v1.begin(), v1.end(), v6.begin(), v6.end(), cmp_nocase)
        << '\n';
}

User cannot opt for use of a custom comparator in the >= and <= overloaded operators at the moment, as shown in the case-insensitive example. In this situation, includes has to be called directly.

// Returns true if left is superset of right?
template<typename T>
bool operator>=(wrapper<T>& left, wrapper<T>& right)
{
    T& c1 = left.cont;
    T& c2 = right.cont;

    if (!std::is_sorted(c1.begin(), c1.end()))
        std::sort(c1.begin(), c1.end());
    if (!std::is_sorted(c2.begin(), c2.end()))
        std::sort(c2.begin(), c2.end());

    return std::includes(
        c1.begin(), c1.end(),
        c2.begin(), c2.end());
}

// Returns true if left is subset of right?
template<typename T>
bool operator<=(wrapper<T>& left, wrapper<T>& right)
{
    T& c1 = left.cont;
    T& c2 = right.cont;

    if (!std::is_sorted(c1.begin(), c1.end()))
        std::sort(c1.begin(), c1.end());
    if (!std::is_sorted(c2.begin(), c2.end()))
        std::sort(c2.begin(), c2.end());

    return std::includes(
        c2.begin(), c2.end(),
        c1.begin(), c1.end());
}

“I have no use for all these!”

 

Before you are quick to exclaim that you have no use for these set algorithms, I like to show to you a typical selection example where you can use this. Imagine you are writing a subject enrollment website for college students. On the form, there are currently selected subjects which the student added, and the available subject dropdown which student can pick. It makes sense to remove subject from available dropdown after addition because you do not want the student to accidentally add the same subject twice. One way to compute leftover subjects available for selection, is to just subtract the selected set from the complete set of subjects with minus operator introduced in this article.

Article source code is hosted at Github

C++ Summing For Loop Benchmark

 

Introduction

The initial motivation is to find out the overhead of different for-loop types in C++.

Code

Copy and paste below code into Godbolt Online C++ Compiler to see the generated assembly code. Note: The array or vector are initialized in the benchmark. The simplified code below is for you to copy and paste into the Godbolt Online C++ Compiler so that you only read the relevant assembly code, including other code just adds to noise where you have to wade through to find your assembly code.

Prior to using Godbolt, I was poring the assembly code generated by Visual C++ which was difficult to read because the optimized assembly code for each single C++ line were interleaved with assembly code for other lines. My guess is reason for doing that probably is to take advantage of the CPU pipelining so that code which are not dependent on the result of previous operation, can execute independently. Using Godbolt is the best and most easy way. Trust me.

#include <cstdint>
#include <algorithm>
#include <numeric>
#include <iterator>

const size_t LEN = 1000000;

// Increment For Loop
uint64_t func1()
{
    uint64_t vec[LEN];

    uint64_t sum = 0;
    for (size_t i = 0; i < LEN; ++i)
    {
        sum += vec[i];
    }
    return sum;
}

// Range For Loop
uint64_t func2()
{
    uint64_t vec[LEN];

    uint64_t sum = 0;
    for (auto n : vec)
    {
        sum += n;
    }
    return sum;
}

// Iterator For Loop
uint64_t func3()
{
    uint64_t vec[LEN];

    uint64_t sum = 0;
    for (auto it = std::cbegin(vec); it != std::cend(vec); ++it)
    {
        sum += *it;
    }
    return sum;
}

// Accumulator
uint64_t func4()
{
    uint64_t vec[LEN];

    uint64_t sum = 0;
    const uint64_t Zero = 0;

    sum = std::accumulate(std::cbegin(vec), std::cend(vec), Zero);
    return sum;
}

Test Machine: Intel i7 6700 at 3.4 GHz

Visual C++ 2017 (15.4 Update) Result

Please ignore the sum result. I display resultant sum to prevent compiler from optimizing away for loop. Visual C++ vectorized the code with SSE2.

 Increment For Loop:  599ms, sum:500000500000
     Range For Loop:  446ms, sum:500000500000
  Iterator For Loop:  558ms, sum:500000500000
        Accumulator:  437ms, sum:500000500000

Investigation shows multiplication by 8 for array index subscripting could be the culprit in the slowdown in the Increment For Loop.

sum += vec[i];
movdqu   xmm0, XMMWORD PTR vec$[rsp+rax*8] <== multiplication by 8

As for the Range For Loop, the address is incremented by 16 (= 8 + 8 because of loop unrolling), multiplication is not used to calculate the address. Accumulator code use the same tactics. Earlier in the decade, C programmers were baffled as to why std::accumulate was faster than for loop. Now we know the reason.

As for the Iterator For Loop poor result, my guess is the iterator overhead.

Cygwin clang++ 3.9.1 Result

clang++ generated the similar code for all 4 loops, hence, similar timing. clang++ vectorized the loops with SSE2. To compile the code with clang++, use the command below.

# clang++ ForLoopBenchmark.cpp -O2 -std=c++14
 Increment For Loop:  392ms, sum:500000500000
     Range For Loop:  406ms, sum:500000500000
  Iterator For Loop:  381ms, sum:500000500000
        Accumulator:  391ms, sum:500000500000

Cygwin g++ 5.4 Result

Like clang++, g++ also generated the similar code for all 4 loops, so they had similar timing but sadly, loops are not vectorized in O2. Specifying O3 vectorize all loops and result is on par with clang++’s O2. To compile the code with g++, use the command below.

g++ ForLoopBenchmark.cpp -O2 -std=c++14
 Increment For Loop:  558ms, sum:500000500000
     Range For Loop:  552ms, sum:500000500000
  Iterator For Loop:  542ms, sum:500000500000
        Accumulator:  544ms, sum:500000500000

“Is this information even useful?”

There is FIX Protocol (for financial markets) which makes use of summing to calculate simple checksum of its message.

If you find Godbolt useful, do consider becoming a patreon of Matt Godbolt to show your appreciation and help him to defray monthly server costs. Of course, donation is not obligatory. I have been his patreon since December.

Benchmark source code is hosted here. Thanks for reading!

C++ Tip: Modification Inside const Function

 

Introduction

There are times where modification inside const member function must be done (for example, to allow for caching or memoization). The mutable keyword and const_cast are well-known in the C++ circles to work around these. The 3rd way is to use a const pointer; we cannot modify the const pointer but we can modify the contents it pointed to. Let’s dive in and discover for ourselves.

mutable Keyword

We will do a quick recap on the mutable keyword. A lock have to be acquired on mutex. So by declaring the mutex mutable, it can be used inside a const function.

class MyClass1
{
public:
    MyClass1(int value) : m_Value(value) {}

    int GetValue() const
    {
        std::lock_guard<std::mutex> lock(m_Mutex);
        return m_Value;
    }
private:
    int m_Value;
    mutable std::mutex m_Mutex;
};

int GetValue(const MyClass1& c)
{
    return c.GetValue();
}

Cast Away Constness

Next example, we will declare the member function as non-const and do a const_cast(). Typically C++ developers who do not know the mutable keyword will use this technique.

class MyClass2
{
public:
    MyClass2(int value) : m_Value(value) {}

    int GetValue() // not const
    {
        std::lock_guard<std::mutex> lock(m_Mutex);
        return m_Value;
    }
private:
    int m_Value;
    std::mutex m_Mutex;
};
int GetValue(const MyClass2& c)
{
    MyClass2& c2 = const_cast(c); // cast away the constness
    return c2.GetValue();
}

Const Pointer to Member

As advertised, the data member can be modified though the member pointer.

class MyClass3
{
public:
    MyClass3(int value) : m_Value(value), m_pMutex(&m_Mutex) {}

    int GetValue() const
    {
        std::lock_guard<std::mutex> lock(*m_pMutex);
        return m_Value;
    }
private:
    int m_Value;
    std::mutex m_Mutex;
    std::mutex* m_pMutex;
};
int GetValue(const MyClass3& c)
{
    return c.GetValue();
}

Local Pointer to Member Does Not Compile

If local pointer is used, compiler will complain “error C2440: 'initializing': cannot convert from 'const std::mutex *' to 'std::mutex *‘”.

// cannot work.
class MyClass4
{
public:
    MyClass4(int value) : m_Value(value) {}

    int GetValue() const
    {
        std::mutex* pMutex = &m_Mutex; // compiler complain
        std::lock_guard<std::mutex> lock(*pMutex);
        return m_Value;
    }
private:
    int m_Value;
    std::mutex m_Mutex;
};
int GetValue(const MyClass4& c)
{
    return c.GetValue();
}

unique_ptr

Instead of raw pointer, using unique_ptr works as well.

class MyClass5
{
public:
    MyClass5(int value) : m_Value(value), m_pMutex(std::make_unique()) {}

    int GetValue() const
    {
        std::lock_guard<std::mutex> lock(*m_pMutex);
        return m_Value;
    }
private:
    int m_Value;
    std::unique_ptr<std::mutex> m_pMutex;
};
int GetValue(const MyClass5& c)
{
    return c.GetValue();
}

The example code is hosted at Github.

C++ Tip: Make Your Class Non-Copyable Without Boost

Introduction

There are times where an object should never be passed by copy but by reference or pointer. For instance, the class has a data member (like counter or mutex) which should not be duplicated. In this tip, we take a look at 2 techniques which declare the class to be non-copyable without resorting to using Boost’s Noncopyable class.

Use delete Keyword

Delete the copy constructor and assignment operator. This works for C++11 and above.

// Works for C++11 and above
class DeletedCopyFunc
{
public:
    DeletedCopyFunc(int value): m_Value(value) {}
public:
    DeletedCopyFunc(const DeletedCopyFunc&) = delete;
    DeletedCopyFunc& operator=(const DeletedCopyFunc&) = delete;

private:
    int m_Value;
    std::mutex m_Mutex;
};

Make private

Declaring the copy constructor and assignment operator private is another way and it is perfectly fine not to define their bodies. This technique works for all C++ versions.

// Works for all C++ versions
class PrivateCopyFunc
{
public:
    PrivateCopyFunc(int value) : m_Value(value) {}
private:
    PrivateCopyFunc(const PrivateCopyFunc&);
    PrivateCopyFunc& operator=(const PrivateCopyFunc&);

private:
    int m_Value;
    std::mutex m_Mutex;
};

How Boost Does It?

It can be seen from the Boost noncopyable source that it also uses the same techniques.

class noncopyable
  {
  protected:
#if !defined(BOOST_NO_CXX11_DEFAULTED_FUNCTIONS) && 
         !defined(BOOST_NO_CXX11_NON_PUBLIC_DEFAULTED_FUNCTIONS)
      BOOST_CONSTEXPR noncopyable() = default;
      ~noncopyable() = default;
#else
      noncopyable() {}
      ~noncopyable() {}
#endif
#if !defined(BOOST_NO_CXX11_DELETED_FUNCTIONS)
      noncopyable( const noncopyable& ) = delete;
      noncopyable& operator=( const noncopyable& ) = delete;
#else
  private:  // emphasize the following members are private
      noncopyable( const noncopyable& );
      noncopyable& operator=( const noncopyable& );
#endif
  };

Bonus: When copy Constructor and Assignment Operator are Called

Having done interviews over the years, I discover to my dismay that many job candidates are not aware of when copy constructor and assignment operator are called. Run the code below to see which lines are printed.

class CopyableClass
{
public:
    CopyableClass(int value) : m_Value(value) {}
    CopyableClass(const CopyableClass& that)
    {
        std::cout << "CopyableClass Copy Constructor called!" <m_Value = that.m_Value;
    }
    CopyableClass& operator=(const CopyableClass& that)
    {
        std::cout << "CopyableClass Assignment Operator called!" <m_Value = that.m_Value;
        return *this;
    }

private:
    int m_Value;
};

int main()
{
    CopyableClass a(10);
    CopyableClass b = a; // CopyableClass Copy Constructor called!
    b = a; // CopyableClass Assignment Operator called!

    CopyableClass c(a); // CopyableClass Copy Constructor called!

    return 0;
}

The example code is hosted at Github.